Module 4: Linked Lists

Module Overview

Understanding and implementing linked list data structures in Java. Learn about the mechanics, benefits, and trade-offs of using linked lists versus other collection types.

Learning Objectives

  • Design and implement a class that uses a List as a member that maintains an ordered collection of objects
  • Explain why accessing a value by index from a LinkedList runs in linear O(n) time
  • Explain why inserting or removing the first value in a LinkedList runs in constant O(1) time
  • Analyze whether to use a LinkedList or an ArrayList for a provided scenario

Video Content: Linked Lists Introduction

This video introduces the concept of linked lists, explaining their structure, basic operations, and comparative advantages over array-based lists in certain scenarios.

// Basic Node class for a singly linked list
public class Node<T> {
    private T data;
    private Node<T> next;
    
    public Node(T data) {
        this.data = data;
        this.next = null;
    }
    
    public T getData() {
        return data;
    }
    
    public void setData(T data) {
        this.data = data;
    }
    
    public Node<T> getNext() {
        return next;
    }
    
    public void setNext(Node<T> next) {
        this.next = next;
    }
}

// Simple LinkedList implementation
public class LinkedList<T> {
    private Node<T> head;
    private int size;
    
    public LinkedList() {
        head = null;
        size = 0;
    }
    
    // Add to the beginning - O(1) operation
    public void addFirst(T data) {
        Node<T> newNode = new Node<>(data);
        newNode.setNext(head);
        head = newNode;
        size++;
    }
    
    // Get element at index - O(n) operation
    public T get(int index) {
        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException("Index: " + index);
        }
        
        Node<T> current = head;
        for (int i = 0; i < index; i++) {
            current = current.getNext();
        }
        
        return current.getData();
    }
}

This example shows a basic implementation of a singly linked list with a Node class and essential operations. Note how adding to the beginning is an O(1) operation (constant time) because it only requires updating the head reference, while accessing an element by index is an O(n) operation (linear time) as we must traverse from the head to the desired index.

What is a linked list?

A linked list is a series of nodes, each of which contains an associated data element and a pointer to either one or two other nodes. Each node is an object instance that has an associated data element and a reference to the next member.

A real-life example that illustrates a linked list is a treasure hunt. Treasure hunts are a fun way to get out with friends and do some exploration. All treasure hunts start at a known location that is given to all participants. The "head pointer" of a linked list points to its start location. At each location, you pick up a prize and a clue leading to the next location. The final location of the treasure hunt has a prize, but no clue, since there are no more locations to visit. The "tail pointer" of a linked list points to its end location. In the treasure hunt, you don't necessarily know how many locations there will be, but you always know that each location's clue points to the next location in the hunt. To get a location by its index, you must visit all the previous locations in order; it's not an O(1) lookup like an array.

Nodes in linked lists are like locations in a treasure hunt. Each node (location) has a bit of data (the prize) and a pointer to the next node (a clue leading to the next location). If you want to locate the first clue you can use the head pointer to find it. However, finding a location in the middle of the treasure hunt would be more work. You would have to start at the head and visit locations one a time, following their clues until you reach the location you are looking for.

The figure below depicts the data structures that are typically used to represent the linked list. The head node is the leftmost node and the tail node is the rightmost node. There is a pointer to the head node, and some linked lists have a pointer to the tail node. Each node has a pointer to the next node in the linked list, but the tail node's pointer is null. You iterate through a linked list by starting at the head node and traveling toward the tail node.

Modeling of a Single linked list. A linked list with 3 nodes: a head node with a data element and a pointer to the next node, a middle node with a data element and a pointer to the next node, and a tail node with a data element and a null pointer. There are also pointers to the head node and tail node.

Modeling of a Single linked list. A linked list with 3 nodes: a head node with a data element and a pointer to the next node, a middle node with a data element and a pointer to the next node, and a tail node with a data element and a null pointer. There are also pointers to the head node and tail node.

Situations well suited for linked lists

Linked lists are most useful for a list of objects accessed in order. We add objects to the tail of the list. We call removing objects from the head of the list "First In, First Out" (FIFO). We call removing objects from the tail of the list "Last In, First Out" (LIFO).

This simplified Amazon call center uses FIFO operations:

  • We take calls in the order they are received, meaning we add new calls to the end of the list and take calls from the front of the list.
  • When we finish helping the customer at the front of the line, we remove them from the front of the list.
  • We do not change the order of the calls, except for adding calls at the end of the list and removing them from the front of the list.

Recall that removing the head node for an ArrayList is O(n), because we remove the element at index 0 and shift all the remaining elements lower by one index. In just a moment, we will describe how a linked list can remove its head node in O(1), making a linked list a better fit for this call center.

A stack of dinner plates uses LIFO operations:

  • As we wash each plate, we place it on top of the stack (end of the list).
  • As we make each guest's meal, we remove the plate on top of the stack (end of the list).

Although we can add and remove from the end of an ArrayList in constant time, resizing it whenever it grows larger than its backing array is expensive. While a single family's stack of dinner plates won't grow very large, a stack of dinner plates for a buffet restaurant would be better suited to a linked list.

Adding a node to the tail of a linked list

Let's consider what operations are necessary to add a new element to the end of a linked list.

Linked lists don't store their nodes in an array, so we can't just add a node to the last array index. Instead, each node points to the next node in order; therefore we must:

  1. Create the new node with its data element
  2. Find the tail node
  3. Update the tail node to point to the new node
  4. If a tail pointer exists, then update it to point at the new node

If the list does not have a tail pointer, then finding the tail node requires starting at the head and iterating until we reach the last node, making it an O(n) operation. If the list has a tail pointer, then finding the tail node only requires accessing the value in the tail pointer, which is an O(1) operation. The rest of the operations are all O(1), so the time complexity of adding a node to the tail of a linked list is the same as the time complexity of finding the current tail node.

The work required to add to the end of an ArrayList is similar, but includes large penalties for resizing when the list grows larger than its backing array. In use cases that also include adding nodes anywhere other than the end of the list, ArrayList has a penalty for moving all its later items by one index. Linked lists are often more suitable for these use cases.

A linked list with a new node added to its tail. There are 3 nodes: a head node with a data element and pointer to the next node, the original tail node with a data element and a new pointer to the next node, and the new node which has a data element and null pointer.

A linked list with a new node added to its tail. There are 3 nodes: a head node with a data element and pointer to the next node, the original tail node with a data element and a new pointer to the next node, and the new node which has a data element and null pointer.

Adding a node to the head of a linked list

Adding a new node to the head of a linked list is similar to adding a new tail node:

  1. Create the new node with its data element
  2. Find the head node
  3. Update the new node to point at the head node
  4. Update the head pointer to point at the new node

Because all linked lists have a head pointer, finding the head node is an O(1) operation. The other operations are all O(1). Therefore, adding a new head node is a constant-time operation.

A linked list with a new node added to the head. There are 3 nodes: a new node a data element and pointer to the original head node, the head node with a data element and a pointer to the tail node, and the tail node with a data element and a null pointer. The head pointer has been updated to point at the new node.

A linked list with a new node added to the head. There are 3 nodes: a new node a data element and pointer to the original head node, the head node with a data element and a pointer to the tail node, and the tail node with a data element and a null pointer. The head pointer has been updated to point at the new node.

Removing a node from a linked list

The time complexity of removing a node from a linked list depends on finding the nodes that need to be updated, just like adding a node to the tail. Up to three nodes can be involved: the preceding node (the node that points to the node to be removed), the node to be removed, and the following node (the node that the node-to-be-removed points at.)

  1. Find the preceding node. If you are removing the head node, there is no preceding node.
  2. Update the previous node's pointer to point at the following node. (If you are removing the head node, update the head pointer instead.)
  3. Set the removed node's pointer to null.

Consider the time complexity of removing the first element from an ArrayList and from a linked list. An ArrayList would have to shift all other elements one index lower, making it O(n). To remove an element, a linked list only needs to change the head pointer to point at the second node in the list, making it O(1) no matter how many items are in the list. This makes linked lists ideal for FIFO operations, when we add nodes to the tail of the list and remove them from the head of the list.

A linked list after removing the head node. The old head node and the old head pointer are faded out, indicating that the head pointer now points to the new head node. The new head node is unchanged: it still has a data element and pointer to the tail node, which has a data element and a null pointer.

A linked list after removing the head node. The old head node and the old head pointer are faded out, indicating that the head pointer now points to the new head node. The new head node is unchanged: it still has a data element and pointer to the tail node, which has a data element and a null pointer.

Accessing nodes in a linked list

The efficiency of adding linked list tail nodes and removing linked list head nodes comes at a price. Imagine you're about to start a treasure hunt and your friend calls you up and says they'll meet you at the coffee shop near the 10th location for lunch. They don't want to spoil the hunt, so the only way for you to meet them is to follow the clues through the first nine locations before you can get the clue for the 10th.

Like the treasure hunt, a linked list has no array index to look up entries in. To access the element at n, we must iterate through the list until we count up to n nodes. Linked lists are well suited to operations at their ends, but far less suited to operations requiring indexing.

When to use Java's LinkedList versus ArrayList -- Examples

Java uses a "doubly linked list" for its LinkedList implementation. Each node in a doubly linked list has a pointer to its preceding node as well as its following node. Java's LinkedList has both a tail pointer and a head pointer. Remember from our description above that the time complexity of "Adding a node to the tail of a linked list" depends on finding the tail node. Since LinkedList has a tail pointer, adding to its end is O(1).

Consider the time complexity of removing the tail node in a doubly linked list. Since we have a tail pointer, and the tail node has a pointer to its preceding node, finding the preceding node always requires exactly two O(1) operations. Therefore, removing the tail node is an O(1) operation.

The LinkedList JavaDoc shows that LinkedList implements the List interface. Here's a table of the complexity of some List operations for Java's ArrayList and LinkedList:

Operation LinkedList ArrayList Description
size() O(1) O(1) Number of elements in list
add(index, ...) O(n) O(n) Add to middle of list
add(0, ...) O(1) O(n) Add to beginning of list
add(...) / add(last, ...) O(1) O(1) Add to end of list
remove(index) O(n) O(n) Remove a middle element
remove(0) O(1) O(n) Remove first element
remove(last) O(1) O(1) Remove last element
get(index) O(n) O(1) Access a middle element
get(0) O(1) O(1) Access first element
get(last) O(1) O(1) Access last element

The table shows that LinkedList is typically more efficient than ArrayList unless the use case requires access by index. To choose between List implementations, you must compare the use cases that require accessing an element by index against the ones that require adding or removing elements from the head of the list. If indexing is more common, use an ArrayList. If adding or removing from the head is more common, use a LinkedList.

We typically use LinkedList when we will always handle items in list order:

  • List of orders to fulfill
  • List of customers to contact
  • Lists of successful and failed calls (almost always added, then printed, with no other access)
  • List of books to be downloaded
  • List of categories (almost always printed together)

We typically use ArrayList when we may access items arbitrarily or unpredictably, including when we may sort them:

  • List of items in an order (usually small, customers sometimes want to access interior nodes)
  • List of names to sort (sorting accesses items by index)
  • Instructions (often requires interrupting and starting over in the middle)
  • List of lines in a file (editing often accesses lines by index)

Video Content: Implementing a Linked List

This video walks through building a custom linked list implementation, demonstrating key operations and design considerations.

// More complete linked list implementation
public class CustomLinkedList<E> {
    
    private static class Node<E> {
        E data;
        Node<E> next;
        
        Node(E data) {
            this.data = data;
            this.next = null;
        }
    }
    
    private Node<E> head;
    private Node<E> tail;
    private int size;
    
    public CustomLinkedList() {
        this.head = null;
        this.tail = null;
        this.size = 0;
    }
    
    // Add to the end - with tail reference this is O(1)
    public void add(E element) {
        Node<E> newNode = new Node<>(element);
        
        if (head == null) {
            head = newNode;
            tail = newNode;
        } else {
            tail.next = newNode;
            tail = newNode;
        }
        
        size++;
    }
    
    // Add at specific index - O(n) in worst case
    public void add(int index, E element) {
        if (index < 0 || index > size) {
            throw new IndexOutOfBoundsException("Index: " + index);
        }
        
        if (index == 0) {
            addFirst(element);
            return;
        }
        
        if (index == size) {
            add(element);
            return;
        }
        
        Node<E> current = head;
        for (int i = 0; i < index - 1; i++) {
            current = current.next;
        }
        
        Node<E> newNode = new Node<>(element);
        newNode.next = current.next;
        current.next = newNode;
        size++;
    }
    
    // Add to the beginning - O(1)
    public void addFirst(E element) {
        Node<E> newNode = new Node<>(element);
        
        if (head == null) {
            head = newNode;
            tail = newNode;
        } else {
            newNode.next = head;
            head = newNode;
        }
        
        size++;
    }
    
    // Get element at index - O(n)
    public E get(int index) {
        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException("Index: " + index);
        }
        
        Node<E> current = head;
        for (int i = 0; i < index; i++) {
            current = current.next;
        }
        
        return current.data;
    }
    
    // Remove first element - O(1)
    public E removeFirst() {
        if (head == null) {
            throw new NoSuchElementException();
        }
        
        E data = head.data;
        head = head.next;
        
        if (head == null) {
            tail = null;
        }
        
        size--;
        return data;
    }
}

This more comprehensive implementation demonstrates key linked list operations with time complexity analysis. Note the use of tail reference to make adding to the end an O(1) operation. The code includes methods for inserting elements at different positions, retrieving elements by index, and removing elements, each with appropriate time complexity considerations.

Call Center Implementation Example

This section depicts the implementation of the Call Center from the previous reading. When a customer contacts the Call Center, a separate system creates a "case" for the contact and stores the caller's name, the phone number, the issue details, the time, and other information.

Our Call Center only uses the ID of each case. A LinkedList uses an object reference as its data element, so it could refer to a more complex class, but our implementation uses only the String case ID to simplify our examination.

The Call Center Class

The CallCenter class uses the String case ID. CallCenter has one attribute, calls, which is a LinkedList of Strings. We will discuss these Call Center use cases:

  • Main scenario: A new call gets put on hold and enters the line -> receiveCall()
  • Main scenario: The call at the head of the line gets assigned to an agent and removed -> nextCall()
  • Rare scenario: Search for a call by case ID -> findCall()
  • Rare scenario: Handle a call by index -> getCallAt()

A Note on LinkedList Methods

Since LinkedList is frequently used when we expect to manipulate the ends of a list, it implements methods from the Deque interface. Deque (pronounced "deck") stands for "Double Ended Queue", and defines readable names for these end-related methods. In particular, addFirst(), addLast(), removeFirst(), and removeLast() provide intuitive alternatives for verbose code like dishes.remove(dishes.size() - 1).

We will not use the other Deque methods in ATA, but we encourage you to read the LinkedList JavaDoc to explore them.

CallCenter uses addLast() and removeFirst() instead of the List alternatives.

package com.amazon.ata.deploying.prework;

import java.util.LinkedList;

public class CallCenter {

    // The list of calls. We use a LinkedList for access to the
    // addLast() and removeFirst() methods, which aren't part of List.
    private final LinkedList<String> calls;

    /**
     * Constructor.
     */
    public CallCenter() {
        calls = new LinkedList<>();
    }

    /**
     * A call gets into line to get processed.
     * @param caseId - case ID assigned to this contact
     */
    public void receiveCall(String caseId) {
        calls.addLast(caseId);
    }

    /**
     * The call at the head of the line gets processed.
     * @return call at head of the line
     */
    public String nextCall() {
        return calls.removeFirst();
    }

    /**
     * Handle a call by index.
     * @param index - The desired index of the call to retrieve from the LinkedList
     * @return - The case ID of the call at the index (the data element) 
     */
    public String getCallAt(int index) {
        return calls.get(index);
    }

    /**
     * Search the list and return the first call with a given case ID.
     * @param caseId - The ID of the case we're searching for
     * @return - The first call matching the case ID
     */
    public String findCall(String caseId) {
        for (String call : calls) {
            if (call.equals(caseId)) {
                return call;
            }
        }
        return null;
    }
}

Adding a New Call as the Tail Node

When a customer calls into the Call Center, we need to add the call to the tail of the list. The receiveCall() method calls the LinkedList method addLast() to add the new call to the tail of the LinkedList. For example, when a call with case ID "A1" gets added to the back of an empty list, then a call with case "B2" gets added to the back of the list, the list order will be ("A1", "B2") as shown below. Adding the call "C3" to the back of the list results in order ("A1", "B2", "C3").

Adding a new Tail node to the list. There are 3 nodes; the first is a node with data element  = "A1" and a pointer to the next node, the second node is the old tail node with data element  = "B2" and a pointer to the next node, and the third node is the new tail node with data element  = "C3" and a null pointer. The tail pointer has been updated to point at the new tail node.

Adding a new Tail node to the list. There are 3 nodes; the first is a node with data element = "A1" and a pointer to the next node, the second node is the old tail node with data element = "B2" and a pointer to the next node, and the third node is the new tail node with data element = "C3" and a null pointer. The tail pointer has been updated to point at the new tail node.

Processing the Head Node Call

When an agent is available, nextCall() removes the head node and returns it. If calls included the case IDs ("A1", "B2", "C3"), nextCall() would remove the case ID "A1" and return it, leaving calls with the case IDs ("B2", "C3").

Removing the Head node. There are 3 nodes; the first node with data element  = "A1" has been removed. The second node is the new head node with data element  = "B2" and a pointer to the next node, and the third node has data element  = "C3" and a pointer to the next node. The head pointer has been updated to point at the new head node.

Removing the Head node. There are 3 nodes; the first node with data element = "A1" has been removed. The second node is the new head node with data element = "B2" and a pointer to the next node, and the third node has data element = "C3" and a pointer to the next node. The head pointer has been updated to point at the new head node.

Finding a Call by Case ID

We can use findCall() to search for a call with a given case ID. In the worst case, the call we're looking for is at the end of the list or doesn't exist, and findCall() must go through the entire list.

Handling a Call by Index

On the rare occasion when we want to retrieve a call by index, we use getCallAt(). Following the example above, if calls included the case IDs ("A1", "B2", "C3"), then getCallAt(0) would return the case ID "A1". Our code does not remove the call from the LinkedList.

Comparing LinkedList and ArrayList implementations for the Call Center

What happens to the time complexity if we had decided to use an ArrayList for calls instead of a LinkedList? This table depicts the time complexity of the CallCenter methods for each implementation:

Operation LinkedList ArrayList
receiveCall() O(1) O(1)
nextCall() O(1) O(n)
findCall() O(n) O(n)
getCallAt() O(n) O(1)

As discussed in the earlier reading, adding and removing from the ends favor the LinkedList, whereas reading (but not adding or removing) interior nodes by index favors the ArrayList. Time complexity becomes increasingly important with larger size lists.

Consider what would happen in a Call Center that receives 500,000 calls each day. As calls arrive, we invoke receiveCall(). With either a LinkedList or an ArrayList, this is an O(1) operation. As agents became available, we invoke nextCall(). With a LinkedList, this would be an O(1) operation; with an ArrayList, each call would be O(n), because we need to move all the other calls to a lower index.

If we must invoke getCallAt(425000), the LinkedList implementation would traverse hundreds of thousands of nodes before finding the correct index. The ArrayList would just return the indexed call in O(1).

Since the use cases indicate that we will invoke nextCall() much more often than getCallAt(), the LinkedList is the most efficient choice.

Guided Project

Video Content: Sprint 14 Linked Lists Overview

This video provides a comprehensive overview of linked lists, summarizing key concepts, implementations, and use cases covered in this module.

// Example: Choosing the right list implementation for the job
import java.util.LinkedList;
import java.util.ArrayList;
import java.util.Deque;
import java.util.List;

public class ListSelectionExample {
    
    // Queue/stack operations (add/remove at ends) - use LinkedList
    public static void queueOperationExample() {
        Deque<String> queue = new LinkedList<>();
        
        // Add to end (enqueue)
        queue.addLast("First");
        queue.addLast("Second");
        queue.addLast("Third");
        
        // Remove from front (dequeue)
        String first = queue.removeFirst();  // "First"
        String second = queue.removeFirst(); // "Second"
    }
    
    // Random access and modification - use ArrayList
    public static void randomAccessExample() {
        List<Integer> numbers = new ArrayList<>();
        
        // Fill with data
        for (int i = 0; i < 1000; i++) {
            numbers.add(i);
        }
        
        // Fast random access
        int value = numbers.get(500);  // O(1) operation
        
        // Update value at specific index
        numbers.set(500, 9999);        // O(1) operation
    }
    
    // Frequent insertions/deletions in the middle - consider choices carefully
    public static void frequentModificationExample() {
        // If most operations are at the beginning/end
        LinkedList<String> linkedNames = new LinkedList<>();
        
        // If most operations are random access with occasional insertions
        ArrayList<String> arrayNames = new ArrayList<>();
        
        // LinkedList is better when you have a reference to the node
        // and can directly insert/remove without traversal
        // But often ArrayList is still faster overall due to cache locality
        // and lower overhead, unless insertions/deletions dominate
    }
}

This code demonstrates practical scenarios for choosing between LinkedList and ArrayList. LinkedList is optimal for queue/stack operations (adding/removing at ends) and for frequent insertions in the middle when you already have a reference to the position. ArrayList excels with random access and performs better overall in many scenarios due to memory locality and lower overhead per element.

Mastery Task 4: Without Music, Life Would B-flat

Milestone 1: Implement AddSongToPlaylistActivity

We're able to Create, Get, and Update playlists, but they're not very useful without being able to add songs! Let's fix that by implementing the AddSongToPlaylist API. Review the "Music Playlist Service API Implementation Notes" section of the design document for requirements.

NOTE: For this task, you don't need to support the queueNext option. We will do that in a future task. We will implement your base API first for this task and add more features later. In the next milestone, we'll do the same thing with the GetPlaylistSongs endpoint.

Our AlbumTrack java model is empty. We'll need to add the fields matching the data model and stored data in the album_tracks table in your account. Make sure to include annotations to mark the partition key and sort key.

Notice some attributes in the table are not camelCase, such as track_number. However, we can use the attributeName property of the @DynamoDBAttribute annotation to explicitly set the attribute name.

When the Java model is ready, implement a getAlbumTrack method in AlbumTrackDao that uses the DynamoDBMapper to load an item from the album_tracks table. Add tests for AlbumTrackDao as appropriate.

As with PlaylistModel for previous APIs, we must convert our AlbumTrack data model to the API-defined SongModel. With the updated AlbumTrack class, create a toSongModel method in ModelConverter to map the new fields from the AlbumTrack object to the SongModel object. Update ModelConverterTest as appropriate.

Once done, we can implement AddSongToPlaylistActivity's handleRequest method to fetch the album information and add it to the Playlist's songList, save the updated Playlist and return the updated song list from the API. Update the Activity to use the ModelConverter to convert the AlbumTrack's to SongModel's as needed. Make the change, and then uncomment and run the AddSongToPlaylistActivityTest unit tests.

Since we're only partially implementing the AddSongToPlaylistActivity class in this sprint, you only need the following tests to pass to complete this Mastery Task:

  • handleRequest_validRequest_addsSongToEndOfPlaylist
  • handleRequest_noMatchingPlaylistId_throwsPlaylistNotFoundException
  • handleRequest_noMatchingAlbumTrack_throwsAlbumTrackNotFoundException

Once your tests pass, upload your code to Lambda and ensure it works.

Milestone 2: Implement GetPlaylistSongsActivity

Next, let's implement GetPlaylistSongsActivity's handleRequest method to return the song list, along with unit tests as needed. Review the "Music Playlist Service API Implementation Notes" section of the design document for requirements.

No need to support the order field in the GetPlaylistSongsRequest yet, as you will implement that in a later task. For now, we'll only be able to return songs in the default playlist order. This API only needs to load the Playlist from the playlists table and return the Playlist's song list.

To return the result of this activity, we'll have to convert the AlbumTrack's to our API defined SongModel class and add them in a list in the GetPlaylistSongsResult.

Since we're only partially implementing the GetPlaylistSongsActivity class in this sprint, you only need the following tests to pass to complete this Mastery Task:

  • handleRequest_playlistExistsWithSongs_returnsSongsInPlaylist
  • handleRequest_playlistExistsWithoutSongs_returnsEmptyList
  • handleRequest_noMatchingPlaylistId_throwsPlaylistNotFoundException

Doneness Checklist

  • You've implemented AddSongToPlaylist's functionality
  • You've implemented GetPlaylistSongs functionality
  • AddSongsToPlaylistActivityTest tests pass
  • GetPlaylistSongsActivityTest tests pass
  • Both of your Lambda functions are working on AWS

Resources