ArrayLists & Node Lists

Array List ADT, Dynamic Resizing, Position-Based Lists, and Iterators

CS205 Data Structures — Use arrow keys or buttons to navigate
ArrayList (index-based) Node List (position-based) +---+---+---+---+---+ +---+ +---+ +---+ | A | B | C | D | | | A |<-->| B |<-->| C | +---+---+---+---+---+ +---+ +---+ +---+ 0 1 2 3 p1 p2 p3 ^ ^ index access O(1) position access O(1)
1 / 18

The List ADT

An abstract, ordered collection of elements

A List stores elements in a linear sequence. Each element has an index (rank) from 0 to n-1.

Core Operations

get(i) - return element at index i set(i, e) - replace element at index i with e add(i, e) - insert e at index i, shift others right remove(i) - remove element at index i, shift left size() - return number of elements

Additional Operations

isEmpty() - is the list empty? addFirst(e) - same as add(0, e) addLast(e) - same as add(size(), e)

Analogy: A Numbered Lineup

Think of people standing in a numbered line. You can tell someone to stand at position 3, and everyone behind them scoots back one spot.

Key Idea

The List ADT is abstract -- it defines what operations are available, not how they are implemented. An ArrayList and a LinkedList both implement the same List ADT!

List ADT (interface) / \ ArrayList LinkedList (array) (nodes + pointers)
2 / 18

Array-Based List (ArrayList)

A dynamic array that grows as needed

An ArrayList stores elements in a contiguous backing array. Two key values track its state:

  • size -- number of elements actually stored
  • capacity -- total slots available in backing array
capacity = 8 size = 5 ______________________________ | | | | | | | | | arr: | 10 | 20 | 30 | 40 | 50 | | | | |____|____|____|____|____|____|____|____| idx: 0 1 2 3 4 5 6 7 <--- used (size=5) ---> <- empty ->

Key Idea

size ≤ capacity always. When size == capacity, the array is full and must be resized before adding more elements.

Warning: Index Bounds

Valid indices are 0 to size - 1. Accessing index 5 in the diagram above throws IndexOutOfBoundsException even though the backing array has slots 5-7!

public class ArrayList<E> { E[] data; // backing array int size; // # of real elements ArrayList(int capacity) { data = (E[]) new Object[capacity]; size = 0; } }
3 / 18

Get and Set -- O(1) Random Access

The superpower of array-based lists

Direct Index Access

get(2): go directly to arr[2] no searching needed! +----+----+----+----+----+ arr: | 10 | 20 | 30 | 40 | 50 | +----+----+----+----+----+ 0 1 2 3 4 ^ | return 30 -- O(1)!
set(2, 99): replace arr[2] in-place +----+----+----+----+----+ arr: | 10 | 20 | 99 | 40 | 50 | +----+----+----+----+----+ 0 1 2 3 4 ^ was 30, now 99 -- O(1)!

Code

// O(1) -- constant time public E get(int i) { if (i < 0 || i >= size) throw new IndexOutOfBoundsException(); return data[i]; } // O(1) -- constant time public E set(int i, E element) { if (i < 0 || i >= size) throw new IndexOutOfBoundsException(); E old = data[i]; data[i] = element; return old; }

Analogy: Hotel Rooms

Like going directly to Room 302 in a hotel. You don't check rooms 1, 2, 3... You just walk straight there. That's what array indexing does -- address arithmetic computes the memory location instantly.

Key Idea

O(1) random access is the main reason to use an ArrayList. If your workload is read-heavy, ArrayList is likely the best choice.

4 / 18

Add at Index -- Shifting Right

Making room costs O(n) in the worst case

add(2, 99): Insert 99 at index 2 BEFORE: +----+----+----+----+----+----+ arr: | 10 | 20 | 30 | 40 | 50 | | size = 5, capacity = 6 +----+----+----+----+----+----+ 0 1 2 3 4 5 STEP 1: Shift elements at indices 2..4 one position RIGHT (start from the back!) +----+----+----+----+----+----+ arr: | 10 | 20 | | 30 | 40 | 50 | shifted 30,40,50 right +----+----+----+----+----+----+ 0 1 2 3 4 5 ^ gap opened up STEP 2: Place new element at index 2, increment size +----+----+----+----+----+----+ arr: | 10 | 20 | 99 | 30 | 40 | 50 | size = 6 +----+----+----+----+----+----+ 0 1 2 3 4 5 ^ inserted!
public void add(int i, E element) { if (size == data.length) resize(2 * data.length); // shift right from back for (int k = size - 1; k >= i; k--) data[k + 1] = data[k]; data[i] = element; size++; }

O(n) Worst Case

add(0, e) shifts ALL n elements. On average, add at a random index shifts n/2 elements. Only add(size, e) (append) avoids shifting entirely.

5 / 18

Remove at Index -- Shifting Left

Filling the gap costs O(n) in the worst case

remove(2): Remove element at index 2 (value 99) BEFORE: +----+----+----+----+----+----+ arr: | 10 | 20 | 99 | 30 | 40 | 50 | size = 6 +----+----+----+----+----+----+ 0 1 2 3 4 5 ^ remove this STEP 1: Save removed element, then shift indices 3..5 one position LEFT +----+----+----+----+----+----+ arr: | 10 | 20 | 30 | 40 | 50 | | shifted 30,40,50 left +----+----+----+----+----+----+ 0 1 2 3 4 5 ^ now unused STEP 2: Decrement size, null out old last slot +----+----+----+----+----+----+ arr: | 10 | 20 | 30 | 40 | 50 |null| size = 5 +----+----+----+----+----+----+ 0 1 2 3 4 5 return 99 (the removed element)
public E remove(int i) { E removed = data[i]; // shift left for (int k = i; k < size - 1; k++) data[k] = data[k + 1]; data[size - 1] = null; // help GC size--; return removed; }

Key Idea

Setting the old last slot to null prevents memory leaks -- otherwise the array still references an object the user thinks was removed.

6 / 18

Add at End (Append)

Usually O(1), but occasionally O(n) -- amortized O(1)

Case 1: Array has room

addLast(60): size=5, capacity=8 BEFORE: +----+----+----+----+----+----+----+----+ | 10 | 20 | 30 | 40 | 50 | | | | +----+----+----+----+----+----+----+----+ 0 1 2 3 4 5 6 7 AFTER: just place at data[size], size++ +----+----+----+----+----+----+----+----+ | 10 | 20 | 30 | 40 | 50 | 60 | | | +----+----+----+----+----+----+----+----+ 0 1 2 3 4 5 6 7 Cost: O(1) -- no shifting, no copying!

Case 2: Array is full

addLast(60): size=5, capacity=5 (FULL!) +----+----+----+----+----+ | 10 | 20 | 30 | 40 | 50 | <-- no room! +----+----+----+----+----+ Must RESIZE first (see next slide), then place 60 at the end. Cost: O(n) for this one insertion
public void addLast(E element) { if (size == data.length) resize(2 * data.length); data[size] = element; size++; } // Equivalent to: // add(size, element) // but avoids the shifting loop

Analogy: A Notebook

Writing on the next blank line is instant. But when the notebook is full, you need to buy a bigger one and copy all your old notes -- that takes a while. However, the bigger notebook means many quick writes before the next copy.

Key Idea: Amortized O(1)

Resizing is rare and gets rarer as the array grows. Averaged over many operations, each addLast costs only O(1). This is called amortized analysis.

7 / 18

Dynamic Resizing -- The Doubling Strategy

Why doubling the capacity gives amortized O(1) appends

When array is full, allocate 2x capacity and copy everything: FULL (capacity = 4): +----+----+----+----+ | 10 | 20 | 30 | 40 | size = 4, capacity = 4 +----+----+----+----+ | | | | v v v v copy all 4 elements +----+----+----+----+----+----+----+----+ | 10 | 20 | 30 | 40 | | | | | new capacity = 8 +----+----+----+----+----+----+----+----+ Now insert new element: +----+----+----+----+----+----+----+----+ | 10 | 20 | 30 | 40 | 50 | | | | size = 5 +----+----+----+----+----+----+----+----+ Next 3 appends are FREE (no resize needed)!

Amortized Analysis (Accounting Method)

Growth history (starting capacity = 1): Op# Size Capacity Resize? Copy cost --- ---- -------- ------- --------- 1 1 1 -- 0 2 2 2 yes! 1 3 3 4 yes! 2 4 4 4 -- 0 5 5 8 yes! 4 6-8 6-8 8 -- 0 9 9 16 yes! 8 10-16 ... 16 -- 0 Total copies after n inserts: 1 + 2 + 4 + 8 + ... + n = 2n - 1 Amortized cost per insert = (2n-1)/n ~ O(1)

Key Idea: Why doubling, not +1?

If we grew by +1 each time, every append copies all elements: total cost = 1+2+3+...+n = O(n^2). Doubling makes total copies = O(n), giving amortized O(1) per append.

Warning: Memory Waste

Doubling can waste up to 50% of memory (size = n, capacity = 2n). This is the time-space tradeoff. Growth factor 1.5 wastes less memory but still achieves amortized O(1).

8 / 18

Shrinking Strategy

When and how to reclaim unused memory

Naive approach: shrink at 1/2 full

THRASHING -- worst-case scenario! capacity = 8, size = 4 (half full) +--+--+--+--+--+--+--+--+ |##|##|##|##| | | | | size/cap = 1/2 +--+--+--+--+--+--+--+--+ add() --> size=5, need resize to 8 +--+--+--+--+--+--+--+--+ |##|##|##|##|##| | | | grew! +--+--+--+--+--+--+--+--+ remove() --> size=4, shrink to 4 +--+--+--+--+ |##|##|##|##| shrank! +--+--+--+--+ add() --> size=5, resize to 8 AGAIN! Every add/remove pair = O(n) copies!

Warning: Thrashing

If you grow at 2x and shrink at 1/2, alternating add/remove near the boundary triggers resize every single operation. This destroys amortized performance!

Smart approach: shrink at 1/4 full

Rule: shrink to HALF when size = capacity/4 capacity = 16, size = 4 (quarter full) +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |##|##|##|##| | | | | | | | | | | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ trigger: size = cap/4 Shrink to capacity/2 = 8: +--+--+--+--+--+--+--+--+ |##|##|##|##| | | | | half full, not at boundary +--+--+--+--+--+--+--+--+ Now you need 4 more removes to trigger another shrink, or 4 adds to reach capacity. Either way: many operations between resizes!

Key Idea: Hysteresis

By leaving a buffer zone between the grow threshold (full) and shrink threshold (1/4 full), we guarantee enough operations between resizes to amortize the copy cost.

Summary of thresholds: GROW: when size == capacity new capacity = 2 * old capacity SHRINK: when size == capacity / 4 new capacity = capacity / 2
9 / 18

The Position ADT

A stable handle to an element, immune to shifting

The Problem with Indices

You're tracking element "C" at index 2. BEFORE insert: +---+---+---+---+---+ | A | B | C | D | E | "C" is at index 2 +---+---+---+---+---+ 0 1 2 3 4 AFTER add(1, "X"): +---+---+---+---+---+---+ | A | X | B | C | D | E | "C" is now at index 3! +---+---+---+---+---+---+ 0 1 2 3 4 5 ^ Your stored index 2 now points to "B"!

Indices Are Fragile

Every insertion or deletion can invalidate stored indices. If you save "the element at index 2", that reference breaks after any modification before it.

The Solution: Positions

A Position is an object that holds onto an element, regardless of where it sits in the sequence. Insertions and deletions don't break other positions.

Position interface: +------------------------+ | Position<E> | | ----------------------| | getElement() : E | +------------------------+ You hold position p for "C". No matter how many inserts/deletes happen around it, p.getElement() always returns "C".

Analogy: Name Tags vs Seat Numbers

Index = "person in seat #3" -- changes when people move. Position = "the person wearing the Alice name tag" -- always refers to Alice no matter where she sits.

Key Idea

The Position ADT decouples identity from location. This is the foundation for positional (node) lists.

10 / 18

Node List (Positional Linked List)

A doubly-linked list where each node IS a position

Doubly-Linked List with Sentinel Nodes: header trailer (dummy) (dummy) +------+ +------+ +------+ +------+ +------+ | |---->| |---->| |---->| |---->| | | null | | A | | B | | C | | null | | |<----| |<----| |<----| |<----| | +------+ +------+ +------+ +------+ +------+ p1 p2 p3 (position) (position) (position) Each node has: +-----------+ | element | the stored data | prev ----> pointer to previous node | next ----> pointer to next node +-----------+

Positional List Operations

first() : Position of first element last() : Position of last element before(p) : Position before p after(p) : Position after p addBefore(p, e) : Insert e before p addAfter(p, e) : Insert e after p set(p, e) : Replace element at p remove(p) : Remove element at p

Key Idea: Sentinels

Header and trailer are dummy nodes that simplify edge cases. We never insert before header or after trailer. This eliminates null-checking for first/last operations.

Analogy: Train Cars

Each car (node) is coupled to the car in front and behind. To add a new car, you just recouple the links. The engine (header) and caboose (trailer) never change.

11 / 18

Node List Operations -- O(1) Each!

Pointer rewiring instead of element shifting

addAfter(p2, "X"): Insert "X" after position p2 (which holds "B") BEFORE: +---+ +---+ +---+ ... | A | <----> | B | <----> | C | ... +---+ p1 +---+ p2 +---+ p3 STEP 1: Create new node holding "X" +---+ | X | new +---+ STEP 2: Rewire 4 pointers +---+ +---+ +---+ +---+ ... | A | <----> | B | <----> | X | <----> | C | ... +---+ p1 +---+ p2 +---+ pNew +---+ p3 Only 4 pointer assignments -- O(1)!

remove(p2): Remove node at position p2

BEFORE: +---+ +---+ +---+ | A |<--> | B |<--> | C | +---+ p1 +---+ p2 +---+ p3 Rewire: A.next = C, C.prev = A AFTER: +---+ +---+ | A |<------------> | C | +---+ p1 +---+ p3 +---+ | B | (garbage collected) +---+ O(1) -- no shifting!
// Java-style pseudocode void addAfter(Position p, E element) { Node node = (Node) p; Node newNode = new Node(element, node, node.next); node.next.prev = newNode; node.next = newNode; size++; } E remove(Position p) { Node node = (Node) p; node.prev.next = node.next; node.next.prev = node.prev; size--; return node.element; }

Key Idea

All positional insert/remove operations are O(1) because we only rewire a constant number of pointers. No elements are shifted. This is the fundamental advantage over ArrayList.

12 / 18

Iterators

Traversing a collection without knowing its implementation

Iterator Interface

+---------------------+ | Iterator<E> | | -------------------| | hasNext() : boolean| | next() : E | +---------------------+ Usage pattern: Iterator<String> it = list.iterator(); while (it.hasNext()) { String s = it.next(); System.out.println(s); } // Or with for-each (syntactic sugar): for (String s : list) { System.out.println(s); }
How the iterator walks through a list: +---+---+---+---+---+ | A | B | C | D | E | +---+---+---+---+---+ ^ cursor next() returns "A", advances cursor: +---+---+---+---+---+ | A | B | C | D | E | +---+---+---+---+---+ ^ cursor hasNext()? YES (cursor != end)

Why Iterators?

Analogy: TV Remote

You press "next channel" without knowing if signals come via cable, satellite, or streaming. The remote is the iterator -- it gives you a uniform way to move through content regardless of the source.

Iterator on ArrayList vs Node List

ArrayList iterator: cursor = index (int) next() returns data[cursor++] Node List iterator: cursor = node reference next() { E val = cursor.element; cursor = cursor.next; return val; } SAME interface, DIFFERENT internals!

Key Idea: Abstraction

Iterators decouple traversal logic from data structure details. Code using iterators works with ANY Iterable collection without modification.

13 / 18

ArrayList vs Node List -- Comparison

Choosing the right implementation for your workload

Operation ArrayList Node List (Doubly Linked)
get(i) / Access by index O(1) O(n)
set(i, e) O(1) O(n)
add(0, e) / Insert at front O(n) O(1)
add(n, e) / Append at end O(1)* amortized O(1)
add(i, e) / Insert at middle O(n) O(1) if you have position, O(n) to find it
remove(i) / Delete at middle O(n) O(1) if you have position, O(n) to find it
Memory per element 1 reference 3 references (elem + prev + next)
Cache performance Excellent (contiguous) Poor (scattered in memory)

Key Idea

Node List insert/delete is O(1) only if you already have the position. Finding a position by value or index still costs O(n). The advantage comes when you hold positions from prior operations.

Analogy

ArrayList = a numbered bookshelf. Finding book #47 is instant, but inserting a book in the middle means sliding everything over. Node List = a chain of paperclips. Easy to add/remove a clip anywhere, but finding the 47th clip means counting from the start.

14 / 18

Java Collections Framework

How ArrayList and LinkedList fit into the bigger picture

Iterable<E> | Collection<E> / \ List<E> Set<E> ... / \ ArrayList LinkedList (array) (doubly-linked list) Both implement the same List<E> interface!

java.util.ArrayList

  • Backed by Object[] array
  • Default initial capacity: 10
  • Growth factor: roughly 1.5x (not 2x)
  • Implements List, RandomAccess

java.util.LinkedList

  • Doubly-linked list with header/trailer
  • Also implements Deque (double-ended queue)
  • No RandomAccess marker -- indexing is O(n)

Common Methods (both share via List interface)

List<String> list = new ArrayList<>(); // or: new LinkedList<>(); list.add("hello"); // append list.add(0, "world"); // insert at index list.get(1); // "hello" list.set(0, "hi"); // replace list.remove(0); // remove by index list.remove("hello"); // remove by value list.size(); // number of elements list.contains("hi"); // search list.indexOf("hi"); // find index // Iterate for (String s : list) { ... }

Warning: LinkedList Indexing

Java's LinkedList.get(i) walks from the head (or tail if i > n/2). Using get(i) in a loop on a LinkedList is O(n^2)! Use an iterator or for-each instead.

Key Idea

Java's LinkedList does NOT expose Position objects. The positional list ADT from lecture is more powerful than java.util.LinkedList because positions let you do O(1) insert/delete at known locations.

15 / 18

Choosing the Right List

Match the data structure to your workload

Use ArrayList when...

  • Frequent random access by index (read-heavy)
  • Mostly appending to the end
  • Memory efficiency matters (less overhead per element)
  • You need good cache locality (iteration speed)
  • The list size is relatively stable
Good for ArrayList: - Database result caching - Lookup tables - Buffers where you mostly append - Any read-heavy, write-rare pattern

Rule of Thumb

When in doubt, use ArrayList. It is the better default in nearly all practical scenarios due to cache performance and lower memory overhead.

Use LinkedList / Node List when...

  • Frequent insertion/deletion at both ends (deque pattern)
  • You hold position references and need O(1) insert/delete at those positions
  • Elements are frequently reordered or spliced
  • No random access needed
Good for LinkedList / Node List: - Implementing undo/redo (position-based) - LRU cache (move to front on access) - Music playlist with reordering - Any insert/delete-heavy pattern at known positions

Beware

In practice, ArrayList often beats LinkedList even for middle insertions on modern hardware, because CPU cache effects dominate. Profile before assuming LinkedList is faster!

16 / 18

Common Pitfalls

Mistakes that bite data structures students

1. ConcurrentModificationException

// WRONG: modifying list while iterating for (String s : list) { if (s.equals("bad")) list.remove(s); // CRASH! } // RIGHT: use iterator's remove() Iterator<String> it = list.iterator(); while (it.hasNext()) { if (it.next().equals("bad")) it.remove(); // safe! }

2. IndexOutOfBoundsException

List has size 5 (indices 0-4) list.get(5); // CRASH! Off by one list.get(-1); // CRASH! Negative index // Remember: valid range is [0, size-1] // add() valid range is [0, size]

3. Forgetting add() Shifts Elements

list = [A, B, C, D, E] // Removing at indices 1 and 3... list.remove(1); // removes B // Now list = [A, C, D, E] list.remove(3); // removes E, not D! // After first remove, indices shifted!

4. O(n^2) Loop on LinkedList

// TERRIBLE on LinkedList: O(n^2) for (int i = 0; i < list.size(); i++) { process(list.get(i)); // get(i) walks from head each time! } // GOOD: O(n) with iterator for (String s : list) { process(s); }

5. Confusing size() with capacity

ArrayList list = new ArrayList(100); // capacity = 100, but size = 0! list.get(0); // CRASH! // The list is EMPTY despite having // 100 slots in the backing array.

Warning: Remove-While-Iterating

This is the #1 most common bug with collections. If you need to remove elements during traversal, always use the iterator's own remove() method, or build a separate "to-remove" list first.

Key Idea: Think About Indices

When removing multiple elements by index, work backwards (from high to low) so earlier removals don't shift the positions of elements you haven't removed yet.

17 / 18

Summary & Cheat Sheet

Everything on one slide

ArrayList

+---+---+---+---+---+---+ | A | B | C | D | | | contiguous array +---+---+---+---+---+---+ 0 1 2 3 get/set(i) : O(1) add(i, e) : O(n) shift right remove(i) : O(n) shift left addLast(e) : O(1)* amortized size() : O(1) Resize: double when full Shrink: halve when 1/4 full

Node List (Positional)

H <-> [A] <-> [B] <-> [C] <-> T p1 p2 p3 addBefore/After(p, e) : O(1) remove(p) : O(1) Access by index : O(n) Sentinels: header (H) + trailer (T)

Complexity Quick Reference

Operation ArrayList Node List
Index access O(1) O(n)
Insert/delete at ends O(1)* / O(n) O(1)
Insert/delete at position O(n) O(1)
Memory / element Low High
Cache locality Great Poor

Core Takeaways

  • ArrayList: fast random access, slow middle insert/delete, great cache locality
  • Node List: O(1) insert/delete at known positions, O(n) to find a position
  • Positions are stable references; indices are fragile
  • Iterators abstract traversal from structure
  • Doubling + shrink-at-quarter = amortized O(1)
  • When in doubt, use ArrayList
18 / 18