Heaps

The Efficient Priority Queue

_ _ | | | | ___ __ _ _ __ ___ | |_| |/ _ \/ _` | '_ \/ __| | _ | __/ (_| | |_) \__ \ |_| |_|\___|\__,_| .__/|___/ |_| ( 1 ) / \ ( 3 ) ( 5 ) / \ ( 7 ) ( 9 ) CS205 Data Structures

Use arrow keys or buttons to navigate

1 / 20

What is a Heap?

A complete binary tree with a special ordering property

Two Requirements

  • Complete binary tree -- every level is fully filled except possibly the last, which is filled left to right
  • Heap-order property -- every node satisfies an ordering rule relative to its children

Min-Heap: Parent <= Children

The smallest element is always at the root. Every parent is smaller than (or equal to) both of its children.

Max-Heap: Parent >= Children

The largest element is always at the root. Every parent is larger than (or equal to) both of its children.

Min-Heap Example

( 1 ) / \ ( 3 ) ( 5 ) / \ / ( 7 ) (9) (6) Root = minimum element = 1 Check heap-order: 1 <= 3 and 1 <= 5 OK 3 <= 7 and 3 <= 9 OK 5 <= 6 OK

Key Idea

A heap is not a fully sorted structure. It only guarantees the root is the min (or max). The left child is not necessarily smaller than the right child. This partial ordering is what makes heaps fast.

2 / 20

Heap Properties

The three properties that make heaps work

1. Complete Binary Tree

All levels full except possibly the last. The last level is filled left to right with no gaps.

COMPLETE (valid): NOT COMPLETE (invalid): ( 1 ) ( 1 ) / \ / \ ( 3 ) ( 5 ) ( 3 ) ( 5 ) / \ / \ ( 7 ) ( 9 ) ( 7 ) ( 9 ) ^gap here!

2. Heap-Order Property

Min-heap: every parent <= its children
Max-heap: every parent >= its children

This means the root always holds the extreme value (min or max).

3. Height = O(log n)

Because the tree is complete, it is always perfectly balanced. A heap with n nodes has height:

height = floor( log2(n) ) n = 1 --> height = 0 n = 3 --> height = 1 n = 7 --> height = 2 n = 15 --> height = 3 n = 31 --> height = 4 Every operation that travels root-to-leaf is O(log n)!

Analogy: A Company Hierarchy

Think of a min-heap like a company where the CEO (root) always has the lowest employee ID number. Every manager's ID is lower than their direct reports. You can always find the CEO instantly -- just look at the top!

3 / 20

Why Heaps are Brilliant

The best of both worlds for priority queue operations

The Priority Queue Problem

We need a data structure that supports:

  • insert(key) -- add a new element
  • removeMin() -- remove and return the smallest element
Implementation insert removeMin
Unsorted List O(1) O(n)
Sorted List O(n) O(1)
Heap O(log n) O(log n)
Unsorted List: insert = O(1) -- just append removeMin = O(n) -- scan entire list [7, 2, 9, 1, 5] -- where is min? must check all! Sorted List: insert = O(n) -- find correct position removeMin = O(1) -- it's at the front [1, 2, 5, 7, 9] -- min at front! but insert shifts Heap: insert = O(log n) -- add + bubble up removeMin = O(log n) -- swap + bubble down ( 1 ) / \ ( 2 ) ( 5 ) -- min at root! / \ -- insert/remove ( 7 ) ( 9 ) travel height

Key Idea

A heap gives O(log n) for both operations. With n = 1,000,000, that is ~20 steps instead of 1,000,000. This makes heaps the go-to implementation for priority queues.

4 / 20

Array Representation

No pointers needed -- store a heap in a simple array!

The Tree: The Array: ( 1 ) index: 0 1 2 3 4 5 / \ ┌───┬───┬───┬───┬───┬───┐ ( 3 ) ( 5 ) value: │ 1 │ 3 │ 5 │ 7 │ 9 │ 6 │ / \ / └───┴───┴───┴───┴───┴───┘ ( 7 ) (9) (6) How? Read the tree level by level, left to right: Level 0: 1 --> arr[0] = 1 Level 1: 3, 5 --> arr[1] = 3, arr[2] = 5 Level 2: 7, 9, 6 --> arr[3] = 7, arr[4] = 9, arr[5] = 6

Key Idea: Level-Order = Array Order

Because the tree is complete, there are no gaps when we lay it out level by level. This means we can use a flat array with zero wasted space -- no left/right child pointers needed!

Analogy: Stadium Seating

Imagine filling stadium seats row by row, left to right, no empty seats. Seat number alone tells you which row and position you are in. That is exactly how a heap fits in an array.

5 / 20

Array Indexing Formulas

Navigate the tree using simple arithmetic -- no pointers!

The Three Formulas (0-indexed)

Given a node at index i: parent(i) = (i - 1) / 2 leftChild(i) = 2 * i + 1 rightChild(i) = 2 * i + 2

Example: Node at index 1 (value 3)

parent(1) = (1-1)/2 = 0 --> arr[0] = 1 leftChild(1) = 2*1+1 = 3 --> arr[3] = 7 rightChild(1) = 2*1+2 = 4 --> arr[4] = 9

Example: Node at index 4 (value 9)

parent(4) = (4-1)/2 = 1 --> arr[1] = 3 leftChild(4) = 2*4+1 = 9 --> out of bounds (no children!)

Visualized on the Tree

[0]( 1 ) / \ [1]( 3 ) [2]( 5 ) / \ / [3](7) [4](9) [5](6) Array: index: 0 1 2 3 4 5 ┌───┬───┬───┬───┬───┬───┐ │ 1 │ 3 │ 5 │ 7 │ 9 │ 6 │ └───┴───┴───┴───┴───┴───┘ | parent = [0] left = [3] right = [4]

Watch Out: Integer Division

The parent formula uses integer division (floor). Both left child (index 1) and right child (index 2) map to parent index 0. This is intentional: (1-1)/2 = 0 and (2-1)/2 = 0.

6 / 20

Insert: Upheap (Bubble Up)

Add at the end, then swim the element up to restore heap order

Algorithm

  1. Add the new element at the next available position (end of array = next leaf spot)
  2. Compare with parent
  3. If new element < parent, swap them
  4. Repeat until heap order is restored or we reach the root
// Pseudocode: insert into min-heap insert(heap, value): heap.add(value) // add at end i = heap.size - 1 // index of new elem // bubble up while i > 0: parent = (i - 1) / 2 if heap[i] < heap[parent]: swap(heap[i], heap[parent]) i = parent else: break // heap order OK

Analogy: New Employee

A new employee joins at the bottom of the org chart. If they outrank their manager (smaller key), they swap positions. They keep getting promoted until they meet someone who outranks them -- or they become CEO (root).

Time Complexity: Worst case: element bubbles all the way from leaf to root. Distance = height = O(log n) Each swap = O(1) Total: O(log n)

Key Idea

The completeness property guarantees that adding at the end keeps the tree complete. The bubble-up fixes the heap order. Two properties, two steps, O(log n).

7 / 20

Insert Example: Insert 2 into a Min-Heap

Step-by-step trace showing each swap

Starting heap: Array: [1, 4, 5, 7, 9, 6] ( 1 ) / \ ( 4 ) ( 5 ) / \ / ( 7 ) (9) (6)
Step 1: Add 2 at the end Array: [1, 4, 5, 7, 9, 6, 2] ( 1 ) / \ ( 4 ) ( 5 ) / \ / \ ( 7 ) (9) (6) (2) <-- new element at index 6 Compare 2 with parent at index (6-1)/2 = 2 --> parent is 5 2 < 5 --> SWAP!
Step 2: Swap 2 and 5 Array: [1, 4, 2, 7, 9, 6, 5] ( 1 ) / \ ( 4 ) (2) <-- 2 moved up / \ / \ ( 7 ) (9) (6) (5) Compare 2 with parent at index (2-1)/2 = 0 --> parent is 1 2 > 1 --> STOP! Heap order restored.

Result

Final heap: [1, 4, 2, 7, 9, 6, 5]. The element 2 bubbled up one level. Only 1 swap was needed because 2 > 1 (the root). In the worst case, an element could bubble all the way to the root: O(log n) swaps.

8 / 20

RemoveMin: Downheap (Bubble Down)

Remove root, move last to root, then sink it down

Algorithm

  1. Save the root value (this is the min)
  2. Move the last element to the root position
  3. Remove the last position (shrink array)
  4. Bubble down: compare with children, swap with the smaller child
  5. Repeat until heap order is restored or we reach a leaf
// Pseudocode: removeMin removeMin(heap): min = heap[0] heap[0] = heap[last] // move last to root heap.removeLast() // bubble down i = 0 while hasLeftChild(i): smallest = i if heap[left(i)] < heap[smallest]: smallest = left(i) if hasRight(i) and heap[right(i)] < heap[smallest]: smallest = right(i) if smallest != i: swap(heap[i], heap[smallest]) i = smallest else: break return min

Why Swap with the Smaller Child?

If we swapped with the larger child, that child would become the parent of the smaller child -- violating heap order! Always pick the smaller child (in a min-heap) to maintain the invariant.

Why not just remove root directly? ( 1 ) / \ Remove 1... ( 3 ) ( 5 ) now what? / \ The tree has ( 7 ) ( 9 ) a hole! Moving the last element to the root keeps the tree COMPLETE. Then bubble-down fixes ORDER.

Key Idea

RemoveMin maintains both heap properties: moving the last element to root preserves completeness, and bubble-down restores heap order. Time: O(log n).

9 / 20

RemoveMin Example

Remove the minimum from a min-heap, step by step

Starting heap: Array: [1, 3, 5, 7, 9, 6] ( 1 ) <-- remove this (min) / \ ( 3 ) ( 5 ) / \ / ( 7 ) (9) (6)
Step 1: Move last to root Array: [6, 3, 5, 7, 9] (6) <-- last element (6) placed at root / \ ( 3 ) ( 5 ) / \ ( 7 ) ( 9 ) Compare 6 with children: left=3, right=5 Smaller child = 3. Is 6 > 3? YES --> SWAP!
Step 2: Swap 6 and 3 Array: [3, 6, 5, 7, 9] ( 3 ) / \ (6) ( 5 ) / \ ( 7 ) ( 9 ) Compare 6 with children: left=7, right=9 Smaller child = 7. Is 6 > 7? NO --> STOP!

Result

Returned min = 1. Final heap: [3, 6, 5, 7, 9]. Element 6 sank one level and stopped because it was already smaller than both children. Heap order restored.

10 / 20

Building a Heap: Top-Down

Insert elements one at a time -- O(n log n)

Build a min-heap from: [5, 3, 8, 1, 4] Insert 5: Insert 3: Insert 8: Insert 1: Insert 4: 3 < 5, swap! 8 > 3, stop 1 < 3, swap! 4 > 1, stop (5) (3) (3) (1) (1) / / \ / \ / \ (5) (5) (8) (3) (8) (3) (8) / / \ (5) (5) (4) ^ 1 < 5, swap! then 1 < 3, swap!

Trace the Array

Start: [] +5: [5] +3: [5, 3] --> bubble up [3, 5] +8: [3, 5, 8] (8 > 3, OK) +1: [3, 5, 8, 1] --> bubble up [3, 5, 8, 1] [3, 1, 8, 5] (swap 1,5) [1, 3, 8, 5] (swap 1,3) +4: [1, 3, 8, 5, 4] (4 > 3, OK)

Time Complexity: O(n log n)

Each insert is O(log n) in the worst case. Doing n inserts gives O(n log n) total. This works, but we can do better with the bottom-up approach on the next slide!

11 / 20

Building a Heap: Bottom-Up (Heapify)

Start from the last internal node and sift down -- O(n)!

Input array: [5, 3, 8, 1, 4, 2, 7] (just treat it as a tree directly) Initial tree (NOT a heap): Last non-leaf = index (n/2)-1 = 2 [0]( 5 ) / \ [1]( 3 ) [2]( 8 ) / \ / \ [3](1) [4](4) [5](2) [6](7)
Sift down index 2 (value 8): Sift down index 1 (value 3): Sift down index 0 (value 5): children: 2, 7 children: 1, 4 children: 1, 2 smallest child = 2 smallest child = 1 smallest child = 1 8 > 2, swap! 3 > 1, swap! 5 > 1, swap! ( 5 ) ( 5 ) ( 1 ) / \ / \ / \ ( 3 ) ( 2 ) ( 1 ) ( 2 ) ( 3 ) ( 2 ) / \ / \ / \ / \ / \ / \ (1) (4) (8) (7) (3) (4) (8) (7) (3) (4) (8) (7) ^ ^ ^ 8 moved down 1 moved up, 3 moved down continue: 5 > 3? YES swap 5 and 3 ( 1 ) / \ ( 3 ) ( 2 ) / \ / \ (5) (4) (8) (7)

Key Idea

Process nodes from the last internal node up to the root. Each node sifts down at most to the bottom. Result: a valid min-heap [1, 3, 2, 5, 4, 8, 7] built in O(n) time!

12 / 20

Why Bottom-Up Heapify is O(n)

Most nodes are near the bottom and barely need to move

The Intuition

Level 0 (root): 1 node sifts down up to h levels Level 1: 2 nodes sift down up to h-1 levels Level 2: 4 nodes sift down up to h-2 levels ... Level h-1: n/4 nodes sift down at most 1 level Level h (leaves): n/2 nodes sift down 0 levels (skip!)

Half the nodes are leaves -- they do zero work! A quarter of nodes sift down at most 1 level. Only 1 node (root) sifts down h levels.

The Math

Total work = sum over all levels: h SUM (nodes at level k) * (h - k) k=0 h = SUM 2^k * (h - k) k=0 This sum evaluates to: = 2^(h+1) - h - 2 = O(n) (NOT n log n!)

Key Idea

Top-down build = O(n log n) because the many elements inserted later bubble up a long distance. Bottom-up build = O(n) because the many elements at the bottom sift down a short distance. The work is concentrated where it is cheapest.

13 / 20

Heap Sort

Build a max-heap, then repeatedly extract the maximum -- O(n log n), in-place!

Algorithm

  1. Build a max-heap from the array using bottom-up heapify -- O(n)
  2. Repeat n-1 times:
    • Swap root (max) with the last unsorted element
    • Shrink the heap size by 1
    • Bubble down the new root to restore heap order
// Pseudocode heapSort(arr): // Phase 1: Build max-heap buildMaxHeap(arr) // O(n) // Phase 2: Extract max repeatedly for i = n-1 down to 1: // O(n log n) swap(arr[0], arr[i]) // max goes to end heapSize-- siftDown(arr, 0) // fix heap

Why Max-Heap for Sorting?

We want to sort in ascending order. By extracting the max and placing it at the end of the array, we fill the sorted portion from right to left. The sorted elements accumulate at the back while the heap shrinks at the front.

┌──────────────┬────────────┐ │ HEAP part │ SORTED part│ │ (shrinking) │ (growing) │ └──────────────┴────────────┘ Each step: 1. Swap max to end of heap 2. Heap shrinks by 1 3. Sorted region grows by 1

Analogy

Like a talent show: the winner (max) of each round is retired to the "hall of fame" (sorted region). The remaining contestants re-compete (re-heapify).

14 / 20

Heap Sort Trace

Sort [5, 3, 8, 1, 4, 2] using heap sort

Step 0: Build max-heap from [5, 3, 8, 1, 4, 2] Result: [8, 4, 5, 1, 3, 2] Step 1: Swap 8 and 2, sift down [2, 4, 5, 1, 3 | 8] ( 8 ) ( 5 ) / \ / \ ( 4 ) ( 5 ) ( 4 ) ( 2 ) sorted: [8] / \ / / \ ( 1 ) (3)(2) ( 1 ) ( 3 )
Step 2: Swap 5 and 3, sift down Step 3: Swap 4 and 1, sift down [3, 4, 2, 1 | 3, 8] [1, 3, 2 | 1, 3, 8] wait... [4, 3, 2, 1 | 5, 8] --> sift: [3, 1, 2 | 4, 5, 8] ( 4 ) ( 3 ) / \ / \ sorted: ( 3 ) ( 2 ) sorted: ( 1 ) ( 2 ) [4, 5, 8] / [5, 8] ( 1 )
Step 4: Swap 3 and 2, sift down Step 5: Swap 2 and 1 DONE! [2, 1 | 3, 4, 5, 8] [1 | 2, 3, 4, 5, 8] ( 2 ) sorted: sorted: / [3, 4, 5, 8] [1, 2, 3, 4, 5, 8] ( 1 ) Final sorted array: [1, 2, 3, 4, 5, 8]
15 / 20

Min-Heap vs Max-Heap

Same structure, just flip the comparison

Min-Heap

( 1 ) parent <= children / \ ( 3 ) ( 2 ) root = MINIMUM / \ ( 5 ) ( 4 )
  • Parent <= children
  • Root = smallest element
  • Used for: priority queues (get min-priority item)
  • removeMin() returns the smallest

Java's PriorityQueue is a min-heap by default.

Max-Heap

( 9 ) parent >= children / \ ( 7 ) ( 8 ) root = MAXIMUM / \ ( 3 ) ( 5 )
  • Parent >= children
  • Root = largest element
  • Used for: heap sort (extract max to sort ascending)
  • removeMax() returns the largest

To get a max-heap in Java, use a reversed comparator.

Key Idea

The only difference is the direction of the comparison operator. All algorithms (insert, remove, heapify) are identical in structure -- just swap < with >. Think of them as the same data structure with a configurable comparator.

16 / 20

Java's PriorityQueue

Built-in min-heap -- ready to use

Basic Usage (Min-Heap)

import java.util.PriorityQueue; PriorityQueue<Integer> pq = new PriorityQueue<>(); pq.add(5); // insert pq.add(3); pq.add(8); pq.add(1); pq.peek(); // 1 (min, no remove) pq.poll(); // 1 (remove min) pq.poll(); // 3 pq.poll(); // 5 pq.poll(); // 8 pq.size(); // 0 pq.isEmpty(); // true

Key Methods

MethodDescriptionTime
add(e)InsertO(log n)
peek()View minO(1)
poll()Remove minO(log n)
size()CountO(1)

Max-Heap with Reversed Comparator

// Option 1: Collections.reverseOrder() PriorityQueue<Integer> maxPQ = new PriorityQueue<>( Collections.reverseOrder() ); // Option 2: Lambda comparator PriorityQueue<Integer> maxPQ = new PriorityQueue<>( (a, b) -> b - a ); maxPQ.add(5); maxPQ.add(3); maxPQ.add(8); maxPQ.poll(); // 8 (max!) maxPQ.poll(); // 5

Custom Objects

// Priority queue of tasks by priority PriorityQueue<Task> taskPQ = new PriorityQueue<>( (t1, t2) -> t1.priority - t2.priority ); // Or implement Comparable<Task> // in your Task class

Common Mistake

Java's PriorityQueue is a min-heap by default. If you need the largest element first, you must provide a reversed comparator!

17 / 20

Application: Top-K Elements

Find the k largest elements in a stream using a min-heap of size k

The Trick: Use a Min-Heap of Size k

To find the k largest, maintain a min-heap of size k. The root is always the smallest of the k largest seen so far. It acts as a gatekeeper.

Algorithm: 1. Insert first k elements 2. For each remaining element: - If element > heap root (min): remove root, insert element - Otherwise: skip it 3. Heap contains k largest!
// Java implementation PriorityQueue<Integer> minHeap = new PriorityQueue<>(); for (int num : stream) { if (minHeap.size() < k) { minHeap.add(num); } else if (num > minHeap.peek()) { minHeap.poll(); minHeap.add(num); } } // minHeap has k largest elements

Example: Top 3 from [4,1,7,3,9,2,8]

k = 3, stream: 4, 1, 7, 3, 9, 2, 8 Process 4: heap = [4] Process 1: heap = [1, 4] Process 7: heap = [1, 4, 7] (size=k) Process 3: 3 > root(1)? YES remove 1, add 3 heap = [3, 4, 7] Process 9: 9 > root(3)? YES remove 3, add 9 heap = [4, 9, 7] Process 2: 2 > root(4)? NO, skip Process 8: 8 > root(4)? YES remove 4, add 8 heap = [7, 9, 8] Top 3 = {7, 8, 9}

Time: O(n log k)

We process n elements, each heap operation is O(log k). Since k is typically much smaller than n, this is far better than sorting the whole array at O(n log n). For k=10 and n=1 billion, that is ~33 operations per element instead of ~30 billion total!

18 / 20

Application: Merge K Sorted Lists

Use a min-heap of size k to efficiently merge -- O(n log k)

The Problem

Given k sorted lists with a total of n elements, merge them into one sorted list.

List 1: [1, 4, 7] List 2: [2, 5, 8] List 3: [3, 6, 9] Merged: [1, 2, 3, 4, 5, 6, 7, 8, 9]

Algorithm

  1. Insert the first element of each list into a min-heap (with a tag for which list it came from)
  2. Extract min from heap, add to result
  3. Insert the next element from that same list
  4. Repeat until heap is empty

Trace

Heap initially: {1, 2, 3} (one from each list) Extract 1 (from L1), insert 4 --> heap: {2, 3, 4} out: [1] Extract 2 (from L2), insert 5 --> heap: {3, 4, 5} out: [1,2] Extract 3 (from L3), insert 6 --> heap: {4, 5, 6} out: [1,2,3] Extract 4 (from L1), insert 7 --> heap: {5, 6, 7} out: [1,2,3,4] ... and so on until all done.

Time: O(n log k)

The heap always has at most k elements (one per list). Each of the n total elements is inserted and extracted once. Each operation costs O(log k). Total: O(n log k).

Analogy

Like a tournament bracket for k runners. At each step, you pick the fastest runner (min), record their time, and their next teammate enters the race. The heap keeps the bracket organized.

19 / 20

Summary & Cheat Sheet

Everything you need to know about heaps in one slide

Core Operations

Operation Time How
peek / findMin O(1) Return root
insert O(log n) Add at end + bubble up
removeMin O(log n) Swap root/last + bubble down
build (top-down) O(n log n) Insert one by one
build (bottom-up) O(n) Heapify from last parent
heap sort O(n log n) Build max-heap + extract

Array Formulas (0-indexed)

parent(i) = (i - 1) / 2 leftChild(i) = 2 * i + 1 rightChild(i) = 2 * i + 2

When to Use a Heap

  • Priority Queue -- process items by priority
  • Top-K elements -- min-heap of size k, O(n log k)
  • Merge K sorted lists -- min-heap of size k
  • Heap Sort -- O(n log n), in-place, not stable
  • Median finding -- two heaps (max + min)
  • Dijkstra's algorithm -- shortest path

Heap Sort Properties

Time: O(n log n) (always) Space: O(1) (in-place!) Stable: NO Compare: merge sort is stable but uses O(n) extra space. Quicksort is O(n^2) worst case but faster in practice.

The Big Picture

A heap is a partially ordered, complete binary tree stored in an array. It trades full sorting for fast access to the extreme element. This trade-off is what makes priority queues, heap sort, and many graph algorithms efficient.

20 / 20