DFS - Depth-First Search

Go Deep, Then Backtrack

A --> B --> D --> (dead end, backtrack!) | | v v C E --> F | v G

CS205 Data Structures

Use arrow keys or buttons to navigate • Press S to reveal steps

1 / 20

What is DFS?

Depth-First Search explores a graph by going as deep as possible along each branch before backtracking.

Graph: DFS Order: (A)----(B) A (1) | \ | | | \ | B (2) | \ | | (C) (D) D (3) | | (E)----(F) F (4) | E (5) backtrack to A... C (6)

Analogy: Maze Explorer

Imagine exploring a maze: you always walk forward, taking the first available turn. When you hit a dead end, you backtrack to the last intersection and try a different path.

Key Idea

DFS is driven by a stack (either the call stack via recursion, or an explicit stack). It visits vertices in a deep-first manner rather than level-by-level.

2 / 20

DFS Uses a Stack (or Recursion)

Recursion IS a Stack

Call Stack during DFS: DFS(F) <-- top of stack DFS(E) DFS(D) DFS(B) DFS(A) <-- bottom (first call) When DFS(F) returns, we "pop" back to DFS(E), then DFS(D)... This IS backtracking!

Key Insight

The recursive version uses the program's call stack implicitly. The iterative version uses an explicit stack data structure. Both do the same thing.

Two Equivalent Approaches

Recursive Iterative
Stack Call stack (implicit) Explicit Stack object
Push Recursive call stack.push(v)
Pop Function return stack.pop()
Risk Stack overflow on deep graphs Uses heap memory

Warning

Recursive DFS can cause a stack overflow on very deep graphs (e.g., a path of 100,000 nodes). The iterative version avoids this.

3 / 20

DFS Algorithm (Recursive)

Pseudocode

function DFS(G): for each vertex v in G: visited[v] = false for each vertex v in G: if not visited[v]: DFS-Visit(v) function DFS-Visit(u): visited[u] = true // process vertex u here for each neighbor v of u: if not visited[v]: DFS-Visit(v)

How It Works

  • Mark the current vertex as visited
  • Recurse on each unvisited neighbor
  • When all neighbors are visited, the function returns (backtracks)
  • Outer loop handles disconnected graphs
DFS-Visit(A) ├── DFS-Visit(B) │ ├── DFS-Visit(D) │ │ └── (no unvisited neighbors) │ └── DFS-Visit(E) │ └── DFS-Visit(F) └── DFS-Visit(C) └── (no unvisited neighbors)

Key Idea

The recursion tree of DFS-Visit calls IS the DFS tree of the graph.

4 / 20

DFS Algorithm (Iterative)

Pseudocode

function DFS-Iterative(G, source): let S = new Stack() let visited = new Set() S.push(source) while S is not empty: u = S.pop() if u not in visited: visited.add(u) // process vertex u here for each neighbor v of u: if v not in visited: S.push(v)

Step-by-Step Logic

  1. Push the source onto the stack
  2. Pop a vertex from the stack
  3. If it hasn't been visited yet, mark it visited
  4. Push all unvisited neighbors onto the stack
  5. Repeat until the stack is empty

Warning: Subtle Difference

The iterative version may visit vertices in a different order than the recursive version, because it pushes ALL neighbors at once. The recursive version visits one neighbor completely before even looking at the next.

BFS vs DFS: One Character Difference

Replace the Stack with a Queue, and you get BFS! That is the only structural difference.

5 / 20

DFS Step-by-Step Example

Graph with 7 vertices. We start DFS at vertex A. Neighbors processed in alphabetical order.

Our Graph (undirected): (A)-----(B) / \ \ / \ \ (C) (D)----(E) | | | | (F) (G) Adjacency Lists: A: [B, C, D] B: [A, E] C: [A, F] D: [A, E] E: [B, D, G] F: [C] G: [E]

Step 1: Visit A

Visited: {A} Stack: [A calls B, C, D] Path: A Action: Visit A, recurse on first unvisited neighbor: B

Step 2: Visit B

Visited: {A, B} Stack: [..., B calls A(skip), E] Path: A → B Action: Visit B, A already visited, recurse on E

Step 3: Visit E

Visited: {A, B, E} Stack: [..., E calls B(skip),D,G] Path: A → B → E Action: Visit E, B visited, recurse on D

Press S to reveal each step

6 / 20

DFS Step-by-Step (continued)

Graph State after Step 3: (A)-----(B) / \ \ / \ \ (C) (D)----(E) | | | | (F) (G) Green = visited White = unvisited

Step 4: Visit D

Visited: {A, B, E, D} Stack: [..., D calls A(skip),E(skip)] Path: A → B → E → D Action: Visit D, all neighbors visited. BACKTRACK to E!

Step 5: Visit G

Visited: {A, B, E, D, G} Stack: [..., G calls E(skip)] Path: A → B → E → G Action: Back at E, next unvisited neighbor is G. Visit G. G has no unvisited neighbors. BACKTRACK to E, then B, then A!

Step 6: Visit C

Visited: {A, B, E, D, G, C} Stack: [..., A calls C] Path: A → C Action: Back at A, next unvisited neighbor is C. Visit C. Recurse on F.

Press S to reveal each step

7 / 20

DFS Step-by-Step (Final)

Step 7: Visit F

Visited: {A, B, E, D, G, C, F} Stack: [..., F calls C(skip)] Path: A → C → F Action: Visit F. Only neighbor C is already visited. BACKTRACK to C, then A. A has no more unvisited neighbors. DONE!

DFS Visit Order

A → B → E → D → G → C → F

Notice how DFS goes deep (A→B→E→D) before backtracking. It does NOT visit level-by-level like BFS would.

Final DFS Tree

DFS Tree: (A) / \ (B) (C) | \ (E) (F) / \ (D) (G) Tree edges (solid): A-B, B-E, E-D, E-G, A-C, C-F Non-tree edges (not shown): A-D (D was already visited via E) B-A (A was already visited)
Complete Graph with edge types: (A)- - -(B) ━ = tree edge ┃ ╲ ┃ - = back edge ┃ ╲ ┃ (C) (D)- -(E) ┃ ┃ (F) (G)
8 / 20

The DFS Tree & Edge Classification

When DFS traverses a directed graph, every edge falls into one of four categories:

Directed Graph: DFS Tree: (A)→(B)→(D) A | | ↓ ┃ v v | B (C) (E)←┘ ┃ | ↓ E └→(F) ┃ ↓ D (A) (back!) Edge Classification: ━━━━━━━━━━━━━━━━━━━ A→B : Tree edge B→E : Tree edge E→D : Tree edge A→C : Tree edge D→E : Back edge (to ancestor) B→D : Forward edge (to descendant) C→F : Tree edge F→A : Back edge (to ancestor)
Edge Type Goes To Meaning
Tree Unvisited vertex Part of DFS tree
Back Ancestor in DFS tree Indicates a cycle!
Forward Descendant in DFS tree Shortcut down
Cross Neither ancestor nor descendant Between branches

Important for Undirected Graphs

In undirected graphs, there are only tree edges and back edges. Forward and cross edges cannot exist.

Key Idea

Back edge = cycle. This is the foundation of DFS-based cycle detection.

9 / 20

Discovery and Finish Times

DFS assigns two timestamps to each vertex: d[v] (discovery) and f[v] (finish).

DFS with timestamps: Graph: A → B → D | | v v C E Vertex d[v] f[v] ━━━━━━━━━━━━━━━━━━━━ A 1 10 B 2 7 D 3 4 E 5 6 C 8 9 Timeline: ──────────────────────────────> time 1 2 3 4 5 6 7 8 9 10 A[ B[ D[D] E[E] B] C[C] A] Parenthesis view: ( A ( B ( D ) ( E ) ) ( C ) )

Parenthesis Theorem

For any two vertices u and v, exactly one of these is true:

  • [d[u], f[u]] and [d[v], f[v]] are entirely disjoint (neither is ancestor of other)
  • One interval completely contains the other (ancestor-descendant relationship)

They never partially overlap. Like properly nested parentheses!

Analogy: Nested Boxes

Think of each vertex as a box that opens at time d[v] and closes at f[v]. Boxes are either completely inside one another or completely separate -- they never partially overlap.

Edge Classification via Timestamps

  • Tree/Forward: d[u] < d[v] < f[v] < f[u]
  • Back: d[v] < d[u] < f[u] < f[v]
  • Cross: d[v] < f[v] < d[u] < f[u]
10 / 20

BFS vs DFS Comparison

Same Graph, Different Orders

Graph: (A)-----(B) / \ \ / \ \ (C) (D)----(E) | | (F) (G) BFS from A (queue): DFS from A (stack): ━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━ Level 0: A A Level 1: B, C, D B Level 2: E, F E Level 3: G D G Order: A B C D E F G C F Order: A B E D G C F
BFS Tree: DFS Tree: (A) (A) / | \ / \ (B)(C)(D) (B) (C) | | \ | \ (E) (F) (E)x (E) (F) | / \ (G) (D) (G)
Property BFS DFS
Data structure Queue Stack
Exploration Level by level Deep first
Shortest path? Yes (unweighted) No
Path existence? Yes Yes
Cycle detection Possible Natural
Topological sort No Yes
Time complexity O(V + E) O(V + E)
Space complexity O(V) O(V)

When to Use Which?

  • BFS: Shortest path, closest neighbors, level-order
  • DFS: Cycle detection, topological sort, connected components, path finding, maze solving
11 / 20

Edge Classification in Directed Graphs

Directed Graph: (A)→(B)→(C) | ↗ | v / v (D)→(E)→(F) ↓ (A) ← back edge to ancestor! DFS from A (d/f times): ━━━━━━━━━━━━━━━━━━━━━━ A: 1/12 B: 2/11 C: 3/10 D: 5/8 E: 6/7 F: 4/9
DFS Tree: A (1/12) ┃ B (2/11) ┃ C (3/10) ┃ F (4/9) ┃ ... Edge A→D: forward (A is ancestor of D) Edge D→B: cross (B finished, different branch) Edge E→A: back (A is ancestor, still open)

How to Classify During DFS

Use vertex colors (states):

Color State Meaning
WHITE Undiscovered Not yet visited
GRAY Discovered In progress (on stack)
BLACK Finished Fully explored
// When exploring edge u → v: if color[v] == WHITE: // Tree edge if color[v] == GRAY: // Back edge (v is ancestor, CYCLE!) if color[v] == BLACK: if d[u] < d[v]: // Forward edge else: // Cross edge

The Critical Rule

Edge to a GRAY vertex = Back edge = CYCLE! A gray vertex is still being processed (it's an ancestor on the current DFS path).

12 / 20

Cycle Detection with DFS

Directed Graph: 3-Color Method

function hasCycle(G): for each vertex v in G: color[v] = WHITE for each vertex v in G: if color[v] == WHITE: if dfsDetect(v): return true return false function dfsDetect(u): color[u] = GRAY // in progress for each neighbor v of u: if color[v] == GRAY: return true // CYCLE! if color[v] == WHITE: if dfsDetect(v): return true color[u] = BLACK // done return false

Why Back Edge = Cycle?

If edge u → v is a back edge: v is an ancestor of u in DFS tree Path: v ~~> ... ~~> u (tree path) ^ | | | +<---- u → v ---+ (back edge) The tree path v ~~> u plus the back edge u → v forms a CYCLE!

Undirected Graph: Simpler

function dfsDetect(u, parent): visited[u] = true for each neighbor v of u: if not visited[v]: if dfsDetect(v, u): return true else if v != parent: return true // CYCLE! return false

Undirected Caveat

In undirected graphs, every edge appears twice (u-v and v-u). We must exclude the parent edge when checking for back edges.

13 / 20

Topological Sort

A linear ordering of vertices such that for every directed edge u → v, vertex u comes before v.

DAG (Directed Acyclic Graph): (A)→(B)→(D) | ↗ v / (C)→(E)→(F) Topological Order: A, C, E, B, D, F or A, B, C, E, D, F or A, B, C, E, F, D ... (multiple valid orders)

Key Idea

Topological sort = reverse of DFS finish order. When a vertex finishes (all descendants explored), prepend it to the result.

Algorithm

function topologicalSort(G): let result = [] let visited = new Set() for each vertex v in G: if v not in visited: dfs(v, visited, result) return result.reverse() function dfs(u, visited, result): visited.add(u) for each neighbor v of u: if v not in visited: dfs(v, visited, result) result.push(u) // post-order!

Prerequisite: No Cycles!

Topological sort only works on DAGs (Directed Acyclic Graphs). If there is a cycle, no valid topological order exists.

14 / 20

Topological Sort: Course Prerequisites

Course Prerequisite DAG: (CS101)→(CS201)→(CS301) | | | v v v (CS102)→(CS202) (CS401) | | v v (MATH1) (CS302) Adjacency: CS101 → CS201, CS102 CS201 → CS301, CS202 CS301 → CS401 CS102 → CS202, MATH1 CS202 → CS302

Analogy: Getting Dressed

Underwear before pants, socks before shoes, shirt before jacket. Topological sort gives you an order that respects ALL dependencies.

DFS Trace (finish order)

Start DFS at CS101: CS101 → CS201 → CS301 → CS401 finish: CS401 (1) finish: CS301 (2) → CS202 → CS302 finish: CS302 (3) finish: CS202 (4) finish: CS201 (5) → CS102 → MATH1 finish: MATH1 (6) (CS202 already visited) finish: CS102 (7) finish: CS101 (8)

Result

Finish order: CS401, CS301, CS302, CS202, CS201, MATH1, CS102, CS101

Reversed (topological):

CS101, CS102, MATH1, CS201, CS202, CS302, CS301, CS401

Every course appears after all its prerequisites. Valid schedule!

Press S to reveal steps

15 / 20

Application: Connected Components (Undirected)

Undirected Graph with 3 components: Component 0: Component 1: (A)--(B) (E)--(F) | / | (C) (G) Component 2: (H)--(I) Run DFS from each unvisited vertex. Each DFS call discovers one full connected component.
function connectedComponents(G): let comp = 0 let visited = new Set() let component = {} for each vertex v in G: if v not in visited: dfs(v, visited, comp, component) comp += 1 function dfs(u, visited, comp, component): visited.add(u) component[u] = comp for each neighbor v of u: if v not in visited: dfs(v, visited, comp, component)

How It Works

Step 1: DFS from A (Component 0) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Visit: A → B → C (all connected) component[A]=0, component[B]=0, component[C]=0
Step 2: A,B,C visited. Next unvisited: E DFS from E (Component 1) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Visit: E → F, E → G component[E]=1, component[F]=1, component[G]=1
Step 3: Next unvisited: H DFS from H (Component 2) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Visit: H → I component[H]=2, component[I]=2 Result: 3 connected components found!

Key Idea

Each call to DFS from the outer loop discovers exactly one connected component. The total number of outer-loop calls = number of components.

Press S to reveal steps

16 / 20

Application: Strongly Connected Components (Directed)

An SCC is a maximal set of vertices where every vertex is reachable from every other vertex.

Directed Graph: (A)→(B) (E)→(F) ↑ | ↑ | | v | v (D)<-(C) (H)<-(G) SCC 1: {A, B, C, D} (cycle: A→B→C→D→A) SCC 2: {E, F, G, H} (cycle: E→F→G→H→E)

Kosaraju's Algorithm (Overview)

  1. Run DFS on original graph G. Record finish times.
  2. Compute GT (reverse all edges).
  3. Run DFS on GT in decreasing finish time order.
  4. Each DFS tree in step 3 = one SCC.
Kosaraju's Step-by-Step: Step 1: DFS on G, finish order: D(4), C(5), B(6), A(7), H(2), G(3), F(8), E(9) Step 2: Reverse all edges (G^T): (A)<-(B) (E)<-(F) | ↑ | ↑ v | v | (D)→(C) (H)→(G) Step 3: DFS on G^T, process in decreasing finish order: E(9),F(8),A(7),B(6),... DFS from E: visits E,H,G,F → SCC! DFS from A: visits A,D,C,B → SCC! Result: 2 SCCs found

Analogy

Think of SCCs as "islands" in a directed graph where you can travel between any two cities on the same island. Kosaraju's finds these islands by looking at the graph forwards AND backwards.

17 / 20

Application: Maze Generation & Solving

Maze Generation (DFS Random Walk)

Grid of cells (all walls up): +--+--+--+--+ | | | | | +--+--+--+--+ | | | | | +--+--+--+--+ | | | | | +--+--+--+--+ DFS with random neighbor selection: Start at (0,0), pick random unvisited neighbor, remove wall between them. Backtrack when stuck. Result: +--+--+--+--+ | | | + +--+ + + | | | | + + +--+ + | | | +--+--+--+--+

DFS creates long, winding corridors with few branches -- a "perfect maze" (exactly one path between any two cells).

Maze Solving (DFS Backtracking)

Maze: +--+--+--+--+ |S | | + +--+ + + | | | | + + +--+ + | | E| +--+--+--+--+ DFS solving: +--+--+--+--+ |* * * | | + +--+ + + | | * | | + + +--+ + | | * *| +--+--+--+--+ * = solution path

Why DFS Works for Mazes

DFS naturally backtracks when it hits dead ends, making it perfect for exploring mazes. It finds a path (not necessarily the shortest). Use BFS if you need the shortest path.

Analogy: Wall Follower

DFS maze solving is like the "always follow the left wall" strategy. You explore one path completely before trying another.

18 / 20

Time and Space Complexity

Time: O(V + E)

Why O(V + E)? Each vertex is visited exactly ONCE: visited[v] = true → O(V) total Each edge is examined exactly ONCE (directed) or TWICE (undirected): for each neighbor v of u → O(E) total Total: O(V) + O(E) = O(V + E) ┌──────────────────────────────────┐ │ V = number of vertices │ │ E = number of edges │ │ │ │ Dense graph: E ≈ V² → O(V²) │ │ Sparse graph: E ≈ V → O(V) │ └──────────────────────────────────┘

Space: O(V)

Space breakdown: visited[] array: O(V) Stack (explicit): O(V) worst case Call stack (recursive): O(V) worst case Worst case for stack depth: ┌─────────────────────────────┐ │ Path graph: A-B-C-D-...-Z │ │ Stack depth = V │ │ │ │ A → B → C → ... → Z │ │ Stack: [A, B, C, ..., Z] │ └─────────────────────────────┘
Representation DFS Time DFS Space
Adjacency List O(V + E) O(V)
Adjacency Matrix O(V2) O(V)

Adjacency Matrix Penalty

With an adjacency matrix, finding neighbors of a vertex takes O(V) instead of O(degree(v)), so total time becomes O(V2) regardless of the number of edges.

19 / 20

Summary & Cheat Sheet

DFS at a Glance

Property Value
Data Structure Stack (or recursion)
Strategy Go deep, then backtrack
Time O(V + E)
Space O(V)
Shortest Path? No (use BFS)
Complete? Yes (finite graphs)

Core Applications

  • Cycle detection (back edge = cycle)
  • Topological sort (reverse finish order)
  • Connected components (undirected)
  • Strongly connected components (Kosaraju/Tarjan)
  • Path finding and backtracking
  • Maze generation and solving

Quick Reference: Pseudocode

// Recursive DFS function DFS-Visit(u): visited[u] = true d[u] = ++time for each v in adj[u]: if not visited[v]: DFS-Visit(v) f[u] = ++time // Iterative DFS function DFS-Iter(source): stack.push(source) while stack not empty: u = stack.pop() if not visited[u]: visited[u] = true for each v in adj[u]: stack.push(v) // Topological Sort // = reverse of DFS finish order // Cycle Detection // = edge to GRAY vertex (directed) // = edge to visited non-parent (undirected)
DFS Mental Model: ┌──────────────────────────────┐ │ 1. Push / Call │ │ 2. Pop / Enter function │ │ 3. Mark visited │ │ 4. Process │ │ 5. Push neighbors / Recurse │ │ 6. Backtrack when stuck │ └──────────────────────────────┘
20 / 20