DFS - Depth-First Search

Go Deep, Then Backtrack

A --> B --> D --> (dead end, backtrack!) | | v v C E --> F | v G

CS205 Data Structures

Use arrow keys or buttons to navigate • Press S to reveal steps

1 / 20

What is DFS?

Depth-First Search explores a graph by going as deep as possible along each branch before backtracking.

Graph: DFS Order: (A)----(B) A (1) | \ | | | \ | B (2) | \ | | (C) (D) D (3) | | (E)----(F) F (4) | E (5) backtrack to A... C (6)

Analogy: Maze Explorer

Imagine exploring a maze: you always walk forward, taking the first available turn. When you hit a dead end, you backtrack to the last intersection and try a different path.

Key Idea

DFS is driven by a stack (either the call stack via recursion, or an explicit stack). It visits vertices in a deep-first manner rather than level-by-level.

2 / 20

DFS Uses a Stack (or Recursion)

Recursion IS a Stack

Call Stack during DFS: DFS(F) <-- top of stack DFS(E) DFS(D) DFS(B) DFS(A) <-- bottom (first call) When DFS(F) returns, we "pop" back to DFS(E), then DFS(D)... This IS backtracking!

Key Insight

The recursive version uses the program's call stack implicitly. The iterative version uses an explicit stack data structure. Both do the same thing.

Two Equivalent Approaches

	Recursive	Iterative
Stack	Call stack (implicit)	Explicit Stack object
Push	Recursive call	stack.push(v)
Pop	Function return	stack.pop()
Risk	Stack overflow on deep graphs	Uses heap memory

Warning

Recursive DFS can cause a stack overflow on very deep graphs (e.g., a path of 100,000 nodes). The iterative version avoids this.

3 / 20

DFS Algorithm (Recursive)

Pseudocode

function DFS(G):
    for each vertex v in G:
        visited[v] = false

    for each vertex v in G:
        if not visited[v]:
            DFS-Visit(v)


function DFS-Visit(u):
    visited[u] = true
    // process vertex u here

    for each neighbor v of u:
        if not visited[v]:
            DFS-Visit(v)
      

How It Works

Mark the current vertex as visited
Recurse on each unvisited neighbor
When all neighbors are visited, the function returns (backtracks)
Outer loop handles disconnected graphs

DFS-Visit(A) ├── DFS-Visit(B) │ ├── DFS-Visit(D) │ │ └── (no unvisited neighbors) │ └── DFS-Visit(E) │ └── DFS-Visit(F) └── DFS-Visit(C) └── (no unvisited neighbors)

Key Idea

The recursion tree of DFS-Visit calls IS the DFS tree of the graph.

4 / 20

DFS Algorithm (Iterative)

Pseudocode

function DFS-Iterative(G, source):
    let S = new Stack()
    let visited = new Set()

    S.push(source)

    while S is not empty:
        u = S.pop()

        if u not in visited:
            visited.add(u)
            // process vertex u here

            for each neighbor v of u:
                if v not in visited:
                    S.push(v)
      

Step-by-Step Logic

Push the source onto the stack
Pop a vertex from the stack
If it hasn't been visited yet, mark it visited
Push all unvisited neighbors onto the stack
Repeat until the stack is empty

Warning: Subtle Difference

The iterative version may visit vertices in a different order than the recursive version, because it pushes ALL neighbors at once. The recursive version visits one neighbor completely before even looking at the next.

BFS vs DFS: One Character Difference

Replace the Stack with a Queue, and you get BFS! That is the only structural difference.

5 / 20

DFS Step-by-Step Example

Graph with 7 vertices. We start DFS at vertex A. Neighbors processed in alphabetical order.

Our Graph (undirected): (A)-----(B) / \ \ / \ \ (C) (D)----(E) | | | | (F) (G) Adjacency Lists: A: [B, C, D] B: [A, E] C: [A, F] D: [A, E] E: [B, D, G] F: [C] G: [E]

Step 1: Visit A

Visited: {A} Stack: [A calls B, C, D] Path: A Action: Visit A, recurse on first unvisited neighbor: B

Step 2: Visit B

Visited: {A, B} Stack: [..., B calls A(skip), E] Path: A → B Action: Visit B, A already visited, recurse on E

Step 3: Visit E

Visited: {A, B, E} Stack: [..., E calls B(skip),D,G] Path: A → B → E Action: Visit E, B visited, recurse on D

Press S to reveal each step

6 / 20

DFS Step-by-Step (continued)

Graph State after Step 3: (A)-----(B) / \ \ / \ \ (C) (D)----(E) | | | | (F) (G) Green = visited White = unvisited

Step 4: Visit D

Visited: {A, B, E, D} Stack: [..., D calls A(skip),E(skip)] Path: A → B → E → D Action: Visit D, all neighbors visited. BACKTRACK to E!

Step 5: Visit G

Visited: {A, B, E, D, G} Stack: [..., G calls E(skip)] Path: A → B → E → G Action: Back at E, next unvisited neighbor is G. Visit G. G has no unvisited neighbors. BACKTRACK to E, then B, then A!

Step 6: Visit C

Visited: {A, B, E, D, G, C} Stack: [..., A calls C] Path: A → C Action: Back at A, next unvisited neighbor is C. Visit C. Recurse on F.

Press S to reveal each step

7 / 20

DFS Step-by-Step (Final)

Step 7: Visit F

Visited: {A, B, E, D, G, C, F} Stack: [..., F calls C(skip)] Path: A → C → F Action: Visit F. Only neighbor C is already visited. BACKTRACK to C, then A. A has no more unvisited neighbors. DONE!

DFS Visit Order

A → B → E → D → G → C → F

Notice how DFS goes deep (A→B→E→D) before backtracking. It does NOT visit level-by-level like BFS would.

Final DFS Tree

DFS Tree: (A) / \ (B) (C) | \ (E) (F) / \ (D) (G) Tree edges (solid): A-B, B-E, E-D, E-G, A-C, C-F Non-tree edges (not shown): A-D (D was already visited via E) B-A (A was already visited)

Complete Graph with edge types: (A)- - -(B) ━ = tree edge ┃ ╲ ┃ - = back edge ┃ ╲ ┃ (C) (D)- -(E) ┃ ┃ (F) (G)

8 / 20

The DFS Tree & Edge Classification

When DFS traverses a directed graph, every edge falls into one of four categories:

Directed Graph: DFS Tree: (A)→(B)→(D) A | | ↓ ┃ v v | B (C) (E)←┘ ┃ | ↓ E └→(F) ┃ ↓ D (A) (back!) Edge Classification: ━━━━━━━━━━━━━━━━━━━ A→B : Tree edge B→E : Tree edge E→D : Tree edge A→C : Tree edge D→E : Back edge (to ancestor) B→D : Forward edge (to descendant) C→F : Tree edge F→A : Back edge (to ancestor)

Edge Type	Goes To	Meaning
Tree	Unvisited vertex	Part of DFS tree
Back	Ancestor in DFS tree	Indicates a cycle!
Forward	Descendant in DFS tree	Shortcut down
Cross	Neither ancestor nor descendant	Between branches

Important for Undirected Graphs

In undirected graphs, there are only tree edges and back edges. Forward and cross edges cannot exist.

Key Idea

Back edge = cycle. This is the foundation of DFS-based cycle detection.

9 / 20

Discovery and Finish Times

DFS assigns two timestamps to each vertex: d[v] (discovery) and f[v] (finish).

DFS with timestamps: Graph: A → B → D | | v v C E Vertex d[v] f[v] ━━━━━━━━━━━━━━━━━━━━ A 1 10 B 2 7 D 3 4 E 5 6 C 8 9 Timeline: ──────────────────────────────> time 1 2 3 4 5 6 7 8 9 10 A[ B[ D[D] E[E] B] C[C] A] Parenthesis view: ( A ( B ( D ) ( E ) ) ( C ) )

Parenthesis Theorem

For any two vertices u and v, exactly one of these is true:

[d[u], f[u]] and [d[v], f[v]] are entirely disjoint (neither is ancestor of other)
One interval completely contains the other (ancestor-descendant relationship)

They never partially overlap. Like properly nested parentheses!

Analogy: Nested Boxes

Think of each vertex as a box that opens at time d[v] and closes at f[v]. Boxes are either completely inside one another or completely separate -- they never partially overlap.

Edge Classification via Timestamps

Tree/Forward: d[u] < d[v] < f[v] < f[u]
Back: d[v] < d[u] < f[u] < f[v]
Cross: d[v] < f[v] < d[u] < f[u]

10 / 20

BFS vs DFS Comparison

Same Graph, Different Orders

Graph: (A)-----(B) / \ \ / \ \ (C) (D)----(E) | | (F) (G) BFS from A (queue): DFS from A (stack): ━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━ Level 0: A A Level 1: B, C, D B Level 2: E, F E Level 3: G D G Order: A B C D E F G C F Order: A B E D G C F

Property	BFS	DFS
Data structure	Queue	Stack
Exploration	Level by level	Deep first
Shortest path?	Yes (unweighted)	No
Path existence?	Yes	Yes
Cycle detection	Possible	Natural
Topological sort	No	Yes
Time complexity	O(V + E)	O(V + E)
Space complexity	O(V)	O(V)

When to Use Which?

BFS: Shortest path, closest neighbors, level-order
DFS: Cycle detection, topological sort, connected components, path finding, maze solving

11 / 20

Edge Classification in Directed Graphs

Directed Graph: (A)→(B)→(C) | ↗ | v / v (D)→(E)→(F) ↓ (A) ← back edge to ancestor! DFS from A (d/f times): ━━━━━━━━━━━━━━━━━━━━━━ A: 1/12 B: 2/11 C: 3/10 D: 5/8 E: 6/7 F: 4/9

DFS Tree: A (1/12) ┃ B (2/11) ┃ C (3/10) ┃ F (4/9) ┃ ... Edge A→D: forward (A is ancestor of D) Edge D→B: cross (B finished, different branch) Edge E→A: back (A is ancestor, still open)

How to Classify During DFS

Use vertex colors (states):

Color	State	Meaning
WHITE	Undiscovered	Not yet visited
GRAY	Discovered	In progress (on stack)
BLACK	Finished	Fully explored

// When exploring edge u → v:
if color[v] == WHITE:
    // Tree edge
if color[v] == GRAY:
    // Back edge (v is ancestor, CYCLE!)
if color[v] == BLACK:
    if d[u] < d[v]:
        // Forward edge
    else:
        // Cross edge
      

The Critical Rule

Edge to a GRAY vertex = Back edge = CYCLE! A gray vertex is still being processed (it's an ancestor on the current DFS path).

12 / 20

Cycle Detection with DFS

Directed Graph: 3-Color Method

function hasCycle(G):
    for each vertex v in G:
        color[v] = WHITE

    for each vertex v in G:
        if color[v] == WHITE:
            if dfsDetect(v):
                return true
    return false

function dfsDetect(u):
    color[u] = GRAY   // in progress

    for each neighbor v of u:
        if color[v] == GRAY:
            return true  // CYCLE!
        if color[v] == WHITE:
            if dfsDetect(v):
                return true

    color[u] = BLACK   // done
    return false
      

Why Back Edge = Cycle?

If edge u → v is a back edge: v is an ancestor of u in DFS tree Path: v ~~> ... ~~> u (tree path) ^ | | | +<---- u → v ---+ (back edge) The tree path v ~~> u plus the back edge u → v forms a CYCLE!

Undirected Graph: Simpler

function dfsDetect(u, parent):
    visited[u] = true

    for each neighbor v of u:
        if not visited[v]:
            if dfsDetect(v, u):
                return true
        else if v != parent:
            return true  // CYCLE!

    return false
      

Undirected Caveat

In undirected graphs, every edge appears twice (u-v and v-u). We must exclude the parent edge when checking for back edges.

13 / 20

Topological Sort

A linear ordering of vertices such that for every directed edge u → v, vertex u comes before v.

DAG (Directed Acyclic Graph): (A)→(B)→(D) | ↗ v / (C)→(E)→(F) Topological Order: A, C, E, B, D, F or A, B, C, E, D, F or A, B, C, E, F, D ... (multiple valid orders)

Key Idea

Topological sort = reverse of DFS finish order. When a vertex finishes (all descendants explored), prepend it to the result.

Algorithm

function topologicalSort(G):
    let result = []
    let visited = new Set()

    for each vertex v in G:
        if v not in visited:
            dfs(v, visited, result)

    return result.reverse()

function dfs(u, visited, result):
    visited.add(u)

    for each neighbor v of u:
        if v not in visited:
            dfs(v, visited, result)

    result.push(u) // post-order!
      

Prerequisite: No Cycles!

Topological sort only works on DAGs (Directed Acyclic Graphs). If there is a cycle, no valid topological order exists.

14 / 20

Topological Sort: Course Prerequisites

Course Prerequisite DAG: (CS101)→(CS201)→(CS301) | | | v v v (CS102)→(CS202) (CS401) | | v v (MATH1) (CS302) Adjacency: CS101 → CS201, CS102 CS201 → CS301, CS202 CS301 → CS401 CS102 → CS202, MATH1 CS202 → CS302

Analogy: Getting Dressed

Underwear before pants, socks before shoes, shirt before jacket. Topological sort gives you an order that respects ALL dependencies.

DFS Trace (finish order)

Start DFS at CS101: CS101 → CS201 → CS301 → CS401 finish: CS401 (1) finish: CS301 (2) → CS202 → CS302 finish: CS302 (3) finish: CS202 (4) finish: CS201 (5) → CS102 → MATH1 finish: MATH1 (6) (CS202 already visited) finish: CS102 (7) finish: CS101 (8)

Result

Finish order: CS401, CS301, CS302, CS202, CS201, MATH1, CS102, CS101

Reversed (topological):

CS101, CS102, MATH1, CS201, CS202, CS302, CS301, CS401

Every course appears after all its prerequisites. Valid schedule!

Press S to reveal steps

15 / 20

Application: Connected Components (Undirected)

Undirected Graph with 3 components: Component 0: Component 1: (A)--(B) (E)--(F) | / | (C) (G) Component 2: (H)--(I) Run DFS from each unvisited vertex. Each DFS call discovers one full connected component.

function connectedComponents(G):
    let comp = 0
    let visited = new Set()
    let component = {}

    for each vertex v in G:
        if v not in visited:
            dfs(v, visited, comp, component)
            comp += 1

function dfs(u, visited, comp, component):
    visited.add(u)
    component[u] = comp
    for each neighbor v of u:
        if v not in visited:
            dfs(v, visited, comp, component)
      

How It Works

Step 1: DFS from A (Component 0) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Visit: A → B → C (all connected) component[A]=0, component[B]=0, component[C]=0

Step 2: A,B,C visited. Next unvisited: E DFS from E (Component 1) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Visit: E → F, E → G component[E]=1, component[F]=1, component[G]=1

Step 3: Next unvisited: H DFS from H (Component 2) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Visit: H → I component[H]=2, component[I]=2 Result: 3 connected components found!

Key Idea

Each call to DFS from the outer loop discovers exactly one connected component. The total number of outer-loop calls = number of components.

Press S to reveal steps

16 / 20

Application: Strongly Connected Components (Directed)

An SCC is a maximal set of vertices where every vertex is reachable from every other vertex.

Directed Graph: (A)→(B) (E)→(F) ↑ | ↑ | | v | v (D)<-(C) (H)<-(G) SCC 1: {A, B, C, D} (cycle: A→B→C→D→A) SCC 2: {E, F, G, H} (cycle: E→F→G→H→E)

Kosaraju's Algorithm (Overview)

Run DFS on original graph G. Record finish times.
Compute G^T (reverse all edges).
Run DFS on G^T in decreasing finish time order.
Each DFS tree in step 3 = one SCC.

Kosaraju's Step-by-Step: Step 1: DFS on G, finish order: D(4), C(5), B(6), A(7), H(2), G(3), F(8), E(9) Step 2: Reverse all edges (G^T): (A)<-(B) (E)<-(F) | ↑ | ↑ v | v | (D)→(C) (H)→(G) Step 3: DFS on G^T, process in decreasing finish order: E(9),F(8),A(7),B(6),... DFS from E: visits E,H,G,F → SCC! DFS from A: visits A,D,C,B → SCC! Result: 2 SCCs found

Analogy

Think of SCCs as "islands" in a directed graph where you can travel between any two cities on the same island. Kosaraju's finds these islands by looking at the graph forwards AND backwards.

17 / 20

Application: Maze Generation & Solving

Maze Generation (DFS Random Walk)

Grid of cells (all walls up): +--+--+--+--+ | | | | | +--+--+--+--+ | | | | | +--+--+--+--+ | | | | | +--+--+--+--+ DFS with random neighbor selection: Start at (0,0), pick random unvisited neighbor, remove wall between them. Backtrack when stuck. Result: +--+--+--+--+ | | | + +--+ + + | | | | + + +--+ + | | | +--+--+--+--+

DFS creates long, winding corridors with few branches -- a "perfect maze" (exactly one path between any two cells).

Maze Solving (DFS Backtracking)

Maze: +--+--+--+--+ |S | | + +--+ + + | | | | + + +--+ + | | E| +--+--+--+--+ DFS solving: +--+--+--+--+ |* * * | | + +--+ + + | | * | | + + +--+ + | | * *| +--+--+--+--+ * = solution path

Why DFS Works for Mazes

DFS naturally backtracks when it hits dead ends, making it perfect for exploring mazes. It finds a path (not necessarily the shortest). Use BFS if you need the shortest path.

Analogy: Wall Follower

DFS maze solving is like the "always follow the left wall" strategy. You explore one path completely before trying another.

18 / 20

Time and Space Complexity

Time: O(V + E)

Why O(V + E)? Each vertex is visited exactly ONCE: visited[v] = true → O(V) total Each edge is examined exactly ONCE (directed) or TWICE (undirected): for each neighbor v of u → O(E) total Total: O(V) + O(E) = O(V + E) ┌──────────────────────────────────┐ │ V = number of vertices │ │ E = number of edges │ │ │ │ Dense graph: E ≈ V² → O(V²) │ │ Sparse graph: E ≈ V → O(V) │ └──────────────────────────────────┘

Space: O(V)

Space breakdown: visited[] array: O(V) Stack (explicit): O(V) worst case Call stack (recursive): O(V) worst case Worst case for stack depth: ┌─────────────────────────────┐ │ Path graph: A-B-C-D-...-Z │ │ Stack depth = V │ │ │ │ A → B → C → ... → Z │ │ Stack: [A, B, C, ..., Z] │ └─────────────────────────────┘

Representation	DFS Time	DFS Space
Adjacency List	O(V + E)	O(V)
Adjacency Matrix	O(V²)	O(V)

Adjacency Matrix Penalty

With an adjacency matrix, finding neighbors of a vertex takes O(V) instead of O(degree(v)), so total time becomes O(V²) regardless of the number of edges.

19 / 20

Summary & Cheat Sheet

DFS at a Glance

Property	Value
Data Structure	Stack (or recursion)
Strategy	Go deep, then backtrack
Time	O(V + E)
Space	O(V)
Shortest Path?	No (use BFS)
Complete?	Yes (finite graphs)

Core Applications

Cycle detection (back edge = cycle)
Topological sort (reverse finish order)
Connected components (undirected)
Strongly connected components (Kosaraju/Tarjan)
Path finding and backtracking
Maze generation and solving

Quick Reference: Pseudocode

// Recursive DFS
function DFS-Visit(u):
    visited[u] = true
    d[u] = ++time
    for each v in adj[u]:
        if not visited[v]:
            DFS-Visit(v)
    f[u] = ++time

// Iterative DFS
function DFS-Iter(source):
    stack.push(source)
    while stack not empty:
        u = stack.pop()
        if not visited[u]:
            visited[u] = true
            for each v in adj[u]:
                stack.push(v)

// Topological Sort
// = reverse of DFS finish order

// Cycle Detection
// = edge to GRAY vertex (directed)
// = edge to visited non-parent (undirected)
      

DFS Mental Model: ┌──────────────────────────────┐ │ 1. Push / Call │ │ 2. Pop / Enter function │ │ 3. Mark visited │ │ 4. Process │ │ 5. Push neighbors / Recurse │ │ 6. Backtrack when stuck │ └──────────────────────────────┘

20 / 20