Deterministic Finite Automata (DFA)

A complete guide to DFAs -- formal definitions, design patterns, minimization, and the tennis scoring example

Based on CS305 lecture materials (Ullman: Finite Automata & Tennis)

Use Arrow Keys or the buttons below to navigate. Press S to step through animations.

1 / 20

The Big Picture

Where do DFAs fit in the Chomsky hierarchy?

+==============================================================+ | Recursively Enumerable (Type 0) | | Turing Machines recognize these | | +========================================================+ | | | Context-Sensitive (Type 1) | | | | +=================================================+ | | | | | Context-Free (Type 2) | | | | | | Pushdown Automata (PDAs) recognize these | | | | | | Example: balanced parentheses, a^n b^n | | | | | | +==========================================+ | | | | | | | Regular Languages (Type 3) | | | | | | | | | | | | | | | | *** DFAs recognize EXACTLY these *** | | | | | | | | | | | | | | | | Also: NFAs, Regular Expressions, | | | | | | | | Regular Grammars | | | | | | | +==========================================+ | | | | | +=================================================+ | | | +========================================================+ | +==============================================================+

Why does this matter?

DFAs are the simplest computational model that still recognizes a useful class of languages. They are the foundation of everything above. Understanding DFAs is the first step toward understanding all of computation theory.

2 / 20

What is a Finite Automaton?

Intuition before formalism -- you already know these!

A finite automaton is a formal system that:

  • Remembers only a finite amount of information
  • Information is represented by its state
  • State changes in response to inputs
  • Rules for state changes are called transitions
Turnstile: Traffic Light: +--------+ coin +-------+ timer | LOCKED |-------+ | RED |-------+ +--------+ | +-------+ | ^ v ^ v | +----------+ | +---------+ +--push-| UNLOCKED | timer | GREEN | +----------+ | +---------+ | | +--------+ timer | | YELLOW |<-------+ +--------+

Analogy: Vending Machine

A vending machine is a finite automaton! It has states (how much money inserted), inputs (coins, button presses), and transitions (inserting a quarter changes the state from "$0.50" to "$0.75"). It has finitely many states and follows deterministic rules.

Why "Finite"?

The machine has a fixed, finite number of states. It cannot count to infinity or remember an unbounded amount of data. This is its fundamental limitation -- and also what makes it analyzable.

Why study finite automata?

Used in circuit design and verification, text processing (grep, regex), compilers (lexical analysis), protocol verification, and modeling simple patterns of events.

3 / 20

Formal Definition: The 5-Tuple

A DFA is precisely defined by (Q, Σ, δ, q0, F)

SymbolNameWhat it is
QStatesA finite set of states
ΣAlphabetA finite set of input symbols
δTransition functionδ : Q × Σ → Q
q0Start stateq0 ∈ Q (exactly one)
FAccept statesF ⊆ Q (zero or more)
Example DFA: Q = {q0, q1, q2} Sigma = {0, 1} q0 = q0 (start state) F = {q2} (accept states) delta: delta(q0, 0) = q1 delta(q0, 1) = q0 delta(q1, 0) = q2 delta(q1, 1) = q0 delta(q2, 0) = q2 delta(q2, 1) = q2

The "Deterministic" in DFA

δ is a total function: for every state q and every symbol a, δ(q, a) produces exactly one next state. No choices. No ambiguity. No missing transitions.

Common exam mistake

Students often draw DFAs with missing transitions. If state q has no arrow for symbol 'a', that is NOT a valid DFA! Every state must have exactly one outgoing transition for each symbol in Σ.

Analogy: GPS Navigation

A DFA is like a GPS that gives you exactly one instruction at every intersection. "Turn left." No options, no ambiguity. You just follow the rules. At the end of your trip, you check: "Am I at an acceptable destination?"

4 / 20

State Diagram Notation

How to draw DFAs -- the visual language

The four visual elements

1. STATES = circles ( q0 ) ( q1 ) ( q2 ) 2. START STATE = arrow from nowhere -->( q0 ) 3. ACCEPT/FINAL STATES = double circle (( q2 )) 4. TRANSITIONS = labeled arrows a ( q0 )----->( q1 ) Self-loop: +--a--+ | | v | ( q0 )-+

Complete example: "contains at least two 0s"

0 0 -->( q0 )------>( q1 )------>(( q2 )) | | | | 1 | 1 | 0,1 v v v ( q0 ) ( q1 ) (( q2 )) (self-loop) (self-loop) (self-loop) Cleaner drawing: +--1--+ +--1--+ +--0,1-+ | | | | | | v | v | v | -->( q0 )--+ ( q1 )--+ (( q2 ))--+ | 0 | 0 +----------> +---------->

Reading a state diagram

Start at the arrow. Follow transitions for each input symbol. After all input is consumed, check: are you in a double-circle state? If yes: accept. If no: reject.

5 / 20

Transition Tables

An equivalent way to represent DFAs -- great for systematic work

The same DFA, two representations

State Diagram: +--1--+ +--1--+ +-0,1--+ | | | | | | v | v | v | -->( q0 )-+ ( q1 )-+ (( q2 ))--+ | 0 | 0 +--------> +-------->

Transition Table:

01
→ q0q1q0
q1q2q1
*q2q2q2

→ = start state, * = accept state

How to read the table

Each row is a state. Each column is an input symbol. The cell at row q, column a gives δ(q, a) -- the next state.

Row "q0", column "0" says q1. This means: "from state q0, on input 0, go to q1."

Completeness check

In a valid DFA transition table, every cell must be filled with exactly one state. If any cell is empty or has multiple states, it is not a DFA.

When to use which?

Diagrams are better for intuition and seeing structure. Tables are better for algorithms (minimization, product construction) and for making sure you have not missed any transitions.

6 / 20

Example: Even Number of 0s

Building a DFA step by step -- the thought process matters

The problem

Design a DFA over Σ = {0, 1} that accepts all strings with an even number of 0s (including zero 0s).

Step 1: What do we need to remember?

We only care about the parity (even/odd) of the count of 0s seen so far. That is a finite amount of information -- just 2 states!

Step 2: Define states
  • E = "seen an even number of 0s so far"
  • O = "seen an odd number of 0s so far"
Step 3: Start state = E (zero 0s is even)
Step 4: Accept state = {E} (we want even count)

Step 5: Define transitions

01
→ *EOE
OEO

Reading a 0 flips parity. Reading a 1 keeps it the same.

Step 6: Draw the state diagram

+---1---+ +---1---+ | | | | v | v | -->(( E ))--+ ( O )---+ | 0 ^ +-------------------->| |<--------------------+ 0

Design principle

Ask: "What is the minimum information I need to remember?" That tells you how many states you need. For parity questions, the answer is always 2.

7 / 20

DFA Simulator: Even Number of 0s

Type a binary string and watch the DFA process it step by step.

E O 0 0 1 1
Enter a binary string and press Step or Run.

The procedure is purely mechanical

Start at q0. For each symbol in the input (left to right), look up the next state. At the end, check membership in F. That is all a DFA does.

8 / 20

Tennis Scoring as a DFA

A real-world DFA example from the lecture: scoring a game of tennis

Tennis game rules

  • One person serves throughout a game
  • Points: Love (0), 15, 30, 40
  • Must score at least 4 points to win
  • Must win by at least 2 points
  • When tied at 40-40: Deuce
  • From Deuce, one point ahead: Advantage

Alphabet

Σ = {s, o}

  • s = server wins the point
  • o = opponent wins the point

Why is this a DFA?

The score is the state. Each point (s or o) deterministically moves to exactly one new score. There is a finite number of possible scores. The "Server Wins" state is the accept state.

Tennis DFA (20 states): s = server wins point, o = opponent wins point Scores shown as Server-Opponent. Accept = ((Server Wins)) s Start -->(Love)--->(15-Lv)--->(30-Lv)--->(40-Lv)--->((SvrW)) | o | o | o | o ^ ^ v v v v | | (Lv-15) (15-all) (30-15) (40-15)--s------+ | | o | o | o | o | v v v v s | (Lv-30) (15-30) (30-all) (40-30)--------+ | | o | o | o | o | v v v v | (Lv-40) (15-40) (30-40) (Deuce)<---+ s | | o | o | o | | | | | v v v s o | | | (OppW) (OppW) (OppW) | | | | | ^ ^ ^ v v | | | | | | (Ad-in)(Ad-out) | | | +----------+--------o | s | | | from any losing path +-->(Deuce)+ | | o-->(OppW) | +------------------------------------------------+ (Ad-in)---s--->((SvrW))

SvrW = Server Wins, OppW = Opponent Wins, Ad = Advantage, Lv = Love

9 / 20

Tennis DFA: Full State Diagram

The complete picture with all 20 states and transitions

s s s Start-->(Love)---------->(15-Love)---------->(30-Love)---------->(40-Love)---------->((Server Wins)) | s | s | s | o ^ ^ | o | o | o v s | | v v v (40-15)--------+ s | | (Love-15) (15-all) (30-15) | | | | | s | s | s | o | | | | o | o | o v s | | | v v v (40-30)------+ | | | (Love-30) (15-30) (30-all) | | | | | s | s | s | o | | | | o | o | o v s o | | | v v v (Deuce)---->(Ad-in)---->(Deuce) | (Love-40) (15-40) (30-40) ^ | | | | s | s | s | | o | s | | o | o | o | v +--------------------+ v v v |(Ad-out) ((Opp Wins))<-----((Opp Wins))<-------((Opp Wins)) | | | | o +--+ s-->(Deuce) o-->((Opp Wins))

Example trace: "sosososososs"

Love →s 15-Lv →o 15-all →s 30-15 →o 30-all →s 40-30 →o Deuce →s Ad-in →o Deuce →s Ad-in →o Deuce →s Ad-in →s Server Wins!

Note: Server Wins and Opp Wins are "absorbing"

Once the game is over, both s and o transitions from Server Wins lead back to Server Wins (and similarly for Opp Wins). The game is over -- no more points change the outcome. These are sometimes called trap states or absorbing states.

10 / 20

Extended Transition Function

δ̂ (delta-hat): extending δ from single symbols to entire strings

The problem

δ(q, a) tells us the next state for one symbol. But we need to process entire strings. We define δ̂(q, w) by induction on the length of w:

Base case: δ̂(q, ε) = q

"Reading nothing leaves you where you are."

Inductive case: δ̂(q, wa) = δ(δ̂(q, w), a)

"To process string wa: first process w to reach some state p, then take one more step on symbol a."

Worked example

Using the "even 0s" DFA with states E, O:

delta-hat(E, "010") = ? delta-hat(E, epsilon) = E (base) delta-hat(E, "0") = delta(E, 0) = O (E + 0 -> O) delta-hat(E, "01") = delta(O, 1) = O (O + 1 -> O) delta-hat(E, "010")= delta(O, 0) = E (O + 0 -> E) Result: E (accept state) -> "010" ACCEPTED

Why define it this formally?

This inductive definition lets us prove things about DFAs mathematically. For example, we can prove that two DFAs accept the same language by showing their δ̂ functions agree on all strings. Without formal definitions, proofs are impossible.

11 / 20

Language of a DFA

L(A) -- the set of all strings a DFA accepts

Definition

For a DFA A = (Q, Σ, δ, q0, F), the language of A is:

L(A) = { w ∈ Σ* | δ̂(q0, w) ∈ F }

"The set of all strings over Σ that, when processed from q0, end in an accept state."

Examples

DFA 1 (even 0s): L(A) = { w in {0,1}* | w has an even number of 0s } Examples in L: "", "1", "11", "00", "1001", "0110" Examples not in L: "0", "010", "000" DFA 2 (tennis): L(A) = { w in {s,o}* | server wins the game } Examples in L: "ssss", "sosososs" Examples not in L: "oooo", "so"

A language is called "regular" if...

A language L is regular if there exists some DFA A such that L = L(A).

Equivalently: L is regular if it can be recognized by a DFA, an NFA, or described by a regular expression (all three formalisms define the same class).

Not every language is regular!

The language { anbn | n ≥ 0 } is not regular. No DFA can count to an arbitrary n. DFAs have finite memory, so they cannot match unbounded counts.

Analogy: club bouncers

Think of a DFA as a bouncer with a simple checklist. The language is the guest list. The bouncer reads your invitation (input string) and follows fixed rules to decide: are you in or out?

12 / 20

Product Construction

Combining two DFAs to build intersection and union

The idea

Given DFA A1 recognizing L1 and DFA A2 recognizing L2, we can build a single DFA recognizing L1 ∩ L2 or L1 ∪ L2.

Construction

Run both DFAs simultaneously.

States of product DFA: Q1 × Q2 (all pairs)

Transition: δ((p, q), a) = (δ1(p, a), δ2(q, a))

Start: (q0,1, q0,2)

For intersection: F = F1 × F2 (both accept)

For union: F = (F1 × Q2) ∪ (Q1 × F2) (either accepts)

Example: L1 ∩ L2

L1 = "even number of 0s", L2 = "ends with 1"

A1: states {E, O}, start E, accept {E} A2: states {A, B}, start A, accept {B} A --0--> A, A --1--> B B --0--> A, B --1--> B Product DFA states: {(E,A),(E,B),(O,A),(O,B)} Start: (E, A) delta: (E,A) --0--> (O,A) (E,A) --1--> (E,B) (E,B) --0--> (O,A) (E,B) --1--> (E,B) (O,A) --0--> (E,A) (O,A) --1--> (O,B) (O,B) --0--> (E,A) (O,B) --1--> (O,B) Intersection accept: {(E,B)} (even 0s AND ends with 1) Union accept: {(E,A),(E,B),(O,B)} (even 0s OR ends with 1)

Why does this matter?

This proves regular languages are closed under intersection and union. If L1 and L2 are regular, so are L1 ∩ L2 and L1 ∪ L2.

13 / 20

Complement of a DFA

The simplest closure proof you will ever see

The trick

Given a DFA A recognizing L, build a DFA for the complement Σ* \ L by simply swapping accept and non-accept states.

Accept states become non-accept. Non-accept states become accept. Everything else stays the same.

Original: accept = {q2} +--1--+ +--1--+ +-0,1--+ | | | | | | v | v | v | -->( q0 )-+ ( q1 )-+ (( q2 ))--+ | 0 | 0 +--------> +--------> Complement: accept = {q0, q1} +--1--+ +--1--+ +-0,1--+ | | | | | | v | v | v | -->(( q0 )) (( q1 )) ( q2 )---+ | 0 | 0 +--------> +-------->

Why does this work?

For any string w, the DFA ends in exactly one state. That state is either in F (accept) or not in F (reject). Swapping F flips every accept to reject and vice versa -- so the complement DFA accepts exactly the strings the original rejects.

This ONLY works for DFAs, not NFAs!

An NFA accepts if any path leads to an accept state. Swapping accept/non-accept in an NFA does NOT give the complement. (Some paths might accept while others reject the same string.) You must first convert the NFA to a DFA, then complement.

Analogy: pass/fail grading

Imagine a class where ≥60 passes. The complement is like changing the rule to "<60 passes." Same exam, same scores, just flip which scores count as passing.

14 / 20

DFA Minimization

Finding the smallest DFA for a given language

Why minimize?

  • Smaller DFA = less memory, faster execution
  • The minimum DFA for a regular language is unique (up to renaming states)
  • Two DFAs accept the same language if and only if their minimized versions are identical

Key concept: Distinguishable States

States p and q are distinguishable if there exists some string w such that exactly one of δ̂(p, w) and δ̂(q, w) is in F.

In other words: there is a string that one state accepts but the other rejects. They behave differently, so they must stay separate.

States that are not distinguishable are called equivalent -- they can be merged.

Can these two states be merged? -->( A )--0-->( B )--0-->(( C )) | | ^ 1 1 | v v | ( D ) ( D )--0-------+ | 1 v (( C )) A and B: From A, "00" reaches C (accept) From B, "0" reaches C (accept) But from A, "0" reaches B (not accept) and from B, "0" reaches C (accept) The string "0" distinguishes A from B. They are DIFFERENT and cannot be merged.

Analogy: identical twins

If two states respond identically to every possible future input, they are "twins" and can be merged into one. The table-filling algorithm systematically finds all such twins.

15 / 20

Table-Filling Algorithm

A systematic way to find all distinguishable pairs of states

The algorithm

Step 1 (Base case): Mark every pair (p, q) where exactly one of p, q is an accept state. These are distinguishable by ε (the empty string).
Step 2 (Induction): For each unmarked pair (p, q), check: for some input symbol a, is the pair (δ(p,a), δ(q,a)) already marked? If yes, mark (p, q) too.
Step 3: Repeat Step 2 until no more pairs can be marked.
Step 4: All unmarked pairs are equivalent and can be merged.

Why does it work?

If δ(p, a) and δ(q, a) are distinguishable by some string w, then p and q are distinguishable by the string aw. The algorithm propagates distinguishability backwards through transitions.

Visualizing the table

Triangular table (only need pairs where p < q): q1 q2 q3 q4 q5 q0 [ | | | | ] q1 [ | | | ] q2 [ | | ] q3 [ | ] q4 [ ] Step 1: Mark all (accept, non-accept) pairs with "X" in the base round. Step 2+: For each empty cell (p,q), check if delta(p,a) vs delta(q,a) is already marked for any symbol a. If so, mark this cell too. Step 3: When nothing new gets marked, STOP. Step 4: Empty cells = equivalent pairs = MERGE!

Complexity

The algorithm runs in O(n2) time where n = |Q|. Each pair is checked at most once per round, and there are at most n rounds.

16 / 20

Table-Filling Algorithm: Step-Through

Watch the algorithm find equivalent state pairs.

Step 0 / 17

Transition Table

01
→ ABC
BDE
*CFC
DDE
*EFC
FDE

Accept: {C, E}

Distinguishability Table

B
C
D
E
F
A
B
C
D
E
17 / 20

The Myhill-Nerode Theorem

The deepest result about DFAs -- connecting equivalence classes to minimum states

The Equivalence Relation ≡L

For a language L, define: x ≡L y if and only if for every string z, xz ∈ L ⇔ yz ∈ L.

"x and y are equivalent if no suffix can tell them apart with respect to L."

The Theorem (three parts)

  1. L is regular if and only ifL has a finite number of equivalence classes
  2. The number of equivalence classes equals the minimum number of states in any DFA for L
  3. The minimum-state DFA is unique (up to renaming)

Example: L = "even number of 0s"

Equivalence classes of strings: Class 1: strings with even # of 0s "", "1", "11", "00", "1001", ... Any suffix that makes one accept also makes the other accept. Class 2: strings with odd # of 0s "0", "010", "000", "10", ... Only 2 classes -> minimum DFA has 2 states! (This matches our E/O DFA exactly.)

For non-regular languages

If L = { anbn }, then the strings "", "a", "aa", "aaa", ... are all in different equivalence classes (infinitely many). By Myhill-Nerode, L is not regular. This is an alternative to the pumping lemma for proving non-regularity!

Why does this matter?

Myhill-Nerode gives us a precise lower bound on the number of states. It is not just an algorithm -- it is a theorem about the nature of regular languages themselves.

18 / 20

Dead States and Trap States

States you can never leave -- or should never reach

Dead state (a.k.a. trap state, sink state)

A dead state is a non-accepting state where all transitions (for every symbol) loop back to itself. Once you enter, you can never reach an accept state.

Example: "starts with 1" over {0,1} +--0,1-+ | | v | -->( q0 )--+---1--->(( q1 )) | | 0 +--0,1-+ v | | (DEAD)<--+ v | | | (( q1 ))+ +--0,1--+ DEAD state: non-accepting, all arrows point back to itself. It's a "black hole."

When do you need dead states?

  • A DFA requires total transitions -- every state must have a transition for every symbol
  • When no "useful" state exists for a transition, send it to the dead state
  • The dead state makes the DFA complete

Common omission

Students often leave out the dead state when drawing DFAs. This is technically incorrect -- the DFA is incomplete. Always include it, or explicitly note "transitions to dead state omitted for clarity."

Absorbing states (like Tennis)

In the tennis DFA, "Server Wins" and "Opp Wins" are absorbing states: all transitions loop back. "Server Wins" is an accepting absorbing state. "Opp Wins" is a dead state (non-accepting absorbing state). Both are trap states -- once you are in, you never leave.

19 / 20

DFA Design Patterns & Cheat Sheet

Common patterns and a quick reference for exams

Common DFA Design Patterns

1. "AT LEAST ONE a" Two states: seen-a (accept), not-yet. -->(S)--a-->((T))--a,b-->((T)) |b (self-loop) v (S) (self-loop on b) 2. "ENDS WITH ab" Three states tracking suffix progress. -->(S)--a-->(A)--b-->((AB)) ^-b ^-a |a |a-->(A) +-b->(S) |b-->(S) 3. "CONTAINS substring aba" Build states for each prefix matched. Once full match: stay in accept forever. 4. "DIVISIBILITY by n" States = {0, 1, ..., n-1} = remainders. delta(r, d) = (r * base + d) mod n Accept = {0}. Example: divisible by 3 in binary States: {0, 1, 2}, start=0, accept={0} delta(r, b) = (2r + b) mod 3

Summary & Cheat Sheet

Core Facts

  • DFA = (Q, Σ, δ, q0, F) -- 5-tuple
  • δ is a total function: Q × Σ → Q
  • Accepts w if δ̂(q0, w) ∈ F
  • L(A) = { w | δ̂(q0, w) ∈ F }

Closure Properties

  • Complement: swap F and Q\F
  • Union: product construction, F = either accepts
  • Intersection: product construction, F = both accept
  • Concatenation, Kleene star: need NFAs (covered later)

Common Mistakes to Avoid

  • Missing transitions (every state needs one per symbol)
  • Forgetting the dead/trap state
  • Complementing an NFA (must convert to DFA first!)
  • Confusing DFA state count with language complexity
  • Not specifying all 5 components of the tuple

Minimization: Table-filling algorithm. Myhill-Nerode gives the exact minimum state count. The minimum DFA is unique up to renaming.

20 / 20