A complete guide to DFAs -- formal definitions, design patterns, minimization, and the tennis scoring example
Based on CS305 lecture materials (Ullman: Finite Automata & Tennis)
Use Arrow Keys or the buttons below to navigate. Press S to step through animations.
Where do DFAs fit in the Chomsky hierarchy?
DFAs are the simplest computational model that still recognizes a useful class of languages. They are the foundation of everything above. Understanding DFAs is the first step toward understanding all of computation theory.
Intuition before formalism -- you already know these!
A finite automaton is a formal system that:
A vending machine is a finite automaton! It has states (how much money inserted), inputs (coins, button presses), and transitions (inserting a quarter changes the state from "$0.50" to "$0.75"). It has finitely many states and follows deterministic rules.
The machine has a fixed, finite number of states. It cannot count to infinity or remember an unbounded amount of data. This is its fundamental limitation -- and also what makes it analyzable.
Used in circuit design and verification, text processing (grep, regex), compilers (lexical analysis), protocol verification, and modeling simple patterns of events.
A DFA is precisely defined by (Q, Σ, δ, q0, F)
| Symbol | Name | What it is |
|---|---|---|
| Q | States | A finite set of states |
| Σ | Alphabet | A finite set of input symbols |
| δ | Transition function | δ : Q × Σ → Q |
| q0 | Start state | q0 ∈ Q (exactly one) |
| F | Accept states | F ⊆ Q (zero or more) |
δ is a total function: for every state q and every symbol a, δ(q, a) produces exactly one next state. No choices. No ambiguity. No missing transitions.
Students often draw DFAs with missing transitions. If state q has no arrow for symbol 'a', that is NOT a valid DFA! Every state must have exactly one outgoing transition for each symbol in Σ.
A DFA is like a GPS that gives you exactly one instruction at every intersection. "Turn left." No options, no ambiguity. You just follow the rules. At the end of your trip, you check: "Am I at an acceptable destination?"
How to draw DFAs -- the visual language
Start at the arrow. Follow transitions for each input symbol. After all input is consumed, check: are you in a double-circle state? If yes: accept. If no: reject.
An equivalent way to represent DFAs -- great for systematic work
| 0 | 1 | |
|---|---|---|
| → q0 | q1 | q0 |
| q1 | q2 | q1 |
| *q2 | q2 | q2 |
→ = start state, * = accept state
Each row is a state. Each column is an input symbol. The cell at row q, column a gives δ(q, a) -- the next state.
Row "q0", column "0" says q1. This means: "from state q0, on input 0, go to q1."
In a valid DFA transition table, every cell must be filled with exactly one state. If any cell is empty or has multiple states, it is not a DFA.
Diagrams are better for intuition and seeing structure. Tables are better for algorithms (minimization, product construction) and for making sure you have not missed any transitions.
Building a DFA step by step -- the thought process matters
Design a DFA over Σ = {0, 1} that accepts all strings with an even number of 0s (including zero 0s).
We only care about the parity (even/odd) of the count of 0s seen so far. That is a finite amount of information -- just 2 states!
| 0 | 1 | |
|---|---|---|
| → *E | O | E |
| O | E | O |
Reading a 0 flips parity. Reading a 1 keeps it the same.
Ask: "What is the minimum information I need to remember?" That tells you how many states you need. For parity questions, the answer is always 2.
Type a binary string and watch the DFA process it step by step.
Start at q0. For each symbol in the input (left to right), look up the next state. At the end, check membership in F. That is all a DFA does.
A real-world DFA example from the lecture: scoring a game of tennis
Σ = {s, o}
The score is the state. Each point (s or o) deterministically moves to exactly one new score. There is a finite number of possible scores. The "Server Wins" state is the accept state.
SvrW = Server Wins, OppW = Opponent Wins, Ad = Advantage, Lv = Love
The complete picture with all 20 states and transitions
Love →s 15-Lv →o 15-all →s 30-15 →o 30-all →s 40-30 →o Deuce →s Ad-in →o Deuce →s Ad-in →o Deuce →s Ad-in →s Server Wins!
Once the game is over, both s and o transitions from Server Wins lead back to Server Wins (and similarly for Opp Wins). The game is over -- no more points change the outcome. These are sometimes called trap states or absorbing states.
δ̂ (delta-hat): extending δ from single symbols to entire strings
δ(q, a) tells us the next state for one symbol. But we need to process entire strings. We define δ̂(q, w) by induction on the length of w:
Base case: δ̂(q, ε) = q
"Reading nothing leaves you where you are."
Inductive case: δ̂(q, wa) = δ(δ̂(q, w), a)
"To process string wa: first process w to reach some state p, then take one more step on symbol a."
Using the "even 0s" DFA with states E, O:
This inductive definition lets us prove things about DFAs mathematically. For example, we can prove that two DFAs accept the same language by showing their δ̂ functions agree on all strings. Without formal definitions, proofs are impossible.
L(A) -- the set of all strings a DFA accepts
For a DFA A = (Q, Σ, δ, q0, F), the language of A is:
L(A) = { w ∈ Σ* | δ̂(q0, w) ∈ F }
"The set of all strings over Σ that, when processed from q0, end in an accept state."
A language L is regular if there exists some DFA A such that L = L(A).
Equivalently: L is regular if it can be recognized by a DFA, an NFA, or described by a regular expression (all three formalisms define the same class).
The language { anbn | n ≥ 0 } is not regular. No DFA can count to an arbitrary n. DFAs have finite memory, so they cannot match unbounded counts.
Think of a DFA as a bouncer with a simple checklist. The language is the guest list. The bouncer reads your invitation (input string) and follows fixed rules to decide: are you in or out?
Combining two DFAs to build intersection and union
Given DFA A1 recognizing L1 and DFA A2 recognizing L2, we can build a single DFA recognizing L1 ∩ L2 or L1 ∪ L2.
Run both DFAs simultaneously.
States of product DFA: Q1 × Q2 (all pairs)
Transition: δ((p, q), a) = (δ1(p, a), δ2(q, a))
Start: (q0,1, q0,2)
For intersection: F = F1 × F2 (both accept)
For union: F = (F1 × Q2) ∪ (Q1 × F2) (either accepts)
L1 = "even number of 0s", L2 = "ends with 1"
This proves regular languages are closed under intersection and union. If L1 and L2 are regular, so are L1 ∩ L2 and L1 ∪ L2.
The simplest closure proof you will ever see
Given a DFA A recognizing L, build a DFA for the complement Σ* \ L by simply swapping accept and non-accept states.
Accept states become non-accept. Non-accept states become accept. Everything else stays the same.
For any string w, the DFA ends in exactly one state. That state is either in F (accept) or not in F (reject). Swapping F flips every accept to reject and vice versa -- so the complement DFA accepts exactly the strings the original rejects.
An NFA accepts if any path leads to an accept state. Swapping accept/non-accept in an NFA does NOT give the complement. (Some paths might accept while others reject the same string.) You must first convert the NFA to a DFA, then complement.
Imagine a class where ≥60 passes. The complement is like changing the rule to "<60 passes." Same exam, same scores, just flip which scores count as passing.
Finding the smallest DFA for a given language
States p and q are distinguishable if there exists some string w such that exactly one of δ̂(p, w) and δ̂(q, w) is in F.
In other words: there is a string that one state accepts but the other rejects. They behave differently, so they must stay separate.
States that are not distinguishable are called equivalent -- they can be merged.
If two states respond identically to every possible future input, they are "twins" and can be merged into one. The table-filling algorithm systematically finds all such twins.
A systematic way to find all distinguishable pairs of states
If δ(p, a) and δ(q, a) are distinguishable by some string w, then p and q are distinguishable by the string aw. The algorithm propagates distinguishability backwards through transitions.
The algorithm runs in O(n2) time where n = |Q|. Each pair is checked at most once per round, and there are at most n rounds.
Watch the algorithm find equivalent state pairs.
| 0 | 1 | |
|---|---|---|
| → A | B | C |
| B | D | E |
| *C | F | C |
| D | D | E |
| *E | F | C |
| F | D | E |
Accept: {C, E}
The deepest result about DFAs -- connecting equivalence classes to minimum states
For a language L, define: x ≡L y if and only if for every string z, xz ∈ L ⇔ yz ∈ L.
"x and y are equivalent if no suffix can tell them apart with respect to L."
If L = { anbn }, then the strings "", "a", "aa", "aaa", ... are all in different equivalence classes (infinitely many). By Myhill-Nerode, L is not regular. This is an alternative to the pumping lemma for proving non-regularity!
Myhill-Nerode gives us a precise lower bound on the number of states. It is not just an algorithm -- it is a theorem about the nature of regular languages themselves.
States you can never leave -- or should never reach
A dead state is a non-accepting state where all transitions (for every symbol) loop back to itself. Once you enter, you can never reach an accept state.
Students often leave out the dead state when drawing DFAs. This is technically incorrect -- the DFA is incomplete. Always include it, or explicitly note "transitions to dead state omitted for clarity."
In the tennis DFA, "Server Wins" and "Opp Wins" are absorbing states: all transitions loop back. "Server Wins" is an accepting absorbing state. "Opp Wins" is a dead state (non-accepting absorbing state). Both are trap states -- once you are in, you never leave.
Common patterns and a quick reference for exams
Minimization: Table-filling algorithm. Myhill-Nerode gives the exact minimum state count. The minimum DFA is unique up to renaming.