Extended NFA with Epsilon-Transitions
CS305 -- Formal Language Theory
Arrow Keys or Space to navigate • Press S to reveal steps
We have now seen DFAs and NFAs. The epsilon-NFA adds one more layer of nondeterminism -- but all three recognize exactly the same class of languages: the regular languages.
Adding epsilon-transitions does NOT increase the power of the machine. It only makes designing automata easier and more modular. Any epsilon-NFA can be converted to an equivalent DFA.
Sometimes, designing an NFA for a complex language is still painful. We want to:
Epsilon-transitions are like LEGO connectors: they let you snap small, tested components together into bigger machines without redesigning anything.
The epsilon-transitions from q0 let us "choose" which sub-machine to enter -- no input consumed!
An epsilon-transition (written as an ε-transition) is a transition that the machine can take without reading any input symbol.
Think of states as rooms in a castle. Normal transitions are doors that require a key (an input symbol) to pass through. Epsilon-transitions are secret passages -- you can slip through them at any time, for free, without spending a key.
Or think of epsilon-transitions as teleporters between states. You can beam yourself to the destination instantly without consuming any resource. And you can chain teleporters: q0 → q1 → q2 all for free!
ε is NOT a symbol in the alphabet Σ. It represents the empty string. The machine never "reads" an ε from the tape.
An epsilon-NFA is a 5-tuple E = (Q, Σ, δ, q0, F) where:
| Component | Meaning |
|---|---|
| Q | Finite set of states |
| Σ | Finite input alphabet (ε ∉ Σ) |
| δ | Q × (Σ ∪ {ε}) → P(Q) |
| q0 | Start state (q0 ∈ Q) |
| F | Set of accept states (F ⊆ Q) |
In a plain NFA: δ : Q × Σ → P(Q)
In an ε-NFA: δ : Q × (Σ ∪ {ε}) → P(Q)
The transition function now also accepts ε as input. This means the transition table gets an extra column for ε.
Students sometimes add ε to the alphabet Σ. Don't! ε is never in Σ. The transition function's domain is extended to include ε, but the alphabet itself stays the same.
Let's build an ε-NFA for the language L = { strings over {a,b} that start with 'a' or end with 'b' or are empty }.
| State | a | b | ε |
|---|---|---|---|
| → q0 | ∅ | ∅ | {q1, q3} |
| q1 | {q2} | ∅ | ∅ |
| *q2 | {q2} | {q2} | ∅ |
| q3 | {q3} | {q4} | ∅ |
| q4 | {q3} | {q5} | ∅ |
| *q5 | ∅ | ∅ | ∅ |
Note the ε column -- that's what makes this an ε-NFA, not just an NFA.
The epsilon-closure of a state q, written CL(q) or ECLOSE(q), is the set of all states reachable from q by following zero or more epsilon-transitions.
CL(q) is the smallest set such that:
Imagine each ε-transition is a wormhole. CL(q) is the set of all places you can reach from q using only wormholes (no fuel/input needed). You always include your starting location!
A state is always in its own epsilon-closure. CL(q) always contains q itself (zero epsilon-transitions = staying put).
Use BFS or DFS starting from q, following only ε-transitions.
Click a state to explore its ε-closure via BFS:
Epsilon-closure is essentially a graph reachability problem. The ε-transitions form a directed graph, and CL(q) is just all nodes reachable from q in that graph.
Epsilon-transitions can form cycles. Always track visited states to avoid infinite loops.
We often need the epsilon-closure of a set of states, not just one state.
For a set of states S ⊆ Q:
CL(S) = ⋃q ∈ S CL(q)
Just take the union of the epsilon-closures of every state in S.
When processing input in an ε-NFA, after reading a symbol we may land in multiple states (just like regular NFA). We need the epsilon-closure of that entire set before processing the next symbol.
After each "real" step (reading a symbol), all your tokens teleport through every wormhole they can reach. You always "expand" via epsilon before and after reading input.
The extended transition function δ̂(q, w) tells us the set of states reachable from q after reading string w, accounting for all epsilon-transitions.
Base case:
δ̂(q, ε) = CL(q)
On empty input, you can reach anything via epsilon.
Inductive case: For string w = xa (x is a string, a is a symbol):
δ̂(q, xa) = CL( ⋃p ∈ δ̂(q,x) δ(p, a) )
In words: first process x to get a set of states, then from each of those states follow the 'a'-transition, then take the epsilon-closure of all the resulting states.
In a plain NFA, δ̂(q, ε) = {q}. In an ε-NFA, δ̂(q, ε) = CL(q), which may include many states!
ε-NFA: accepts "ab" or "b"
An ε-NFA E = (Q, Σ, δ, q0, F) accepts string w if and only if:
δ̂(q0, w) ∩ F ≠ ∅
That is, after processing w (with all epsilon-closures), at least one of the states we end up in is an accept state.
The language recognized by E is:
L(E) = { w ∈ Σ* | δ̂(q0, w) ∩ F ≠ ∅ }
Even if q0 is NOT an accept state, the ε-NFA might still accept the empty string! If any state in CL(q0) is an accept state, then ε ∈ L(E).
We can eliminate all epsilon-transitions to get an equivalent ordinary NFA.
Given ε-NFA E = (Q, Σ, δE, q0, F), construct NFA N = (Q, Σ, δN, q0, F') where:
If CL(q0) ∩ F ≠ ∅, then q0 becomes an accept state in the NFA (even if it wasn't before). This ensures the NFA still accepts ε when the ε-NFA did.
| a | b | ε | |
|---|---|---|---|
| → q0 | ∅ | ∅ | {q1} |
| q1 | {q2} | ∅ | ∅ |
| q2 | ∅ | ∅ | {q3} |
| q3 | ∅ | {q4} | ∅ |
| *q4 | ∅ | ∅ | ∅ |
| a | b | |
|---|---|---|
| → q0 | -- | -- |
| q1 | -- | -- |
| q2 | -- | -- |
| q3 | -- | -- |
| *q4 | -- | -- |
We can skip the intermediate NFA and go straight from ε-NFA to DFA using a modified subset construction with epsilon-closures baked in.
δDFA(S, a) = CL( ⋃q ∈ S δ(q, a) )
Move on a, then epsilon-close.It's the same algorithm you already know from NFA → DFA, except you wrap every intermediate result in CL(). Think of it as "subset construction wearing epsilon-closure glasses."
| a | b | ε | |
|---|---|---|---|
| → q0 | ∅ | ∅ | {q1} |
| q1 | {q2} | ∅ | ∅ |
| q2 | ∅ | {q3} | ∅ |
| *q3 | ∅ | ∅ | ∅ |
| DFA State | ε-NFA States | a | b | Accept? |
|---|
F = {q3, q5}, ε: q0→{q1,q4}
Thompson's Construction converts any regex to an ε-NFA mechanically. Epsilon-transitions are the glue:
Every regex operator (union, concat, star) maps to an ε-NFA pattern.
Epsilon-transitions let us compose automata like functions. Build small, test small, combine freely. This is the foundation of how tools like grep, lex, and regex engines work internally.
| Concept | Definition |
|---|---|
| ε-transition | Move between states without consuming input |
| CL(q) | All states reachable from q via ε* |
| CL(S) | ∪ CL(q) for all q in S |
| δ̂(q,ε) | = CL(q) |
| δ̂(q,xa) | = CL( δ( δ̂(q,x), a ) ) |
| Accepts w | δ̂(q0,w) ∩ F ≠ ∅ |
ε-transitions are like escalators in a mall -- they move you between floors for free. The mall (language recognized) doesn't change if you remove them and add staircases (direct transitions) instead. Just the convenience of getting around changes.