The Ultimate Model of Computation
CS305 - Formal Language Theory
Use ← → arrows to navigate
We have been climbing a ladder of computational power. The Turing Machine sits at the very top.
| Type | Grammar | Machine |
|---|---|---|
| 3 | Regular | DFA / NFA |
| 2 | Context-Free | PDA |
| 1 | Context-Sensitive | LBA |
| 0 | Unrestricted | TM |
Each level strictly contains the one below it. TMs can do everything the lower models can, and more.
A TM consists of three parts: an infinite tape, a read/write head, and a finite control.
Think of a TM as the simplest possible general-purpose computer. The tape is RAM (but infinite). The head is the read/write mechanism. The finite control is the CPU running a fixed program.
| Symbol | Name | Description |
|---|---|---|
| Q | States | Finite set of states |
| Σ | Input alphabet | Symbols in input (Σ ⊂ Γ, B ∉ Σ) |
| Γ | Tape alphabet | All tape symbols (Σ ∪ {B} ⊆ Γ) |
| δ | Transition fn | Q × Γ → Q × Γ × {L, R} |
| q0 | Start state | q0 ∈ Q |
| B | Blank | B ∈ Γ but B ∉ Σ |
| F | Final states | F ⊆ Q (accepting) |
Σ is what the input is made of (e.g., {0, 1}). Γ includes Σ, the blank B, and work symbols (e.g., X, Y for marking).
The blank B is in Γ but NOT in Σ. The input never contains blanks -- blanks only appear as "empty" cells on the tape.
Each transition: what to write, where to move, which state next.
δ(q1, 0) = (q2, X, R)
"In state q1, reading 0: write X, move right, go to q2."
Click a transition to animate it.
The TM halts when δ(q, X) is undefined.
Unlike DFAs, a TM might never stop. This is a fundamental feature that leads to undecidability.
TM for { w ∈ {a,b}* | w = wR }. Strategy: match first & last chars, shrink inward.
XX, scan back leftThe state acts as 1-bit memory — it "remembers" which character was seen on the left, then verifies the match on the right.
Complete δ for M = ({q0, qa, qb, qca, qcb, qback, qacc}, {a,b}, {a,b,X,B}, δ, q0, B, {qacc})
| State | Read | Next State | Write | Move |
|---|---|---|---|---|
| q0 | a | qa | X | R |
| q0 | b | qb | X | R |
| q0 | X | q0 | X | R |
| q0 | B | qacc | B | R |
| qa | a | qa | a | R |
| qa | b | qa | b | R |
| qa | X | qca | X | L |
| qa | B | qca | B | L |
| qb | a | qb | a | R |
| qb | b | qb | b | R |
| qb | X | qcb | X | L |
| qb | B | qcb | B | L |
| State | Read | Next State | Write | Move |
|---|---|---|---|---|
| qca | a | qback | X | L |
| qca | X | qacc | X | R |
| qcb | b | qback | X | L |
| qcb | X | qacc | X | R |
| qback | a | qback | a | L |
| qback | b | qback | b | L |
| qback | X | q0 | X | R |
Highlighted rows lead to qacc (accept). Missing entries → reject (halt with no transition). E.g., qca reading b = mismatch → reject.
Even a "simple" palindrome check needs this many transitions. TM programming is low-level — every read/write/move must be specified.
Building TMs for complex tasks uses a few recurring "tricks."
Replace a symbol with a "marked" version (e.g., 0 → X) to remember you processed it.
To insert or delete a symbol, shift all symbols right/left by one cell. Requires O(n) steps per shift.
Treat each tape cell as a tuple. E.g., one track for data, one for markers.
Design a TM for a subtask, then "call" it by entering its start state.
These techniques show TMs can simulate structured programming: variables (marked cells), arrays (tape regions), and function calls (subroutines).
TMs can compute functions, not just decide languages. Here: add 1 to a binary number.
This is exactly how you add 1 by hand: start from the right, flip bits, and carry the 1 leftward until you find a 0 to absorb it.
A multi-tape TM has k tapes, each with its own independent read/write head.
Reads all heads at once, writes to all, moves each independently.
Multi-tape TMs are much easier to program. Use one tape for input, one as scratch space. They are often the "go-to" model for algorithm design in theory.
Every k-tape TM can be simulated by a single-tape TM. Here's exactly how.
* markers and read the symbols under each virtual head* marker left or rightEach step of the k-tape TM needs O(n) steps on the single tape (to sweep and find markers). After t steps the tape grows to at most t cells, so total cost is O(t²) — polynomial slowdown only.
Adding more tapes does NOT increase the class of languages recognized. Convenience yes, extra power no. Same languages, just slower.
An NTM can have multiple possible transitions for the same (state, symbol) pair.
An NTM accepts if at least one computation path reaches an accepting state.
NTMs are equivalent in power to DTMs. Every NTM can be simulated by a DTM using BFS of the computation tree.
The DTM simulation may be exponentially slower: O(ct) steps. Whether this blowup is necessary is the P vs NP problem!
NTM = "lucky guesser" that always picks the right branch. DTM = methodical searcher trying every branch. Same problems solved, potentially different speeds.
Input tape: 0 1 1 0
What will the tape contain when the TM halts?
Unlike DFAs/PDAs that always finish reading input, a TM on input w can do one of three things:
Enters an accept state and halts
Enters a reject state and halts
Never halts — runs infinitely
The third possibility — looping — is what makes TMs fundamentally different from DFAs and PDAs.
If you run a TM and it hasn't halted after 1 billion steps, you cannot tell whether it will eventually halt or loop forever. There is no general algorithm to decide this — that is the Halting Problem (covered in the UTM lecture).
DFA/PDA = a test that always finishes and gives you a grade.
TM = a test that might finish... or might keep grading forever. You're stuck waiting, unsure if it will ever return your paper.
We classify TMs based on whether they always halt, which defines two fundamental language classes.
A TM that halts on every input — it always says YES or NO, never loops.
Examples: {anbncn}, {palindromes}, {connected graphs}
A TM that accepts strings in L but may loop forever on strings not in L.
Example: ATM = {⟨M,w⟩ | M accepts w}
Every decider is also a recognizer (halting is a special case of not-looping). But NOT every recognizer is a decider. Decidable ⊂ Recognizable.
The most important philosophical claim in computer science — proposed independently by Alonzo Church and Alan Turing in 1936.
"Every function that would naturally be regarded as computable can be computed by a Turing Machine."
In plain English: if there is an algorithm for it, a TM can do it.
| A thesis — a claim / hypothesis, not a proven fact |
| Universally supported — every proposed model is equivalent to TMs |
| Battle-tested — no counterexample in 90 years |
Evidence: Lambda calculus, recursive functions, RAM machines, Post systems, quantum computers — all compute exactly what TMs compute. Nothing more.
| Not a theorem — cannot be formally proved (no rigorous definition of "algorithm" to prove against) |
| Not about speed — TMs can be astronomically slow; this says nothing about efficiency |
| Not unfalsifiable — could theoretically be disproved by finding a "computable" function no TM can compute |
If an exam asks "Is the Church-Turing Thesis a theorem?" the answer is NO. It is a widely accepted hypothesis. Saying it is "proved" is always wrong.
What is wrong? What input breaks it?
Many TM variations exist. Remarkably, they all have the exact same computational power.
| Variant | Description |
|---|---|
| Multi-tape | k independent tapes + heads |
| Multi-track | Each cell holds a tuple |
| 2-way infinite | Tape infinite in both directions |
| Stay option | Head can stay put (L, R, or S) |
| Multi-head | Multiple heads on one tape |
| Nondeterministic | Multiple transitions per pair |
| Enumerator | Prints strings instead of accepting |
This robustness is evidence for the Church-Turing Thesis. No matter how you tweak the TM model, you get the same computable functions.
Any TM M can be encoded as a binary string ⟨M⟩. This allows TMs to take other TMs as input!
Once TMs are strings, we can feed a TM description to another TM. This enables:
There are only countably many TMs (each is a finite binary string), but uncountably many languages (by Cantor's diagonal argument). Therefore most languages have no TM — they are undecidable.
An Instantaneous Description (ID) is a snapshot of a TM at one moment: the tape contents, head position, and current state — written as a single string.
The ID string uqv means:
u = tape symbols left of the headq = the current state (inserted where the head is)v = symbol under the head + everything to its rightThe head reads the first symbol of v.
⊢ means "yields in one step." The full sequence is a computation history — a complete record of what the TM did.
IDs let us write an entire computation as a mathematical object (a sequence of strings). This is essential for formal proofs — e.g., proving a language is undecidable by encoding computation histories.
A Turing Machine has a finite control, an infinite tape, and a read/write head. It reads, writes, moves L/R, changes state. It is the most powerful standard model of computation.
On input 01, what is the sequence of IDs?
Hint: You can use the marking technique.
Which strategy works?