CS305 — Formal Language Theory
Use ← → arrows to navigate • Press S for step-through reveals
We've climbed the Chomsky hierarchy. Now we hit the ceiling.
Turing machines are the most powerful computational model we know. But are there problems that even TMs cannot solve? The shocking answer is yes — and most problems are unsolvable.
Think of it like exploring a mountain range. We've been climbing higher (more powerful machines). Now we discover there's a hard ceiling — a boundary even infinite time and memory can't cross.
Any TM can be written as a finite binary string 〈M〉.
A Turing machine M is defined by finite components:
All of this is finite, so we can encode it as a binary string 〈M〉.
A Python program is just a text file — a string of characters. Similarly, a Turing machine is just a string of bits. Programs are data. This is the insight that changes everything.
One machine to simulate them all.
The UTM U takes two inputs: the code of a TM 〈M〉 and an input string w. It simulates M running on w. Whatever M would do, U does the same. It is a universal simulator.
The UTM is an interpreter. Just like Python reads a .py file and executes it, the UTM reads the encoding of a TM and executes it. Or think of it as an emulator — like running a Game Boy game on your PC.
Three-tape simulation, step by step.
The UTM uses 3 tapes for clarity, but any multi-tape TM can be converted to single-tape. A UTM can be a single-tape machine — just slower.
The UTM is itself a TM with a finite number of states. The "program" it runs is on its tape, not in its states.
There are more problems than programs. We prove this by constructing a language no TM can handle.
List all TMs down the rows (M1, M2, ...) and all strings across the columns (w1, w2, ...).
Each cell says whether Mi accepts (A) or rejects (R) string wj.
If we could list every TM, then every possible language is a row in this table. The question is: does every row correspond to some TM?
Click Random Fill to populate the table →
Look at the diagonal cells: what M1 does on w1, what M2 does on w2, what M3 does on w3, etc.
The diagonal picks one cell from each row — it "samples" the behavior of every TM on its own index string.
Think of it like asking each student to grade their own homework. Student 1 grades problem 1, student 2 grades problem 2, etc.
Click Flip It! to construct the new language →
Create a new language Ld by flipping every diagonal entry: A→R and R→A.
Ld disagrees with M1 on w1 (different at column 1).
Ld disagrees with M2 on w2 (different at column 2).
Ld disagrees with every Mi on wi!
So Ld is not the language of any TM in the list. Since the list contains all TMs, no TM recognizes Ld.
A specific language that NO Turing machine can recognize. Not even RE.
Ld = { wi | Mi does NOT accept wi }
The "diagonal" language: wi is in Ld exactly when the i-th TM rejects or loops on wi
Ld is not just undecidable — it's not even recognizable (RE). No TM of any kind can recognize it, even one allowed to loop forever on non-members. This is strictly worse than the Halting Problem.
The most famous undecidable problem in computer science.
HALT = { ⟨M, w⟩ | M is a TM and M halts on input w }
Given a program and an input, does the program eventually stop?
There is no TM that can correctly decide, for every ⟨M, w⟩, whether M halts on w. We prove this by contradiction.
"A barber shaves everyone who doesn't shave themselves." Who shaves the barber? The proof uses the same trick: self-reference + negation = paradox.
We could solve problems no one has been able to solve:
"Every even number > 2 is the sum of two primes."
(e.g., 4=2+2, 6=3+3, 8=3+5, 28=11+17)
Unproven since 1742! But if HALT were decidable, we could write a program that searches for a counterexample, then use H to check if it halts. If it halts → conjecture is false. If it loops → conjecture is true. Instant answer to a 280-year-old mystery.
We build a paradox machine that breaks any assumed halting decider.
No TM H can decide the Halting Problem. This is a fundamental mathematical impossibility, not a limitation of current technology. No faster computer, better algorithm, or AI will ever solve it.
What happens when we feed D its own encoding?
No TM H can decide the Halting Problem. This is a fundamental mathematical impossibility, not a limitation of current technology.
Where do Ld and HALT fit? Let's formalize the three classes we've seen.
A TM always halts with the correct YES or NO.
Machine: Decider
Examples:
{anbncn}, primes, {connected graphs}
A TM accepts strings in L, but may loop forever on strings not in L.
Machine: Recognizer
Examples:
HALT, ATM
No TM at all — not even one that loops. Completely unrecognizable.
Machine: None exists
Examples:
Ld, complement of HALT
L is decidable if and only if both L and its complement L are RE.
Decidable = RE ∩ co-RE
Decidable ⊊ RE ⊊ All Languages.
Each level is strictly larger. Most languages are not even RE.
How to prove new problems undecidable without building a new paradox each time.
"If you could solve Problem B, you could use that solution to also solve Problem A."
If A is undecidable, then B must be undecidable too — otherwise we'd have a way to solve A.
Imagine you can't solve a maze (Problem A). Someone claims they can solve a different puzzle (Problem B). You show: "If I could solve your puzzle, I could use it to solve my maze." Since your maze is unsolvable, their puzzle must be unsolvable too.
We reduce FROM the known-hard problem TO the new one. Reducing HALT → B means "HALT is no harder than B" — so B is at least as hard as HALT.
Undecidability spreads through reductions.
ETM = { 〈M〉 | L(M) = ∅ } — Is the language empty?
| Problem | Question | In plain English |
|---|---|---|
| ETM | Is L(M) = ∅? | Does M reject everything? (never accepts any string) |
| EQTM | Is L(M1) = L(M2)? | Do two TMs accept exactly the same strings? |
| ALLTM | Is L(M) = Σ*? | Does M accept everything? (never rejects) |
| REGTM | Is L(M) regular? | Could a simple DFA do M's job? |
Any non-trivial property of the language of a TM is undecidable. "Is L(M) finite?", "Is L(M) context-free?", etc. — all undecidable!
For each problem, decide: Decidable, RE (but not decidable), or Neither?
The nuclear weapon of undecidability results.
Every non-trivial property of the language recognized by a Turing machine is undecidable.
Not always-true or always-false.
Depends only on what M computes, not how it computes.
"You can never determine ANY interesting fact about what a program outputs, just by looking at the program." You can ask about the code's structure (how many lines? what variables?), but not its behavior.
Not all unsolvable problems are equally unsolvable.
A language is decidable iff it is both RE and co-RE. The decidable languages sit at the intersection.
| Class | Description | Example |
|---|---|---|
| Decidable | TM always halts | anbn |
| RE (not dec.) | TM halts on "yes", may loop on "no" | HALT |
| co-RE (not dec.) | TM halts on "no", may loop on "yes" | Compl(HALT) |
| Neither | Not RE, not co-RE | Ld, EQTM |
L is decidable ⇔ L is RE AND co-RE. If HALT's complement were RE, then HALT would be decidable. But it's not.
The reduction below has a mistake. Can you find it?
The full picture of computational power.
| Type | Language | Recognizer | Closure |
|---|---|---|---|
| 3 | Regular | DFA/NFA | ∪ ∩ * comp |
| 2 | Context-Free | PDA | ∪ * (not ∩) |
| 1 | Context-Sensitive | LBA | ∪ ∩ comp |
| 0 | RE | TM | ∪ ∩ (not comp) |
| — | Not RE | None | — |
Each level strictly contains the one below it. The jump from decidable to undecidable is not about "not enough power" — it's a provable impossibility that no amount of computation can overcome.
It's not that we haven't found the right algorithm yet. It's that no algorithm can possibly exist. This is proven mathematically, not just observed empirically.
What we can NEVER build, no matter how smart we get.
We can't solve these in general, but CAN solve them for specific cases. Real tools use heuristics, approximations, and restricted inputs. They work most of the time but can never be perfect.
Anything as powerful as a TM inherits all the same limitations.
| System | Turing-Complete? |
|---|---|
| Python, Java, C++ | Yes |
| HTML + CSS | Yes (with user interaction) |
| Conway's Game of Life | Yes |
| Magic: The Gathering | Yes |
| PowerPoint | Yes (with animations) |
| Regular Expressions | No (just regular langs) |
| SQL (basic) | No (relational algebra) |
Any "reasonable" model of computation is equivalent in power to a Turing machine. No physical device can compute more than a TM.
Every Turing-complete system has the same ceiling. No programming language, no matter how advanced, can solve the Halting Problem. Quantum computers can't either — they're faster, but not more powerful in terms of what's computable.
Where undecidability meets the rest of computer science.
Some theorists propose machines more powerful than TMs (oracle machines, infinite time TMs). These are mathematical abstractions — no physical device is known to exceed TM power.
Use Rice's Theorem or reduction arguments to classify each language.
Everything you need to know, on one slide.
| Concept | Key Point |
|---|---|
| TM Encoding | Every TM is a finite binary string |
| UTM | One TM simulates all others (= interpreter) |
| Counting | Countable TMs vs uncountable languages |
| Ld | Diagonal language, not even RE |
| HALT | RE but not decidable |
| Reductions | Prove B undecidable via A ≤ B |
| Rice's Thm | All non-trivial props of L(M) undecidable |
1. Assume decider H for HALT exists
2. Build D: run H(〈M,〈M〉〉); do opposite
3. Run D(〈D〉): contradiction either way
4. Therefore H cannot exist
When asked "is X decidable?", first check: is it a non-trivial property of L(M)? If yes, cite Rice's Theorem. Only do a full reduction if Rice's doesn't apply.
Test your understanding.
The key proof technique.
The nuclear option.
You've completed the UTM & Undecidability enhanced slide deck. Key takeaways: (1) TMs can be encoded as strings, enabling the UTM. (2) Diagonalization proves most languages are unrecognizable. (3) The Halting Problem is undecidable via self-reference + negation. (4) Rice's Theorem generalizes undecidability to all non-trivial properties of L(M).