CS305 -- Formal Language Theory & Complexity
$1,000,000 · Unsolved since 1971 · Use ← → to navigate
We've studied what's computable. Now: among computable problems, which are EFFICIENTLY solvable?
Computability: "Your keys exist somewhere in the universe."
Complexity: "Can you find them before lunch?"
In the first half of the course, we asked: "Can this problem be solved at all?"
Now we ask: "Can it be solved FAST ENOUGH to be practical?"
A problem can be decidable yet completely impractical. An algorithm taking 2n steps on input of size n=1000 would run longer than the age of the universe. "Solvable in theory" is not "solvable in practice."
The central question of complexity theory:
Is finding solutions fundamentally harder than checking them?
Measuring efficiency as a function of input size n
O(n) -- LinearO(n log n) -- LinearithmicO(n²) -- QuadraticO(n³) -- CubicO(nk) -- Polynomial for fixed kThese grow manageably.
O(2n) -- ExponentialO(n!) -- FactorialThese EXPLODE.
At n=100: O(n²) = 10,000 steps (milliseconds).
O(2n) = 1030 steps -- more than the age of the universe in nanoseconds.
Problems solvable in polynomial time by a deterministic Turing Machine
P captures the notion of feasible computation. If a problem is in P, we can build practical software to solve it.
P problems are like recipes where cooking time is proportional to the number of guests. Double the guests? Maybe quadruple the cooking time. Manageable.
| Problem | Algorithm | Time |
|---|---|---|
| Sorting | Merge sort | O(n log n) |
| Shortest Path | Dijkstra | O(V²) |
| GCD | Euclid's | O(log n) |
| Primality | AKS (2002) | O(n&sup6;) |
| Matching | Edmonds' | O(V³) |
| 2-SAT | SCC-based | O(V+E) |
| Graph connectivity | BFS/DFS | O(V+E) |
Being "in P" doesn't mean trivially fast -- O(n100) is technically polynomial but impractical. In practice, most P algorithms are low-degree polynomials.
Problems where solutions can be VERIFIED quickly -- the certificate/verifier definition
NP = Nondeterministic Polynomial time, NOT "Non-Polynomial"!
NP does NOT mean "not solvable in polynomial time." Many NP problems ARE in P! NP means verifiable in polynomial time.
Set = {3, 7, 1, 8, 4} Target = 12. Click numbers to build a certificate.
Verifying {3,1,8} sums to 12: instant (3 additions).
Finding which subset sums to 12: potentially 2n subsets to check.
The equivalent "lucky guessing" definition using Nondeterministic Turing Machines
Think of it as a two-phase machine:
Nondeterministically "guess" a certificate. Magically picks the right one if it exists.
Deterministically check the guess in polynomial time. This is the verifier.
A friend with perfect intuition always guesses right. You still double-check their work. If checking is fast (polynomial), the problem is in NP.
Certificate-based: "A short proof exists and can be checked quickly."
NTM-based: "A nondeterministic machine can find and verify in poly time."
Each branch = different candidate certificate.
Every problem in P is also in NP -- but is the reverse true?
If you can solve a problem in polynomial time, you can certainly verify a solution in polynomial time -- just solve it again and compare!
Given a P-algorithm for L: build a verifier V(x,c) that ignores c, runs the P-algorithm on x, and accepts iff the algorithm says YES. V runs in poly time, so L ∈ NP.
This is one of the Clay Millennium Prize Problems. Solve it (either direction) and win $1,000,000. Open since 1971 -- over 50 years of the brightest minds failing to resolve it.
Two possible worlds -- click to explore each scenario
Most experts think this is UNLIKELY
Cryptography breaks. RSA, AES, blockchain -- all gone.
Optimization becomes trivial. Scheduling, logistics, protein folding -- all solvable optimally.
Mathematical proofs become automated. Computers find short proofs of any provable theorem.
AI leaps. Optimal neural network training in poly time.
Cryptography is safe. Encryption works as intended.
Fundamental asymmetry exists. Creating IS harder than checking.
Creativity has value. Some tasks require genuine insight.
Use approximation & heuristics. Practical solutions still exist.
A 2019 poll: 88% of complexity theorists expect P ≠ NP. But nobody can PROVE it!
Decades of effort by brilliant researchers have failed to find polynomial algorithms for NP-complete problems. It would be astonishing if all that effort missed something. But nobody can PROVE it either way!
The tool for comparing problem difficulty: A ≤P B
Problem A reduces to problem B (written A ≤P B) if we can transform any instance of A into an instance of B in polynomial time, such that solving B gives us the answer to A.
If you can translate French to English quickly, and you have an English dictionary, then you can look up French words. Reducing French-lookup to English-lookup.
A ≤P B means "A is no harder than B" (or "B is at least as hard as A").
Showing a hard problem reduces TO your problem means your problem is hard too!
The hardest problems IN NP -- click regions and problems to explore
A problem is NP-Complete if:
1. It is in NP (solutions can be verified in polynomial time), AND
2. It is NP-Hard (every NP problem reduces to it in poly time)
If you find a polynomial algorithm for ANY NP-complete problem, then P = NP. Conversely, if ANY NP-complete problem has no poly-time algorithm, then P ≠ NP. They're the "gatekeepers."
SAT is NP-complete -- the theorem that launched complexity theory
SAT is NP-complete. Boolean Satisfiability was the FIRST problem ever proven NP-complete. Every other NP-completeness proof builds on this.
3 variables = 8 assignments. n variables = 2n. At n=300, more than atoms in the universe.
For each problem, decide: Is it in P, NP-complete, or NP-hard (not in NP)?
Click edges for reduction details, nodes for connections, or watch the chain cascade
Cook knocked over the first domino (SAT). Karp knocked over 21 more in 1972. Now thousands of NP-complete problems are known. Each needs just ONE reduction from an existing NP-complete problem.
Step-by-step construction showing how to reduce 3-SAT to the Clique problem
A k-clique in the constructed graph corresponds to picking one TRUE literal from each clause -- exactly a satisfying assignment!
This reduction from Vertex Cover to Set Cover has a bug. Find and fix it!
What should Sv contain instead? Click the correct fix:
Understanding the crucial distinction
Every NP problem reduces to it. "At least as hard as the hardest NP problems."
Does NOT need to be in NP itself!
NP-Hard AND in NP. The "sweet spot" -- maximally hard within NP. These are the gatekeepers of P vs NP.
NP-hard problems can be harder than NP! They might not even be decidable.
Example: The Halting Problem is NP-hard (every NP problem reduces to it) but it's not in NP -- it's not even decidable!
| In NP? | NP-Hard? | |
|---|---|---|
| Sorting | Yes (in P) | No |
| SAT | Yes | Yes (NP-Complete) |
| TSP Decision | Yes | Yes (NP-Complete) |
| Halting Problem | No | Yes (NP-Hard only) |
| TSP Optimization | No | Yes (NP-Hard only) |
An NP-hard problem is a "master lock." Pick this one lock, every other NP lock opens. NP-complete means it's also a lock in the NP building. Some master locks (Halting Problem) aren't even in the building!
What would each answer mean for the world?
RSA, AES, blockchain -- all broken. Attackers decrypt any message, forge any signature. Online banking, HTTPS -- gone.
Scheduling, logistics, protein folding, chip design -- all solvable optimally in polynomial time.
Finding proofs is in NP. If P=NP, computers find short proofs of any provable theorem automatically.
The hardness of certain problems guarantees encryption works. Your online banking is safe.
Creating IS harder than checking. Writing a proof is harder than verifying one. This asymmetry is built into mathematics.
Some tasks require genuine insight that cannot be shortcut by computation.
2019 poll: 88% of complexity theorists expect P ≠ NP.
But nobody can prove it either way!
Your problem is NP-complete. Now what? Don't despair!
Find a solution provably CLOSE to optimal. Vertex Cover: greedy gives ≤ 2x optimal. TSP (metric): Christofides gives ≤ 1.5x optimal.
Simulated annealing, genetic algorithms, local search. TSP tours for millions of cities found near-optimally in practice.
2-SAT is in P. Graph coloring on trees is in P. Your specific inputs might have exploitable structure.
Random assignment satisfies ≥ 7/8 of clauses in MAX-3-SAT on average.
Modern SAT solvers handle millions of variables using DPLL, CDCL, unit propagation. Worst case exponential, typical cases fast.
If n is small, even O(2n) is fine. Subset Sum with 30 elements? 230 ≈ 109 -- seconds on modern hardware.
NP-completeness says "you can't always find the exit in a giant maze quickly." But YOUR maze might have helpful signs, be small, or you might accept an exit that's close enough.
NP is just one level in a vast tower of complexity
Complements of NP problems. "Can you verify something is NOT the case?"
Example: "Is this formula UNSATISFIABLE?"
Polynomial SPACE, possibly exponential time. Example: Quantified Boolean Formulas (QBF) -- two-player game logic. PSPACE = NPSPACE (Savitch's theorem).
Exponential time. Generalized chess & checkers are EXPTIME-complete. We KNOW P ≠ EXPTIME!
P ⊆ NP ⊆ PSPACE ⊆ EXPTIME -- all known.
But we only know P ≠ EXPTIME for certain. Whether P≠NP, NP≠PSPACE are open!
Match each problem to its complexity class
Everything you need to know about P vs NP on one slide
1. Show X is in NP (describe a verifier).
2. Reduce a known NP-complete problem to X.
That's it! Usually reduce from 3-SAT.
P vs NP asks: is finding solutions fundamentally harder than checking solutions? The most important open question in CS. We believe yes (P≠NP), but proving it remains one of humanity's greatest challenges.
Test your understanding of P, NP, and reductions
Test your understanding of reductions and proofs
Final challenge on complexity hierarchy and implications