CS305 -- Formal Language Theory & Complexity
Use arrow keys or buttons to navigate
We've studied what's computable. Now: among computable problems, which are EFFICIENTLY solvable?
In the first half of the course, we asked: "Can this problem be solved at all?"
Now we ask: "Can it be solved FAST ENOUGH to be practical?"
Computability: "Your keys exist somewhere in the universe."
Complexity: "Can you find them before lunch?"
Time complexity as a function of input size n
O(n) -- LinearO(n log n) -- LinearithmicO(n^2) -- QuadraticO(n^3) -- CubicO(n^k) -- Polynomial for fixed kThese grow manageably.
O(2^n) -- ExponentialO(n!) -- FactorialO(n^n) -- SuperexponentialThese EXPLODE.
At n=100, an O(n^2) algorithm takes 10,000 steps (milliseconds on modern hardware).
An O(2^n) algorithm takes 10^30 steps -- that's more than the age of the universe in nanoseconds.
No amount of faster hardware will save you from exponential growth.
Problems solvable in polynomial time by a deterministic Turing Machine
P captures the notion of feasible computation. If a problem is in P, we can build practical software to solve it.
P problems are like recipes where the cooking time is proportional to the number of guests. Double the guests? Maybe quadruple the cooking time. That's manageable.
A tour of problems we CAN solve efficiently
Being "in P" doesn't mean it's trivially fast -- O(n^100) is technically polynomial but impractical. In practice, most P algorithms we use are O(n), O(n log n), or low-degree polynomials.
Problems where solutions can be VERIFIED quickly
NP stands for Nondeterministic Polynomial time, NOT "Non-Polynomial"!
NP does NOT mean "not solvable in polynomial time." Many NP problems ARE in P! NP means verifiable in polynomial time.
Set = {3, 7, 1, 8, 4} Target = 12. Click numbers to select them.
The equivalent "lucky guessing" definition
Think of it as a two-phase machine:
Nondeterministically "guess" a certificate (candidate solution). Magically picks the right one if it exists.
Deterministically check the guess in polynomial time. This is the verifier.
Imagine you have a friend with perfect intuition. They always guess the right answer, but you still need to double-check their work. If the checking step is fast (polynomial), the problem is in NP.
Certificate-based: "A short proof exists and can be checked quickly."
NTM-based: "A nondeterministic machine can find and verify in poly time."
Each nondeterministic branch corresponds to a different candidate certificate.
Problems where we can verify but (probably) can't efficiently solve
For every NP problem: finding the answer seems to require searching an exponential space, but checking a given answer is fast. That asymmetry is the heart of P vs NP.
Every problem in P is also in NP -- but is the reverse true?
If you can solve a problem in polynomial time, you can certainly verify a solution in polynomial time.
This is one of the Clay Millennium Prize Problems. Solve it (either direction) and win $1,000,000.
It has been open since Stephen Cook formalized it in 1971 -- over 50 years of the brightest minds failing to resolve it.
Two possible worlds — click to explore each scenario
Decades of effort by brilliant researchers have failed to find polynomial algorithms for NP-complete problems. It would be astonishing if all that effort missed something. But nobody can PROVE it either way!
The tool for comparing problem difficulty
Problem A reduces to problem B (written A ≤_P B) if we can transform any instance of A into an instance of B in polynomial time, such that solving B gives us the answer to A.
If you can translate French to English quickly, and you have an English dictionary, then you can effectively look up French words. Reducing French-lookup to English-lookup.
A ≤_P B means "A is no harder than B" (or "B is at least as hard as A").
If you show a hard problem reduces TO your problem, your problem must be hard too!
Problems that are "at least as hard as anything in NP"
NP-hard problems can be harder than NP! They might not even be decidable.
Example: The Halting Problem is NP-hard (every NP problem reduces to it) but it's not in NP -- it's not even decidable!
Think of an NP-hard problem as a "master lock." If you can pick this one lock, every other lock (NP problem) opens automatically. It's at least as tough as every other lock in the building.
The hardest problems IN NP — click regions and problems to explore
A problem is NP-Complete if it is:
1. In NP (solutions can be verified in polynomial time), AND
2. NP-Hard (every NP problem reduces to it)
If you find a polynomial algorithm for ANY NP-complete problem, then P = NP (all NP problems become easy). Conversely, if you prove ANY NP-complete problem has no poly-time algorithm, then P ≠ NP. They're the "gatekeepers" of the P vs NP question.
The theorem that launched complexity theory
SAT is NP-complete.
Boolean Satisfiability was the FIRST problem ever proven NP-complete. Every other NP-completeness proof builds on this foundation.
Cook showed that the general act of computation can be captured by Boolean logic. Every polynomial-time verification can be "compiled" into a SAT instance.
It's like discovering that every recipe in every cookbook can be translated into one universal recipe format. SAT is that universal format -- it can express ANY NP computation.
Stephen Cook (Toronto) proved this in 1971. Leonid Levin independently proved a similar result in the Soviet Union. Cook received the Turing Award in 1982 for this work.
Boolean Satisfiability: toggle variables and watch clauses evaluate live
| x1 | x2 | x3 | C1 | C2 | C3 | C4 | φ |
|---|
With 3 variables, we checked 8 rows. With n variables, there are 2n possible assignments. At n=300, that's more than the atoms in the universe.
When every clause has exactly 3 literals, it's called 3-SAT. This restricted version is STILL NP-complete! (2-SAT, however, is in P.)
Click edges for reduction details, nodes for connections, or watch the chain cascade
1. Show X is in NP (give a polynomial-time verifier).
2. Pick a known NP-complete problem Y and show Y ≤_P X (reduce Y to X in polynomial time).
Cook knocked over the first domino (SAT). Karp knocked over 21 more in 1972. Now thousands of NP-complete problems are known. Each new one just needs ONE reduction from an existing NP-complete problem.
The "greatest hits" -- problems you'll see everywhere
SAT where each clause has exactly 3 literals.
(x1 OR ~x2 OR x3) AND (~x1 OR x4 OR x2)
The "workhorse" for reductions.
Find k vertices all connected to each other.
"Is there a friend group of size 5 where everyone knows everyone?"
Find k vertices with NO edges between them.
"Can you seat 5 people at a dinner where no two are enemies?"
Find k vertices that touch every edge.
"Place k security cameras to watch every hallway."
Visit every vertex exactly once and return to start.
"Can the mail carrier visit every house on one loop?"
Is there a tour of all cities with distance ≤ k?
4 cities: "Can I visit A, B, C, D and return in ≤ 20 miles?"
Given S = {3, 7, 1, 8, 4, 12}, target = 15.
Is there a subset summing to 15? Yes: {3, 8, 4}.
Color vertices with k colors, no adjacent vertices same color.
"Color a map with 3 colors so no bordering countries match."
If you solve ANY ONE of these in polynomial time, you've solved ALL of them (and won $1M). They all reduce to each other. They stand or fall together.
Your problem is NP-complete. Now what?
Don't despair! NP-completeness is a worst-case statement. In practice, there are many strategies:
Don't find the OPTIMAL solution -- find one that's provably CLOSE to optimal.
Example: For Vertex Cover, a simple greedy algorithm always finds a cover at most 2x the optimal size. Good enough for many applications!
Algorithms that work well in practice without guarantees: simulated annealing, genetic algorithms, local search.
Example: TSP tours for millions of cities are routinely found near-optimally using heuristics.
Your specific inputs might have structure that makes the problem easier.
Example: 2-SAT is in P! Graph coloring on trees is in P! TSP on Euclidean distances has good approximations.
Allow random coin flips. Sometimes randomness helps!
Example: Random assignment satisfies at least 7/8 of clauses in MAX-3-SAT on average.
Modern SAT solvers handle formulas with millions of variables using clever techniques (DPLL, CDCL, unit propagation). Worst case is exponential, but typical cases are fast.
If n is small enough, even O(2^n) is fine. Subset Sum with 30 elements? 2^30 ≈ 10^9 -- a computer handles that in seconds.
NP-completeness says "you can't always find the exit in a giant maze quickly." But YOUR maze might have helpful signs, be smaller than you think, or you might accept finding an exit that's close enough to the shortest one.
NP is just one level in a vast tower of complexity
Complements of NP problems. "Can you verify that something is NOT the case?"
Example: "Is this formula UNSATISFIABLE?" (complement of SAT). Easy to prove satisfiable (give an assignment), hard to prove unsatisfiable.
Problems solvable with polynomial SPACE (but possibly exponential time).
Example: Quantified Boolean Formulas (QBF): "For all x, there exists y, such that φ(x,y) is true." Like a two-player game -- much harder than SAT!
Problems requiring exponential time. Generalized chess and checkers are EXPTIME-complete. We KNOW P ≠ EXPTIME!
P ⊆ NP ⊆ PSPACE ⊆ EXPTIME -- all inclusions are known.
But we only know P ≠ EXPTIME for certain. Whether P ≠ NP, NP ≠ PSPACE, etc. are all open questions!
The "mathematical apocalypse" scenario
If someone proved P = NP and gave us actual algorithms, the consequences would be staggering:
Modern encryption (RSA, AES, etc.) relies on problems being HARD to solve (factoring, discrete log). If P = NP, an attacker could break any cipher, forge any digital signature, and decrypt any message. Online banking, HTTPS, blockchain -- all gone.
Scheduling, logistics, resource allocation, protein folding, chip design -- all solvable optimally in polynomial time. Companies would save billions. Supply chains perfected overnight.
Many ML problems (optimal neural network training, feature selection) are NP-hard. If P = NP, we could find provably optimal models efficiently.
Finding proofs is in NP (verify by checking each step). If P = NP, computers could find short proofs of any provable theorem automatically. Mathematics itself would be transformed.
Writing a symphony "as good as Beethoven's" (if quality can be verified) becomes a computation. Generating optimal code, designing drugs, composing music -- all become algorithmic.
Even if P = NP, the polynomial might be impractically large (e.g., O(n^1000000)). A "yes" answer doesn't guarantee practical algorithms -- but historically, polynomial algorithms are eventually improved.
The world as we know it -- some problems are inherently hard
Most computer scientists believe P ≠ NP. A 2019 poll showed 88% of complexity theorists expect P ≠ NP.
The hardness of certain problems guarantees that encryption, digital signatures, and secure communication work as intended. Your online banking is safe because breaking the encryption is (probably) an inherently hard problem.
There really IS a deep difference between creating and checking. Composing a symphony is harder than appreciating one. Writing a proof is harder than verifying one. This asymmetry is built into the fabric of mathematics.
Some tasks require genuine insight that cannot be shortcut by brute computation. Human (and AI) creativity retains its value.
We'd need to show that no polynomial-time algorithm exists for SAT -- ruling out ALL possible algorithms, including ones nobody has thought of yet. Known proof techniques (diagonalization, relativization, natural proofs) have been shown to be insufficient for this task.
Razborov and Rudich (1997) showed that a large class of "natural" proof strategies cannot resolve P vs NP. We need genuinely new mathematical ideas.
Proving P ≠ NP is like proving that no shortcut exists through a maze -- not just that YOU can't find one, but that nobody ever could. You have to rule out every conceivable path.
Everything you need to know about P vs NP on one slide
1. Show X is in NP (describe a verifier).
2. Reduce a known NP-complete problem to X.
That's it! The "known problem" is usually 3-SAT.
P vs NP is about whether finding solutions is fundamentally harder than checking solutions. It's arguably the most important open question in all of computer science and mathematics. We believe the answer is yes (P ≠ NP), but proving it remains one of humanity's greatest intellectual challenges.
Test your understanding of P vs NP concepts