Proving Languages Are NOT Regular (or Not Context-Free)
"Can I prove this language IS regular?"
--> Build a DFA/NFA/regex for it.
"Can I prove this language is NOT regular?"
--> Use the PUMPING LEMMA.
CS305 - Formal Language Theory
Use arrow keys or buttons to navigate
1 / 19
The Big Picture
The pumping lemma is a tool for proving NEGATIVE results
What it tells you
A language is NOT regular
A language is NOT context-free
Key Idea
Every regular language has a certain "pumping" property. If a language lacks this property, it cannot be regular.
What it does NOT tell you
It does not prove a language IS regular
Passing the pumping lemma does not mean "regular"
Warning
The pumping lemma is a necessary condition for regularity, NOT a sufficient one. Think of it like a one-way test.
Proof by contradiction structure:
+-------------------------------------------------+
| 1. Assume L is regular (for contradiction) |
| 2. Then pumping lemma applies to L |
| 3. Find a string that CANNOT be pumped |
| 4. Contradiction! L is NOT regular. |
+-------------------------------------------------+
2 / 19
Intuition: Why Pumping Works
The Pigeonhole Principle meets finite automata
A DFA has a finite number of states. Say it has p states.
If it reads a string of length ≥ p, it visits at least p + 1 states (including the start).
By the pigeonhole principle, some state must be visited twice. That means there's a loop!
Analogy
If you have 5 pigeonholes and 6 pigeons, at least one hole has 2 pigeons. If you have p states and p+1 visits, at least one state is visited twice.
3 / 19
The Pumping Lemma for Regular Languages
The formal statement you need to memorize
Pumping Lemma (Regular Languages)
If L is a regular language, then there exists a number p (the pumping length) such that for every string s ∈ L with |s| ≥ p, s can be written as s = xyz satisfying:
1. |y| > 0 (y is not empty)
2. |xy| ≤ p (loop is in the first p characters)
3. xyiz ∈ L for all i ≥ 0 (pump y any number of times)
The string s broken into x, y, z:
|<------------ s = xyz ------------>|
|<--- x --->|<-- y -->|<--- z ----->|
| | LOOP | |
|<--- xy ---->|
| ≤ p chars |
Pumping:
i=0: x z (delete the loop)
i=1: x y z (original string)
i=2: x yy z (traverse loop twice)
i=3: x yyy z (traverse loop three times)
...
4 / 19
The Pumping Game: Prove {anbn} is NOT Regular
Play the adversarial game -- you are the prover!
1Adversary picks pumping length p
2You pick s ∈ L with |s| ≥ p. Suggestion: s = apbp = ""
3Adversary splits s = xyz (|y| > 0, |xy| ≤ p):
4You pick i to pump:
Critical Point
Your argument must work for ALL possible values of p and ALL valid splits xyz. You only get to choose s and i.
5 / 19
The Proof Game as a Flowchart
Follow this template for every pumping lemma proof
START: "I want to prove L is not regular."
|
v
+-----------------------------------------------+
| Step 0: ASSUME (for contradiction) L is |
| regular. Then the pumping lemma holds. |
+-----------------------------------------------+
|
v
+-----------------------------------------------+
| Step 1: Let p be the pumping length |
| (given by the lemma -- we don't pick |
| its value, just call it p). |
+-----------------------------------------------+
|
v
+-----------------------------------------------+
| Step 2: CHOOSE a specific string s in L |
| with |s| >= p. |
| (This is YOUR strategic choice!) |
+-----------------------------------------------+
|
v
+-----------------------------------------------+
| Step 3: CONSIDER ANY split s = xyz |
| where |y| > 0 and |xy| <= p. |
| (You must handle ALL valid splits!) |
+-----------------------------------------------+
|
v
+-----------------------------------------------+
| Step 4: FIND an i >= 0 such that |
| xy^i z is NOT in L. |
| (Usually i = 0 or i = 2 works.) |
+-----------------------------------------------+
|
v
+-----------------------------------------------+
| Step 5: CONTRADICTION with the pumping lemma. |
| Therefore L is NOT regular. QED |
+-----------------------------------------------+
Pro Tip: Choosing s
Pick s so that any way the adversary splits the first p characters forces y to land in an "inconvenient" region. Strings like apbp work well because the first p characters are all a's, so y must be all a's -- pumping it breaks the a/b balance.
6 / 19
Explore: Pumping apbp
Interactively pick p, choose a split, and pump to see the contradiction
p =
String s = apbp:
Pump i =
Why this always works
Since |xy| ≤ p and the first p characters are all a's, y must consist entirely of a's. Pumping y (with i ≠ 1) changes the number of a's but not b's, so the result cannot be in {anbn}.
7 / 19
Example 2: {ww | w ∈ {0,1}*}
Prove this language is not regular
0Assume L = {ww} is regular.
1Let p be the pumping length.
2Choose s = 0p1 0p1.
Here w = 0p1, so s = ww ∈ L. And |s| = 2p + 2 ≥ p.
3Consider any split s = xyz with |y| > 0, |xy| ≤ p.
Since |xy| ≤ p and the first p chars are all 0's, y = 0k for some k ≥ 1.
4Choose i = 2. Then:
xy2z = 0p+k1 0p1
Total length = 2p + 2 + k (odd+even matters less; structure matters).
For this to be ww, the two halves must match. The first half would be 0(p+k)/2+1... but the 1's are no longer aligned symmetrically. The first half has more 0's before its 1 than the second half does. So xy2z ∉ L.
5Contradiction! L is not regular. □
s = 0^p 1 0^p 1 (this is ww where w = 0^p 1)
0 0 0 ... 0 0 1 0 0 0 ... 0 0 1
|<-- p -->| |<-- p -->|
|<--- w --->| |<--- w --->|
Since |xy| <= p:
|< x >|< y >|<--------- z --------->|
0 .. 0 0..0 0 .. 0 1 0 0 .. 0 0 1
||
all 0's
After pumping (i=2):
0 0 .. 0 0 0..0 0..0 1 0 0 .. 0 0 1
|<--- p+k 0's --->| |<-- p -->|
For this to be ww, split in half:
first half = 0^((p+k+1)) ... contains the "1"
second half = 0^p 1 ... or similar
The two halves CANNOT match because
there are p+k zeros before the first "1"
but only p zeros before the second "1".
8 / 19
Example 3: {1n2 | n ≥ 0}
Strings of 1s whose length is a perfect square -- proof using number theory
0Assume L = {1n2} is regular.
1Let p be the pumping length.
2Choose s = 1p2.
s ∈ L since p2 is a perfect square. |s| = p2 ≥ p.
3Consider any split s = xyz with |y| = k where 1 ≤ k ≤ p (since |y| > 0 and |xy| ≤ p).
4Choose i = 2. Then:
|xy2z| = p2 + k
We need to show p2 + k is NOT a perfect square.
Since 1 ≤ k ≤ p:
p2 < p2 + k ≤ p2 + p < p2 + 2p + 1 = (p+1)2
So p2 + k is strictly between two consecutive perfect squares. Therefore it is NOT a perfect square!
5Contradiction! L is not regular. □
The key number theory insight:
Perfect squares: 0, 1, 4, 9, 16, 25, 36, ...
Gaps between consecutive squares GROW:
n: 0 1 2 3 4 5 6
n^2: 0 1 4 9 16 25 36
gap: 1 3 5 7 9 11
Gap between p^2 and (p+1)^2:
(p+1)^2 - p^2 = 2p + 1
When we pump, we add k where 1 <= k <= p:
p^2 + k
Since k <= p < 2p + 1:
p^2 < p^2 + k < (p+1)^2
|-------|-----------|------------|
p^2 p^2+1 p^2+p (p+1)^2
|<--- k --->|
falls in the GAP!
NOT a perfect square!
Analogy
Think of perfect squares as "stepping stones" that get further and further apart. Pumping adds a small amount (at most p), but the gap to the next stone is 2p+1. You land in the water every time!
9 / 19
Common Mistakes
These errors cost points on every exam. Do not make them!
Mistake 1: Picking a specific p
"Let p = 5..." -- NO! You don't get to choose p. The adversary chooses it. Your proof must work for any p.
Mistake 2: Picking a specific split
"Let x = a2, y = a3, z = ..." -- NO! The adversary picks the split. You must argue about all valid splits.
Mistake 3: Forgetting |xy| ≤ p
This condition restricts WHERE y can be. It's often the most useful condition! Don't ignore it -- it constrains the adversary's choices.
Mistake 4: Wrong quantifier order
"For some split xyz... for all i..." -- BACKWARDS! The adversary picks the split, and then you pick i.
Mistake 5: Using pumping to prove regularity
"The language can be pumped, so it's regular." -- WRONG! The pumping lemma is one-directional. It can only prove non-regularity.
The Correct Quantifier Order
FOR ALL p (adversary picks)
THERE EXISTS s (you pick)
FOR ALL xyz splits (adversary picks)
THERE EXISTS i (you pick)
xy^i z NOT in L
10 / 19
When the Pumping Lemma Fails
A necessary condition is not the same as a sufficient condition
Surprising Fact
There exist languages that are NOT regular but still satisfy the pumping lemma!
Consider the language:
L = {aibjck | i, j, k ≥ 0 and if i = 1 then j = k}
This language is NOT regular (it contains {abncn}), but you cannot prove this using the pumping lemma alone -- it satisfies the pumping property!
Analogy
"All dogs are mammals" does NOT mean "all mammals are dogs." Similarly, "all regular languages are pumpable" does NOT mean "all pumpable languages are regular."
The logical relationship:
+--------------------------------------+
| All languages |
| |
| +--------------------------------+ |
| | Pumpable languages | |
| | | |
| | +--------------------------+ | |
| | | Regular languages | | |
| | | | | |
| | +--------------------------+ | |
| | ^ | |
| | These are pumpable AND | |
| | regular. | |
| | | |
| | * Non-regular but pumpable | |
| | languages live HERE | |
| +--------------------------------+ |
| |
| * Non-pumpable languages |
| are definitely NOT regular |
+--------------------------------------+
Pumping lemma proves: NOT pumpable --> NOT regular
It CANNOT prove: Pumpable --> Regular
When the pumping lemma is insufficient, use:
Myhill-Nerode theorem (necessary AND sufficient)
Closure properties (intersect with a regular language, then pump)
11 / 19
The Pumping Lemma for CFLs
Same idea, but now we pump TWO substrings
Pumping Lemma (Context-Free Languages)
If L is context-free, then there exists p such that for every s ∈ L with |s| ≥ p, s can be written as s = uvxyz satisfying:
1. |vy| > 0 (v and y are not BOTH empty)
2. |vxy| ≤ p (the "middle chunk" is bounded)
3. uvixyiz ∈ L for all i ≥ 0 (pump v and y together)
The string s broken into u, v, x, y, z:
|<------------------ s = uvxyz ------------------>|
|<- u ->|<- v ->|<- x ->|<- y ->|<----- z ------>|
| | PUMP | | PUMP | |
| |<----- vxy ---->|
| | <= p chars |
Pumping (v and y are pumped TOGETHER, same number of copies):
i=0: u x z (delete both v and y)
i=1: u v x y z (original)
i=2: u vv x yy z (double both)
i=3: u vvv x yyy z (triple both)
Key Difference from Regular Pumping
Regular: pump ONE substring (y). CFL: pump TWO substrings (v and y) in sync. This is because CFGs can generate matching pairs (like matching parentheses), but pumping both sides preserves the pairing.
12 / 19
Intuition: Why CFL Pumping Works
The parse tree argument -- a repeated variable means a "nestable" pattern
A context-free grammar has a finite number of variables (nonterminals).
If a string s is long enough, its parse tree must be tall. A tall tree means a long root-to-leaf path.
By the pigeonhole principle, some variable A must appear twice on this path.
The subtree rooted at the upper A generates vxy. The subtree rooted at the lower A generates just x.
We can replace the lower A's subtree with the upper A's subtree (or vice versa), giving us the pumping effect!
Parse tree with repeated variable A:
S
/|\
/ | \
u . z <-- generates u...z
|
A <--------- UPPER occurrence of A
/|\
/ | \
v . y <-- generates v...y
|
A <--------- LOWER occurrence of A
|
x <-- generates x
String: u v x y z
PUMP UP (replace lower A with upper A's tree):
S
/|\
u . z
|
A
/|\
v . y
|
A <-- plug in upper A again!
/|\
v . y
|
A
|
x
Result: u v v x y y z = uv^2 xy^2 z
PUMP DOWN (replace upper A with lower A's tree):
S
/|\
u . z
|
A
|
x
Result: u x z = uv^0 xy^0 z
13 / 19
Example: {anbncn | n ≥ 0}
Prove this language is not context-free
0Assume L = {anbncn} is context-free.
1Let p be the pumping length.
2Choose s = apbpcp.
s ∈ L and |s| = 3p ≥ p.
3Consider any split s = uvxyz with |vy| > 0 and |vxy| ≤ p.
Since |vxy| ≤ p, the substring vxy can span at most two of the three symbol types (a, b, c). It cannot touch all three.
4Choose i = 2. Then uv2xy2z has more of at most two symbols but not the third. The counts of a's, b's, c's are no longer all equal.
So uv2xy2z ∉ L.
5Contradiction! L is not context-free. □
s = a^p b^p c^p
a a...a a b b...b b c c...c c
|<- p ->| |<- p ->| |<- p ->|
Since |vxy| <= p, vxy fits in a window
of width p. Where can this window be?
Case 1: vxy is all a's and b's (no c's)
a a [a..a b..b] b c c...c c
|<= p chars>|
Pumping increases a's or b's (or both),
but NOT c's. Counts become unequal!
Case 2: vxy is all b's and c's (no a's)
a a...a a b [b..b c..c] c
|<= p chars>|
Pumping increases b's or c's (or both),
but NOT a's. Counts become unequal!
Case 3: vxy is all a's (or all b's/c's)
Same argument -- only one count changes.
In ALL cases, pumping breaks the
a-count = b-count = c-count requirement!
Key Insight
The constraint |vxy| ≤ p is what makes this work. It prevents the "pump zone" from touching all three symbol groups simultaneously.
14 / 19
Example: {ww | w ∈ {0,1}*}
Not just non-regular -- also NOT context-free!
0Assume L = {ww} is context-free.
1Let p be the pumping length.
2Choose s = 0p1p0p1p.
Here w = 0p1p, so s = ww ∈ L, and |s| = 4p ≥ p.
3Consider any split s = uvxyz with |vy| > 0 and |vxy| ≤ p.
Since |vxy| ≤ p, it sits within a window of at most p characters. In the string 0p1p0p1p, this window straddles at most two of the four blocks.
4Choose i = 2. Pumping changes the length of at most two of the four blocks, destroying the ww structure.
The first half and second half can no longer match.
5Contradiction! L is not context-free. □
s = 0^p 1^p 0^p 1^p
0...0 1...1 0...0 1...1
|blk1| |blk2| |blk3| |blk4|
|< w = 0^p 1^p >|< w = 0^p 1^p >|
|vxy| <= p, so the window sits in one
of these regions:
Region A: within block 1 (all 0s)
Region B: straddling blocks 1-2 (0s and 1s)
Region C: within block 2 (all 1s)
Region D: straddling blocks 2-3 (1s and 0s)
Region E: within block 3 (all 0s)
Region F: straddling blocks 3-4 (0s and 1s)
Region G: within block 4 (all 1s)
In every case, pumping affects at most
2 adjacent blocks. The other 2 blocks
stay the same.
Example - Region D (straddles 1^p and 0^p):
Pumping gives: 0^p 1^(p+a) 0^(p+b) 1^p
First half: 0^p 1^((p+a)/2)...
Second half: ...doesn't match!
Note
Contrast with {wwR} (palindromes), which IS context-free. ww requires "copying" which CFGs cannot do; wwR requires "mirroring" which CFGs handle via nesting.
15 / 19
Comparing the Two Pumping Lemmas
Side-by-side: Regular vs. Context-Free
Feature
Regular Languages PL
Context-Free Languages PL
Split
s = xyz (3 parts)
s = uvxyz (5 parts)
Pumped parts
y alone
v and y together (in sync)
Non-empty
|y| > 0
|vy| > 0
Length bound
|xy| ≤ p
|vxy| ≤ p
Pumped string
xyiz ∈ L
uvixyiz ∈ L
Source of loop
Repeated state in DFA
Repeated variable in parse tree
Proves
Language is NOT regular
Language is NOT context-free
Limitation
Necessary, not sufficient
Necessary, not sufficient
Regular PL: |-- x --|-- y --|--- z ---|
PUMP
Pumped: |-- x --|yyyyyy|--- z ---|
CFL PL: |- u -|- v -|- x -|- y -|- z -|
PUMP PUMP
Pumped: |- u -|vvvvv|- x -|yyyyy|- z -|
How to Decide Which to Use
Trying to prove a language is not regular? Use the regular pumping lemma first (simpler). Trying to prove it's not context-free? You must use the CFL pumping lemma. If you already know a language is not regular, the CFL lemma can tell you if it's also not context-free.
16 / 19
Beyond Pumping
Other techniques for proving non-regularity and non-context-freeness
Myhill-Nerode Theorem
A language L is regular if and only if it has a finite number of equivalence classes under the indistinguishability relation.
Advantage over Pumping
Myhill-Nerode is necessary AND sufficient. If the pumping lemma can't prove non-regularity, Myhill-Nerode still can.
Closure Properties
Regular and context-free languages are closed under certain operations. Strategy:
Assume L is regular (or CF)
Intersect L with a known regular language
Show the result is a known non-regular (or non-CF) language
Contradiction with closure!
Ogden's Lemma
A strengthened version of the CFL pumping lemma where you can "mark" certain positions and the lemma guarantees the pump includes marked positions.
Example: Closure property proof
Prove L = {0^n 1^n 2^n} is not CF.
Alternative to pumping:
1. Assume L is CF.
2. CF languages are closed under
intersection with regular languages.
3. Let R = 0* 1* 2* (regular).
4. L intersect R = L itself.
5. But we can also use this trick
with harder languages where direct
pumping is tricky.
Closure properties let you REDUCE
a hard problem to an easier one!
Analogy
The pumping lemma is a screwdriver -- great for most screws. Myhill-Nerode is a power drill -- works on everything but takes more setup. Closure properties are like using a friend's tool -- reduce the problem to one they already solved.
17 / 19
Summary & Cheat Sheet
Your quick reference for pumping lemma proofs
Proof Template (Regular)
1. Assume L is regular.
2. Let p = pumping length.
3. Choose s in L, |s| >= p.
(TIP: make first p chars uniform)
4. Let s = xyz, |y| > 0, |xy| <= p.
5. Show xy^i z not in L for some i.
(TIP: try i = 0 or i = 2 first)
6. Contradiction. L not regular. QED.
Proof Template (CFL)
1. Assume L is context-free.
2. Let p = pumping length.
3. Choose s in L, |s| >= p.
4. Let s = uvxyz, |vy| > 0, |vxy| <= p.
5. Show uv^i xy^i z not in L for some i.
(TIP: |vxy| <= p limits the window)
6. Contradiction. L not CF. QED.
Golden Rule of String Choice
Pick s so that the constraint |xy| ≤ p (or |vxy| ≤ p) forces the pump zone into a region that will break the language's defining property when pumped.
Quick Reference Table
Language
Regular?
CF?
anbn
No
Yes
ww
No
No
wwR
No
Yes
anbncn
No
No
1n2
No
Yes
balanced parens
No
Yes
Remember!
The Quantifier Chant
For all p, there exists s, for all xyz, there exists i.
Adversary, You, Adversary, You. A-Y-A-Y.
THEY pick p --> YOU pick s
THEY split --> YOU pick i
If you always win --> NOT regular!
18 / 19
Challenge Quiz
Test your pumping lemma knowledge -- 3 random questions