CS135-lecture-20210318

Pumping lemma for CFL #

PL for RL

Given \( RL, L \) there exists \( p \) such that if \( w \in L \) and \( |w| \geq p \) then there exists \( w = xyz \) where

  • \( |y| > 0 \)
  • \( |xy| \leq p \)
  • \( xy^iz \in L \)

We consume \( w \) and we get to a state that is the first repeated state. At some point we come back to it, and then to the accept.

image_2021-03-18-13-05-46

Since \( w \) is \( p \) or longer, it must visit one state more than once. We can identify this state as the one we get repeated. The part that gets us there the first time \( x \) , the part that gets us there the second time \( y \) , and the part that goes to the accept state \( z \) .

If \( x \) gets us to the repeated state, then we can go around \( y \) as many times we want (or omit).

Proof template #

Given context-free language \( L \) , there exists \( p \) such that if \( w \in L \) and \( |w| \geq p \) , then there exists a \( w = uvxyz \) where

  • \( |vy| > 0 \)
  • \( |vxy| \leq p \)
  • \( uv^ixy^iz \in L \)

Main use: we look for strings that are in the language and at least \( p \) long, and then apply the pumping lemma to show that the language is not context-free.

To prove:

  • Assume \( L \) is context-free
  • Pick \( w \in L \) with \( |w| \geq p \)
  • Argue that the pumping lemma says that there exists \( w = uvxyz \) with
    • \( |vy| > 0 \)
    • \( |vyx| \leq p \)
    • \( uv^ixy^i \in L \)
  • Argue that either if we pump down ( \( i = 0 \) gives us \( uxz \) , or if we pump up \( i > 0 \) we get \( uvvxyyz \) and show that \( e \not \in L \)
  • Pumping lemma says string is in \( L \) , contradiction, so \( L \) is not context-free.

Example #

Consider \( L = \{a^n b^n c^n : n \geq 0\} \)

This starts by consuming \( a \) ’s and pushing them onto the stack. Then when we start seeing \( b \) ’s we pop the \( a \) ’s off. When we reach the \( c \) ’s we don’t have a way to remember how many \( a \) ’s or \( b \) ’s we saw.

Note: A Turning machine can do this because it has 2 stacks.

So lets prove its not context-free using the pumping lemma.

The string we pick must be length \( p \) and exist in \( L \) :

\[\begin{aligned} w = a^p b^p c^p \end{aligned}\]

\( w \) exists in the language and is at least \( p \) long (it is in fact \( 3p \) long).

So lets assume that this language is context-free, and apply the pumping lemma.

Possible divisions for \( w \) :

image_2021-03-18-13-24-55

Since \( w = uvxyz \) , we don’t know exactly where they are however

image_2021-03-18-13-25-22

We do know that \( 0 < |vxy| \leq p \)

image_2021-03-18-13-26-20

Its possible that \( vxy \) :

  • appear in the region of just \( a \) ’s, since there are at least \( p \) \( a \) ’s. So \( vxy = \) all \( a \) ’s.
  • appear in the \( a \) ’s and \( b \) ’s.
  • any other combination of 2, but not all 3

image_2021-03-18-13-29-16

So, \( uvvxyyz \) will increase 1 or 2 types of characters, but not all 3. Thus it is out of balance, and the string is not in \( L \) .

\[\begin{aligned} uvvxyyz \not \in L \end{aligned}\]

Therefore, the language \( L = \{a^n b^n c^n : n \geq 0\} \) is not context-free.

Why the pumping lemma is true #

If \( L \) is context-free, then there is a context-free grammar for it. So, every string \( w \in L \) has a parse tree:

image_2021-03-18-13-34-54

where the fringe is \( w \) and the interior is non-terminals.

image_2021-03-18-13-35-42

Observe,

  • The longer that \( w \) is, the taller the tree
  • The taller the tree, the more non-terminals in the longest root-to-leaf path
  • Once the root-to-leaf path is long enough, some non-terminal must repeat in it

There are a finite number of productions in the grammar, and so as the tree gets longer and longer until there must be a repeat. This is where the pumping lemma comes in.

As an abstraction:

image_2021-03-18-13-39-20

Some part of \( w \) will be derived from \( A \) .

image_2021-03-18-13-40-12 image_2021-03-18-13-40-42

This tells us that

  • \( x \in L(A) \) , and
  • \( vxy \in L(A) \)

So each time that \( A \) is reached, we can replace it with \( x \) or \( vxy \) , we can pump the \( v \) and \( y \) as many times we want.

Note: This is similar to the pumping lemma for regular languages because it also uses the pigeon hole principle.