On this page:
1 A Regular Language?
2 A Context-Free Language?
3 CFGs and Their String Derivations
4 More Induction Practice

Homework 5

Last updated: Mon, 18 Mar 2024 11:58:43 -0400

Out: Mon Mar 18, 12:00pm EST (noon) Due: Mon Mar 25, 12:00pm EST (noon)

This assignment begins to explore context-free grammars (CFGs) and context-free languages (CFLs).

Homework Problems

  1. A Regular Language? (11 points)

  2. A Context-Free Language? (11 points)

  3. CFGs and Their String Derivations (11 points)

  4. More Induction Practice (11 points)

  5. README (1 point)

Total: 45 points

Submitting

Submit your solution to this assignment in Gradescope hw5. Please assign each page to the correct problem and make sure your solutions are legible.

A submission must also include a README containing the required information.

1 A Regular Language?

Here is a language that approximates basic whitespace checking in a language like Python.

\mathit{WS} = \left\{\sqcup^*\texttt{if:}\downarrow\sqcup^*\texttt{else:}\mid\downarrow\textrm{ is newline; }\sqcup\textrm{ is space; spaces on each line match}\right\}

\Sigma = \left\{\texttt{if:},\texttt{else:},\sqcup,\downarrow\right\} (in real-world grammars, terminal (i.e., alphabet) "symbols" can be, and often are, whole words)

Prove that this language is not a regular language.

The proof must use the Pumping Lemma and proof by contradiction. Make sure your proof has all the required components.

2 A Context-Free Language?

Now prove that the \mathit{WS} language from problem A Regular Language? above is a context-free language (CFL).

Your proof must be in the form of a Statements and Justifications table, but it must use the CFG representation of CFLs.

3 CFGs and Their String Derivations

Here’s a context-free grammar (CFG), called \mathit{LIKEPY}, representing the core of a Python-like language:

\left\langle STMTS+\right\rangle

\rightarrow

\left\langle STMT\right\rangle; \left\langle STMTS+\right\rangle \mid \left\langle STMT\right\rangle

\left\langle STMT\right\rangle

\rightarrow

\left\langle ID\right\rangle \texttt{=} \left\langle EXPR\right\rangle \mid \texttt{if:} \left\langle EXPR\right\rangle\texttt{then:} \left\langle STMT\right\rangle \texttt{else:} \left\langle STMT\right\rangle

\mid \texttt{print(} \left\langle EXPR\right\rangle \texttt{)} \mid \left\langle FNDEF\right\rangle \mid \left\langle EXPR\right\rangle

\left\langle FNDEF\right\rangle

\rightarrow

\texttt{def} \left\langle ID\right\rangle \texttt{(}\left\langle IDS+\right\rangle \texttt{)}\texttt{:} \left\langle STMT\right\rangle

\left\langle EXPR\right\rangle

\rightarrow

\left\langle ID\right\rangle \mid \left\langle NUM\right\rangle \mid \left\langle EXPR\right\rangle \texttt{==} \left\langle EXPR\right\rangle \mid \left\langle EXPR\right\rangle \texttt{*} \left\langle EXPR\right\rangle

\mid \left\langle EXPR\right\rangle \texttt{+} \left\langle EXPR\right\rangle \mid \left\langle LAM\right\rangle \mid \left\langle ID\right\rangle\texttt{(} \left\langle EXPRS\right\rangle \texttt{)}

\left\langle LAM\right\rangle

\rightarrow

\texttt{lambda} \left\langle IDS+\right\rangle \texttt{:} \left\langle EXPR\right\rangle

\left\langle EXPRS\right\rangle

\rightarrow

\left\langle EXPRS+\right\rangle \mid \varepsilon

\left\langle EXPRS+\right\rangle

\rightarrow

\left\langle EXPR\right\rangle, \left\langle EXPRS+\right\rangle \mid \left\langle EXPR\right\rangle

\left\langle NUM\right\rangle

\rightarrow

0\mid 1\mid 2\mid 3\mid 4\mid 5\mid 6\mid 7\mid 8\mid 9

\left\langle IDS+\right\rangle

\rightarrow

\left\langle ID\right\rangle \texttt{,} \left\langle IDS+\right\rangle \mid \left\langle ID\right\rangle

\left\langle ID\right\rangle

\rightarrow

\texttt{f}\mid\texttt{g}\mid\texttt{x}\mid\texttt{y}

(Yes, real-world languages are much more complicated than textbook examples.)

The variables (nonterminals) of the CFG are all the names enclosed in angle brackets, e.g., \left\langle EXPR\right\rangle is a variable name. All other symbols used in the rules are terminals (ignoring whitespace) (also, you may treat multi-symbol terminals as one terminal, e.g., \texttt{if:}).

  1. Give two strings in \mathit{LIKEPY}’s language by showing their derivation steps. Each given string must have at least six derivation steps.

  2. Give a formal description of the grammar \mathit{LIKEPY}. You may assume that the given rules are in a set called \mathit{PYRULES}.

  3. Give parse trees for the following strings in the language of \mathit{LIKEPY}:
    1. \texttt{def f(x): x + 1}

    2. \texttt{lambda x: x + 1}

    3. \texttt{f = lambda x,y: x + y; def g(x,y): x*y; f(2,3)==g(2,3);}

4 More Induction Practice

Prove that the following statement is for string derivations using some CFG G=\left\langle V,\Sigma,R,S\right\rangle:

If \alpha\Rightarrow_G^*\beta then \gamma_1\alpha\gamma_2\Rightarrow_G^*\gamma_1\beta\gamma_2

where \alpha,\beta,\gamma_1,\gamma_2\in (V\cup\Sigma)^*.

In other words, if a grammar can derive a string, then it can derive the same string as a substring in a larger string.

Since \Rightarrow^* is a recursive defintion, the proof must also be recursive, i.e., you must use proof by induction.

Make sure to clearly state all the necessary components of such a proof, including:
  1. what the induction is "on",

  2. base case(s),

  3. and inductive case(s) (where each includes an inductive hypothesis)

In addition, the proof of each case should be clearly explained with a Statements and Justifications table, as described in class.