On this page:
1 Concatentation
2 Kleene Star
3 A Regular Expression Matcher
4 The FLIP Operation
5 A Non-Regular language
6 A Regular and a Non-Regular Language

Homework 4

Last updated: Mon, 22 Feb 2021 16:17:36 -0500

Out: Mon Feb 22, 00:00 EST Due: Sun Feb 28, 23:59 EST

This is the last homework related to Chapter 1.

You will explore NFAs and their relation to regular expressions; an additional closure property of regular langs; and practice proving when a language is not regular.

Homework Problems

  1. Concatentation

  2. Kleene Star

  3. A Regular Expression Matcher (6 pts + 6 pts bonus)

  4. The FLIP Operation (5 pts)

  5. A Non-Regular language (5 pts)

  6. A Regular and a Non-Regular Language (4 + 4 = 8 pts)

Total: 24 points (+ 6 bonus points)

Submitting

Submit the solution to this assignment in Gradescope hw4.

The submission must include the following files (NOTE: everything is case-sensitive):
  • a Makefile with the following targets:
    • setup (optional)

    • run-hw4-regexp

  • a README containing the required information,

  • the source code files needed by your Makefile,

  • and a pdf or plain-text file containing the answer to the non-coding questions.

1 Concatentation

Your Task

Implement concatentation for your NFAs from A Data Representation for NFAs.

Specifically, write a function that, given:

  • an NFA recognizing language A_1, and

  • an NFA recognizing language A_2

returns an NFA recognizing A_1 \circ A_2.

To do this, you’ll need to be able to extract individual components of an NFA instance, e.g., the states, transitions, etc., and use them to create a new NFA.

Figure 1.48 from the textbook should be helpful here.

NOTE: When combining NFAs be careful with duplicate state names. Since the given files may use the same state names, it may be a good idea to first rename them to something unique.

There will not be any direct tests for this problem.

2 Kleene Star

Your Task

Implement the Kleene Star operation for your NFAs from A Data Representation for NFAs.

Specifically, write a function that, given an NFA recognizing language A, returns an NFA recognizing A^*.

To do this, you’ll need to be able to extract individual components of an NFA instance, e.g., the states, transitions, etc., and use them to create a new NFA.

Figure 1.50 from the textbook should be helpful here.

There will be not any direct tests for this problem.

3 A Regular Expression Matcher

This problem will show you that you’ve almost implemented (the core of) a regular expression matcher, which is found in nearly every programming language! Congrats!

The regexp matcher included with most PLs are of course much more efficient, and have many more bells and whistles than the theoretical models we are studying. But the basic behavior is indeed the same.

To complete it, you’ll just need use your NFA datastructure from HW3, in combination with the union, concatenation, and Kleene star operations you implemented so far.

You will also need the definition (Def 1.52) of Regular Expressions from the textbook, which are defined using exactly these three operations:

R is a regular expression if R is either:
  • a for some a\in an alphabet \Sigma,

  • the empty string \varepsilon,

  • the empty set \emptyset,

  • R_1 \cup R_2, sometimes written R_1\mid R_2, where R_1 and R_2 are regular expressions,

  • R_1 \circ R_2, sometimes written R_1R_2, where R_1 and R_2 are regular expressions,

  • R_1^*, where R_1 is a regular expression.

Your Task

Use Lemma 1.55 and the closed NFA operations you’ve implemented to create an equivalent NFA for each of the regular expressions below (these are all from Example 1.53 in the textbook).

There are 12 examples total, but you only need to submit 6 to get full credit (every example beyond 6 will be extra credit).

Assume that \Sigma = \{0,1\}, a ^+ means "1 or more", and a \Sigma in a regular expression below means 0\cup 1.

  1. 0^*10^* = \{w \mid w \textrm{ contains a single } 1\}

  2. \Sigma^*1\Sigma^* = \{w \mid w \textrm{ has at least one } 1\}

  3. \Sigma^*001\Sigma^* = \{w \mid w \textrm{ contains the string } 001 \textrm{ as a substring}\}

  4. 1^*(01^+)^* = \{w \mid \textrm{every } 0 \textrm{ in } w \textrm{ is followed by at least one } 1\}

  5. (\Sigma\Sigma)^* = \{w \mid w \textrm{ is a string of even length}\}

  6. (\Sigma\Sigma\Sigma)^* = \{w \mid \textrm{the length of } w \textrm{ is a multiple of } 3\}

  7. 01 \cup 10 = \{01,10\}

  8. 0\Sigma^*0\cup 1\Sigma^*1\cup 0\cup 1 = \{w\mid w \textrm{ starts and ends with the same symbol}\}

  9. (0\cup\varepsilon)1^* = 01^*\cup1^*

  10. (0\cup\varepsilon)(1\cup\varepsilon) = \{\varepsilon,0,1,01\}

  11. 1^*\emptyset = \emptyset

  12. \emptyset^* = \{\varepsilon\}

Each NFA you create is doing exactly the same thing as a real regexp matcher! To prove this to you, the grader will test your submission using the regexp implementation from an actual PL (e.g., could be Perl, Python, Java, Racket, etc.).

Specifically, your solution will be tested as follows:

More Hints:

To further help you, below are the exact regexp patterns (each corresponds to a numbered example from above) that the grader will use to grade your output NFA (the grader will only check strings up to length 5).

Note that in typical regexp matching, a union of single chars, e.g., 0 \cup 1, is typically written [01], while the "vertical bar" \mid is a more general union operation.

  1. "0*10*"

  2. "[01]*1[01]*"

  3. "[01]*001[01]*"

  4. "1*(01+)*"

  5. "([01][01])*"

  6. "([01][01][01])*"

  7. "01|10"

  8. "0[01]*0|1[01]*1|0|1"

  9. "01*|1*"

  10. "^$|^0$|^1$|^01$"

    (won’t work without the ^ or $, which stand for "string start" and "string end", respectively, demonstrating that there are some differences between the theoretical regular expressions we study in class, and the regexp matching available in programming languages)

  11. "\b\B" (pattern that won’t match anything)

  12. ""

With these, you should be able to easily test your solution on your own using the regexp library in your favorite language. Make sure to choose "exact match" ("partial match" is the default in some langs and will not recognize the same language), which corresponds to the textbook’s notion of matching. Specifically, for any given string in the language of alphabet \Sigma = \{0,1\} (up to length 5), your NFA should accept the string if and only if the equivalent regexp pattern matches on the string.

Finally, if you still need more explanations, the internet has plenty of sites dedicated to learning about, and interactively exploring regexps, e.g., regexr.com or regex101.com.

4 The FLIP Operation

Define the \textrm{FLIP} operation on a language to be:

\textrm{FLIP}(L) = \{c_n \ldots c_0 \mid c_0\ldots c_n \in L\}, \textrm{where } c_i \in \Sigma

Prove that regular languages are closed under the \textrm{FLIP} operation.

Your answer may:
  • use any representation of regular languages, e.g., DFA, NFA, regular expression, etc.,

  • use any proof style (e.g., proof by construction, proof by induction, etc.),

  • and may assume any theorems already proven in class or the book.

The proof must be submitted as either a pdf or plain text file. Also, make sure to assign it to the proper problem in Gradescope when submitting.

5 A Non-Regular language

Prove that the language L = \{www \mid w\in \{0,1\}^*\} is not regular.

6 A Regular and a Non-Regular Language

The language L_1 = \{a^nsa^n\mid n > 0\textrm{ and }s\in\Sigma^*\} is regular.

The language L_2 = \{a^nbsa^n\mid n > 0\textrm{ and }s\in\Sigma^*\}, however, is not regular.

You may assume \Sigma = \{a,b\}.

  1. Prove that L_1 is regular.

  2. Prove that L_2 is not regular.