Regular Expressions
Regular expressions describe languages based on algebra operations. The language defined by regular expression
Regular Expression Definition (Bases)
A regular expression is defined as follows:
- Basis1: If
is a symbol in the alphabet, then is a regular expression and represents the language . - Basis2:
is a regular expression and represents the language . - Basis3:
is a regular expression and represents the language .
Regular Expression Operators (Inductions)
The regular expression operators are defined as follows:
- Induction 1 (Union): If
and are regular expressions, then is also a regular expression. is the set of strings that are in or . - Induction 2 (Concatenation): If R1 and R2 are regular expressions, then
is also a regular expression. is the set of strings that are the concatenation of a string in followed by a string in . - Induction 3 (Kleene Star): If
is a regular expression, then is also a regular expression. is the set of strings that are the concatenation of zero or more strings in .
Precedence of Operators
The precedence of operators is as follows:
- Kleene Star has the highest precedence.
- Concatenation has the second highest precedence. The concatenation operator is implicit, so it is not necessary to write it.
- Union has the lowest precedence.
Regular Expression Examples
- Example 1:
represents the set of all binary strings. - Example 2:
represents the set of all binary strings that have no two consecutive s.
Equivalence of Regular Expressions and Finite Automata
Regular expressions and finite automata are equivalent. A regular expression can be easily converted to a NFA by using Thompson's construction method. For any regular expression
Basic Rules
- If the regular expression is a symbol
, then it is converted to an NFA with two states.
R is only a symbol 'a'
- If the regular expression
is , then it is converted to an NFA with two state, linked by an -transition.
R is epsilon
- If the regular expression is
, then it is converted to an NFA with two states, but without any transitions.
- If the regular expression is a symbol
Inductive Rules
- If the regular expression is
, then the NFA is like a choice between the NFAs for and .
R is s + t
- If the regular expression is
, then the NFA sequentially combines the NFAs for and .
R is s.t
- If the regular expression is
, then the NFA for is a loop that can be taken zero or more times.
R is s*
- If the regular expression is