Skip to content

ckshitij/RE_TO_DFA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Regular Expression to DFA

  • How To Compile Code
  • VIDEO LINK OF CONVERSION AND OTHER REFERENCES
  • ABOUT CONVERSION OF RE TO DFA

How to Compile code

  • If you don't have c++ compiler first install it using following command :-
     sudo apt-get install g++
    
  • Compile the cpp file by using command :-
     g++ -std=c++11 RE_TO_DFA.cpp -o re
     ./re
    

VIDEO LINK AND REFERANCES

CONVERSION OF RE TO DFA

  • Regular expression is used to represent the language (lexeme) of finite automata (lexical analyzer).
  • Regular expressions are used to specify regular languages and finite automata are used to recognize the regular languages. Many computer applications such as compilers, operating system utilities, text editors make use of regular languages.
  • In these applications, the regular expressions and finite automata are used to recognize this language.

Formal definition of Regular expression

  • The class of regular expressions over ∑ is defined recursively as follows:
    • The letters φ and Є are regular expressions over ∑ .
    • Every letter ‘a’ c ∑is a regular expression over ∑ .
    • If ‘R1’ and ‘R are regular expressions over ∑, then so are ‘(R1|R2)’, ‘(R1.R2)’ and (R1)* Where
    ‘|’ indicates alternative or parallel paths.
    ‘.’ Indicates concatenation
    ‘*’ indicates closure
    
    • The regular expressions are only those that are obtained using rules (1) and (2).

Formal definition of DFA:

  • The formal definition of finite automata is denoted by a tuple

    ( Q, ∑,d, qo, f)
    

    Where

    Q à Finite set of table
    ∑à finite input alphabet
    qoà Initial state of FA,qo, qo Q
    Fà set of final states, F c Q
    dà Transition function called as state function mapping
    Q * ∑ à Q
    i.e. d= Q*∑à Q
    
  • A FA is called deterministic (DFA) if from every vertex of its transition graph, there is an unique input symbol which takes the vertex state to the required next state.

Algorithm

The steps in algorithm are

  • Accept the given regular expression with end of character as #

  • Covert the regular expressions to its equivalent postfix form manually.

  • Construct a syntax tree from the postfix expression obtained in step 2.

  • Assign positions to leaf nodes

  • Compute following functions.

Nullable, Firstpos, Lastpos, Followpos
  • Computation of Nullables : All nodes except the '*' nodes are not nullable.
  • Also if some leaf node is for έ, then it is also nullable.
  • Firstpos (Firstposition): At each node n of the syntax tree of a regular expression, we define a function firstpos(n) that gives the set of first positions that can match first symbol of a string generated by sub expression rooted at ‘n’.
  • Lastpos (lastposition): At each node n of the syntax tree of a regular expression, we define a function lastpos(n) that gives the set of last positions that can match last symbol of a string generated by sub expression rooted at ‘n’.
  • To compute firstposition and last position, we need to know which nodes are the roots of sub expression that generate languages that include the empty string. Such nodes are Nullable. We define nullable(n) to be true if node ‘n’ is nullable , false otherwise.

  • Computation of Followpos : Followpos(i) tells us what positions can follow position i in the syntax tree. This can be computed as follows.

    • if n is a ‘.’ (cat) Node, with a left child C1 and right child C2 and i is a position in the Lastpos(C1), then all positions in Firstpos(C2) are in Followpos(i)
    • if n is a * (closure) Node and i is a position in the Lastpos(n), then all positions in Firstpos(n) are Followpos(i)
  • Construct DFA from Follow Pos.