lr0multinonterminalproductionstatemachine.html

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
  <head>
    <title>RIPAL: Responsive and Intuitive Parsing for the Analysis of Language</title>
    <link href="styles/styles.css" rel="stylesheet" type="text/css" media="all"/>
    <meta name="viewport" content="width=device-width, initial-scale=1"/>
  </head>
  <body>
    <h1>RIPAL: Responsive and Intuitive Parsing for the Analysis of Language</h1>
    <h2>Pages</h2>
    <nav aria-label="Pages">
      <ul>
        <li><a href="index.html">Home</a></li>
        <li><a href="theory.html">Theory</a></li>
        <li><a href="contact.html">Contact</a></li>
      </ul>
    </nav>
    <h2>LR(0) multi-nonterminal production state machine</h2>
    <div class="section">
      <h3>Background</h3>
      <p>In a previous section, we illustrated the method for building the LR(0) state machine for a grammar producing a nonterminal symbol in one of its productions.</p>
      <p>In this section, we will expand this construction to illustrate the process for constructing an LR(0) state machine for a grammar producing multiple nonterminal symbols in a single production.</p>
    </div>
    <div class="section">
      <h3>Our grammar</h3>
      <p>The language we will be analyzing is specified by the following grammar:</p>
      <p>
        S &rarr; A B<br/>
        A &rarr; a<br/>
        B &rarr; b
      </p>
      <p>This grammar has the following augmented grammar:</p>
      <p>
        S' &rarr; S $<br/>
        S &rarr; A B<br/>
        A &rarr; a<br/>
        B &rarr; b
      </p>
    </div>
    <div class="section">
      <h3>Our parse table</h3>
      <p>This grammar has the following LR(0) parse table:</p>
      <table class="parsetable">
        <tr>
          <td></td>
          <td>a</td>
          <td>b</td>
          <td>$</td>
          <td>S</td>
          <td>A</td>
          <td>B</td>
          <td>S'</td>
        </tr>
        <tr>
          <td>state<sub>1</sub></td>
          <td>shift<sub>5</sub></td>
          <td></td>
          <td></td>
          <td>goto<sub>6</sub></td>
          <td>goto<sub>2</sub></td>
          <td></td>
          <td></td>
        </tr>
        <tr>
          <td>state<sub>2</sub></td>
          <td></td>
          <td>shift<sub>3</sub></td>
          <td></td>
          <td></td>
          <td></td>
          <td>goto<sub>4</sub></td>
          <td></td>
        </tr>
        <tr>
          <td>state<sub>3</sub></td>
          <td>reduce<sub>4</sub></td>
          <td>reduce<sub>4</sub></td>
          <td>reduce<sub>4</sub></td>
          <td></td>
          <td></td>
          <td></td>
          <td></td>
        </tr>
        <tr>
          <td>state<sub>4</sub></td>
          <td>reduce<sub>2</sub></td>
          <td>reduce<sub>2</sub></td>
          <td>reduce<sub>2</sub></td>
          <td></td>
          <td></td>
          <td></td>
          <td></td>
        </tr>
        <tr>
          <td>state<sub>5</sub></td>
          <td>reduce<sub>3</sub></td>
          <td>reduce<sub>3</sub></td>
          <td>reduce<sub>3</sub></td>
          <td></td>
          <td></td>
          <td></td>
          <td></td>
        </tr>
        <tr>
          <td>state<sub>6</sub></td>
          <td></td>
          <td></td>
          <td>accept</td>
          <td></td>
          <td></td>
          <td></td>
          <td></td>
        </tr>
      </table>
      <p>Parsing input string ab:</p>
      <table class="parsetrace">
        <tr>
          <th>Input queue</th>
          <th>Parse stack</th>
          <th>Action</th>
        </tr>
        <tr>
          <td>ab</td>
          <td>1</td>
          <td>Apply action of shift<sub>5</sub>, which corresponds to state<sub>1</sub> and a in our parse table</td>
        </tr>
        <tr>
          <td>b</td>
          <td>1 a 5</td>
          <td>Apply action of reduce<sub>3</sub>, which corresponds to state<sub>5</sub> and b in our parse table</td>
        </tr>
        <tr>
          <td>b</td>
          <td>1 A</td>
          <td>Apply action of goto<sub>2</sub>, which corresponds to state<sub>1</sub> and A in our parse table</td>
        </tr>
        <tr>
          <td>b</td>
          <td>1 A 2</td>
          <td>Apply action of shift<sub>3</sub>, which corresponds to state<sub>2</sub> and b in our parse table</td>
        </tr>
        <tr>
          <td></td>
          <td>1 A 2 b 3</td>
          <td>Apply action of reduce<sub>4</sub>, which corresponds to state<sub>3</sub> and $ in our parse table</td>
        </tr>
        <tr>
          <td></td>
          <td>1 A 2 B</td>
          <td>Apply action of goto<sub>4</sub>, which corresponds to state<sub>2</sub> and B in our parse table</td>
        </tr>
        <tr>
          <td></td>
          <td>1 A 2 B 4</td>
          <td>Apply action of reduce<sub>2</sub>, which corresponds to state<sub>4</sub> and $ in our parse table</td>
        </tr>
        <tr>
          <td></td>
          <td>1 S</td>
          <td>Apply action of goto<sub>6</sub>, which corresponds to state<sub>1</sub> and S in our parse table</td>
        </tr>
        <tr>
          <td></td>
          <td>1 S 6</td>
          <td>Accept, since this action corresponds to state<sub>6</sub> and $ in our parse table</td>
        </tr>
      </table>
    </div>
    <div class="section">
      <h3>Initial state</h3>
      <p>As per our standard process, we will construct our initial qualified production rule.</p>
      <p>Our initial set of qualified productions is:</p>
      <p>{S' &rarr; &middot; S $}</p>
      <p>Calculating the closure set, we end up with:</p>
      <p>{S' &rarr; &middot; S $, S &rarr; &middot; A B, A &rarr; &middot; a}</p>
      <p>This is as far as we can go, since we would need to process nonterminal A or terminal a to proceed further.</p>
      <p>As a result, our initial parse state is:</p>
      <img src="assets/multinonterminalproductiondfaparsestate1.png" alt="DFA state 1. This state contains the following productions: S prime produces dot S dollar, S produces dot A B, A produces dot a."/>
    </div>
    <div class="section">
      <h3>Handling terminal a</h3>
      <p>Observe the following sequence of parse stack states encountered in the parsing example above:</p>
      <ol>
        <li>1</li>
        <li>1 a 5</li>
        <li>1 A</li>
      </ol>
      <p>Here, we first shift a onto our parse stack and move into state<sub>5</sub>. Here, state<sub>5</sub> represent being at the point in our parsing procedure where the application of a reduction corresponding to the production A &rarr; a can take place.</p>
      <p>In applying this reduction, we are able to return to our initial parse state and continue the standard chaining process.</p>
      <p>To calculate the closure for state<sub>5</sub>, we take the following qualified production:</p>
      <p>A &rarr; &middot; a</p>
      <p>We shift the dot past the terminal a that we are processing and create the following qualified production set from this action:</p>
      <p>{A &rarr; a &middot;}</p>
      <p>Since the dot is at the end of the production, this closure can't be further expanded. We end up with state<sub>5</sub> being defined as follows:</p>
      <img src="assets/multinonterminalproductiondfaparsestate5.png" alt="DFA state 5. This state contains the following production: A produces a dot."/>
      <p>Finally, we connect state<sub>1</sub> and state<sub>5</sub> with the appropriate transition as follows:</p>
      <img src="assets/multinonterminalproductiondfa2parsestates.png" alt="DFA containing two states. State 1 contains the following productions: S prime produces dot S dollar, S produces dot A B, A produces dot a. State 5 contains the following production: A produces a dot. State 1 transitions to state 5 on terminal symbol a."/>
    </div>
    <div class="section">
      <h3>Handling terminal b</h3>
      <p>Observe the following sequence of parse stack states encountered in the parsing example above:</p>
      <ol>
        <li>1 A</li>
        <li>1 A 2</li>
        <li>1 A 2 b 3</li>
        <li>1 A 2 B</li>
      </ol>
      <p>After reducing terminal a to nonterminal A, we want to further make a reduction via the rule S &rarr; A B. However, we need to first get nonterminal B onto our parse stack to use this rule.</p>
      <p>First, we use the goto<sub>2</sub> action to transition into a state where we are ready to process more input symbols.</p>
      <p>We shift terminal b onto the parse stack and transition into state<sub>3</sub>, in preparation for applying a reduction via the rule B &rarr; b. Then we can apply our reduction as shown in the last stack state listed above.</p>
      <p>To calculate the closure for state<sub>2</sub>, we take the following qualified production:</p>
      <p>S &rarr; &middot; A B</p>
      <p>We shift the dot past the nonterminal A that we are processing and create the following qualified production set from this action:</p>
      <p>{S &rarr; A &middot; B}</p>
      <p>Expanding via our closure calculation, we end up with a closure of:</p>
      <p>{S &rarr; A &middot; B, B &rarr; &middot; b}</p>
      <p>We end up with state<sub>2</sub> being defined as follows:</p>
      <img src="assets/multinonterminalproductiondfaparsestate2.png" alt="DFA state 2. This state contains the following productions: S produces A dot B, B produces dot b."/>
      <p>Finally, we connect state<sub>1</sub> and state<sub>2</sub> with the appropriate transition as follows:</p>
      <img src="assets/multinonterminalproductiondfa3parsestates.png" alt="DFA containing three states. State 1 contains the following productions: S prime produces dot S dollar, S produces dot A B, A produces dot a. State 5 contains the following production: A produces a dot. State 1 transitions to state 5 on terminal symbol a. State 2 contains the following productions: S produces A dot B, B produces dot b. State 1 transitions to state 2 on nonterminal symbol A."/>
      <p>Next, we create state<sub>3</sub> by taking qualified production:</p>
      <p>B &rarr; &middot; b</p>
      <p>We shift the dot past the terminal b that we are processing and create the following qualified production set from this action:</p>
      <p>{B &rarr; b &middot;}
      <p>Since the dot is as the end of the production, this closure can't be further expanded. We end up with state<sub>3</sub> being defined as follows:</p>
      <img src="assets/multinonterminalproductiondfaparsestate3.png" alt="DFA state 3. This state contains the following production: B produces b dot."/>
      <p>Finally, we connect state<sub>2</sub> and state<sub>3</sub> with the appropriate transition as follows:</p>
      <img src="assets/multinonterminalproductiondfa4parsestates.png" alt="DFA containing four states. State 1 contains the following productions: S prime produces dot S dollar, S produces dot A B, A produces dot a. State 5 contains the following production: A produces a dot. State 1 transitions to state 5 on terminal symbol a. State 2 contains the following productions: S produces A dot B, B produces dot b. State 1 transitions to state 2 on nonterminal symbol A. State 3 contains the following production: B produces b dot. State 2 transitions to state 3 on terminal symbol b."/>
    </div>
    <div class="section">
      <h3>Handling nonterminal S</h3>
      <p>Observe the following sequence of parse stack states encountered in the parsing example above:</p>
      <ol>
        <li>1 A 2 B</li>
        <li>1 A 2 B 4</li>
        <li>1 S</li>
      </ol>
      <p>After reducing terminal b to nonterminal B, we arrive back in state<sub>2</sub>.</p>
      <p>From here, we use our standard goto action corresponding to nonterminal processing and transition into state<sub>4</sub>.</p>
      <p>To calculate the closure for state<sub>4</sub>, we start with the qualified production:</p>
      <p>S &rarr; A &middot; B</p>
      <p>We shift the dot past the nonterminal B that we are processing and create the following qualified production set from this action:</p>
      <p>{S &rarr; A B &middot;}</p>
      <p>Since the dot is at the end of the production, this closure can't be further expanded. We end up with state<sub>4</sub> being defined as follows:</p>
      <img src="assets/multinonterminalproductiondfaparsestate4.png" alt="DFA state 4. This state contains the following production: S produces A B dot."/>
      <p>Finally, we connect state<sub>2</sub> and state<sub>4</sub> with the appropriate transition as follows:</p>
      <img src="assets/multinonterminalproductiondfa5parsestates.png" alt="DFA containing five states. State 1 contains the following productions: S prime produces dot S dollar, S produces dot A B, A produces dot a. State 5 contains the following production: A produces a dot. State 1 transitions to state 5 on terminal symbol a. State 2 contains the following productions: S produces A dot B, B produces dot b. State 1 transitions to state 2 on nonterminal symbol A. State 3 contains the following production: B produces b dot. State 2 transitions to state 3 on terminal symbol b. State 4 contains the following production: S produces A B dot. State 2 transitions to state 4 on nonterminal symbol B."/>
      <p>At this point, we are able to reduce nonterminals A and B to nonterminal S via the production S &rarr; A B, discarding the state information attached to those nonterminals.</p>
    </div>
    <div class="section">
      <h3>Handling end of string</h3>
      <p>After reducing A B to S, from state<sub>1</sub>, we want to process the end of string according to our standard approach.</p>
      <p>Extending our DFA, we end up with a final diagram of:</p>
      <img src="assets/multinonterminalproductiondfa7parsestates.png" alt="DFA containing seven states. State 1 contains the following productions: S prime produces dot S dollar, S produces dot A B, A produces dot a. State 5 contains the following production: A produces a dot. State 1 transitions to state 5 on terminal symbol a. State 2 contains the following productions: S produces A dot B, B produces dot b. State 1 transitions to state 2 on nonterminal symbol A. State 3 contains the following production: B produces b dot. State 2 transitions to state 3 on terminal symbol b. State 4 contains the following production: S produces A B dot. State 2 transitions to state 4 on nonterminal symbol B. State 6 contains the following production: S prime produces S dot $. State 1 transitions to state 6 on nonterminal symbol S. State 7 contains the following production: S prime produces S $ dot. State 6 transitions to state 7 on end of string symbol $. State 7 is an accepting state."/>
    </div>
    <div class="section">
      <h3>Conclusion</h3>
      <p>We've seen the LR(0) DFA construction process for a grammar containing a multi-nonterminal production.</p>
      <p>In the next section, we will start to explore the process of converting an LR(0) DFA into a parse table.</p>
    </div>
    <hr/>
    <p>GitHub Repository: <a href="https://github.com/bprollinson/ripal">https://github.com/bprollinson/ripal</a></p>
    <p>Copyright &copy; 2017 Brendan Rollinson-Lorimer</p>
  </body>
</html>