CS135-lecture-20210426

Parsing using Java #

Consider

\( S' \to S\$ \\ S \to aSx \mid bSx \mid \lambda \)

Heres the predictor table

Production \( \text{First (right hand side)} \) If nullable, \( \text{Follow(left hand side)} \)
\( S \to aSx \) \( a \)
\( S \to bSx \) \( b \)
\( S \to \lambda \) \( x,\$ \)

In code:

We are using a queue of strings to simulate the scanner.

import java.util.Queue;
import java.util.LinkedList;
import java.util.Arrays;

public class ParseTest {

    private static Queue<String> toks;

    private static boolean isEmpty() { return toks.isEmpty(); }
    private static String  next()    { return toks.element(); }

    private static void match(String tok) {
        if (isEmpty() || !next().equals(tok))
            throw new IllegalStateException("Error on match(" + tok + ")");
        toks.remove();
    }

    public static void parseS() {
        if (isEmpty()) {
            // $ predicts S-> \lambda, so leave toks alone
        } else if (next().equals("a")) {
            match("a"); parseS(); match("x");
        } else if (next().equals("b")) {
            match("b"); parseS(); match("x");
        } else if (next().equals("x")) {
            // x predicts S-> \lambda, so leave toks alone
        } else {
            throw new IllegalStateException("Unexpected: " + next());
        }
    }

    public static void main(String[] args) {
        toks = new LinkedList<String>(Arrays.asList("a","b","x","x"));
        parseS();
        System.out.println(isEmpty() ? "accept" : "reject");
    }
}

Creating a parse tree #

Using the previous language, the string \( abxx \) ’s structure isn’t at first clear. Lets turn it into a tree:

image_2021-04-26-12-08-56

It is useful to represent files as a tree (like structured XML files), to make it easy to traverse the data.

We should be able to take a grammar and turn it into a tree that represents the structure of the input. Whenever we do a parse, it returns a reference to a node representing that symbol.

Each leaf will be a node, and each interior non-terminal will also be a node. The non-terminals have a responsibility to build a node and return a reference to it.

We will use Java’s DefaultMutableTreeNode class, which implements TreeNode.

 import javax.swing.tree.DefaultMutableTreeNode;
    // other existing methods from above
    public static DefaultMutableTreeNode parseS() {
        DefaultMutableTreeNode rval = new DefaultMutableTreeNode("S");
        if (isEmpty() || next().equals("x")) {                 // S --> lambda
            rval.add(new DefaultMutableTreeNode(""));
        } else if (next().equals("a") || next().equals("b")) { // S --> aSx | bSx
            String tok = next();  
            match(tok);
            rval.add(new DefaultMutableTreeNode(tok));
            rval.add(parseS());
            match("x");
            rval.add(new DefaultMutableTreeNode("x"));
        } else {                                               // Unexpected token
            throw new IllegalStateException("Unexpected: " + next());
        }
        return rval;
    }
Same method but thoroughly commented
 import javax.swing.tree.DefaultMutableTreeNode;
    // ....
    public static DefaultMutableTreeNode parseS() {
        // Create node to be returned and label it "S"
        DefaultMutableTreeNode rval = new DefaultMutableTreeNode("S");
        if (isEmpty()) {
            // S non-terminal is nullable, so $ and x both indicate S -> lambda
            // Make child be a leaf node labeled "" (empty string)
            rval.add(new DefaultMutableTreeNode(""));
        } else if (next().equals("a")) {
            // Match "a" and add first child, a leaf node labeled "a"
            match("a");
            rval.add(new DefaultMutableTreeNode("a"))
            // parseS() and have resulting subtree be our second child
            rval.add(parseS());
            // Match "x" and add third child, a leaf node labeled "x"
            match("x");
            rval.add(new DefaultMutableTreeNode("x"))
        } else if (next().equals("b")) {
            // Match "b" and add first child, a leaf node labeled "b"
            match("b");
            rval.add(new DefaultMutableTreeNode("b"))
            // parseS() and have resulting subtree be our second child
            rval.add(parseS());
            // Match "x" and add third child, a leaf node labeled "x"
            match("x");
            rval.add(new DefaultMutableTreeNode("x"))
        } else if (next().equals("x")) {
            // S non-terminal is nullable, so $ and x both indicate S -> lambda
            // Make child be a leaf node labeled "" (empty string)
            rval.add(new DefaultMutableTreeNode(""));
        } else {
            // Unexpected token, so throw an exception
            throw new IllegalStateException("Unexpected: " + next());
        }
        return rval;
    }
Note: The add() method will create a new child at the next index.

When we want to print out this tree we can use a method like so:

    public static void printLeaves(TreeNode noderef) {
        if (noderef.getChildCount()==0) {           // noderef is a leaf node
            String label = noderef.toString();
            System.out.print(label);
        } else {
            for (int i=0; i<noderef.getChildCount(); i++) {
                printLeaves(noderef.getChildAt(i));
            }
        }
    }