Parsing using Java #
Consider
\( S' \to S\$ \\ S \to aSx \mid bSx \mid \lambda \)Heres the predictor table
Production | \( \text{First (right hand side)} \) | If nullable, \( \text{Follow(left hand side)} \) |
---|---|---|
\( S \to aSx \) | \( a \) | |
\( S \to bSx \) | \( b \) | |
\( S \to \lambda \) | \( x,\$ \) |
In code:
We are using a queue of strings to simulate the scanner.
import java.util.Queue;
import java.util.LinkedList;
import java.util.Arrays;
public class ParseTest {
private static Queue<String> toks;
private static boolean isEmpty() { return toks.isEmpty(); }
private static String next() { return toks.element(); }
private static void match(String tok) {
if (isEmpty() || !next().equals(tok))
throw new IllegalStateException("Error on match(" + tok + ")");
toks.remove();
}
public static void parseS() {
if (isEmpty()) {
// $ predicts S-> \lambda, so leave toks alone
} else if (next().equals("a")) {
match("a"); parseS(); match("x");
} else if (next().equals("b")) {
match("b"); parseS(); match("x");
} else if (next().equals("x")) {
// x predicts S-> \lambda, so leave toks alone
} else {
throw new IllegalStateException("Unexpected: " + next());
}
}
public static void main(String[] args) {
toks = new LinkedList<String>(Arrays.asList("a","b","x","x"));
parseS();
System.out.println(isEmpty() ? "accept" : "reject");
}
}
Creating a parse tree #
Using the previous language, the string \( abxx \) ’s structure isn’t at first clear. Lets turn it into a tree:
It is useful to represent files as a tree (like structured XML files), to make it easy to traverse the data.
We should be able to take a grammar and turn it into a tree that represents the structure of the input. Whenever we do a parse, it returns a reference to a node representing that symbol.
Each leaf will be a node, and each interior non-terminal will also be a node. The non-terminals have a responsibility to build a node and return a reference to it.
We will use Java’s DefaultMutableTreeNode
class, which implements TreeNode
.
import javax.swing.tree.DefaultMutableTreeNode;
// other existing methods from above
public static DefaultMutableTreeNode parseS() {
DefaultMutableTreeNode rval = new DefaultMutableTreeNode("S");
if (isEmpty() || next().equals("x")) { // S --> lambda
rval.add(new DefaultMutableTreeNode(""));
} else if (next().equals("a") || next().equals("b")) { // S --> aSx | bSx
String tok = next();
match(tok);
rval.add(new DefaultMutableTreeNode(tok));
rval.add(parseS());
match("x");
rval.add(new DefaultMutableTreeNode("x"));
} else { // Unexpected token
throw new IllegalStateException("Unexpected: " + next());
}
return rval;
}
Same method but thoroughly commented
import javax.swing.tree.DefaultMutableTreeNode;
// ....
public static DefaultMutableTreeNode parseS() {
// Create node to be returned and label it "S"
DefaultMutableTreeNode rval = new DefaultMutableTreeNode("S");
if (isEmpty()) {
// S non-terminal is nullable, so $ and x both indicate S -> lambda
// Make child be a leaf node labeled "" (empty string)
rval.add(new DefaultMutableTreeNode(""));
} else if (next().equals("a")) {
// Match "a" and add first child, a leaf node labeled "a"
match("a");
rval.add(new DefaultMutableTreeNode("a"))
// parseS() and have resulting subtree be our second child
rval.add(parseS());
// Match "x" and add third child, a leaf node labeled "x"
match("x");
rval.add(new DefaultMutableTreeNode("x"))
} else if (next().equals("b")) {
// Match "b" and add first child, a leaf node labeled "b"
match("b");
rval.add(new DefaultMutableTreeNode("b"))
// parseS() and have resulting subtree be our second child
rval.add(parseS());
// Match "x" and add third child, a leaf node labeled "x"
match("x");
rval.add(new DefaultMutableTreeNode("x"))
} else if (next().equals("x")) {
// S non-terminal is nullable, so $ and x both indicate S -> lambda
// Make child be a leaf node labeled "" (empty string)
rval.add(new DefaultMutableTreeNode(""));
} else {
// Unexpected token, so throw an exception
throw new IllegalStateException("Unexpected: " + next());
}
return rval;
}
Note: The add()
method will create a new child at the next index.
When we want to print out this tree we can use a method like so:
public static void printLeaves(TreeNode noderef) {
if (noderef.getChildCount()==0) { // noderef is a leaf node
String label = noderef.toString();
System.out.print(label);
} else {
for (int i=0; i<noderef.getChildCount(); i++) {
printLeaves(noderef.getChildAt(i));
}
}
}