Skip to content

Backtracking

Ronald Franco edited this page Dec 27, 2018 · 2 revisions

Before attempting to match a DEF node, the values of all the out-flow variables of the (non)terminals present on the RHS of a production are saved onto a std::stack. This allows the parsing algorithm to backtrack and restore the old values of out-flow variables, in the case a production is not fully matched and out-flow variables are altered. Recall that out-flow variables are the only type of flow variable that gets altered by (non)terminal and thus saving the values of the in-flow variables of these (non)terminals, before attempting to match a DEF node, is not necessary.

Motivating Example

Consider the following incomplete AFG that performs syntax-directed translation of some simple programming language with basic control-flow structures to java bytecode:

Parser<> S; // Represents one statement in the simple programming language
size_t a;

// IF_CODE is a terminal matching the "if" token
// THEN_CODE is a terminal matching the "then" token
// E is some boolean expression
S = IF_CODE & E & [&]{ emit(iconst_0); a = pc; emit(if_icmpeq,0); } 
    & THEN_CODE & S & [&]{ backpatch(a,pc-a); };

The incomplete AFG given above, outlines an if-then control flow structure to the appropriate java bytecode. We note the following:

  • Flow variable a is used to store the program counter before reaching the keyword
  • After setting a, there is a recursive call on S
  • If the recursive call alters a, the old value of a is lost as it was not saved
  • Overall, the above AFG cannot handle nested if-then statements

Flow variables local to the production can be saved if it is an out-flow variable of one symbol in the production. The use of a marker nonterminal, whose sole purpose is to copy the program counter into an out-flow variable to save a local copy, would be desirable. This is a result of all the out-flow variable values being saved before attempting to match a (non)terminal, which could overwrite the values of flow variables. Thus, the following altered AFG can handle nested if-then statements:

Parser<> S;
Parser<size_t> T;
size_t a, pc;

S = IF_CODE & E & [&]{ emit(iconst_0); } & T(pc)>>a 
    & [&]{ emit(if_icmpeq,0); } & THEN_CODE & S 
    & [&]{ backpatch(a,pc-a); };

size_t x;

// Empty llambda, always matches
// Copy in-flow variable to out-flow variable
T(x)>>x = [&]{ };

The marker nonterminal T has an instance in the above AFG that passes the program counter as an in-flow variable and has an out-flow variable a. The marker nonterminal T has a formal definition that does nothing but copy the value of the in-flow variable to the out-flow variable. This is done by designing T in such a way that it will always match, the empty lambda in the formal definition gives us this property, and by making the in- and out- flow variables the same.