Next: Operator Precedence, Previous: Lookahead Tokens, Up: The Bison Parser Algorithm [Contents][Index]
Suppose we are parsing a language which has if-then and if-then-else statements, with a pair of rules like this:
if_stmt: "if" expr "then" stmt | "if" expr "then" stmt "else" stmt ;
Here "if"
, "then"
and "else"
are terminal symbols for
specific keyword tokens.
When the "else"
token is read and becomes the lookahead token, the
contents of the stack (assuming the input is valid) are just right for
reduction by the first rule. But it is also legitimate to shift the
"else"
, because that would lead to eventual reduction by the second
rule.
This situation, where either a shift or a reduction would be valid, is called a shift/reduce conflict. Bison is designed to resolve these conflicts by choosing to shift, unless otherwise directed by operator precedence declarations. To see the reason for this, let’s contrast it with the other alternative.
Since the parser prefers to shift the "else"
, the result is to attach
the else-clause to the innermost if-statement, making these two inputs
equivalent:
if x then if y then win; else lose; if x then do; if y then win; else lose; end;
But if the parser chose to reduce when possible rather than shift, the result would be to attach the else-clause to the outermost if-statement, making these two inputs equivalent:
if x then if y then win; else lose; if x then do; if y then win; end; else lose;
The conflict exists because the grammar as written is ambiguous: either
parsing of the simple nested if-statement is legitimate. The established
convention is that these ambiguities are resolved by attaching the
else-clause to the innermost if-statement; this is what Bison accomplishes
by choosing to shift rather than reduce. (It would ideally be cleaner to
write an unambiguous grammar, but that is very hard to do in this case.)
This particular ambiguity was first encountered in the specifications of
Algol 60 and is called the “dangling else
” ambiguity.
To assist the grammar author in understanding the nature of each conflict, Bison can be asked to generate “counterexamples”. In the present case it actually even proves that the grammar is ambiguous by exhibiting a string with two different parses:
Example: "if" expr "then" "if" expr "then" stmt • "else" stmt Shift derivation if_stmt ↳ 3: "if" expr "then" stmt ↳ 2: if_stmt ↳ 4: "if" expr "then" stmt • "else" stmt Example: "if" expr "then" "if" expr "then" stmt • "else" stmt Reduce derivation if_stmt ↳ 4: "if" expr "then" stmt "else" stmt ↳ 2: if_stmt ↳ 3: "if" expr "then" stmt •
See Generation of Counterexamples, for more details.
To avoid warnings from Bison about predictable, legitimate shift/reduce
conflicts, you can use the %expect n
declaration.
There will be no warning as long as the number of shift/reduce conflicts
is exactly n, and Bison will report an error if there is a
different number.
See Suppressing Conflict Warnings. However, we don’t
recommend the use of %expect
(except ‘%expect 0’!), as an equal
number of conflicts does not mean that they are the same. When
possible, you should rather use precedence directives to fix the
conflicts explicitly (see Using Precedence For Non Operators).
The definition of if_stmt
above is solely to blame for the
conflict, but the conflict does not actually appear without additional
rules. Here is a complete Bison grammar file that actually manifests
the conflict:
%%
stmt: expr | if_stmt ;
if_stmt: "if" expr "then" stmt | "if" expr "then" stmt "else" stmt ;
expr: "identifier" ;
Next: Operator Precedence, Previous: Lookahead Tokens, Up: The Bison Parser Algorithm [Contents][Index]