Previous: , Up: Parsing Expression Grammars   [Contents][Index]

37.3 Writing PEG Rules

Something to be aware of when writing PEG rules is that they are greedy. Rules which can consume a variable amount of text will always consume the maximum amount possible, even if that causes a rule that might otherwise have matched to fail later on – there is no backtracking. For instance, this rule will never succeed:

(forest (+ "tree" (* [blank])) "tree" (eol))

The PEX (+ "tree" (* [blank])) will consume all the repetitions of the word ‘tree’, leaving none to match the final ‘tree’.

In these situations, the desired result can be obtained by using predicates and guards – namely the not, if and guard expressions – to constrain behavior. For instance:

(forest (+ "tree" (* [blank])) (not (eol)) "tree" (eol))

The if and not operators accept a parsing expression and interpret it as a boolean, without moving point. The contents of a guard operator are evaluated as regular Lisp (not a PEX) and should return a boolean value. A nil value causes the match to fail.

Another potentially unexpected behavior is that parsing will move point as far as possible, even if the parsing ultimately fails. This rule:

(end-game "game" (eob))

when run in a buffer containing the text “game over” after point, will move point to just after “game” then halt parsing, returning nil. Successful parsing will always return t, or the contexts of the parsing stack.