Previous: Split Symbols, Up: C++ Scanner Interface [Contents][Index]
With both %define api.value.type variant
and %define
api.token.constructor
, the parser defines the type symbol_type
, and
expects yylex
to have the following prototype.
Return a complete symbol, aggregating its type (i.e., the traditional
value returned by yylex
), its semantic value, and possibly its
location. Invocations of ‘%lex-param {type1 arg1}’ yield
additional arguments.
A “complete symbol”, that binds together its kind, value and (when applicable) location.
const
¶The kind of this symbol.
const
¶The name of the kind of this symbol.
Returns a std::string
when parse.error
is verbose
.
For each token kind, Bison generates named constructors as follows.
int
token, const value_type&
value, const location_type&
location) ¶int
token, const location_type&
location) ¶int
token, const value_type&
value) ¶int
token) ¶Build a complete terminal symbol for the token kind token (including
the api.token.prefix
), whose semantic value, if it has one, is
value of adequate value_type. Pass the location iff
location tracking is enabled.
Consistency between token and value_type is checked via an
assert
.
For instance, given the following declarations:
%define api.token.prefix {TOK_} %token <std::string> IDENTIFIER; %token <int> INTEGER; %token ':';
you may use these constructors:
symbol_type (int token, const std::string&, const location_type&); symbol_type (int token, const int&, const location_type&); symbol_type (int token, const location_type&);
Correct matching between token kinds and value types is checked via
assert
; for instance, ‘symbol_type (ID, 42)’ would abort. Named
constructors are preferable (see below), as they offer better type safety
(for instance ‘make_ID (42)’ would not even compile), but symbol_type
constructors may help when token kinds are discovered at run-time, e.g.,
[a-z]+ { if (auto i = lookup_keyword (yytext)) return yy::parser::symbol_type (i, loc); else return yy::parser::make_ID (yytext, loc); }
Note that it is possible to generate and compile type incorrect code (e.g. ‘symbol_type (':', yytext, loc)’). It will fail at run time, provided the assertions are enabled (i.e., -DNDEBUG was not passed to the compiler). Bison supports an alternative that guarantees that type incorrect code will not even compile. Indeed, it generates named constructors as follows.
const value_type&
value, const location_type&
location) ¶const location_type&
location) ¶const value_type&
value) ¶Build a complete terminal symbol for the token kind token (not
including the api.token.prefix
), whose semantic value, if it has one,
is value of adequate value_type. Pass the location iff
location tracking is enabled.
For instance, given the following declarations:
%define api.token.prefix {TOK_} %token <std::string> IDENTIFIER; %token <int> INTEGER; %token COLON; %token EOF 0;
Bison generates:
symbol_type make_IDENTIFIER (const std::string&, const location_type&); symbol_type make_INTEGER (const int&, const location_type&); symbol_type make_COLON (const location_type&); symbol_type make_EOF (const location_type&);
which should be used in a scanner as follows.
[a-z]+ return yy::parser::make_IDENTIFIER (yytext, loc); [0-9]+ return yy::parser::make_INTEGER (text_to_int (yytext), loc); ":" return yy::parser::make_COLON (loc); <<EOF>> return yy::parser::make_EOF (loc);
Tokens that do not have an identifier are not accessible: you cannot simply
use characters such as ':'
, they must be declared with %token
,
including the end-of-file token.
Previous: Split Symbols, Up: C++ Scanner Interface [Contents][Index]