Next: Special Tokens, Up: The Lexical Analyzer Function yylex
[Contents][Index]
yylex
The value that yylex
returns must be the positive numeric code for
the kind of token it has just found; a zero or negative value signifies
end-of-input.
When a token kind is referred to in the grammar rules by a name, that name
in the parser implementation file becomes an enumerator of the enum
yytoken_kind_t
whose definition is the proper numeric code for that
token kind. So yylex
should use the name to indicate that type.
See Symbols, Terminal and Nonterminal.
When a token is referred to in the grammar rules by a character literal, the
numeric code for that character is also the code for the token kind. So
yylex
can simply return that character code, possibly converted to
unsigned char
to avoid sign-extension. The null character must not
be used this way, because its code is zero and that signifies end-of-input.
Here is an example showing these things:
int yylex (void) { … if (c == EOF) /* Detect end-of-input. */ return YYEOF; … else if (c == '+' || c == '-') return c; /* Assume token kind for '+' is '+'. */ … else return INT; /* Return the kind of the token. */ … }
This interface has been designed so that the output from the lex
utility can be used without change as the definition of yylex
.