3.1.1.2 Gperf Declarations
The declaration section can contain gperf
declarations. They
influence the way gperf
works, like command line options do.
In fact, every such declaration is equivalent to a command line option.
There are three forms of declarations:
- Declarations without argument, like ‘%compare-lengths’.
- Declarations with an argument, like ‘%switch=count’.
- Declarations of names of entities in the output file, like
‘%define lookup-function-name name’.
When a declaration is given both in the input file and as a command line
option, the command-line option's value prevails.
The following gperf
declarations are available.
- ‘%delimiters=delimiter-list’
- Allows you to provide a string containing delimiters used to
separate keywords from their attributes. The default is ",". This
option is essential if you want to use keywords that have embedded
commas or newlines.
- ‘%struct-type’
- Allows you to include a
struct
type declaration for generated
code; see above for an example.
- ‘%ignore-case’
- Consider upper and lower case ASCII characters as equivalent. The string
comparison will use a case insignificant character comparison. Note that
locale dependent case mappings are ignored.
- ‘%language=language-name’
- Instructs
gperf
to generate code in the language specified by the
option's argument. Languages handled are currently:
- ‘KR-C’
- Old-style K&R C. This language is understood by old-style C compilers and
ANSI C compilers, but ANSI C compilers may flag warnings (or even errors)
because of lacking ‘const’.
- ‘C’
- Common C. This language is understood by ANSI C compilers, and also by
old-style C compilers, provided that you
#define const
to empty
for compilers which don't know about this keyword.
- ‘ANSI-C’
- ANSI C. This language is understood by ANSI C (C89, ISO C90) compilers,
ISO C99 compilers, and C++ compilers.
- ‘C++’
- C++. This language is understood by C++ compilers.
The default is ANSI-C.
- ‘%define slot-name name’
- This declaration is only useful when option ‘-t’ (or, equivalently, the
‘%struct-type’ declaration) has been given.
By default, the program assumes the structure component identifier for
the keyword is ‘name’. This option allows an arbitrary choice of
identifier for this component, although it still must occur as the first
field in your supplied
struct
.
- ‘%define initializer-suffix initializers’
- This declaration is only useful when option ‘-t’ (or, equivalently, the
‘%struct-type’ declaration) has been given.
It permits to specify initializers for the structure members following
slot-name in empty hash table entries. The list of initializers
should start with a comma. By default, the emitted code will
zero-initialize structure members following slot-name.
- ‘%define hash-function-name name’
- Allows you to specify the name for the generated hash function. Default
name is ‘hash’. This option permits the use of two hash tables in
the same file.
- ‘%define lookup-function-name name’
- Allows you to specify the name for the generated lookup function.
Default name is ‘in_word_set’. This option permits multiple
generated hash functions to be used in the same application.
- ‘%define class-name name’
- This option is only useful when option ‘-L C++’ (or, equivalently,
the ‘%language=C++’ declaration) has been given. It
allows you to specify the name of generated C++ class. Default name is
Perfect_Hash
.
- ‘%7bit’
- This option specifies that all strings that will be passed as arguments
to the generated hash function and the generated lookup function will
solely consist of 7-bit ASCII characters (bytes in the range 0..127).
(Note that the ANSI C functions
isalnum
and isgraph
do
not guarantee that a byte is in this range. Only an explicit
test like ‘c >= 'A' && c <= 'Z'’ guarantees this.)
- ‘%compare-lengths’
- Compare keyword lengths before trying a string comparison. This option
is mandatory for binary comparisons (see Binary Strings). It also might
cut down on the number of string comparisons made during the lookup, since
keywords with different lengths are never compared via
strcmp
.
However, using ‘%compare-lengths’ might greatly increase the size of the
generated C code if the lookup table range is large (which implies that
the switch option ‘-S’ or ‘%switch’ is not enabled), since the length
table contains as many elements as there are entries in the lookup table.
- ‘%compare-strncmp’
- Generates C code that uses the
strncmp
function to perform
string comparisons. The default action is to use strcmp
.
- ‘%readonly-tables’
- Makes the contents of all generated lookup tables constant, i.e.,
“readonly”. Many compilers can generate more efficient code for this
by putting the tables in readonly memory.
- ‘%enum’
- Define constant values using an enum local to the lookup function rather
than with #defines. This also means that different lookup functions can
reside in the same file. Thanks to James Clark
<jjc@ai.mit.edu>
.
- ‘%includes’
- Include the necessary system include file,
<string.h>
, at the
beginning of the code. By default, this is not done; the user must
include this header file himself to allow compilation of the code.
- ‘%global-table’
- Generate the static table of keywords as a static global variable,
rather than hiding it inside of the lookup function (which is the
default behavior).
- ‘%pic’
- Optimize the generated table for inclusion in shared libraries. This
reduces the startup time of programs using a shared library containing
the generated code. If the ‘%struct-type’ declaration (or,
equivalently, the option ‘-t’) is also given, the first field of the
user-defined struct must be of type ‘int’, not ‘char *’, because
it will contain offsets into the string pool instead of actual strings.
To convert such an offset to a string, you can use the expression
‘stringpool + o’, where o is the offset. The string pool
name can be changed through the ‘%define string-pool-name’ declaration.
- ‘%define string-pool-name name’
- Allows you to specify the name of the generated string pool created by
the declaration ‘%pic’ (or, equivalently, the option ‘-P’).
The default name is ‘stringpool’. This declaration permits the use of
two hash tables in the same file, with ‘%pic’ and even when the
‘%global-table’ declaration (or, equivalently, the option ‘-G’)
is given.
- ‘%null-strings’
- Use NULL strings instead of empty strings for empty keyword table entries.
This reduces the startup time of programs using a shared library containing
the generated code (but not as much as the declaration ‘%pic’), at the
expense of one more test-and-branch instruction at run time.
- ‘%define constants-prefix prefix’
- Allows you to specify a prefix for the constants
TOTAL_KEYWORDS
,
MIN_WORD_LENGTH
, MAX_WORD_LENGTH
, and so on. This option
permits the use of two hash tables in the same file, even when the option
‘-E’ (or, equivalently, the ‘%enum’ declaration) is not given or
the option ‘-G’ (or, equivalently, the ‘%global-table’ declaration)
is given.
- ‘%define word-array-name name’
- Allows you to specify the name for the generated array containing the
hash table. Default name is ‘wordlist’. This option permits the
use of two hash tables in the same file, even when the option ‘-G’
(or, equivalently, the ‘%global-table’ declaration) is given.
- ‘%define length-table-name name’
- Allows you to specify the name for the generated array containing the
length table. Default name is ‘lengthtable’. This option permits the
use of two length tables in the same file, even when the option ‘-G’
(or, equivalently, the ‘%global-table’ declaration) is given.
- ‘%switch=count’
- Causes the generated C code to use a
switch
statement scheme,
rather than an array lookup table. This can lead to a reduction in both
time and space requirements for some input files. The argument to this
option determines how many switch
statements are generated. A
value of 1 generates 1 switch
containing all the elements, a
value of 2 generates 2 tables with 1/2 the elements in each
switch
, etc. This is useful since many C compilers cannot
correctly generate code for large switch
statements. This option
was inspired in part by Keith Bostic's original C program.
- ‘%omit-struct-type’
- Prevents the transfer of the type declaration to the output file. Use
this option if the type is already defined elsewhere.