There are some additional coding conventions for code in GCC, beyond those in the GNU Coding Standards. Some existing code may not follow these conventions, but they must be used for new code. If changing existing code to follow these conventions, it is best to send changes to follow the conventions separately from any other changes to the code.
Documentation, both of user interfaces and of internals, must be maintained and kept up to date. In particular:
--param
arguments) must be documented in the GCC manual.tree
and RTL data
structures and interfaces must be kept complete and up to date.
ChangeLog entries are part of git commit messages and are automatically put
into a corresponding ChangeLog file. A ChangeLog template can be easily generated
with ./contrib/mklog.py
script. GCC offers a checking script that
verifies a proper ChangeLog formatting (see git gcc-verify
git alias).
for a particular git commit. The checking script covers most commonly used ChangeLog
formats and the following paragraphs explain what it supports.
See also what the GNU Coding Standards have to say about what goes in ChangeLogs; in particular, descriptions of the purpose of code and changes should go in comments rather than the ChangeLog, though a single line overall description of the changes may be useful above the ChangeLog entry for a large batch of changes.
git_description
- a leading text with git commit descriptioncommitter_timestamp
- line with timestamp and an author name and email (2 spaces before and after name) 2020-04-23␣␣Martin Liska␣␣<mliska@suse.cz>
additional_author
- line with additional commit author name and email (starting with a tabular and 4 spaces) \t␣␣␣␣Martin Liska␣␣<mliska@suse.cz>
changelog_location
- a location to a ChangeLog file a/b/c/ChangeLog
, a/b/c/ChangeLog:
, a/b/c/
(where ChangeLog file lives in the folder), \ta/b/c/
and a/b/c
pr_entry
- bug report reference \tPR component/12345
changelog_file
- a modified file mentioned in a ChangeLog:
supported formats: \t* a/b/c/file.c:
, \t* a/b/c/file.c (function):
, \t* a/b/c/file1.c, a/b/c/file2.c:
changelog_file_comment
- line that follows a changelog_file
with description of changes in the file;
must start with \t
co_authored_by
- GitHub format
for a Co-Authored-Bygit_description
- optional; ends right before one of the other components is foundcommitter_timestamp
- optional; when found before a changelog_file
, then it is added
to each changelog entryadditional_author
- optionalchangelog_location
- optional; parser attempts to identify ChangeLog file based
on modified files; $changelog_location
belonging to a different ChangeLog must
be separated with an empty linepr_entry
- optional; can contain any number of PR entrieschangelog_file
- each changelog_location
must contain at least one filechangelog_file_comment
- optionalco_authored_by
- optional, can contain more than onechangelog_location
file location can be deduced based on group of changelog_file
sco_authored_by
is added to each ChangeLog entryChangeLog
files, DATESTAMP
, BASE-VER
and DEV-PHASE
can be modified only separately from other filesThis patch adds a second movk pattern that models the instruction
as a "normal" and/ior operation rather than an insertion. It fixes
the third insv_1.c failure in PR87763, which was a regression from
GCC 8.
2020-02-06 John Foo <john@example.com>
gcc/
PR target/87763
* config/aarch64/aarch64-protos.h (aarch64_movk_shift): Declare.
* config/aarch64/aarch64.c (aarch64_movk_shift): New function.
* config/aarch64/aarch64.md (aarch64_movk<mode>): New pattern.
gcc/testsuite/
PR target/87763
* gcc.target/aarch64/movk_2.c: New test.
Co-Authored-By: Jack Bar <jack@example.com>
$git_description
$committer_timestamp
$changelog_location
$pr_entry
$changelog_file
$changelog_file
$changelog_file
$changelog_location
$pr_entry
$changelog_file
$co_authored_by
There are strict requirements for portability of code in GCC to older systems whose compilers do not implement all of the latest ISO C and C++ standards.
The directories
gcc
, libcpp
and fixincludes
may use C++11.
They may also use the long long
type
if the host C++ compiler supports it.
These directories should use reasonably portable parts of C++11,
so that it is possible to build GCC with C++ compilers other than GCC itself.
If testing reveals that
reasonably recent versions of non-GCC C++ compilers cannot compile GCC,
then GCC code should be adjusted accordingly.
(Avoiding unusual language constructs helps immensely.)
Furthermore,
these directories should also be compatible with later C++ standards.
The directories libiberty and libdecnumber must use C
and require at least an ANSI C89 or ISO C90 host compiler.
C code should avoid pre-standard style function definitions, unnecessary
function prototypes and use of the now deprecated PARAMS
macro.
See README.Portability
for details of some of the portability problems that may arise. Some
of these problems are warned about by gcc -Wtraditional
,
which is included in the default warning options in a bootstrap.
The programs included in GCC are linked with the
libiberty library, which will replace some standard
library functions if not present on the system used, so those
functions may be freely used in GCC. In particular, the ISO C string
functions memcmp
, memcpy
,
memmove
, memset
, strchr
and
strrchr
are preferred to the old functions
bcmp
, bcopy
, bzero
,
index
and rindex
; see messages 1 and 2. The
older functions must no longer be used in GCC; apart from
index
, these identifiers are poisoned to prevent their
use.
Machine-independent files may contain conditionals on features of a
particular system, but should never contain conditionals such as
#ifdef __hpux__
on the name or version of a particular
system. Exceptions may be made to this on a release branch late in
the release cycle, to reduce the risk involved in fixing a problem
that only shows up on one particular system.
Function prototypes for extern functions should only occur in header files. Functions should be ordered within source files to minimize the number of function prototypes, by defining them before their first use. Function prototypes should only be used when necessary, to break mutually recursive cycles.
touch
should never be used in GCC Makefiles. Instead
of touch foo
always use $(STAMP) foo
.
Every language or library feature, whether standard or a GNU extension, and every warning GCC can give, should have testcases thoroughly covering both its specification and its implementation. Every bug fixed should have a testcase to detect if the bug recurs.
The testsuite READMEs discuss the requirement to use abort
()
for runtime failures and exit (0)
for success.
For compile-time tests, a trick taken from autoconf may be used to evaluate
expressions: a declaration extern char x[(EXPR) ? 1 :
-1];
will compile successfully if and only if EXPR
is nonzero.
Where appropriate, testsuite entries should include comments giving their origin: the people who added them or submitted the bug report they relate to, possibly with a reference to a PR in our bug tracking system. There are some copyright guidelines on what can be included in the testsuite.
If a testcase itself is incorrect, but there's a possibility that an improved testcase might fail on some platform where the incorrect testcase passed, the old testcase should be removed and a new testcase (with a different name) should be added. This helps automated regression-checkers distinguish a true regression from an improvement to the testsuite.
input_location
global, and of the
diagnostic functions that implicitly use input_location
,
is deprecated; the preferred technique is to pass around locations
ultimately derived from the location of some explicitly chosen source
code token.%qs
or
%<
and %>
for quoting and
%m
for errno
numbers.%E
or
%qE
; use of identifier_to_locale
is needed
if the identifier text is used directly.%wd
should be used with types such as
HOST_WIDE_INT
(HOST_WIDE_INT_PRINT_DEC
is a
format for the host printf
functions, not for the GCC
diagnostic functions).error
is for defects in the user's code.sorry
is for correct user input programs but
unimplemented functionalities.warning
is for advisory diagnostics; it
may be used for diagnostics that have severity less than an
error.inform
is for adding additional explanatory
information to a diagnostic.internal_error
is used for conditions that should not
be triggered by any user input whether valid or invalid and including
invalid asms and LTO binary data (sometimes, as an exception, there is
a call to error
before further information is printed and
an ICE is triggered). Assertion failures should not be triggered by
invalid input.inform
is for informative notes accompanying errors
and warnings.The following conventions of spelling and terminology apply throughout GCC, including the manuals, web pages, diagnostics, comments, and (except where they require spaces or hyphens to be used) function and variable names, although consistency in user-visible documentation and diagnostics is more important than that in comments and code. The following table lists some simple cases:
Use... | ...instead of | Rationale |
---|---|---|
American spelling (in particular -ize, -or) | British spelling (in particular -ise, -our) | |
"32-bit" (adjective) | "32 bit" | |
"alphanumeric" | "alpha numeric" | |
"back end" (noun) | "back-end" or "backend" | |
"back-end" (adjective) | "back end" or "backend" | |
"bit-field" | "bit field" or "bitfield" | Spelling used in C and C++ standards |
"built-in" as an adjective ("built-in function") or "built in" | "builtin" | "builtin" isn't a word |
"bug fix" (noun) or "bug-fix" (adjective) | "bugfix" or "bug-fix" | "bugfix" isn't a word |
"ColdFire" | "coldfire" or "Coldfire" | |
"command-line option" | "command line option" | |
"compilation time" (noun); how long it takes to compile the program | "compile time" | |
"compile time" (noun), "compile-time" (adjective); the time at which the program is compiled | ||
"dependent" (adjective), "dependence", "dependency" | "dependant", "dependance", "dependancy" | |
"enumerated" | "enumeral" | Terminology used in C and C++ standards |
"epilogue" | "epilog" | Established convention |
"execution time" (noun); how long it takes the program to run | "run time" or "runtime" | |
file name | filename | |
"floating-point" (adjective) | "floating point" | |
"free software" or just "free" | "Open Source" or "OpenSource" | |
"front end" (noun) | "front-end" or "frontend" | |
"front-end" (adjective) | "front end" or "frontend" | |
"GNU/Linux" (except in reference to the kernel) | "Linux" or "linux" or "Linux/GNU" | |
"link time" (noun), "link-time" (adjective); the time at which the program is linked | ||
"lowercase" | "lower case" or "lower-case" | |
"H8S" | "H8/S" | |
"Microsoft Windows" | "Windows" | |
"MIPS" | "Mips" or "mips" | |
"nonzero" | "non-zero" or "non zero" | |
"null character" | "zero character" | |
"Objective-C" | "Objective C" | |
"prologue" | "prolog" | Established convention |
"PowerPC" | "powerpc", "powerPC" or "PowerPc" | |
"Red Hat" | "RedHat" or "Redhat" | |
"return type" (noun), "return value" (noun) | "return-type", "return-value" | |
"run time" (noun), "run-time" (adjective); the time at which the program is run | "runtime" | |
"runtime" (both noun and adjective); libraries and system support present at run time | "run time", "run-time" | |
"SPARC" | "Sparc" or "sparc" | |
"testcase", "testsuite" | "test-case" or "test case", "test-suite" or "test suite" | |
"uppercase" | "upper case" or "upper-case" | |
"VAX", "VAXen", "MicroVAX" | "vax" or "Vax", "vaxen" or "vaxes", "microvax" or "microVAX" |
"GCC" should be used for the GNU Compiler Collection, both
generally and as the GNU C Compiler in the context of compiling C;
"G++" for the C++ compiler; "gcc" and "g++" (lowercase), marked up
with @command
when in Texinfo, for the commands for
compilation when the emphasis is on those; "GNU C" and "GNU C++" for
language dialects; and try to avoid the older term "GNU CC".
Use a comma after "e.g." or "i.e." if and only if it is appropriate
in the context and the slight pause a comma means helps the reader; do
not add them automatically in all cases just because some style guides
say so. (In Texinfo manuals, @:
should to be used after
"e.g." and "i.e." when a comma isn't used.)
In Texinfo manuals, Texinfo 4.0 features may be used, and should be
used where appropriate. URLs should be marked up with
@uref
; email addresses with @email
;
command-line options with @option
; names of commands with
@command
; environment variables with @env
.
NULL should be written as @code{NULL}
. Tables of
contents should come just after the title page; printed manuals will
be formatted (for example, by make dvi
) using
texi2dvi
which reruns TeX until cross-references
stabilize, so there is no need for a table of contents to go at the
end for it to have correct page numbers. The @refill
feature is obsolete and should not be used. All manuals should use
@dircategory
and @direntry
to provide Info
directory information for install-info
.
It is useful to read the Texinfo manual. Some general Texinfo style issues discussed in that manual should be noted:
`
or ``
and '
or
''
) should be used; neutral ASCII double quotes
("..."
) should not be. Similarly, TeX dashes
(--
(two hyphens) for an en dash and ---
(three hyphens) for an em dash) should be used; normally these
dashes should not have whitespace on either side. Minus signs
should be written as @minus{}
.@dots{}
should be used; for a
literal sequence of three dots in a programming language, the dots
should be written as such (...
) rather than as
@dots{}
.@r{}
so that it is printed in a
non-fixed-width font.@:
if they are not
followed by other punctuation such as a comma; full stops, question
marks and exclamation marks that end a sentence but are preceded by
an upper-case letter should be written as "@.
",
"@?
" and "@!
", respectively. (This is not
required if the capital letter is within @code
or
@samp
.)@code
for an expression in a program, for the name
of a variable or function used in a program, or for a keyword in
a programming language. However, avoid @code
in uses
of language keywords as adjectives. For example, appropriate uses
of @code
are in phrases such as
"@code{const}
-qualified type", or
"@code{asm}
statement", or
"function returns @code{true}
".
Examples where @code
should be avoided are phrases such as
"const variable", "volatile access", or
"condition is false."Some files and packages in the GCC source tree are imported from elsewhere, and we want to minimize divergence from their upstream sources. The following files should be updated only according to the rules set below:
automake --add-missing --copy --force-missing
.t-softfp
) are in git glibc, and changes should go there
before going into GCC.doxygen -u
).
The files in doc/html are generated from the Docbook sources in doc/xml
and should not be changed manually.
The files in doc/xml/gnu are based on the GNU licenses and should not
be changed without prior permission, if at all.The following conventions apply to both C and C++.
The compiler must build cleanly with -Wall -Wextra
.
Code should use gcc_assert (EXPR)
to check invariants.
Use gcc_unreachable ()
to mark places that should never be
reachable (such as an unreachable default
case of a
switch). Do not use gcc_assert (0)
for such purposes, as
gcc_unreachable
gives the compiler more information. The
assertions are enabled unless explicitly configured off with
--enable-checking=none
. Do not use abort
.
User input should never be validated by either gcc_assert
or gcc_unreachable
. If the checks are expensive or the
compiler can reasonably carry on after the error, they may be
conditioned on --enable-checking
by using gcc_checking_assert
.
Code testing properties of characters from user source code should
use macros such as ISALPHA
from safe-ctype.h
instead of the standard functions such as isalpha
from
<ctype.h>
to avoid any locale-dependency of the
language accepted.
Testing for ERROR_MARK
s should be done by comparing
against error_mark_node
rather than by comparing the
TREE_CODE
against ERROR_MARK
; see message.
Internal numeric parameters that may affect generated code should
be controlled by --param
rather than being hardcoded.
Inlining functions only when you have reason to believe that the expansion of the function is smaller than a call to the function or that inlining is significant to the run-time of the compiler.
Lines shall be at most 80 columns.
Macros names should be in ALL_CAPS
when it's important to be aware that it's a macro
(e.g. accessors and simple predicates),
but in lowercase (e.g., size_int
)
where the macro is a wrapper for efficiency
that should be considered as a function;
see messages
1
and 2.
Other names should be lower-case and separated by low_lines.
Code in GCC should use the following formatting conventions:
For | Use... | ...instead of |
---|---|---|
logical not | !x |
! x |
bitwise complement | ~x |
~ x |
unary minus | -x |
- x |
cast | (type) x |
(type)x |
pointer cast | (type *) x |
(type*)x |
pointer return type | type *f (void) |
type* f (void) |
pointer dereference | *x |
* x |
The following conventions apply only to C++.
These conventions will change over time, but changing them requires a convincing rationale.
C++ is a complex language, and we strive to use it in a manner that is not surprising. So, the primary rule is to be reasonable. Use a language feature in known good ways. If you need to use a feature in an unusual way, or a way that violates the "should" rules below, seek guidance, review and feedback from the wider community.
All use of C++ features is subject to the decisions of the maintainers of the relevant components. (This restates something that is always true for gcc, which is that component maintainers make the final decisions about those components.)
Variables should be defined at the point of first use, rather than at the top of the function. The existing code obviously does not follow that rule, so variables may be defined at the top of the function, as in C90.
Variables may be simultaneously defined and tested in control expressions.
Some coding conventions, including GCC's own in the past, recommend
using the struct
keyword (also known as the class-key)
for
plain old data (POD) types. However, since the POD concept has been
replaced in C++ by a set of much more nuanced distinctions, the current
guidance (though not a requirement) is to use the struct
class-key when defining structures that could be used without
change in C, and use class
for all other classes. It is
recommended to use the same class-key consistently in all
declarations and, if necessary, in uses of the class.
The -Wmismatched-tags
warning option helps detect mismatches.
The -Wredundant-tags
GCC option further helps identify places
where the class-key can safely be omitted.
See the guidance in Struct Definitions for the suggested choice of a class-key.
A class defined with the class-key class
type will often
(but not always) ave a declaration of a
special member function.
If any one of these is declared,
then all should be either declared
or have an explicit comment saying that the default is intended.
Single inheritance is permitted. Use public inheritance to describe interface inheritance, i.e. 'is-a' relationships. Use private and protected inheritance to describe implementation inheritance. Implementation inheritance can be expedient, but think twice before using it in code intended to last a long time.
Complex hierarchies are to be avoided. Take special care with multiple inheritance. On the rare occasion that using multiple inheritance is indeed useful, prepare design rationales in advance, and take special care to make documentation of the entire hierarchy clear. (In particular, multiple inheritance can be an acceptable way of combining "traits"-style classes that only contain static member functions. Its use with data-carrying classes is more problematic.)
Think carefully about the size and performance impact of virtual functions and virtual bases before using them.
Prefer to make data members private.
All constructors should initialize data members in the member initializer list rather than in the body of the constructor.
A class with virtual functions or virtual bases should have a virtual destructor.
Single argument constructors should nearly always be declared explicit.
Conversion operators should be avoided.
Overloading functions is permitted, but take care to ensure that overloads are not surprising, i.e. semantically identical or at least very similar. Virtual functions should not be overloaded.
Overloading operators is permitted, but take care to ensure that overloads are not surprising. Some unsurprising uses are in the implementation of numeric types and in following the C++ Standard Library's conventions. In addition, overloaded operators, excepting the call operator, should not be used for expensive implementations.
Note: in declarations of operator functions or in invocations of
such functions that involve the keyword operator
,
the full name of the operator should be considered as including
the keyword with no spaces in between the keyword and the operator
token. Thus, the expected format of a declaration of an operator
is
T &operator== (const T & const T &);
and not
T &operator == (const T & const T &);
(with the space between operator
and ==
).
Default arguments are another type of function overloading, and the same rules apply. Default arguments must always be POD values, i.e. may not run constructors. Virtual functions should not have default arguments.
Constructors and destructors, even those with empty bodies, are often much larger than programmers expect. Prefer non-inline versions unless you have evidence that the inline version is smaller or has a significant performance impact.
To avoid excessive compiler size,
consider implementing non-trivial templates
on a non-template base class with void*
parameters.
Namespaces are encouraged. All separable libraries should have a unique global namespace. All individual tools should have a unique global namespace. Nested include directories names should map to nested namespaces when possible.
Header files should have neither using
directives
nor namespace-scope using
declarations.
dynamic_cast
Run-time type information (RTTI) is permitted
when certain non-default --enable-checking
options are enabled,
so as to allow checkers to report dynamic types.
However, by default, RTTI is not permitted
and the compiler must build cleanly with -fno-rtti
.
C-style casts should not be used. Instead, use C++-style casts.
Exceptions and throw specifications are not permitted
and the compiler must build cleanly with -fno-exceptions
.
Use of the standard library is permitted. Note, however, that it is currently not usable with garbage collected data.
For compiler messages, indeed any text that needs i18n, should continue to use the existing facilities.
For long-term code, at least for now,
we will continue to use printf
style I/O
rather than <iostream>
style I/O.
When structs and/or classes have member functions,
prefer to name data members with a leading m_
and static data members with a leading s_
.
Template parameter names should use CamelCase, following the C++ Standard.
Note that the rules for classes do not apply to structs. Structs continue to behave as before.
If the entire class definition fits on a single line, put it on a single line. Otherwise, use the following rules.
Do not indent protection labels.
Indent class members by two spaces.
Prefer to put the entire class head on a single line.
class gnuclass : base {
Otherwise, start the colon of the base list at the beginning of a line.
class a_rather_long_class_name : with_a_very_long_base_name, and_another_just_to_make_life_hard { int member; };
If the base clause exceeds one line, move overflowing initializers to the next line and indent by two spaces.
class gnuclass : base1 <template_argument1>, base2 <template_argument1>, base3 <template_argument1>, base4 <template_argument1> { int member; };
When defining a class,
Semantic constraints may require a different declaration order, but seek to minimize the potential confusion.
Close a class definition with a right brace, semicolon, optional closing comment, and a new line.
}; // class gnuclass
Define all members outside the class definition. That is, there are no function bodies or member initializers inside the class definition.
Prefer to put the entire member head on a single line.
gnuclass::gnuclass () : base_class () { ... };
When that is not possible, place the colon of the initializer clause at the beginning of a line.
gnuclass::gnuclass () : base1 (), base2 (), member1 (), member2 (), member3 (), member4 () { ... };
If the initializer clause exceeds one line, move overflowing initializers to the next line and indent by two spaces.
gnuclass::gnuclass () : base1 (some_expression), base2 (another_expression), member1 (my_expressions_everywhere) { ... };
If a C++ function name is long enough to cause the first function parameter with its type to exceed 80 characters, it should appear on the next line indented four spaces.
void very_long_class_name::very_long_function_name ( very_long_type_name arg) {
Sometimes the class qualifier and function name together exceed 80 characters.
In this case, break the line before the ::
operator.
We may wish to do so pre-emptively for all class member functions.
void very_long_template_class_name <with, a, great, many, arguments> ::very_long_function_name ( very_long_type_name arg) {
A declaration following a template parameter list should not have additional indentation.
Prefer typename
over class
in template parameter lists.
Prefer an extern "C"
block to a declaration qualifier.
Open an extern "C"
block with the left brace on the same line.
extern "C" {
Close an extern "C"
block
with a right brace, optional closing comment, and a new line.
} // extern "C"
Definitions within the body of an extern "C"
block are not indented.
Open a namespace with the namespace name followed by a left brace and a new line.
namespace gnutool {
Close a namespace with a right brace, optional closing comment, and a new line.
} // namespace gnutool
Definitions within the body of a namespace are not indented.
There should be a space between the lambda-introducer and the parameter list, if any.
Lambdas that do not outlive their enclosing function should
typically use [&]
implicit capture.
auto l = [&] (tree arg) { ... };
If a lambda does not fit on one line, the left brace should be indented like the body of a for-statement.
auto l = [&] (tree arg) mutable -> int { ... };
This also applies if the lambda is the last argument, and only lambda argument, to a function.
To get the above behavior from GNU Emacs CC Mode, you can add this to yourstd::for_each (start, end, [&] (tree arg) { ... });
.emacs
:
(defun lambda-offset (elem) "If the opening brace of a lambda is on a new line, indent it one step." (if (assq 'inline-open c-syntactic-context) '+ 0)) (add-hook 'c++-mode-hook '(lambda () (c-set-offset 'inlambda 'lambda-offset)))
If the multi-line lambda is not the last argument, or there are multiple
lambda arguments, you are encouraged to make them local variables, as
the l
examples above. If you do pass them directly, they should
be indented like other parameters.
my_algo (start, end, [&] (tree arg) { thing one... }, [&] (tree arg) { thing two... });
See also the GDB coding standards.
Python scripts should follow PEP 8 – Style Guide for Python Code
which can be verified by the flake8 tool.
We recommend using the following flake8
plug-ins:
Copyright (C) Free Software Foundation, Inc. Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.
These pages are maintained by the GCC team. Last modified 2024-09-17.