26.1 Version control and generated files

Background: distributed generated Files

Packages made with Autoconf and Automake ship with some generated files like configure or Makefile.in. These files were generated on the developer’s machine and are distributed so that end-users do not have to install the maintainer tools required to rebuild them. Other generated files like Lex scanners, Yacc parsers, or Info documentation are usually distributed on similar grounds.

Automake output generates rules in Makefiles to rebuild these files. For instance, make will run autoconf to rebuild configure whenever configure.ac is changed. This makes development safer by ensuring a configure is never out-of-date with respect to configure.ac.

As generated files shipped in packages are up-to-date, and because tar preserves timestamps, these rebuild rules are not triggered when a user unpacks and builds a package.

Background: Version Control and Timestamps

Typically when you update files with version control commands, working files will have the timestamp of your update, not the original timestamp of the commit. This is meant to make sure that make notices that source files have been updated.

This timestamp shift is troublesome when both sources and generated files are kept under version control. Because version control commands often process files in lexical order, configure.ac will appear newer than configure after a version control command that updates both files, even if configure was newer than configure.ac when it was committed. Calling make will then trigger a spurious rebuild of configure.

Living with Version Control in Autoconfiscated Projects

There are basically two clans among maintainers: those who keep all distributed files under version control, including generated files, and those who keep generated files out of version control.

All Files under Version Control

Generated Files out of Version Control

One way to get version control and make working peacefully is to never store generated files in version control, i.e., do not version-control files that are Makefile targets (also called derived files).

This way developers are not annoyed by changes to generated files. It does not matter if they all have different versions (assuming they are compatible, of course). And finally, timestamps are not lost; changes to source files can’t be missed as in the Makefile.am/Makefile.in example discussed earlier.

The drawback is that the repository does not contain some files that are is distributed, so builders now need to install various development tools (maybe even specific versions) before they can build a checkout. But, after all, the job of version control is versioning, not distribution.

Allowing developers to use different versions of their tools can also hide bugs during distributed development. Indeed, developers will be using (hence testing) their own generated files, instead of the generated files that will be released. The developer who prepares the tarball might be using a version of the tool that produces bogus output (for instance a non-portable C file), something other developers could have noticed if they weren’t using their own versions of this tool.

Third-party Files

Another class of files not discussed here (because they do not cause timestamp issues) are files that are shipped with a package, but maintained elsewhere. For instance, tools like gettextize and autopoint (from Gettext) or libtoolize (from Libtool), will install or update files in your package.

These files, whether they are kept under version control or not, raise similar concerns about version mismatch between developers’ tools. The Gettext manual has a section about this; see Integrating with Version Control Systems in GNU gettext tools.