[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Files in the file system occasionally have holes. A hole
in a file is a section of the file’s contents which was never written.
The contents of a hole reads as all zeros. On many operating systems,
actual disk storage is not allocated for holes, but they are counted
in the length of the file. If you archive such a file, tar
could create an archive longer than the original. To have tar
attempt to recognize the holes in a file, use ‘--sparse’
(‘-S’). When you use this option, then, for any file using
less disk space than would be expected from its length, tar
searches the file for holes. It then records in the archive for the file where
the holes (consecutive stretches of zeros) are, and only archives the
“real contents” of the file. On extraction (using ‘--sparse’ is not
needed on extraction) any such files have also holes created wherever the holes
were found. Thus, if you use ‘--sparse’, tar
archives won’t
take more space than the original.
GNU tar
uses two methods for detecting holes in sparse files. These
methods are described later in this subsection.
This option instructs tar
to test each file for sparseness
before attempting to archive it. If the file is found to be sparse it
is treated specially, thus allowing to decrease the amount of space
used by its image in the archive.
This option is meaningful only when creating or updating archives. It has no effect on extraction.
Consider using ‘--sparse’ when performing file system backups, to avoid archiving the expanded forms of files stored sparsely in the system.
Even if your system has no sparse files currently, some may be
created in the future. If you use ‘--sparse’ while making file
system backups as a matter of course, you can be assured the archive
will never take more space on the media than the files take on disk
(otherwise, archiving a disk filled with sparse files might take
hundreds of tapes). See section Using tar
to Perform Incremental Dumps.
However, be aware that ‘--sparse’ option may present a serious
drawback. Namely, in order to determine the positions of holes in a file
tar
may have to read it before trying to archive it, so in total
the file may be read twice. This may happen when your OS or your FS
does not support SEEK_HOLE/SEEK_DATA feature in lseek (See
‘--hole-detection’, below).
When using ‘POSIX’ archive format, GNU tar
is able to store
sparse files using in three distinct ways, called sparse
formats. A sparse format is identified by its number,
consisting, as usual of two decimal numbers, delimited by a dot. By
default, format ‘1.0’ is used. If, for some reason, you wish to
use an earlier format, you can select it using
‘--sparse-version’ option.
Select the format to store sparse files in. Valid version values are: ‘0.0’, ‘0.1’ and ‘1.0’. See section Storing Sparse Files, for a detailed description of each format.
Using ‘--sparse-format’ option implies ‘--sparse’.
Enforce concrete hole detection method. Before the real contents of sparse
file are stored, tar
needs to gather knowledge about file
sparseness. This is because it needs to have the file’s map of holes
stored into tar header before it starts archiving the file contents.
Currently, two methods of hole detection are implemented:
lseek
system call (SEEK_HOLE
and SEEK_DATA
) which is able to
reuse file system knowledge about sparse file contents - so the
detection is usually very fast. To use this feature, your file system
and operating system must support it. At the time of this writing
(2015) this feature, in spite of not being accepted by POSIX, is
fairly widely supported by different operating systems.
When no ‘--hole-detection’ option is given, tar
uses
the ‘seek’, if supported by the operating system.
Using ‘--hole-detection’ option implies ‘--sparse’.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on August 23, 2023 using texi2html 5.0.