wc
: Print newline, word, and byte countswc
counts the number of bytes, characters, words, and newlines
in each given file, or standard input if none are given
or for a file of ‘-’. A word is a nonempty sequence of non white
space delimited by white space characters or by start or end of input.
Synopsis:
wc [option]… [file]…
wc
prints one line of counts for each file, and if the file was
given as an argument, it prints the file name following the counts. By default
if more than one file is given, wc
prints a final line
containing the cumulative counts, with the file name total.
This ‘total’ line can be controlled with the --total option,
which is a GNU extension.
The counts are printed in this order: newlines, words, characters, bytes,
maximum line length.
Each count is printed right-justified in a field with at least one
space between fields so that the numbers and file names normally line
up nicely in columns. The width of the count fields varies depending
on the inputs, so you should not depend on a particular field width.
However, as a GNU extension, if only one count is printed,
it is guaranteed to be printed without leading spaces.
By default, wc
prints three counts: the newline, words, and byte
counts. Options can specify that only certain counts be printed.
Options do not undo others previously given, so
wc --bytes --words
prints both the byte counts and the word counts.
With the --max-line-length option, wc
prints the length
of the longest line per file, and if there is more than one file it
prints the maximum (not the sum) of those lengths. The line lengths here
are measured in screen columns, according to the current locale and
assuming tab positions in every 8th column.
The program accepts the following options. Also see Common options.
Print only the byte counts.
Print only the character counts, as per the current locale. Encoding errors are not counted.
Print only the word counts. A word is a nonempty sequence of non white
space delimited by white space characters or by start or end of input.
The current locale determines which characters are white space.
GNU wc
treats encoding errors as non white space.
Unless the environment variable POSIXLY_CORRECT
is set,
GNU wc
treats the following Unicode characters as white
space even if the current locale does not: U+00A0 NO-BREAK SPACE,
U+2007 FIGURE SPACE, U+202F NARROW NO-BREAK SPACE, and U+2060 WORD
JOINER.
Print only the newline character counts. If a file ends in a non-newline character, its trailing partial line is not counted.
Print only the maximum display widths. Tabs are set at every 8th column. Display widths of wide characters are considered. Non-printable characters are given 0 width.
Control when and how the final line with cumulative counts is printed. when is one of:
wc
when no --total
option is specified. Output a total line if more than one file
is specified.
Disallow processing files named on the command line, and instead process
those named in file file; each name being terminated by a zero byte
(ASCII NUL).
This is useful
when the list of file names is so long that it may exceed a command line
length limitation.
In such cases, running wc
via xargs
is undesirable
because it splits the list into pieces and makes wc
print
a total for each sublist rather than for the entire list.
One way to produce a list of ASCII NUL terminated file
names is with GNU
find
, using its -print0 predicate.
If file is ‘-’ then the ASCII NUL terminated
file names are read from standard input.
For example, to find the length of the longest line in any .c or .h file in the current hierarchy, do this:
find . -name '*.[ch]' -print0 | wc -L --files0-from=- | tail -n1
An exit status of zero indicates success, and a nonzero value indicates failure.