6.1 wc: Print newline, word, and byte counts

wc counts the number of bytes, characters, words, and newlines in each given file, or standard input if none are given or for a file of ‘-’. A word is a nonempty sequence of non white space delimited by white space characters or by start or end of input. Synopsis:

wc [option]... [file]...

wc prints one line of counts for each file, and if the file was given as an argument, it prints the file name following the counts. By default if more than one file is given, wc prints a final line containing the cumulative counts, with the file name total. This ‘total’ line can be controlled with the --total option, which is a GNU extension. The counts are printed in this order: newlines, words, characters, bytes, maximum line length. Each count is printed right-justified in a field with at least one space between fields so that the numbers and file names normally line up nicely in columns. The width of the count fields varies depending on the inputs, so you should not depend on a particular field width. However, as a GNU extension, if only one count is printed, it is guaranteed to be printed without leading spaces.

By default, wc prints three counts: the newline, words, and byte counts. Options can specify that only certain counts be printed. Options do not undo others previously given, so

wc --bytes --words

prints both the byte counts and the word counts.

With the --max-line-length option, wc prints the length of the longest line per file, and if there is more than one file it prints the maximum (not the sum) of those lengths. The line lengths here are measured in screen columns, according to the current locale and assuming tab positions in every 8th column.

The program accepts the following options. Also see Common options.

-c
--bytes

Print only the byte counts.

-m
--chars

Print only the character counts, as per the current locale. Encoding errors are not counted.

-w
--words

Print only the word counts. A word is a nonempty sequence of non white space delimited by white space characters or by start or end of input. The current locale determines which characters are white space. GNU wc treats encoding errors as non white space.

Unless the environment variable POSIXLY_CORRECT is set, GNU wc treats the following Unicode characters as white space even if the current locale does not: U+00A0 NO-BREAK SPACE, U+2007 FIGURE SPACE, U+202F NARROW NO-BREAK SPACE, and U+2060 WORD JOINER.

-l
--lines

Print only the newline character counts. If a file ends in a non-newline character, its trailing partial line is not counted.

-L
--max-line-length

Print only the maximum display widths. Tabs are set at every 8th column. Display widths of wide characters are considered. Non-printable characters are given 0 width.

--total=when

Control when and how the final line with cumulative counts is printed. when is one of:

  • auto - This is the default mode of wc when no --total option is specified. Output a total line if more than one file is specified.
  • always - Always output a total line, irrespective of the number of files processed.
  • only - Only output total counts. I.e., don’t print individual file counts, suppress any leading spaces, and don’t print the ‘total’ word itself, to simplify subsequent processing.
  • never - Never output a total line.
--files0-from=file

Disallow processing files named on the command line, and instead process those named in file file; each name being terminated by a zero byte (ASCII NUL). This is useful when the list of file names is so long that it may exceed a command line length limitation. In such cases, running wc via xargs is undesirable because it splits the list into pieces and makes wc print a total for each sublist rather than for the entire list. One way to produce a list of ASCII NUL terminated file names is with GNU find, using its -print0 predicate. If file is ‘-’ then the ASCII NUL terminated file names are read from standard input.

For example, to find the length of the longest line in any .c or .h file in the current hierarchy, do this:

find . -name '*.[ch]' -print0 |
  wc -L --files0-from=- | tail -n1

An exit status of zero indicates success, and a nonzero value indicates failure.