Next: Column Ranges, Previous: Header Lines and Column Names, Up: Usage Examples [Contents][Index]
datamash
uses tabs (ASCII character 0x09) as default field
delimiters. Use -W to treat one or more consecutive
whitespace characters as field delimiters. Use -t,
--field-separator to set a custom field delimiter.
The following examples illustrate the various options.
By default, fields are separated by a single tab. Multiple tabs
denotes multiple fields (this is consistent with GNU coreutils’
cut
):
$ printf '1\t\t2\n' | datamash sum 3 2 $ printf '1\t\t2\n' | cut -f3 2
Every tab separates two fields. A line starting with a tab thus starts with an empty field, and a line ending with a tab ends with an empty field.
Using -W, one or more consecutive whitespace characters are treated as a single field delimiter:
$ printf '1 \t 2\n' | datamash -W sum 2 2 $ printf '1 \t 2\n' | datamash -W sum 3 datamash: invalid input: field 3 requested, line 1 has only 2 fields
With -W, leading whitespace is ignored, but trailing whitespace is significant. A line starting with one or more consecutive whitespace characters followed by a non-whitespace character starts with a non-empty field. A line ending with one or more consecutive whitespace characters ends with an empty field.
Using -t, a custom field delimiter character can be specified. Multiple consecutive delimiters are treated as multiple fields:
$ printf '1,10,,100\n' | datamash -t, sum 4 100