Next: Rounding numbers, Previous: Check - checking tabular structure, Up: Usage Examples [Contents][Index]
Cross-tabulation compares the relationship between two fields. Given the following input file:
$ cat input.txt a x 3 a y 7 b x 21 a x 40
Show cross-tabulation between the first field (a/b) and the second field (x/y) - counting how many times each pair appears (note: sorting is required):
$ datamash -s crosstab 1,2 < input.txt x y a 2 1 b 1 N/A
The default operation is count - in the above example, a and x appear twice in the input file, while b and y never appear together.
An optional grouping operation can be used instead of counting.
For each pair, sum the values in the third column:
$ datamash -s crosstab 1,2 sum 3 < input.txt x y a 43 7 b 21 N/A
For each pair, list all unique values in the third column:
$ datamash -s crosstab 1,2 unique 3 < input.txt x y a 3,40 7 b 21 N/A
Note that using --header-out with crosstab prints a line showing how to interpret the rows and columns, and what operation was used.
$ datamash -s --header-in --header-out crosstab 1,2 < input.txt GroupBy(a) GroupBy(x) count(a) x y a 1 1 b 1 N/A