cut


The cut command is used to extract information from each line of content passed to stdin.

The call signature is:

cut {options} {path}

When path is specified, the contents of that file are read and passed to stdin.

cut is often used to either extract fields from delimited files, or to extract positional data by index. Let's look at each of those:

Extract By Delimited Fields

Data can be extracted from delimited fields in two steps.

First, each line is split based on the delimiter. By default the cut command assumes tab-delimited data, though this can be changed using the -d option, followed by the character to use as a delimiter.

Second, information is selected and returned by numerical index, starting from 1. Field index specifications can take several forms:

Spec Meaning
n include field n
n,m include fields n and m
n-m include fields from n to m
n- include fields from n to the end
-m include fields from 1 to m

where n and m are integers.

To demonstrate, suppose you have the following file:

·
·
ninja$: cat colors.csv
red,#f00
green,#0f0
blue,#00f
yellow,#ff0
cyan,#0ff
magenta,#f0f
ninja$:  

and want to extract the color name from each line. This can be achieved by first setting the delimiter to ,, then selecting the first field from each line:

·
·
ninja$: cut -d , -f 1 colors.csv
red
green
blue
yellow
cyan
magenta
ninja$:  

Alternately, if we wanted to extract the hex value for each color, simply select the second field:

·
·
ninja$: cut -d , -f 2 colors.csv
#f00
#0f0
#00f
#ff0
#0ff
#f0f
ninja$:  

Multiple fields can be specified using comma-separated values. One quirk of the cut command is that fields are returned in the order they are read, not in the order they are specified. For example, if we wanted to reverse fields 1 and 2 and return them we might try the following:

·
·
ninja$: cut -d , -f 2,1 colors.csv
red,#f00
green,#0f0
blue,#00f
yellow,#ff0
cyan,#0ff
magenta,#f0f
ninja$:  

Note that both fields are returned in the original order, despite the ordering of the spec.

Extract By Index

Data can be extracted from each line by column/position by specifying which column(s) to return. Columns are specified in a manner similar to above:

Spec Meaning
n include column n
n,m include columns n and m
n-m include columns from n to m
n- include columns from n to the end
-m include columns from 1 to m

where n and m are integers.

For example, to return only the first 6 characters from each line:

·
·
ninja$: cut -c -6 colors.csv
red,#f
green,
blue,#
yellow
cyan,#
magent
ninja$:  

or to return 6 columns, starting from column 3:

·
·
ninja$: cut -c 3-9 colors.csv
d,#f00
een,#0f
ue,#00f
llow,#f
an,#0ff
genta,#
ninja$:  

Finally, more complex cases can combine multiple column-specifications as follows:

·
·
ninja$: cut -c 1-3,9 colors.csv
red
gref
bluf
yelf
cyaf
mag#
ninja$: