The Coreutils uniq Command

The uniq command filters duplicate adjacent lines from stdin into single lines.

The call signature is

uniq {options} {path}

If path is specified, the specified file is read and passed to stdin.

The basic behavior an be seen with a few simple examples. Suppose we have the following file, and want to eliminate repeated lines.

ninja$:·cat·names.txt

red

green

red

blue

red

ninja$:··

As a first attempt, we call the uniq command and pass the path to this file as an argument:

ninja$:·uniq·names.txt

red

green

red

blue

red

ninja$:··

The uniq command successfully reduced the adjacent lines containing "red" to a single line, but did not filter out the remaining lines containing "red".

If we want to achieve truly unique lines, we should sort the content prior to piping it to uniq:

ninja$:·sort·names.txt·|·uniq

blue

green

red

ninja$:··

which yields the desired result.

Summarizing Input

The uniq command includes a few options that can help generate summaries of input data.

First, the -c option generates the same output as the previous example, but prefixes each line with the number of (adjacent) occurrences in the input file:

ninja$:·sort·names.txt·|·uniq·-c

1·blue

1·green

4·red

ninja$:··

Similarly, the -d option generates a list of lines of content that are not unique, ignoring unique lines:

ninja$:·sort·names.txt·|·uniq·-d

red

ninja$:··

Finally, these options can also be combined to generate a report containing duplicate lines and the number of occurrences of each:

ninja$:·sort·names.txt·|·uniq·-c·-d

4·red

ninja$:··