Get correct number of lines in diff output

I want to get the correct number of lines in the output of diff(specifically with -y and --suppress-common-lines options). Using a simple wc -l does not work, because if both files end without a newline and their last line is different wc -l won’t count the last line.

Is there a simple and efficient solution to avoid this?

For example, if you have files “a”:

a
b
c
d   #no newline here

And “b”:

a
b
c
D    #no newline here

The output is:

$ diff -y --suppress-common-lines a b | wc -l
0

Which is obviously incorrect since diff does output a line.

Asked By: Bakuriu

||

As stated in the man and info pages, it seems the -l (--lines) option for wc prints the number of newlines characters. So if a line doesn’t end with a newline character, it doesn’t increment the count.

Answered By: user22304

There is no newline, so wc -l is correct. Instead, you want to count the number of start of lines. One way to do it:

$ diff -y --suppress-common-lines a b | grep '^' | wc -l
1
Answered By: ire_and_curses

It’s not incorrect. A line has to be terminated by a LF character, otherwise, it’s not a line (and anyway wc -l is documented to count newline characters, not lines).

You could pipe the output into something that adds back the missing LF character. GNU paste does it:

$ diff -y --suppress-common-lines <(printf a) <(printf b) | wc -l
0
$ diff -y --suppress-common-lines <(printf a) <(printf b) | paste | wc -l
1

It might not work with other implementations of paste, but since you’re using GNU specific options to diff, we can probably safely assume that you have GNU paste as well. The behavior of text utilities for non-terminated lines is unspecified by POSIX.

Answered By: Stéphane Chazelas
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.