Count total number of lines before/after a pattern match

I am having a long list of IP addresses, which are not in sequence. I need to find how many IP addresses are there before/after a particular IP address. How can I achieve this?

Asked By: Mandar Shinde

||

Grep has a feature that can count the number of times a particular pattern is found. If you use the -c command that will do so. With the -c and -v command, this will count how many times this does not match a particular pattern

Example:

grep -c -v <pattern> file

So if you try something like:

grep -c -v 192.168.x.x file.log that should work.

Answered By: ryekayo

Here’s a little bit of Perl code that does it:

perl -ne '
     if(1 .. /192.168.1.1/) { $before++ }
     else                      { $after++  }
     $before--; # The matching line was counted
     END{print "Before: $before, After: $aftern"}' your_file

This counts the total number of lines before and after the line containing the IP 192.168.1.1. Replace with your desired IP.

Using nothing but Bash:

before=0
match=0
after=0
while read line;do
    if [ "$line" = 192.168.1.1 ];then
        match=1
    elif [ $match -eq 0 ];then
        before=$(($before+1))
    else
        after=$(($after + 1))
    fi
done < your_file
printf "Before: %d, After: %dn" "$before" "$after"
Answered By: Joseph R.

Number of lines before and after a match, including the match (i.e. you need to subtract 1 from the result if you want to exclude the match):

sed -n '0,/pattern/p' file | wc -l
sed -n '/pattern/,$p' file | wc -l

But this has nothing to do with IP addresses in particular.

Answered By: vinc17

Maybe the easiest is,

sed -n '/pattern/{=; q;}' file

Thanks @JoshepR for pointing the error

Answered By: jpmuc

I was trying the following commands, which are a bit complicated, but would give accurate results:

After:

a=$(cat file | wc -l) && b=$(cat -n file | grep <Pattern> | awk '{print $1}') && echo "$a - $b" | bc -l

Before:

echo "`cat -n file | grep <Pattern> | awk '{print $1}'`-1" | bc -l
Answered By: Mandar Shinde

An awk solution reporting number of lines before and after last match

awk '/192.168.1.1/{x=NR};{y=NR} END{printf "before-%d, after-%dn" , x-1, y-x}'  file
Answered By: iruvar

I did this two ways, though I think I like this best:

: $(( afterl=( lastl=$(wc -l <~/file) ) - 2 -
  $(( beforel=( matchl=$(sed -n "/$IP/{=;q;}" <~/file) ) - 1
)) ))
for n in last match afters befores
do  printf '%s line%s :t%dn' 
        "${n%s}" "${n##*[!s]}" $((${n%s}l))
done

That saves all of those as current shell variables – and evaluates them in the for loop afterwards for output. It counts the total lines in the file with wc and the gets the first matched line number with sed.

Its output:

last line :     1000
match line :    200
after lines :   799
before lines :  199

I also did:

sed -n "/$IP/=;$=" ~/file |  
tr \n   | { 
IFS=' ' read ml ll 
printf '%s line%s:t%dn' 
    last '' $((ll=${ll##* }))
    match '' $ml 
    after s "$((al=ll-ml-1))  
    before s $((bl=ml-1))
}

sed prints only matching and last line numbers, then tr translates the intervening newlines to , and read reads the first of sed‘s results into $ml and all others into $ll. Possible multiple match cases are handled by stripping all but the last result out of $ll‘s expansion when setting it again later.

Its output:

last line :     1000
match line :    200
after lines :   799
before lines :  199

Both methods were tested on the file generated in the following way:

IP='some string for which I seek' 
for count in 1 2 3 4 5 
do  printf '%.199d%sn' 0 "$IP" 
done | tr 0 \n >~/file 

It does, by line number:

  1. sets the search string
  2. loops five times to ensure there will be multiple matches
  3. prints 199 zeroes then "$IP" then a newline
  4. pipes output to tr – which translates zeroes to newlines then into ~/file
Answered By: mikeserv

Using Raku (formerly known as Perl_6)

Take a random sample of 10 IP addresses, examine the 7th, get back two numbers (6 3), the first representing the number of lines before the targeted 7th IP, and the second representing the number of lines after the 7th IP:

~$ raku -e 'my @a = lines; my $b = @a.elems;  
            for @a.grep("194.186.137.164", :k) {
                say $( $_, ($b-1) - $_, ) 
            };'  file

#OR

~$ raku -e 'say slurp.split(/^^ 194.186.137.164 n /).map: *.match(:global, "n").elems;'  file

Sample Input (from https://catonmat.net/tools/generate-random-ip-addresses):

149.81.24.12
158.23.21.37
174.27.60.219
50.22.84.121
65.77.204.162
100.113.99.25
194.186.137.164
170.64.193.219
83.185.96.218
7.168.11.61

Sample Output (either code solution above):

(6 3)

Of course, you can assign the output into an @-sigiled array, and print a more pretty version using @array[0] and @array[1] indexing. Or if you prefer a different approach:

~$ raku -e 'for lines.kv -> $k,$v {  
            if $v.match("194.186.137.164") { 
            put "Lines before: $kn", "Lines after:  ", lines.elems;}};'  file
Lines before: 6
Lines after:  3

#OR

~$ raku -ne 'state $i;  ++$i;  
             put "Lines before: ", $i-1, "nLines after:  ", lines.elems  
             if .match("194.186.137.164");'  file
Lines before: 6
Lines after:  3

#OR

~$ raku -ne 'state $i;  ++$i; state $j;  
             if .match("194.186.137.164") { $j = $i; put "Lines before: ", $i-1};  
             END say "Lines after:  ", $i-$j;'  file
Lines before: 6
Lines after:  3

Note: the original problem says all IPs are unique so the code above fails for files with duplicate lines. But that is easily remedied. Testing on 10-line file where the IP in the 7th position gets copied into the 11th line position:

~$ raku -e 'my @a = lines; my $b = @a.elems;  
            my @pos = @a.grep("194.186.137.164", :k);  
            for @pos -> $p {say ++$, ".nLines before: $pnLines after:  ", ($b-1) - $p, "n"};'  file

Sample Output (note file is now 11-lines long):

1.
Lines before: 6
Lines after:  4

2.
Lines before: 10
Lines after:  0

Note: the first solution at the top of the page will handle multiple matches (i.e. duplicate IPs) as well.

https://raku.org

Answered By: jubilatious1
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.