Finding the line number of first occurrence of a text in bash script

I need to find out what is the line number of first occurrence of a given search string that should be in the start of a line in a text file and store it in a variable in my bash script. For example I want to find the first occurrence of "c":

abc
bde
cddefefef // this is the line that I need its line number
Casdasd // C here is capital, I dont need it
azczxczxc
b223r23r2fe
Cssdfsdfsdf
dccccdcdcCCDcdccCCC
eCCCCCC

I came up with this but as you see there are big problems

   trimLineNum=$(cat "${varFileLog}" | grep -m1 -n "c")
   echo "c is at line #"${trimLineNum}

The output will be:

c is at line #1:abc

Problems:

  1. So obviously it matches the first line, because there is a "c" in the line.
  2. The output will also include the content of the line as well! I want it to be just the number of the line

what should I change to fix those problems?

Asked By: DEKKER

||

You need to tell grep about your “that should be in the start of a line” constraint, by anchoring the match to the start of a line with ^:

trimLineNum=$(grep -m1 -n -- '^c' "${varFileLog}")

Then post-process grep’s output to only keep the line number:

trimLineNum=$(grep -m1 -n -- '^c' "${varFileLog}")
trimLineNum="${trimLineNum%%:*}"

Note that -m is a GNU extension (and with GNU grep, you need -- even though ^c doesn’t start with -- in case $varFileLog itself might start with - as GNU grep accepts options even after non-option arguments). Standardly, you could pipe the output to head -n 1 instead.

If there’s no match, the first command will return false/failure while the second will always return true unless you enable the pipefail option as supported by several shells including bash.

Answered By: Stephen Kitt

Several solutions exist

with AWK

awk '/^c/ { print NR; exit}' "${varFileLog}"
  • /^c/: matches the line starting with c
  • print NR: prints the record (line) number
  • exit : does not continue processing

As I like awk, this is my preferred solution

with grep + filtering

grep -n '^c' "${varFileLog}" | head -n1 | sed 's/:.*//'
  • '^c': matches the line starting with c
  • head -1 : only displays first line from grep’s results
  • sed 's/:.*//' : removes anything after the :

sed 's/:.*//' and cut -d: -f1 have the same effect in that case

about performance

This may be slower than Stephen’s solution:

grep -m1 -n '^c' "${varFileLog}" | cut -d: -f1
Answered By: lauhub

With POSIX sed, you suppress normal output with the -n option, then for the line starting with c (pattern ^c), print the line number with = and quit:

sed -n '/^c/{=;q;}'

With GNU sed, you can use the Q command to quit without output and simplify to

sed '/^c/!d;=;Q'
Answered By: Philippos

grep can print the line number of a match with -n or --line-number so you can use that.

$ cat sample.txt | grep '^c' --line-number
3:cddefefef // this is the line that I need its line number
10:cat // added to illustrate getting the first occurence vs. all

The problem then reduces to:

  • Get the first line only
  • Pull out just the number without the matched text

You can do the first one with head and the second with cut:

$ cat sample.txt | grep '^c' --line-number | head -n 1 | cut -d':' -f1
3

In your example output, you have some extra text – I’m not sure if this is important to you or not. However, when you have a number on STDOUT, adding some string prefix to this is a straightforward task, and I’ll leave that one up to you.

Answered By: Jessica

Using Raku (formerly known as Perl_6)

raku -ne 'state $i; ++$i; say "c starts line $i" and last if m/^c/;'  

OR

raku -ne 'state $i; ++$i; say "c starts line $i" and last if (.index("c").defined && .index("c") == 0);' 

OR

raku -ne 'state $i; ++$i; say "c starts line $i" and last if .starts-with("c");' 

Outputs:

c starts line 3

Raku’s -ne (linewise non-autoprinting) command line flags are used. To get the line number a state variable $iis initialized once, then incremented for every line read. Where a beginning-of-line "c" is identified (either by regex, or index, or starts-with), the string "c starts line $i" is interpolated and output (say).

Note: the low-precedence conditional as last is added to each example above. Remove this conditional to return all matching line numbers, e.g.:

~$ raku -ne 'state $i; ++$i; say "c starts line $i" if m/^c/;'  file
c starts line 3
c starts line 10

Addendum: Thanks to this SO answer, here’s a quick way to get the first zero-indexed line number starting with "c" using Raku’s first routine:

~$ raku -e 'say lines.first(* ~~ / ^ c /):k;' file
2

#OR

~$ perl6 -e 'say lines.first(*.starts-with("c")):k;'  file
2

Sample Input:

abc
bde
cddefefef // this is the line that I need its line number
Casdasd // C here is capital, I dont need it
azczxczxc
b223r23r2fe
Cssdfsdfsdf
dccccdcdcCCDcdccCCC
eCCCCCC
cddefefef // this is the line that I need its line number (again)

https://raku.org

Answered By: jubilatious1
Categories: Answers Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.