Finding the line number of first occurrence of a text in bash script
I need to find out what is the line number of first occurrence of a given search string that should be in the start of a line in a text file and store it in a variable in my bash script. For example I want to find the first occurrence of "c":
abc
bde
cddefefef // this is the line that I need its line number
Casdasd // C here is capital, I dont need it
azczxczxc
b223r23r2fe
Cssdfsdfsdf
dccccdcdcCCDcdccCCC
eCCCCCC
I came up with this but as you see there are big problems
trimLineNum=$(cat "${varFileLog}" | grep -m1 -n "c")
echo "c is at line #"${trimLineNum}
The output will be:
c is at line #1:abc
Problems:
- So obviously it matches the first line, because there is a "c" in the line.
- The output will also include the content of the line as well! I want it to be just the number of the line
what should I change to fix those problems?
You need to tell grep
about your “that should be in the start of a line” constraint, by anchoring the match to the start of a line with ^
:
trimLineNum=$(grep -m1 -n -- '^c' "${varFileLog}")
Then post-process grep
’s output to only keep the line number:
trimLineNum=$(grep -m1 -n -- '^c' "${varFileLog}")
trimLineNum="${trimLineNum%%:*}"
Note that -m
is a GNU extension (and with GNU grep
, you need --
even though ^c
doesn’t start with --
in case $varFileLog
itself might start with -
as GNU grep
accepts options even after non-option arguments). Standardly, you could pipe the output to head -n 1
instead.
If there’s no match, the first command will return false/failure while the second will always return true unless you enable the pipefail
option as supported by several shells including bash
.
Several solutions exist
with AWK
awk '/^c/ { print NR; exit}' "${varFileLog}"
/^c/
: matches the line starting withc
print NR
: prints the record (line) numberexit
: does not continue processing
As I like awk
, this is my preferred solution
with grep + filtering
grep -n '^c' "${varFileLog}" | head -n1 | sed 's/:.*//'
'^c'
: matches the line starting withc
head -1
: only displays first line from grep’s resultssed 's/:.*//'
: removes anything after the:
sed 's/:.*//'
and cut -d: -f1
have the same effect in that case
about performance
This may be slower than Stephen’s solution:
grep -m1 -n '^c' "${varFileLog}" | cut -d: -f1
With POSIX sed
, you suppress normal output with the -n
option, then for the line starting with c
(pattern ^c
), print the line number with =
and q
uit:
sed -n '/^c/{=;q;}'
With GNU sed
, you can use the Q
command to quit without output and simplify to
sed '/^c/!d;=;Q'
grep can print the line number of a match with -n
or --line-number
so you can use that.
$ cat sample.txt | grep '^c' --line-number
3:cddefefef // this is the line that I need its line number
10:cat // added to illustrate getting the first occurence vs. all
The problem then reduces to:
- Get the first line only
- Pull out just the number without the matched text
You can do the first one with head
and the second with cut
:
$ cat sample.txt | grep '^c' --line-number | head -n 1 | cut -d':' -f1
3
In your example output, you have some extra text – I’m not sure if this is important to you or not. However, when you have a number on STDOUT, adding some string prefix to this is a straightforward task, and I’ll leave that one up to you.
Using Raku (formerly known as Perl_6)
raku -ne 'state $i; ++$i; say "c starts line $i" and last if m/^c/;'
OR
raku -ne 'state $i; ++$i; say "c starts line $i" and last if (.index("c").defined && .index("c") == 0);'
OR
raku -ne 'state $i; ++$i; say "c starts line $i" and last if .starts-with("c");'
Outputs:
c starts line 3
Raku’s -ne
(linewise non-autoprinting) command line flags are used. To get the line number a state
variable $i
is initialized once, then incremented for every line read. Where a beginning-of-line "c" is identified (either by regex, or index
, or starts-with
), the string "c starts line $i"
is interpolated and output (say
).
Note: the low-precedence conditional as last
is added to each example above. Remove this conditional to return all matching line numbers, e.g.:
~$ raku -ne 'state $i; ++$i; say "c starts line $i" if m/^c/;' file
c starts line 3
c starts line 10
Addendum: Thanks to this SO answer, here’s a quick way to get the first zero-indexed line number starting with "c" using Raku’s first
routine:
~$ raku -e 'say lines.first(* ~~ / ^ c /):k;' file
2
#OR
~$ perl6 -e 'say lines.first(*.starts-with("c")):k;' file
2
Sample Input:
abc
bde
cddefefef // this is the line that I need its line number
Casdasd // C here is capital, I dont need it
azczxczxc
b223r23r2fe
Cssdfsdfsdf
dccccdcdcCCDcdccCCC
eCCCCCC
cddefefef // this is the line that I need its line number (again)