Grep variable number before / after string from textline

I want to grep a number from a Textline.

cat log.txt | grep "License term"

01/01/2024:00:30 License term is 123 days.

I only want to seperate the "downcounting number" from that.
Is there a way to echo a word before / after matching keyword? Like "output the string after "is"" or "output the string before "days""?

Its most likely the same, but can change in the future. (Format of the Log-entry or other words will be added)

Asked By: mkcdu

||

If the format is always the same I would prefer awk than grep something like awk '{ print $5 }' or if you want also the date awk '{ print $1" "$5 }'

Answered By: admstg

The following sed command will find any line in log.txt that matches the regular expression .*License term is ([0-9]*) days.$, i.e., any line that contains the substring License term is , followed by an optional positive integer, followed by the substring days. at the end of the line. When such a line is found, the whole line is replaced by the integer, and the modified line is outputted, effectively extracting the number that you are asking for.

sed -n 's/.*License term is ([0-9]*) days.$/1/p' log.txt

Another way of doing this is with awk. The following uses a slightly different approach, just matching lines that contain the string License term and then outputting the second-to-last space-delimited word on these lines:

awk '/License term/ { print $(NF-1) }' log.txt

You may obviously also combine grep with e.g. cut to cut out the 5th space-delimited field from the lines containing the string License term:

grep -F 'License term' log.txt | cut -d ' ' -f 5

I’m using grep with its -F option here to signal that we are searching with a string rather than with a regular expression.

Answered By: Kusalananda

With GNU grep, you can use the -o and -P options, which mean "print only the matching part of the line" and "use Perl Compatible Regular Expressions" respectively. With PCRE, we can use K which means "discard anything matched up to this point". Combining all this gives us:

$ grep -oP 'License term is Kd+' log.txt 
123

Whether that will work consistently, of course, depends on what else you have in log.txt, but it will work for your example.

Answered By: terdon

Using Raku (formerly known as Perl_6)

~$ raku -ne '.put if s/ .* "License term is " (<[0..9]>*) " days." $/$0/;'  log.txt   

#OR:

~$ raku -ne '.put if s/ .* License s term s is s (<[0..9]>*) s days . $/$0/;'  log.txt

#OR:

~$ raku -ne '.put if s/ .* License <.ws> term <.ws> is <.ws>  (<[0..9]>*) <.ws> days . $/$0/;'  log.txt

OR:

~$ raku -ne 'if /License s term/ { put .words[4] };'  log.txt

#OR:

~$ raku -ne 'put .words[4]  if /License s term/;'   log.txt

OR:

~$ raku -e '$0.put for lines.match(/ "License term is "  ( d+ ) /);'  log.txt 

#OR:

~$ raku -e '.put for lines.match(/ "License term is "  ( d+ ) /);'  log.txt  

These answers written in Raku are similar in many respects to the excellent sed and awk answers already posted. The first two sets of answers use the -ne non-autoprinting linewise flags. In the first set the s/// form is used. In the second set Raku’s words routine is used to split on whitespace. In the last set of answers, Raku’s lines routine is used to find/return sub-matches.

Sample Input:

#dummy_line followed by blank line

01/01/2024:00:30 License term is 123 days.
#dummy_line

Sample Output:

123

"Capture markers" as well as look-aheads / look-behinds are also available using the Raku regex engine. See first link below for details. Also, you have your choice of capturing <[0..9]> ASCII digits, or d ASCII-plus-Unicode digits.

https://docs.raku.org/language/regexes#Regexes
https://raku.org

Answered By: jubilatious1

With pcre2grep (or its predecessor pcregrep):

pcre2grep -xo1 'dd/dd/dddd:dd:dd License term is (d+) days.' < log.txt

Would do a conservative matching of those lines, only selecting the ones that match that pattern exactly, and output the number matched by the 1st capture group from there.

Or the same with perl (the p in pcre2grep):

perl -lne '
  print $1 if m{^dd/dd/dddd:dd:dd License term is (d+) days.$}
  ' < log.txt
Answered By: Stéphane Chazelas

You can use grep and tr:

$ grep -o 'is.* days' log.txt | tr  -dc '[0-9]n'
123
Answered By: user9101329
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.