Isolate one specific instance of a word in a string
I have a
grep command to process the string
I want the command to only show me the "free-standing" word
test highlighted, but not
However, when I run
echo "test-test test" | grep -w "test"
I get all
test occurences highlighted.
Is it possible to restrict the matching to only the isolated occurence of
With Perl regexes (supported in GNU grep with the
-P flag), you can use a negative lookbehind and lookahead to assert that the match for
test is not adjacent to a
echo "test-test test" | grep -P '(?<!-)test(?!-)'
You get all
test occurrences in bold because for
-w a word is a sequence of word characters delimited by non-word characters, word characters being alphanumerics and underscores.
- is not a word character.
testing-test+test2;test_3;untested, only the second
test occurrence matches because that’s the only one that is neither preceded nor followed by a word character.
If you want to match on whitespace-delimited words, you’ll have to specify those look-around assertions by hand, and you’ll need perl-compatible regexps for that which not all
grep implementations support. Since you seem to have a
grep='grep --color alias, if your
--color, there’s a chance it also supports a
-X perl option, and then you can do:
grep --color -P '(?<!S)test(?!S)'
(?!...) being respectively the negative look-behind and look-ahead assertion operators in perl regexps.
S, like POSIX’
[^[:space:]] matches and character that is not classified as whitespace. So that says: "match on
test provided it’s neither preceded nor followed by a non-whitespace character".
~$ echo "test-test testntestnn" | perl -lpe 's/(?<!S)(test)(?!S)/**$1**/g;' test-test **test** **test**
~$ echo "test-test testntestnn" | raku -pe 's:g/ <!after S > (test) <!before S > /**$0**/;' test-test **test** **test**
[Special thanks to @StéphaneChazelas for expostulating on Perl-style Lookarounds in his
If you want something akin to highlighting here’s a simple idea from PerlMonks. Instead of
grep‘s color option, use Perl/Raku and surround the matches with
Raku note: Lookbehinds are spelled
after in Raku with
? for positive and
! for negative, while Lookaheads are spelled
before in Raku with
? for positive and
! for negative. So you can literally read the Raku code above as "substitute globally any matches to the word "test" not
after a non-whitespace character and also not
before a non-whitespace character… ".
Using extended regex, you can achieve it like so
grep -E "(^| )test( |$)"
With the minor caveat that this will actually highlight any leading or training space, but you won’t see it because it’s not a printable character.