Backreference in Awk regex
Is it possible to do this in Awk?:
echo "eoe" | sed -nr '/^(.*)o1$/p'
Not in standard awk
(POSIX awk
uses POSIX EREs which don’t support back references, and 1
means the 0x1 character in awk, though there are some ambiguities). It’s possible with busybox awk
though using:
busybox awk '$0 ~ "^(.*)o\1$"'
(what that may or may not do (whether that "\1"
should match a literal 1
or the 0x1 character or be unspecified) is unclear in the POSIX specification. In my reading it seems to imply it should match a 0x1 character, but it doesn’t with /usr/xpg4/bin/sh
on Solaris 11 for instance which is a certified OS (where it matches on a literal 1
instead))
With any awk
, for that particular regexp, you could take another approach like:
awk 'length % 2 &&
substr($0, (length+1)/2, 1) == "o" &&
substr($0, 1, (length-1)/2) == substr($0, (length+3)/2)'
As mentioned above POSIX EREs don’t support back-references. GNU sed
with -r
uses EREs, but that’s GNU EREs that support back-references as an extension over the standard. What that means is that
grep -Ex '(.*)o1'
(or same with egrep
) is not portable. However:
grep -x '(.*)o1'
is POSIX and portable. POSIX BREs do support back-references, as did historical implementations of grep
. perl
regexps or PCREs do support back references as well so you can do:
perl -lne 'print if /^(.*)o1$/'