Awk substr index 0

I just discovered that substr() in awk accepts either 0 or 1 as the initial index in a string. I tested this in Gawk 5.1.0 and MacOS awk 20070501.

awk 'BEGIN {print substr("abcd", 0, 1)}'

outputs "a", as does

awk 'BEGIN {print substr("abcd", 1, 1)}'


awk 'BEGIN {print substr("abcd", 2, 1)}'

outputs "b" just to prove that nothing’s obviously wrong.

I didn’t see anything in the man pages or the Gawk info file other than mentions of 1-indexing.

For consistency with the documentation and with the fact that index() returns 1 for the first position and 0 for no match, it would be good policy to always use 1.

My question is why is this duality the case? Also, is it documented somewhere? Are there other awk implementations that do this?

From the GNU awk online documentation: ‘substr() function’:

If start is less than one, substr() treats it as if it was one. (POSIX
doesn’t specify what to do in this case: BWK awk acts this way, and
therefore gawk does too.) If start is greater than the number of
characters in the string, substr() returns the null string. Similarly,
if length is present but less than or equal to zero, the null string
is returned.

Answered By: αғsнιη

the inconsistencies in substr() non-positive starting index has long been one of the criteria i’ve used for my own module to identify which awk variant it’s running upon :

function _testawk_util20(_){

    # mawk 1  :: [12345]
    # mawk 2  :: []
    # [ng]awk :: [123]

    return substr("12345",-3,3) } 
Answered By: RARE Kpop Manifesto
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.