Why does the POSIX for-loop allow a `for name in ; do` (no words in list) syntax?
I’m trying to understand the for-loop syntax in the POSIX shell grammar rules:
for_clause : For name do_group
| For name sequential_sep do_group
| For name linebreak in sequential_sep do_group
| For name linebreak in wordlist sequential_sep do_group
The last (pretty much the default) and the two first makes sense, quoting from posix:
Omitting:
in word...
shall be equivalent to:
in "$@"
so the first two are just looping over the arguments to the shell script.
I just don’t understand how a for-loop consisting of for name in; do echo $name; done
makes sense. I thought it was still just looping over the arguments to the shell script, but creating a small script it doesn’t seem to be the case. The for loop just kinda seems to get ignore.
So what is the purpose of that third variation?
It’s just that wordlist
is defined as:
wordlist : wordlist WORD
| WORD
So one or more words, but the for var in ...; do ...; done
loop can loop over zero or more words, so it needs For name linebreak in sequential_sep do_group
to cover the case of zero words.
Having wordlist
defined as zero or more words:
wordlist : wordlist WORD
| /* empty */
Would probably have been less confusing.
If you look at SUSv2, there was:
for_clause : For name linebreak do_group
| For name linebreak in wordlist sequential_sep do_group
wordlist : wordlist WORD
| WORD
Meaning that an empty list was not allowed. So it looks like what happened is that in SUSv3, they fixed that omission by adding the:
For name linebreak in sequential_sep do_group
Rather than changing the definition of wordlist
.
Note that we’re talking of the shell language grammar here, so it’s not about for i in $var; do ...; done
or for i in $(some cmd); do ...; done
where $var
/$(some cmd)
expansions could result in an empty list. Those $var
, $(some cmd)
are each one WORD token in POSIX formalisation of the shell language.
We are talking of for i in; do ...; done
, where there is literally no WORD in between the in
and the do
.
I agree that code is not particularly useful. There is little reason for one to write a loop that explicitly loops over nothing. Some reasons that one could think of:
Generated code:
if ...; then
list=' "foo" "bar baz" "$var" '
else
list=
fi
eval '
for i in '"$list"'; do
printf "%sn" "$i"
done
'
Or anything that generates sh
scripts and constructs a for
loop over explicit arguments.
Or to comment out some code:
for commented_out in; do
this code is commented out
done
Though something like:
:||:<<'EOF'
this code is commented out and (harmless) even
if it contains invalid syntax
EOF
Would be more idiomatic.
I did not find the change request that led to the modification of the standard, but I suppose they changed it mainly because all existing shell implementations allowed for i in; do ...; done
, so there was no point forbidding it in the standard.
You’ll find that it doesn’t allow if then ...; else ...; fi
nor if; then ...; else ...; fi
nor if ...; then;else ...; fi
¹ (but allows if $empty; then $empty; else $empty; fi
obviously) even if there’s no real good reason to forbid it, because most implementations (zsh
being a notable exception), starting with the Bourne shell which introduced it and the Korn shell on which the POSIX sh
specification is based in practice choke on it.
¹ Actually, in the Bourne shell that did not have the !
keyword to negate the exit status of a pipeline, you had to write if cmd; then :; else command if cmd fails; fi
in place of if ! cmd; then command if cmd fails; fi
, that is use an explicit :
null-command in the the then
part as the Bourne shell did not allow an empty then
part.
The reasoning for this behavior is best illustrated with an example. Assume you have a variable named ${items}
and you need to call a function on each word in that variable, but it might evaluate to an empty string in some cases.
With POSIX compliant behavior, you can just use:
for x in ${items} ; do
do_something(x)
done
Here, it doesn’t matter if ${items}
evaluates to an empty string or not, because if it does, the loop will have nothing to loop over and will just be skipped.
But if the shell handled the case of an empty word list differently (either not at all, or equivalent to one of the forms without the in
), then you would instead need (at minimum):
if [ -n "${items}" ]; then
for x in ${items} ; do
do_something(x)
done
fi
So this behavior saves you (at minimum) a level of indentation and an extra conditional check in the relatively common case of having to iterate over a potentially empty list of items of unknown length. Many other languages that have for loops which iterate over a collection of items (such as Python or JavaScript) also behave in exactly the same manner, they just explicitly require a variable (or literal) to be provided to the loop, while shell script does not appear to because of how a literal list of items to loop over is defined (namely, an empty one is just an unquoted empty string).