When will `find . -exec COMMAND {} +` execute COMMAND multiple times?

If I do

find . -exec echo {} +

it prints all paths in one line, i.e. command echo is only executed once.

But according to man find,

-exec command {} +
    ... the number of invocations of the command will 
be much  less  than  the  number  of matched files. ...

It seems that in some circumstances the command will be executed multiple times. Am I right? Please exemplify.

Asked By: frozen-flame

||

There is a maximum length of argument list for a new process in POSIX system. find will split the execution if the files paths are longer than this. To see the limit on Linux, use xargs --show-limits (don’t work in Mac OS, if someone knows a better alternative please comment here)

edit: stolen straight from Gnouc’s answer, the POSIX way to get the maximum length of argument list is getconf ARG_MAX. However, I ran an experiment on my mac os machine, and it looks like find uses a little more than half that number. This is coherent with the fact that, on system where it works, xargs --show-limits tells us that it won’t be using the maximum argument length (in this case too it will use about a half that number), however I couldn’t find an explanation for that.

edit 2: it seems that the only reliable way to determine how many parameters find will stick together for each invocation is to experiment, for example by running

find / -exec echo {} + | wc -cl

As the output from find has a line for each echo invocation, it is possible to count them using wc -l. The total number of bytes echoed is the output of wc -c instead. Dividing one by the other you get the average number of bytes in the parameters for each command invocation (albeit a slightly lower value, because of rounding, roughly half the average lenght of a path in your system)

Answered By: pqnet

POSIX defined find -exec utility_name [argument …] {} + as:

The end of the primary expression shall be punctuated by a <semicolon>
or by a <plus-sign>. Only a <plus-sign> that immediately follows an
argument containing only the two characters “{}” shall punctuate the
end of the primary expression. Other uses of the <plus-sign> shall not
be treated as special. If the primary expression is punctuated by a
<semicolon>, the utility utility_name shall be invoked once for each
pathname and the primary shall evaluate as true if the utility returns
a zero value as exit status. A utility_name or argument containing
only the two characters “{}” shall be replaced by the current
pathname. If a utility_name or argument string contains the two
characters “{}”, but not just the two characters “{}”, it is
implementation-defined whether find replaces those two characters or
uses the string without change.

If the primary expression is punctuated by a <plus-sign>, the primary
shall always evaluate as true, and the pathnames for which the primary
is evaluated shall be aggregated into sets. The utility utility_name
shall be invoked once for each set of aggregated pathnames. Each
invocation shall begin after the last pathname in the set is
aggregated, and shall be completed before the find utility exits and
before the first pathname in the next set (if any) is aggregated for
this primary, but it is otherwise unspecified whether the invocation
occurs before, during, or after the evaluations of other primaries. If
any invocation returns a non-zero value as exit status, the find
utility shall return a non-zero exit status. An argument containing
only the two characters “{}” shall be replaced by the set of
aggregated pathnames, with each pathname passed as a separate argument
to the invoked utility in the same order that it was aggregated. The
size of any set of two or more pathnames shall be limited such that
execution of the utility does not cause the system’s {ARG_MAX} limit
to be exceeded
. If more than one argument containing the two
characters “{}” is present, the behavior is unspecified.

When length set of file name you found exceed system ARG_MAX, the command is executed.

You can get ARG_MAX using getconf:

$ getconf ARG_MAX
2097152

On some system, actual value of ARG_MAX can be different, you can refer here for more details.

Answered By: cuonglm
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.