Properly escaping output from pipe in xargs
Example:
% touch -- safe-name -name-with-dash-prefix "name with space"
'name-with-double-quote"' "name-with-single-quote'"
'name-with-backslash'
xargs
can’t seem to handle double quotes:
% ls | xargs ls -l
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option
ls: invalid option -- 'e'
Try 'ls --help' for more information.
If we use the -0
option, it has trouble with name that has dash prefix:
% ls -- * | xargs -0 -- ls -l --
ls: invalid option -- 'e'
Try 'ls --help' for more information.
This is before using other potentially problematic characters like newline, control character, etc.
For xargs
to understand the -0
null-delimited input option, the sending party must also apply the null delimiter to the data that they are sending over.
Else there’s no synchronization between the two.
One option is the GNU find
command which can place such delimiters:
find . -maxdepth 1 ! -name . -print0 | xargs -0 ls -ld
As you said, xargs
doesn’t like unmatched double quotes unless you use -0
but -0
only makes sense if you feed it null-terminated data. So, this fails:
$ echo * | xargs
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option
name-with-backslash -name-with-dash-prefix
But this works:
$ printf '%s ' -- * | xargs -0
-- name-with-backslash -name-with-dash-prefix name-with-double-quote" name-with-single-quote' name with space safe-name
In any case, your basic approach is not really the best way to do this. Instead of fiddling about with xargs
and ls
and whatnot, just use shell globs instead:
$ for f in *; do ls -l -- "$f"; done
-rw-r--r-- 1 terdon terdon 4142 Aug 11 16:03 a
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 'name-with-backslash'
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 -name-with-dash-prefix
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 'name-with-double-quote"'
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 "name-with-single-quote'"
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 'name with space'
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 safe-name
The POSIX specification does give you an example for that:
ls | sed -e 's/"/"\""/g' -e 's/.*/"&"/' | xargs -E '' printf '<%s>n'
(with filenames being arbitrary sequences of bytes (other than /
and NULL) and sed
/xargs
expecting text, you’d also need to fix the locale to C (where all non-NUL bytes would make valid characters) to make that reliable (except for xargs
implementations that have a very low limit on the maximum length of an argument))
The -E ''
is needed for some xargs
implementations that without it, would understand a _
argument to signify the end of input (where echo a _ b | xargs
outputs a
only for instance).
With GNU xargs
, you can use:
ls | xargs -rd 'n' printf '<%s>n'
(also adding -r
(also a GNU extension) for the command not be run if the input is empty).
GNU xargs
also has a -0
that has been copied by a few other implementations, so:
ls | tr 'n' ' ' | xargs -0 printf '<%s>n'
is slightly more portable.
All of those assume the file names don’t contain newline characters. If there may be filenames with newline characters, the output of ls
is simply not post-processable. If you get:
a
b
That can be either both a a
and b
files or one file called a<newline>b
, there’s no way to tell.
GNU ls
has a --quoting-style=shell-always
which makes its output unambiguous and could be post-processable, but the quoting is not compatible with the quoting expected by xargs
. xargs
recognise "..."
, x
and '...'
forms of quoting. But both "..."
and '...'
are strong quotes and can’t contain newline characters (only can escape newline characters for
xargs
), so that’s not compatible with sh quoting where only '...'
are strong quotes (and can contain newline characters) but <newline>
is a line-continuation (is removed) instead of an escaped newline.
You can use the shell to parse that output and then output it in a format expected by xargs
:
eval "files=($(ls --quoting-style=shell-always))"
[ "${#files[@]}" -eq 0 ] || printf '%s ' "${files[@]}" |
xargs -0 printf '<%s>n'
Or you can have the shell get the list of files and pass it NUL-delimited to xargs
. For instance:
-
with
zsh
:print -rNC1 -- *(N) | xargs -r0 printf '<%s>n'
-
with
ksh93
:(set -- ~(N)*; (($# == 0)) || printf '%s ' "$@") | xargs -r0 printf '<%s>n'
-
with
fish
:begin set -l files *; string join0 -- $files; end | xargs -r0 printf '<%s>n'
-
with
bash
:( shopt -s nullglob set -- * (($# == 0)) || printf '%s ' "$@" ) | xargs -r0 printf '<%s>n'
2023 Edit. Since version 9.0 of GNU coreutils (September 2021), GNU ls
now has a --zero
option that can be used in conjunction with xargs -r0
:
ls --zero | xargs -r0 printf '<%s>n'
It is extremelly silly to try to parse the output of a command ls
that is not designed to be parsed to feed a command which is not designed to deal with several characters (for example: new lines
and {}
) when the shell does that by itself:
set -- *; for f; do echo "<$f>"; done
set -- *
for f
do ls "$f"
done
Or, in one command line:
$ set -- *; for f; do echo "<$f>"; done
<name-with-backslash>
<-name-with-dash-prefix>
<name-with-double-quote">
<name-with-single-quote'>
<name with space>
<safe-name>
<with_a
newline>
Note that the output deals (and has n example as last filename) with new-lines perfectly fine.
Or, if the number of files makes the shell slow, use find:
$ find ./ -type f -exec echo '<{}>' ;
<./safe-name>
<./with_a
newline>
<./name-with-double-quote">
<./-name-with-dash-prefix>
<./name with space>
<./name-with-single-quote'>
<./name-with-backslash>
Just mind that find process all dot-files and all sub-directories diferently than the shell.