Retrieve large number of files from remote server with wildcards
I’m trying to download a large number of files from a remote server. Part of the path is known, but there’s a folder name that’s randomly generated, so I think I have to use wildcards. The path is something like this:
/home/myuser/files/<random folder name>/*.ext
So was trying this:
rsync -av myuser@myserver.com:~/files/**/*.ext ./
This is giving me following error:
bash: /usr/bin/rsync: Argument list too long
I also tried scp instead of rsync but got the same error. It seems bash interprets the wildcard as the full list of files.
What’s the right way to achieve this?
Instead of letting the remote shell expand a glob that results in a too long list of arguments, use --include
and --exclude
filters to transfer only the files that you want:
rsync -aim --include='*/' --include='*.ext' --exclude='*'
myuser@myserver.com:files ./
This would give you a directory called files
in the current directory. Beneath it, you will find the parts of the remote directory structure that contain the .ext
files, including the files themselves. Directories without .ext
files would not appear on the target side as we use -m
(--prune-empty-dirs
).
With the --include
and --exclude
filters, we include any directory (needed for recursion) and any name matching *.ext
. We then exclude everything else. These filters work on a "first match wins" basis, which means the --exclude='*'
filter must be last. The rsync
utility evaluates the filters as it traverses the source directory structure.
If you then want to move all the synced files into the current directory (ignoring the possibility of name clashes), you could use find
like so:
find files -type f -name '*.ext' -exec mv {} . ;
This looks for regular files in or beneath files
, whose names match the pattern *.ext
. Matching files are moved to the current directory using mv
.