How to rsync only new files

I am trying to set up rsync to synchronize my main web server to the remote server by adding newly generated file to the latter.

Here is the command that I use:

rsync -avh --update -e "ssh -i /path/to/thishost-rsync-key" remoteuser@remotehost:/foo/bar /foo/bar

But it seems that the web server actually transfers all files despite the ‘–update’ flag. I have tried different flag combinations (e.g. omitting ‘-a’ and using’-uv’ instead) but none helped. How can I modify the rsync command to send out only newly added files?

Asked By: supermario

||

From man rsync:

--ignore-existing       skip updating files that exist on receiver

--update does something slightly different, which is probably why you are getting unexpected results (see man rsync):

This forces rsync to skip any files which exist on the destination and have a modified time that is newer than the source file. (If an existing destination file has a modification time equal to the source file’s, it will be updated if the sizes are different.)

Answered By: Chris Down

From my experience with rsync, a 1TB partition copying is too large to be efficient. It takes rsync forever to process it. Instead, do it by subdirectories. That is, run rsync for each main subdirectory. It goes a lot faster if it doesn’t have to juggle tens of thousands of files.

Answered By: turgut kalfaoglu

The issue might be caused by different user/group IDs on source and target servers. My case was similar, having all files transferred instead of only the modified/new ones. The solution was to use parameters -t (instead of -a), and -P (equivalent to --partial --progress):

rsync -h -v -r -P -t <source> <target>

or shorter (thanks @Manngo);

rsync -hvrPt <source> <target>

This transfers only new files, and files already existing but modified. Parameter -a does too much, like user and group ID sync, which in my case can not work as I have different users and groups on my source and target systems. Therefore with -a all my source and target files were always regarded as "different".

The parameters in detail:

  • -h: human readable numbers
  • -v: verbose
  • -r: recurse into directories
  • -P: --partial (keep partially transferred files) +
            --progress (show progress during transfer)
  • -t: preserve modification times
Answered By: t0r0X

The problem you describe is probably because you are missing the trailing / on the source directory. As a result, rsync will copy all the files twice: the first time to /foo/bar and the second time to /foo/bar/bar. Thereafter it will efficiently copy updates to /foo/bar/bar.

The correct command should be this:

rsync -avh -e "ssh -i /path/to/thishost-rsync-key" remoteuser@remotehost:/foo/bar/ /foo/bar
Answered By: roaima

Notice that rsync –ignore-existing -raz –progress /var/www/88021064/var xx@server.tld:/usr/home/xxx/public_html/var/ will not not do anything,
the way it worked for me is rsync –ignore-existing -raz –progress /var/www/88021064**/var/*** xx@server.tld:/usr/home/xxx/public_html/var/,
and notice that will not update hideen files for hideen files only execute one more time /var/www/88021064/var/.[^.]*

Answered By: David Bister
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.