How to sync two folders with command line tools?

Having migrated to Linux from Windows, I would like to find an alternative software to Winmerge or rather learn command line tools to compare and sync two folders on Linux. I would be grateful if you could tell me how to do the following tasks on the command line… (I have studied diff and rsync, but I still need some help.)

We have two folders: “/home/user/A” and “/home/user/B”

Folder A is the place where regular files and folders are saved and folder B is a backup folder that serves as a complete mirror of folder A. (Nothing is directly saved or modified by the user in folder B.)

My questions are:

  • How to list files that exist only in folder B? (E.g. the ones deleted from folder A since the last synchronization.)

  • How to copy files that exist in only folder B back into folder A?

  • How to list files that exist in both folders but have different timestamps or sizes? (The ones that have been modified in folder A since last synronization. I would like to avoid using checksums, because there are tens of thousands of files and it’d make the process too slow.)

  • How to make an exact copy of folder A into folder B? I mean, copy everything from folder A into folder B that exists only in folder A and delete everything from folder B that exists only in folder B, but without touching the files that are the same in both folders.

Asked By: user21417

||

This puts folder A into folder B:

rsync -avu --delete "/home/user/A" "/home/user/B"

If you want the contents of folders A and B to be the same, put /home/user/A/ (with the slash) as the source. This takes not the folder A but all of its content and puts it into folder B. Like this:

rsync -avu --delete "/home/user/A/" "/home/user/B"
  • -a Do the sync preserving all filesystem attributes
  • -v run verbosely
  • -u only copy files with a newer modification time (or size difference if the times are equal)
  • --delete delete the files in target folder that do not exist in the source

Manpage: https://download.samba.org/pub/rsync/rsync.html

Answered By: TuxForLife

You could unison tool developed by Benjamin Pierce at U Penn.

Let us assume you have two directories,

/home/user/Documents/dirA/ and /home/user/Documents/dirB/

To synchronize these two, you may use:

~$unison -ui text /home/user/Documents/dirA/ /home/user/Documents/dirB/

In output, unison will display each and every directory and file that is different in the two directories you have asked to sync. It will recommend to additively synchronize (replicate missing file in both locations) on the initial run, then create and maintain a synchronization tree on your machine, and on subsequent runs it will implement true synchronization (i.e., if you delete a file from .../dirA, it will get deleted from .../dirB as well. You can also compare each and every change and optionally choose to forward or reverse synchronize between the two directories.

Optionally, to launch graphical interface, simply remove the -ui text option from your command, although I find the cli simpler and faster to use.

More on this: Unison tutorial at Unison user documentation.

Answered By: axolotl

You can use it this way:

rsync -avu --delete /home/user/A/* /home/user/B/

This way you will copy the content of folder A into folder B, not the content of folder A itself.

Answered By: Isaias Sanchez

The answer from TuxForLife is pretty good, but I strongly suggest you use -c when syncing locally. You can argue that it’s not worth the time/network penalty to do it for remote syncs, but it is totally worth it for local files because the speed is so great.

-c, --checksum
       This forces the sender to checksum every regular file using a 128-bit  MD4
       checksum.   It  does this during the initial file-system scan as it builds
       the list of all available files. The receiver then checksums  its  version
       of  each  file  (if  it exists and it has the same size as its sender-side
       counterpart) in order to decide which files need to be updated: files with
       either  a  changed  size  or a changed checksum are selected for transfer.
       Since this whole-file checksumming of all files on both sides of the  con-
       nection  occurs  in  addition to the automatic checksum verifications that
       occur during a file's transfer, this option can be quite slow.

       Note that rsync always verifies that each transferred file  was  correctly
       reconstructed  on  the receiving side by checking its whole-file checksum,
       but that automatic after-the-transfer verification has nothing to do  with
       this  option's  before-the-transfer  "Does  this file need to be updated?"
       check.

This shows how having the same size and time stamps can fail you.

The setup

$ cd /tmp

$ mkdir -p {A,b}/1/2/{3,4}

$ echo "___________from A" | 
      tee A/1/2/x  | tee A/1/2/3/y  | tee A/1/2/4/z  | 
  tr A b | 
      tee b/1/2/x  | tee b/1/2/3/y  | tee b/1/2/4/z  | 
      tee b/1/2/x0 | tee b/1/2/3/y0 >     b/1/2/4/z0

$ find A b -type f | xargs -I% sh -c "echo %; cat %;"
A/1/2/3/y
___________from A
A/1/2/4/z
___________from A
A/1/2/x
___________from A
b/1/2/3/y
___________from b
b/1/2/3/y0
___________from b
b/1/2/4/z
___________from b
b/1/2/4/z0
___________from b
b/1/2/x
___________from b
b/1/2/x0
___________from b

The rsync that copies nothing because the files all have the same size and timestamp

$ rsync -avu A/ b
building file list ... done

sent 138 bytes  received 20 bytes  316.00 bytes/sec
total size is 57  speedup is 0.36

$ find A b -type f | xargs -I% sh -c "echo %; cat %;"
A/1/2/3/y
___________from A
A/1/2/4/z
___________from A
A/1/2/x
___________from A
b/1/2/3/y
___________from b
b/1/2/3/y0
___________from b
b/1/2/4/z
___________from b
b/1/2/4/z0
___________from b
b/1/2/x
___________from b
b/1/2/x0
___________from b    

The rsync that works correctly because it compares checksums

$ rsync -cavu A/ b
building file list ... done
1/2/x
1/2/3/y
1/2/4/z

sent 381 bytes  received 86 bytes  934.00 bytes/sec
total size is 57  speedup is 0.12

$ find A b -type f | xargs -I% sh -c "echo %; cat %;"
A/1/2/3/y
___________from A
A/1/2/4/z
___________from A
A/1/2/x
___________from A
b/1/2/3/y
___________from A
b/1/2/3/y0
___________from b
b/1/2/4/z
___________from A
b/1/2/4/z0
___________from b
b/1/2/x
___________from A
b/1/2/x0
___________from b
Answered By: Bruno Bronosky

This is what I’m using for backing up personal files, where I don’t care about everything covered by -a, and want more useful information printed.

rsync -rtu --delete --info=del,name,stats2 "/home/<user>/<src>/" "/run/media/<user>/<drive>/<dst>"

From the rsync man page:

-r, –recursive
This tells rsync to copy directories recursively.

-t, –times
This tells rsync to transfer modification times along with the files and update them on the remote system.

-u, –update
This forces rsync to skip any files which exist on the destination and have a modified time that is newer than the source file. (If an existing destination file has a modification time equal to the source file’s, it will be updated if the sizes are different.)

–delete
This tells rsync to delete extraneous files from the receiving side (ones that aren’t on the sending side), but only for the directories that are being synchronized.

–info=FLAGS
This option lets you have fine-grained control over the information output you want to see.

From rsync --info=help

DEL        Mention deletions on the receiving side  
NAME       Mention 1) updated file/dir names, 2) unchanged names  
STATS      Mention statistics at end of run (levels 1-3)

While less explicit, this is seemingly equivalent and shorter:

rsync -rtuv --delete --info=stats2 "/home/<user>/<src>/" "/run/media/<user>/<drive>/<dst>"

-v, –verbose
A single -v will give you information about what files are being transferred and a brief summary at the end [stats1].

Answered By: Honest Abe

You might have a look at Fitus/Zaloha.sh. It is a synchronizer implemented as a bash shell script that uses only standard Unix commands. It is easy to use:

$ Zaloha.sh --sourceDir="test_source" --backupDir="test_backup"
Answered By: user380458

Old thread but used this to sync up two 2TB drives.

Using

 $ rsync -cavu --delete /home/user/A/* /home/user/B/

did not work (says "no such directory exists"), but using

 $ rsync -cavu --delete /home/user/A /home/user/B 

did work…

Answered By: DKeeler

Another option: https://github.com/mikkorantalainen/rsync-continuous

This script uses bash, inotifywait and rsync over ssh to create very fast one-way sync because only modified files need to be transferred and if only part of the file has been modified, rsync will only transfer the modified part.

Answered By: Mikko Rantalainen