mv with rsync functionality

I’m trying to merge two directory trees that have many common elements but also each have elements that are only present in one of the two trees. The main issue I am having is that when mv encounters two sub-directories with the same relative paths it either keeps the source (with -f) or the destination (with -n) but I can’t make it take the union of both sub-directories. I could of course use rsync with --remove-source-files but this will actually copy the data and then delete the old files as opposed to a true move. The two directory trees contain several hundred GB of data and are both on same partition so I would love to do a true move if possible for the sake of time.

Right now I have find * -type file -exec mv -n {} /destination/{} but this only moves files from the source to the destination when the target directory already exists. I can preface this command with something like mv -n * /destination/ but this only brings over the top level directories. Is there any way to do this in a single line? I could always write a script to check if the directory exists before copying the file but this seems like such a basic task, it seems like there should be an easier way.

Asked By: Reed Espinosa

||

You could use prename to get what you want. On some distributions (eg Debian/Ubuntu) this should be installed as default and aliased to rename. Other distros may use a different rename. You could change to the directory above the source directory and do:

find source -exec prename 's:^source:/path/to/dest:' {} +

This will refuse to move files that already exist in the destination tree and leave empty directories in the case where the directories names overlap, so you will have the remove them after. You can add the -f option to prename to make it overwrite existing files.

Example:

$ mkdir -p dir1/{common,sub1} dir2/{common,sub2}

$ touch dir1/sub1/file dir2/sub2/file dir1/common/common dir2/common/common dir1/common/diff1 dir2/common/diff2

$ tree dir*
dir1
├── common
│   ├── common
│   └── diff1
└── sub1
    └── file
dir2
├── common
│   ├── common
│   └── diff2
└── sub2
    └── file

4 directories, 6 files

$ find dir2 -depth -exec rename 's/^dir2/dir1/' {} +
Can't rename dir2/sub2/file dir1/sub2/file: No such file or directory
dir2/common/common not renamed: dir1/common/common already exists
dir2/common not renamed: dir1/common already exists
dir2 not renamed: dir1 already exists

$ tree dir*
dir1
├── common
│   ├── common
│   ├── diff1
│   └── diff2
├── sub1
│   └── file
└── sub2
    └── file
dir2
└── common
    └── common

4 directories, 6 files

Update:

To give a source for prename, it is usually comes bundled with perl (hence the ‘p’). On Debian/Ubuntu it is part of the perl package. If you want to get it separately, one of the answerers to this question – Get the Perl rename utility instead of the built-in rename has made a separate repository for it – https://github.com/subogero/rename

Answered By: Graeme

Warning: This answer will not work on file systems that do not support hard links (e.g., FAT).

Another option (which may be more portable) is

cd source_directory
find . -type f -print0 | cpio --pass-through --null --link --make-directories dest_dir

cpio (copy in&out) is a dinosaur that predates tar
Like tar, it can create or extract from archives. 
Unlike tar (correct me if I’m wrong),
it can copy directory trees with a single command. 
(I guess you could do this with tar -cf - source option(s) and arguments(s) | tar -xf - destination option(s) and arguments(s).) 
That’s what --pass-through means. 
--null means “expect filenames to be delimited by nulls”;
i.e., read output from find … -print0
--link means “link files from the source directory into the destination directory,
if possible”. 
--make-directories needs no explanation.

This can be abbreviated cpio –p0ld dest_dir
Add --verbose or -v if you want.

Then, after this finishes,

  • Check for collisions and handle appropriately.
  • Verify that your destination directory is populated with hard links.
  • Remove the source directory.

You can actually use rsync to move files rather than copy them:

rsync -av --link-dest "$PWD"/src/ --remove-source-files src/ dst

In the normal way of things, rsync -av --remove-source-files src/ dst will copy files from src/ to dst and then remove the source. By adding the --link-dest option and pointing it to the src/ directory, the target files will be linked rather than copied.

The only caveats that I can see are:

  • --link-dest is relative to the destination directory so we need to specify the full path to src/
  • rsync won’t report files as being copied if they are linked, so the output from rsync -av --progress will report no files copied

NB. Tested using a file that consumed 90% of my available disk space. Copying would have failed (90% + 90% » 100%) but linking/moving worked correctly.

Answered By: roaima
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.