mv with rsync functionality
I’m trying to merge two directory trees that have many common elements but also each have elements that are only present in one of the two trees. The main issue I am having is that when mv
encounters two sub-directories with the same relative paths it either keeps the source (with -f
) or the destination (with -n
) but I can’t make it take the union of both sub-directories. I could of course use rsync
with --remove-source-files
but this will actually copy the data and then delete the old files as opposed to a true move. The two directory trees contain several hundred GB of data and are both on same partition so I would love to do a true move if possible for the sake of time.
Right now I have find * -type file -exec mv -n {} /destination/{}
but this only moves files from the source to the destination when the target directory already exists. I can preface this command with something like mv -n * /destination/
but this only brings over the top level directories. Is there any way to do this in a single line? I could always write a script to check if the directory exists before copying the file but this seems like such a basic task, it seems like there should be an easier way.
You could use prename
to get what you want. On some distributions (eg Debian/Ubuntu) this should be installed as default and aliased to rename
. Other distros may use a different rename
. You could change to the directory above the source directory and do:
find source -exec prename 's:^source:/path/to/dest:' {} +
This will refuse to move files that already exist in the destination tree and leave empty directories in the case where the directories names overlap, so you will have the remove them after. You can add the -f
option to prename
to make it overwrite existing files.
Example:
$ mkdir -p dir1/{common,sub1} dir2/{common,sub2}
$ touch dir1/sub1/file dir2/sub2/file dir1/common/common dir2/common/common dir1/common/diff1 dir2/common/diff2
$ tree dir*
dir1
├── common
│ ├── common
│ └── diff1
└── sub1
└── file
dir2
├── common
│ ├── common
│ └── diff2
└── sub2
└── file
4 directories, 6 files
$ find dir2 -depth -exec rename 's/^dir2/dir1/' {} +
Can't rename dir2/sub2/file dir1/sub2/file: No such file or directory
dir2/common/common not renamed: dir1/common/common already exists
dir2/common not renamed: dir1/common already exists
dir2 not renamed: dir1 already exists
$ tree dir*
dir1
├── common
│ ├── common
│ ├── diff1
│ └── diff2
├── sub1
│ └── file
└── sub2
└── file
dir2
└── common
└── common
4 directories, 6 files
Update:
To give a source for prename
, it is usually comes bundled with perl
(hence the ‘p’). On Debian/Ubuntu it is part of the perl
package. If you want to get it separately, one of the answerers to this question – Get the Perl rename utility instead of the built-in rename has made a separate repository for it – https://github.com/subogero/rename
Warning: This answer will not work on file systems that do not support hard links (e.g., FAT).
Another option (which may be more portable) is
cd source_directory find . -type f -print0 | cpio --pass-through --null --link --make-directories dest_dir
cpio
(copy in&out) is a dinosaur that predates tar
.
Like tar
, it can create or extract from archives.
Unlike tar
(correct me if I’m wrong),
it can copy directory trees with a single command.
(I guess you could do this with tar -cf - source option(s) and arguments(s) | tar -xf - destination option(s) and arguments(s)
.)
That’s what --pass-through
means.
--null
means “expect filenames to be delimited by nulls”;
i.e., read output from find … -print0
.
--link
means “link files from the source directory into the destination directory,
if possible”.
--make-directories
needs no explanation.
This can be abbreviated cpio –p0ld dest_dir
.
Add --verbose
or -v
if you want.
Then, after this finishes,
- Check for collisions and handle appropriately.
- Verify that your destination directory is populated with hard links.
- Remove the source directory.
You can actually use rsync
to move files rather than copy them:
rsync -av --link-dest "$PWD"/src/ --remove-source-files src/ dst
In the normal way of things, rsync -av --remove-source-files src/ dst
will copy files from src/
to dst
and then remove the source. By adding the --link-dest
option and pointing it to the src/
directory, the target files will be linked rather than copied.
The only caveats that I can see are:
--link-dest
is relative to the destination directory so we need to specify the full path tosrc/
rsync
won’t report files as being copied if they are linked, so the output fromrsync -av --progress
will report no files copied
NB. Tested using a file that consumed 90% of my available disk space. Copying would have failed (90% + 90% » 100%) but linking/moving worked correctly.