https://bugzilla.samba.org/show_bug.cgi?id=12570
Bug ID: 12570
Summary: Problems with --checksum --existing
Product: rsync
Version: 3.1.1
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P5
Component: core
Assignee: [email protected]
Reporter: [email protected]
QA Contact: [email protected]
Problem:
I've got an sd-card with some movies, a few of which are corrupted files.
I want to copy only the files that don't match the good files.
command:
rsync --checksum --existing -vhriP /movies/ /media/128-SD/Movies/
The problem here is that *all* files in "/movies/" are hashed before anything
else happens. This can be verified with lsof: "lsof +D /movies".
I've got <100GB in "/media/128-SD/Movies/".
I've got >1.5TB in "/movies/", and hashing all of those files is just a huge
waste of time and system resources.
When "--existing" and "--checksum" are both used, the algorithm should first
make a list of candidate files, then start hashing. It should *not* start
hashing everything on the send-side and then figure out which files might be
needed.
Workaround for me:
diff -r /movies/ /media/128-SD/Movies/ | grep differ | awk '{print "pv " $3" >
"$5}' | sh
nb, that workaround requires "pv" and only works with file-names that do not
contain spaces, but for me it's a quick and easy way to see progress while
files are being copied. "cp" would work fine in place of "pv".
On my system, that workaround saved my about 1-2 days of hashing, and completed
in less than an hour.
--
You are receiving this mail because:
You are the QA Contact for the bug.
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html