Well thanks for THAT pointer! Using --checksum-choice=none results in
speedup of somewhere between 2-3 times. That's my validation of the
checksum theory things have been pointing towards. Now to get xxhash
into rsync and I think we are all set.
Thanks,
Bill
On 6/18/19 9:57 AM, Ellis H. Wilson III wrote:
On 6/18/19 9:16 AM, Bill Wichser wrote:
Stock RH 7 version, rsync-3.1.2-6.el7_6.1.x86_64. We've tried a
number of recompiles. gcc, Intel. The only thing between identical
compiles was the md4 vs md5.
/bin/rsync -lptgoDAH -v --numeric-ids -d --relative --delete
--delete-after --files-from=...
I'm not asking for help. Just if anyone had attempted to change the
algorithm into something much faster.
I refer you to this project https://cyan4973.github.io/xxHash/ where
there is a table of speeds. Regardless of what anyone might
speculate, we are pursuing this route of changing out the algorithm.
Maybe it's all for naught. Maybe it isn't. But in a few weeks
hopefully we'll have determined.
Very interesting. From the rsync man page:
"Note that rsync always verifies that each transferred file was
correctly reconstructed on the receiving side by checking a
whole-file checksum that is generated as the file is transferred, but
that automatic after-the-transfer verification has nothing to do with
this option’s before-the-transfer "Does this file need to be updated?"
check."
So it sounds like you have sufficient churn in large files that the
checksum validation post-transfer is your bottleneck. Short of hacking
rsync to use a faster algorithm, your remaining choice is to use the
--checksum-choice=STR and set it to none, and then perform your own
hashing out-of-band to check the transferred data using the list you
have provided via in files-from. This will nerf rsync's ability to do
delta-transfer, which may be ok depending on the nature of your churning
files. If your pipes are huge (atypical for DR), your CPU is weak, and
your churning data is mostly completely new or completely changed files,
--checksum-choice=none may work very well for you.
Best,
ellis
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf