On 6/18/19 9:16 AM, Bill Wichser wrote:
Stock RH 7 version, rsync-3.1.2-6.el7_6.1.x86_64.  We've tried a number of recompiles.  gcc, Intel.  The only thing between identical compiles was the md4 vs md5.

/bin/rsync -lptgoDAH -v --numeric-ids -d --relative --delete --delete-after --files-from=...

I'm not asking for help.  Just if anyone had attempted to change the algorithm into something much faster.

I refer you to this project https://cyan4973.github.io/xxHash/ where there is a table of speeds.  Regardless of what anyone might speculate, we are pursuing this route of changing out the algorithm.  Maybe it's all for naught.  Maybe it isn't.  But in a few weeks hopefully we'll have determined.

Very interesting.  From the rsync man page:

"Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option’s before-the-transfer "Does this file need to be updated?" check."

So it sounds like you have sufficient churn in large files that the checksum validation post-transfer is your bottleneck. Short of hacking rsync to use a faster algorithm, your remaining choice is to use the --checksum-choice=STR and set it to none, and then perform your own hashing out-of-band to check the transferred data using the list you have provided via in files-from. This will nerf rsync's ability to do delta-transfer, which may be ok depending on the nature of your churning files. If your pipes are huge (atypical for DR), your CPU is weak, and your churning data is mostly completely new or completely changed files, --checksum-choice=none may work very well for you.

Best,

ellis

--
Ellis H. Wilson III, Ph.D.
     www.ellisv3.com
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to