On 6/17/19 6:43 AM, Bill Wichser wrote:
md5 checksums take a lot of compute time with huge files and even with millions of smaller ones. The bulk of the time for running rsync is spent in computing the source and destination checksums and we'd like to alleviate that pain of a cryptographic algorithm.
First of all I would note that rsync only uses checksums if you tell it to, otherwise it just uses file times and sizes to determine what to transfer.
rsync is also single-threaded, so I would take a look at what was previously called parsync, but is now parsynfp :-)
http://moo.nac.uci.edu/~hjm/parsync/ There is the caveat there though: # As a warning, the main use case for parsyncfp is really only # very large data transfers thru fairly fast network connections # (>1Gb). Below this speed, rsync itself can saturate the # connection, so there’s little reason to use parsyncfp and in # fact the overhead of testing the existence of and starting more # rsyncs tends to worsen its performance on small transfers to # slightly less than rsync alone. Good luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf