Re: [Beowulf] Rsync - checksums

Christopher Samuel Mon, 17 Jun 2019 08:30:15 -0700

On 6/17/19 6:43 AM, Bill Wichser wrote:

md5 checksums take a lot of compute time with huge files and even withmillions of smaller ones. The bulk of the time for running rsync isspent in computing the source and destination checksums and we'd like toalleviate that pain of a cryptographic algorithm.

First of all I would note that rsync only uses checksums if you tell itto, otherwise it just uses file times and sizes to determine what totransfer.

rsync is also single-threaded, so I would take a look at what waspreviously called parsync, but is now parsynfp :-)


http://moo.nac.uci.edu/~hjm/parsync/

There is the caveat there though:

# As a warning, the main use case for parsyncfp is really only
# very large data transfers thru fairly fast network connections
# (>1Gb). Below this speed, rsync itself can saturate the
# connection, so there’s little reason to use parsyncfp and in
# fact the overhead of testing the existence of and starting more
# rsyncs tends to worsen its performance on small transfers to
# slightly less than rsync alone.

Good luck!
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Re: [Beowulf] Rsync - checksums

Reply via email to