On Mon, 17 Jun 2019 08:29:53 -0700 Christopher Samuel <ch...@csamuel.org> wrote:
> On 6/17/19 6:43 AM, Bill Wichser wrote: > > > md5 checksums take a lot of compute time with huge files and even > > with millions of smaller ones. The bulk of the time for running > > rsync is spent in computing the source and destination checksums > > and we'd like to alleviate that pain of a cryptographic algorithm. > > First of all I would note that rsync only uses checksums if you tell > it to, otherwise it just uses file times and sizes to determine what > to transfer. As Chris says rsync decides if a files needs to be synced based on the content of the file (by hashing it on both source and destination side). It does _NOT_ protect the transfer with said checksum nor does it verify the destination side write with it. In the end the (significant) performance cost of using -c boils down to the cost of doing open+read of each file on both source and destination side (instead of just stat). The hasing algo is not the main problem. /Peter _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf