I used xxHash-0.7.0 to build against. You'll need to grab a version and install. For the actual rsync I have a diff, xxhash.patch along with the rpms for rsync in

https://tigress-web.princeton.edu/~bill/

If I get time I'll try and pass this to the upstream rsync folks. It is performing about the same speed as using --checksum so we are happy. This has been in production and seems to work fine.

Bill

On 9/30/19 8:55 PM, Stu Midgley wrote:
That's pretty awesome, are you going to make it available?  or push it upstream?

If not... how can we get it?

On Tue, Oct 1, 2019 at 1:09 AM Bill Wichser <b...@princeton.edu <mailto:b...@princeton.edu>> wrote:

    Just wanted to circle back on my orginal question.  I changed the rsync
    code adding xxhash and we see about a 3x speedup.  Good enough since it
    is very close to not using any checksum speedups.

    Bill

    On 6/17/19 9:43 AM, Bill Wichser wrote:
     > We have moved to a rsync disk backup system, from TSM tape, in
    order to
     > have a DR for our 10 PB GPFS filesystem.  We looked at a lot of
    options
     > but here we are.
     >
     > md5 checksums take a lot of compute time with huge files and even
    with
     > millions of smaller ones.  The bulk of the time for running rsync is
     > spent in computing the source and destination checksums and we'd
    like to
     > alleviate that pain of a cryptographic algorithm.
     >
     > Googling around, I found no mention of using a technique like
    this to
     > improve rsync performance.  I did find reference to a few hashing
     > algorithms though which could certainly work here (xxhash,
    murmurhash,
     > sbox, cityhash64).
     >
     > Rsync has certainly been around for a few years!  We are going to
    pursue
     > changing the current checksum algorithm and using something much
    faster.
     >   If anyone has done this already and would like to share their
     > experiences that would be wonderful. Ideally this could be some
    optional
     > plugin for rsync where users could choose which checksummer to use.
     >
     > Bill
     > _______________________________________________
     > Beowulf mailing list, Beowulf@beowulf.org
    <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
     > To change your subscription (digest mode or unsubscribe) visit
     > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
    _______________________________________________
    Beowulf mailing list, Beowulf@beowulf.org
    <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
    To change your subscription (digest mode or unsubscribe) visit
    https://beowulf.org/cgi-bin/mailman/listinfo/beowulf



--
Dr Stuart Midgley
sdm...@gmail.com <mailto:sdm...@gmail.com>
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to