I used xxHash-0.7.0 to build against. You'll need to grab a version and
install. For the actual rsync I have a diff, xxhash.patch along with
the rpms for rsync in
https://tigress-web.princeton.edu/~bill/
If I get time I'll try and pass this to the upstream rsync folks. It is
performing about the same speed as using --checksum so we are happy.
This has been in production and seems to work fine.
Bill
On 9/30/19 8:55 PM, Stu Midgley wrote:
That's pretty awesome, are you going to make it available? or push it
upstream?
If not... how can we get it?
On Tue, Oct 1, 2019 at 1:09 AM Bill Wichser <b...@princeton.edu
<mailto:b...@princeton.edu>> wrote:
Just wanted to circle back on my orginal question. I changed the rsync
code adding xxhash and we see about a 3x speedup. Good enough since it
is very close to not using any checksum speedups.
Bill
On 6/17/19 9:43 AM, Bill Wichser wrote:
> We have moved to a rsync disk backup system, from TSM tape, in
order to
> have a DR for our 10 PB GPFS filesystem. We looked at a lot of
options
> but here we are.
>
> md5 checksums take a lot of compute time with huge files and even
with
> millions of smaller ones. The bulk of the time for running rsync is
> spent in computing the source and destination checksums and we'd
like to
> alleviate that pain of a cryptographic algorithm.
>
> Googling around, I found no mention of using a technique like
this to
> improve rsync performance. I did find reference to a few hashing
> algorithms though which could certainly work here (xxhash,
murmurhash,
> sbox, cityhash64).
>
> Rsync has certainly been around for a few years! We are going to
pursue
> changing the current checksum algorithm and using something much
faster.
> If anyone has done this already and would like to share their
> experiences that would be wonderful. Ideally this could be some
optional
> plugin for rsync where users could choose which checksummer to use.
>
> Bill
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
--
Dr Stuart Midgley
sdm...@gmail.com <mailto:sdm...@gmail.com>
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf