Why not use existing pftool?
https://github.com/pftool/pftool
-Josip
On 6/17/19 10:07 AM, Michael Di Domenico wrote:
just out of morbid curiosity i popped through the rsync code. it
doesn't look terribly difficult to wedge in a new algo. but honestly,
if i was going to go through the trouble i'd write a new tool that
walks the file tree in parallel and logs the checksums to a database.
i've had problems rsync'ing big filesystems in the past, so i try to
avoid it as a DR or poor-man's snapshotting
On Mon, Jun 17, 2019 at 11:30 AM Christopher Samuel <ch...@csamuel.org> wrote:
On 6/17/19 6:43 AM, Bill Wichser wrote:
md5 checksums take a lot of compute time with huge files and even with
millions of smaller ones. The bulk of the time for running rsync is
spent in computing the source and destination checksums and we'd like to
alleviate that pain of a cryptographic algorithm.
First of all I would note that rsync only uses checksums if you tell it
to, otherwise it just uses file times and sizes to determine what to
transfer.
rsync is also single-threaded, so I would take a look at what was
previously called parsync, but is now parsynfp :-)
http://moo.nac.uci.edu/~hjm/parsync/
There is the caveat there though:
# As a warning, the main use case for parsyncfp is really only
# very large data transfers thru fairly fast network connections
# (>1Gb). Below this speed, rsync itself can saturate the
# connection, so there’s little reason to use parsyncfp and in
# fact the overhead of testing the existence of and starting more
# rsyncs tends to worsen its performance on small transfers to
# slightly less than rsync alone.
Good luck!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
--
Dr. Josip Loncaric, LANL, MS-T001, P.O. Box 1663, Los Alamos, NM 87545
mailto:jo...@lanl.gov Cell: +1-505-412-8490 Phone: +1-505-412-6538
--
E Pluribus Unum
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf