I also highly recommend fpsync. Here is a rudimentary guide to this:
https://www.rc.fas.harvard.edu/resources/documentation/transferring-data-on-the-cluster/
I can get line speed with fpsync but single rsyncs usually only get up
to about 0.3-1 GB/s. You really want that parallelism. We use fpsync
for all our large scale data movement here and Globus for external
transfers.
-Paul Edmon-
On 1/2/20 10:45 AM, Joe Landman wrote:
On 1/2/20 10:26 AM, Michael Di Domenico wrote:
does anyone know or has anyone gotten rsync to push wire speed
transfers of big files over 10G links? i'm trying to sync a directory
with several large files. the data is coming from local disk to a
lustre filesystem. i'm not using ssh in this case. i have 10G
ethernet between both machines. both end points have more then
enough spindles to handle 900MB/sec.
i'm using 'rsync -rav --progress --stats -x --inplace
--compress-level=0 /dir1/ /dir2/' but each file (which is 100's of
GB's) is getting choked at 100MB/sec
A few thoughts
1) are you sure your traffic is traversing the high bandwidth link?
Always good to check ...
2) how many files are you xfering? Are these generally large files or
many small files, or a distribution with a long tail towards small
files? The latter two will hit your metadata system fairly hard, and
in the case of Lustre, performance will depend critically upon the
MDS/MDT architecture and implementation. FWIW, the big system I was
working on setting up late last year, we hit MIOP level reads/writes,
but then again, this was architected correctly.
3) wire speed xfers are generally the exception unless you are doing
large sequential single files. There are tricks you can do to enable
this, but they are often complex. You can use the array of
writers/readers, and leverage parallelism, but you risk invoking
congestion/pause throttling on your switch.
running iperf and dd between the client and the lustre hits 900MB/sec,
so i fully believe this is an rsync limitation.
googling around hasn't lent any solid advice, most of the articles are
people that don't check the network first...
with the prevalence of 10G these days, i'm surprised this hasn't come
up before, or my google-fu really stinks. which doesn't bode well
given its the first work day of 2020 :(
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf