If you have no choice but to use single rsync then either set up an 
rsyncd server on the other end to bypass ssh or use something like 
hpn-ssh for performance.

Bill

On 1/2/20 10:52 AM, Bill Abbott wrote:
> Fpsync and parsyncfp both do a great job with multiple rsyncs, although
> you have to be careful about --delete.  The best performance for fewer,
> larger files, if it's an initial or one-time transfer, is bbcp with
> multiple streams.
> 
> Also jack up the tcp send buffer and turn on jumbo frames.
> 
> Bill
> 
> On 1/2/20 10:48 AM, Paul Edmon wrote:
>> I also highly recommend fpsync.  Here is a rudimentary guide to this:
>> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rc.fas.harvard.edu%2Fresources%2Fdocumentation%2Ftransferring-data-on-the-cluster%2F&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084141981&sdata=4pAjOGbqw2dHH%2BJhkqRdsERupv9FRq06kL1EuQbpe%2F8%3D&reserved=0
>>
>>
>> I can get line speed with fpsync but single rsyncs usually only get up
>> to about 0.3-1 GB/s.  You really want that parallelism.  We use fpsync
>> for all our large scale data movement here and Globus for external
>> transfers.
>>
>> -Paul Edmon-
>>
>> On 1/2/20 10:45 AM, Joe Landman wrote:
>>>
>>> On 1/2/20 10:26 AM, Michael Di Domenico wrote:
>>>> does anyone know or has anyone gotten rsync to push wire speed
>>>> transfers of big files over 10G links?  i'm trying to sync a directory
>>>> with several large files.  the data is coming from local disk to a
>>>> lustre filesystem.  i'm not using ssh in this case.  i have 10G
>>>> ethernet between both machines.   both end points have more then
>>>> enough spindles to handle 900MB/sec.
>>>>
>>>> i'm using 'rsync -rav --progress --stats -x --inplace
>>>> --compress-level=0 /dir1/ /dir2/' but each file (which is 100's of
>>>> GB's) is getting choked at 100MB/sec
>>>
>>> A few thoughts
>>>
>>> 1) are you sure your traffic is traversing the high bandwidth link?
>>> Always good to check ...
>>>
>>> 2) how many files are you xfering?  Are these generally large files or
>>> many small files, or a distribution with a long tail towards small
>>> files?  The latter two will hit your metadata system fairly hard, and
>>> in the case of Lustre, performance will depend critically upon the
>>> MDS/MDT architecture and implementation. FWIW, the big system I was
>>> working on setting up late last year, we hit MIOP level reads/writes,
>>> but then again, this was architected correctly.
>>>
>>> 3) wire speed xfers are generally the exception unless you are doing
>>> large sequential single files.   There are tricks you can do to enable
>>> this, but they are often complex.  You can use the array of
>>> writers/readers, and leverage parallelism, but you risk invoking
>>> congestion/pause throttling on your switch.
>>>
>>>
>>>>
>>>> running iperf and dd between the client and the lustre hits 900MB/sec,
>>>> so i fully believe this is an rsync limitation.
>>>>
>>>> googling around hasn't lent any solid advice, most of the articles are
>>>> people that don't check the network first...
>>>>
>>>> with the prevalence of 10G these days, i'm surprised this hasn't come
>>>> up before, or my google-fu really stinks.  which doesn't bode well
>>>> given its the first work day of 2020 :(
>>>> _______________________________________________
>>>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>>>> To change your subscription (digest mode or unsubscribe) visit
>>>> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084141981&sdata=Ea%2FIWr4AzIsOt%2BaEvgAnBvy%2B3gRzJaNHcH4pW1RgzF0%3D&reserved=0
>>>>
>>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084141981&sdata=Ea%2FIWr4AzIsOt%2BaEvgAnBvy%2B3gRzJaNHcH4pW1RgzF0%3D&reserved=0
>>
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084151978&sdata=y3VFfY05kcMx91Dvb3ZPcfxXUMzFWWlVfJVdByOYCIc%3D&reserved=0
> 
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to