If you have no choice but to use single rsync then either set up an rsyncd server on the other end to bypass ssh or use something like hpn-ssh for performance.
Bill On 1/2/20 10:52 AM, Bill Abbott wrote: > Fpsync and parsyncfp both do a great job with multiple rsyncs, although > you have to be careful about --delete. The best performance for fewer, > larger files, if it's an initial or one-time transfer, is bbcp with > multiple streams. > > Also jack up the tcp send buffer and turn on jumbo frames. > > Bill > > On 1/2/20 10:48 AM, Paul Edmon wrote: >> I also highly recommend fpsync. Here is a rudimentary guide to this: >> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rc.fas.harvard.edu%2Fresources%2Fdocumentation%2Ftransferring-data-on-the-cluster%2F&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084141981&sdata=4pAjOGbqw2dHH%2BJhkqRdsERupv9FRq06kL1EuQbpe%2F8%3D&reserved=0 >> >> >> I can get line speed with fpsync but single rsyncs usually only get up >> to about 0.3-1 GB/s. You really want that parallelism. We use fpsync >> for all our large scale data movement here and Globus for external >> transfers. >> >> -Paul Edmon- >> >> On 1/2/20 10:45 AM, Joe Landman wrote: >>> >>> On 1/2/20 10:26 AM, Michael Di Domenico wrote: >>>> does anyone know or has anyone gotten rsync to push wire speed >>>> transfers of big files over 10G links? i'm trying to sync a directory >>>> with several large files. the data is coming from local disk to a >>>> lustre filesystem. i'm not using ssh in this case. i have 10G >>>> ethernet between both machines. both end points have more then >>>> enough spindles to handle 900MB/sec. >>>> >>>> i'm using 'rsync -rav --progress --stats -x --inplace >>>> --compress-level=0 /dir1/ /dir2/' but each file (which is 100's of >>>> GB's) is getting choked at 100MB/sec >>> >>> A few thoughts >>> >>> 1) are you sure your traffic is traversing the high bandwidth link? >>> Always good to check ... >>> >>> 2) how many files are you xfering? Are these generally large files or >>> many small files, or a distribution with a long tail towards small >>> files? The latter two will hit your metadata system fairly hard, and >>> in the case of Lustre, performance will depend critically upon the >>> MDS/MDT architecture and implementation. FWIW, the big system I was >>> working on setting up late last year, we hit MIOP level reads/writes, >>> but then again, this was architected correctly. >>> >>> 3) wire speed xfers are generally the exception unless you are doing >>> large sequential single files. There are tricks you can do to enable >>> this, but they are often complex. You can use the array of >>> writers/readers, and leverage parallelism, but you risk invoking >>> congestion/pause throttling on your switch. >>> >>> >>>> >>>> running iperf and dd between the client and the lustre hits 900MB/sec, >>>> so i fully believe this is an rsync limitation. >>>> >>>> googling around hasn't lent any solid advice, most of the articles are >>>> people that don't check the network first... >>>> >>>> with the prevalence of 10G these days, i'm surprised this hasn't come >>>> up before, or my google-fu really stinks. which doesn't bode well >>>> given its the first work day of 2020 :( >>>> _______________________________________________ >>>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >>>> To change your subscription (digest mode or unsubscribe) visit >>>> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084141981&sdata=Ea%2FIWr4AzIsOt%2BaEvgAnBvy%2B3gRzJaNHcH4pW1RgzF0%3D&reserved=0 >>>> >>> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084141981&sdata=Ea%2FIWr4AzIsOt%2BaEvgAnBvy%2B3gRzJaNHcH4pW1RgzF0%3D&reserved=0 >> > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C274f2a02c8554251c4dd08d78f9be816%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135772084151978&sdata=y3VFfY05kcMx91Dvb3ZPcfxXUMzFWWlVfJVdByOYCIc%3D&reserved=0 > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf