Fpsync and parsyncfp both do a great job with multiple rsyncs, although you have to be careful about --delete. The best performance for fewer, larger files, if it's an initial or one-time transfer, is bbcp with multiple streams.
Also jack up the tcp send buffer and turn on jumbo frames. Bill On 1/2/20 10:48 AM, Paul Edmon wrote: > I also highly recommend fpsync. Here is a rudimentary guide to this: > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rc.fas.harvard.edu%2Fresources%2Fdocumentation%2Ftransferring-data-on-the-cluster%2F&data=02%7C01%7Cbabbott%40rutgers.edu%7C473e235971f341a2ceb808d78f9b7958%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135770247733614&sdata=us%2FmQefb44G%2BaCVVZRoJ797uI3TIgrnmR%2FU0WXsmskE%3D&reserved=0 > > > > I can get line speed with fpsync but single rsyncs usually only get up > to about 0.3-1 GB/s. You really want that parallelism. We use fpsync > for all our large scale data movement here and Globus for external > transfers. > > -Paul Edmon- > > On 1/2/20 10:45 AM, Joe Landman wrote: >> >> On 1/2/20 10:26 AM, Michael Di Domenico wrote: >>> does anyone know or has anyone gotten rsync to push wire speed >>> transfers of big files over 10G links? i'm trying to sync a directory >>> with several large files. the data is coming from local disk to a >>> lustre filesystem. i'm not using ssh in this case. i have 10G >>> ethernet between both machines. both end points have more then >>> enough spindles to handle 900MB/sec. >>> >>> i'm using 'rsync -rav --progress --stats -x --inplace >>> --compress-level=0 /dir1/ /dir2/' but each file (which is 100's of >>> GB's) is getting choked at 100MB/sec >> >> A few thoughts >> >> 1) are you sure your traffic is traversing the high bandwidth link? >> Always good to check ... >> >> 2) how many files are you xfering? Are these generally large files or >> many small files, or a distribution with a long tail towards small >> files? The latter two will hit your metadata system fairly hard, and >> in the case of Lustre, performance will depend critically upon the >> MDS/MDT architecture and implementation. FWIW, the big system I was >> working on setting up late last year, we hit MIOP level reads/writes, >> but then again, this was architected correctly. >> >> 3) wire speed xfers are generally the exception unless you are doing >> large sequential single files. There are tricks you can do to enable >> this, but they are often complex. You can use the array of >> writers/readers, and leverage parallelism, but you risk invoking >> congestion/pause throttling on your switch. >> >> >>> >>> running iperf and dd between the client and the lustre hits 900MB/sec, >>> so i fully believe this is an rsync limitation. >>> >>> googling around hasn't lent any solid advice, most of the articles are >>> people that don't check the network first... >>> >>> with the prevalence of 10G these days, i'm surprised this hasn't come >>> up before, or my google-fu really stinks. which doesn't bode well >>> given its the first work day of 2020 :( >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >>> To change your subscription (digest mode or unsubscribe) visit >>> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C473e235971f341a2ceb808d78f9b7958%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135770247733614&sdata=yEYsxZWvLxkPpQPpqDer%2FXwVmkPcpLiK%2FQmzOwKrzCI%3D&reserved=0 >>> >>> >> > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeowulf.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cbabbott%40rutgers.edu%7C473e235971f341a2ceb808d78f9b7958%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C1%7C637135770247733614&sdata=yEYsxZWvLxkPpQPpqDer%2FXwVmkPcpLiK%2FQmzOwKrzCI%3D&reserved=0 > > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf