Re: [Beowulf] 10G and rsync

2020-01-02 Thread David Mathog
On Thu, 2 Jan 2020 13:32:17 Michael Di Domenico wrote: On Thu, Jan 2, 2020 at 12:44 PM David Mathog wrote: 1. Is a single large file transfer rate reasonable? 2. Ditto for several large files? yes, if i transfer files outside of rsync performance is reasonable Are you sure there is not a pa

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Lance Wilson via Beowulf
Hi Michael, Your speed suspiciously looks like the maximum HDD speed. How many stripes is the directory set to on the source and destinations? This is a common problem for my researchers as they don't understand lustre. They expect to have single stream bandwidth on multi GB/s and get about the spe

[Beowulf] 10G and rsync

2020-01-02 Thread Michael Di Domenico
On Thu, Jan 2, 2020 at 12:44 PM David Mathog wrote: > 1. Is a single large file transfer rate reasonable? > 2. Ditto for several large files? yes, if i transfer files outside of rsync performance is reasonable > Are you sure there is not a patrol read ongoing on one system or the > other? That

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Jonathan Aquilina
Aren’t you going to have issues with I/O contention if you are copying on the same machine but different directories? Is the transfer even going over the link Regards, Jonathan Aquilina Owner managing director Phone (356) 20330099 Mobile (356) 79957942 Email sa...@eagleeyet.net

Re: [Beowulf] 10G and rsync

2020-01-02 Thread David Mathog
On Thu, 2 Jan 2020 11:27:58 -0500 Michael Di Domenico wrote 2) how many files are you xfering? Are these generally large files or many small files, or a distribution with a long tail towards small files? small, 100-200 files, definitely not a MDT issue 1. Is a single large file transfer rate

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Jonathan Engwall
The whitepaper was interesting. Single core VMs might be your best bet. On Thu, Jan 2, 2020, 8:48 AM Michael Di Domenico wrote: > i'll check it, but keep in mind. i'm not copying files between two > servers, but rather between two directories on the same server. > > ideally if rsync is still us

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Michael Di Domenico
i'll check it, but keep in mind. i'm not copying files between two servers, but rather between two directories on the same server. ideally if rsync is still using ssh under the covers in my scenario, i'm hopeful hpn-ssh might alleviate the bottleneck condition. if it's not i'm back to square one

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Alex Chekholko via Beowulf
Hi Michael, I would recommend trying 'bbcp' before 'hpn-ssh' as the latter will really only benefit you for high-latency links, e.g. across country. Put the bbcp binary on both sides and try it out. If you don't have a way to install bbcp into a system $PATH, you can specify the absolute path to

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Michael Di Domenico
just to further the discussion and for everyone's education i found this whitepaper, which seems to confirm what i see https://www.intel.com/content/dam/support/us/en/documents/network/sb/fedexcasestudyfinal.pdf maybe hpn-ssh is something i can work into my process On Thu, Jan 2, 2020 at 10:26

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Michael Di Domenico
> 1) are you sure your traffic is traversing the high bandwidth link? > Always good to check ... Yup, only have one on each side, one switch connecting them together. (ex network engineer, always check the network first, we can't be trusted) :) > 2) how many files are you xfering? Are these gene

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Bill Abbott
If you have no choice but to use single rsync then either set up an rsyncd server on the other end to bypass ssh or use something like hpn-ssh for performance. Bill On 1/2/20 10:52 AM, Bill Abbott wrote: > Fpsync and parsyncfp both do a great job with multiple rsyncs, although > you have to be

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Bill Abbott
Fpsync and parsyncfp both do a great job with multiple rsyncs, although you have to be careful about --delete. The best performance for fewer, larger files, if it's an initial or one-time transfer, is bbcp with multiple streams. Also jack up the tcp send buffer and turn on jumbo frames. Bill

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Paul Edmon
I also highly recommend fpsync.  Here is a rudimentary guide to this: https://www.rc.fas.harvard.edu/resources/documentation/transferring-data-on-the-cluster/ I can get line speed with fpsync but single rsyncs usually only get up to about 0.3-1 GB/s.  You really want that parallelism.  We use f

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Joe Landman
On 1/2/20 10:39 AM, Michael Di Domenico wrote: On Thu, Jan 2, 2020 at 10:35 AM Chris Dagdigian wrote: - I noticed you did not test small file / metadata operations. My past experience has found that this was the #1 cause of slowness in rsync and other file transfers. iperf and IOR tests are al

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Joe Landman
On 1/2/20 10:26 AM, Michael Di Domenico wrote: does anyone know or has anyone gotten rsync to push wire speed transfers of big files over 10G links? i'm trying to sync a directory with several large files. the data is coming from local disk to a lustre filesystem. i'm not using ssh in this ca

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Michael Di Domenico
On Thu, Jan 2, 2020 at 10:35 AM Chris Dagdigian wrote: > > - I noticed you did not test small file / metadata operations. My past > experience has found that this was the #1 cause of slowness in rsync and > other file transfers. iperf and IOR tests are all well and good but you > should run someth

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Chris Dagdigian
A few times a year I need to shift a few petabytes over the wire for a client and based on last year's project some thoughts ... - I noticed you did not test small file / metadata operations. My past experience has found that this was the #1 cause of slowness in rsync and other file transfers.

Re: [Beowulf] 10G and rsync

2020-01-02 Thread Carsten Aulbert
Hi we usually see close to wire-speed with rsync or other tools... On 1/2/20 4:26 PM, Michael Di Domenico wrote: > i'm using 'rsync -rav --progress --stats -x --inplace > --compress-level=0 /dir1/ /dir2/' but each file (which is 100's of > GB's) is getting choked at 100MB/sec Hmm, Isn't this rsyn

[Beowulf] 10G and rsync

2020-01-02 Thread Michael Di Domenico
does anyone know or has anyone gotten rsync to push wire speed transfers of big files over 10G links? i'm trying to sync a directory with several large files. the data is coming from local disk to a lustre filesystem. i'm not using ssh in this case. i have 10G ethernet between both machines.