Michael Di Domenico wrote: > lets see if i can clarify > > assuming there are two clusters - clusterA and clusterB > > Each cluster is 32nodes and has 50TB of storage attached
Attached how? Is the 50TB sitting on one file server on each cluster, or is it distributed across the cluster? We need more details. > > the aggregate network bandwidth between the clusters is 800MB/sec > > the problem is the per-node bandwidth on clusterB is 30MB/sec Is there a switch on each cluster so that each node can write directly to the interconnect between clusters? Specifically, can node A12 write to node B12? Sounds like there might be, and since you seem to care about the per-node bandwidth on the target it sounds like you have a situation where the data is distributed on A and will again be distributed across nodes on B. If that's what you mean, then you just need to queue up a job on each node to do something like: (cd $DATADIRECTORY ; tar -cf - . ) \ | ssh matching_target_node 'cd $DATADIRECTORY; tar -xf - ) It will run in parallel using up all of your interconnect bandwidth. If on the other hand, the only per node rate you care about is the one fileserver on B, then it is a different problem. On the other, other hand, if you can temporarily store the data on each node of B, and the cumulative bandwidth that way is 800MB/s you could conceivably transfer it in parallel from A to all 32 destinations in B, and put the mess back together in B later. However, if you are still rate limited to 30Mb/sec on a single B fileserver then the total time to complete this operation will not change, only the time the data is in transit between the clusters will be reduced. Regards, David Mathog mat...@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf