On Fri, Mar 05, 2010 at 12:32:37PM -0500, Michael Di Domenico wrote: > As i expect from the smartest sysadmins on the planet, everyone has > over analyzed the issue... :) > > lets see if i can clarify > > assuming there are two clusters - clusterA and clusterB > > Each cluster is 32nodes and has 50TB of storage attached > > the aggregate network bandwidth between the clusters is 800MB/sec > > the problem is the per-node bandwidth on clusterB is 30MB/sec > > so i use a single node to copy the 20TB of data from clusterB, yes > it's going to take me 7days to copy everything > > I'd like to paralyze that across multiple nodes to drive the aggregate up > > I was hoping someone would pop up say, hey use this magical piece of > software. (of which im unable to locate)..
You might be able to use "dar" for this: http://dar.linux.free.fr/ Dar will let you slice up your 20 TB of data into even sized pieces that you can transfer in parallel, than re-construct on the receiving side. David S. > > > > On Fri, Mar 5, 2010 at 11:30 AM, kyron <ky...@neuralbs.com> wrote: > > On Fri, 05 Mar 2010 11:22:14 -0500, Mike Davis <jmdav...@vcu.edu> wrote: > >> Michael Di Domenico wrote: > >>> How does one copy large (20TB) amounts of data from one cluster to > >>> another? > >>> > >>> Assuming that each node in the cluster can only do about 30MB/sec > >>> between clusters and i want to preserve the uid/gid/timestamps, etc > >>> > >> If the clusters are co-lo I wouldn't copy I would use shared storage. If > > > >> they are not co-located I would use patience. > >> > >> Seriously though, for a one time copy, I would consider copying to an > >> external system and then physically moving that system. To do this and > >> preserve ownerships you will need to duplicate accounts and groups. > > > > > > ...and we are all assuming non-compressibility; otherwise, use pbzip2 ;) > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf