Henning Fehrmann wrote:
Hi everybody,

Coping a big file onto all nodes in a cluster is a rather common problem.
I would have thought that there might be a standard tool for distributing the files in an efficient way. So far, I haven't found one.

Assuming one has a network design which allows non blocking full duplex
wire-speed connections between N/2 pairs of nodes where N is the number
of nodes in the cluster. It is basically a non blocking coreswitch.
In this case the following scheme would be convenient and rather simple:

The file is placed on node n1 and one builds a chain of nodes n1 , n2 .... nN.

One splits the file into many packages (p1..pM), lets say a fragment fits
into one TCP package. In the first step n1 transmits the package p1 to node n2.
In the second step n1 transmits the package p2 to n2 and n2 transmits p1 to 
node n3.

Someone has implemented this bucket brigade model for data transfer. Its not the only one available, as each NIC has two neighbors to communicate with, and thus winds up at effectively 1/2 the bandwidth, or a serialization of the packets. Not that this is a bad thing, but for big file distribution, this could be a problem.


The transmission of a single package is fast. The time of passing a particular
package through the whole chain of nodes is short compared with time of the entire copying process. E.g., using jumbo frames a package can have the size of ca 10kB. In Gb network the transmission time of a single package between nodes is of the order of 0.1 ms. Even in a cluster with 1024 nodes it takes
in an ideal case just 0.1s to pass a package from node n1 through all nodes to 
n1024.

On each node the package is stored and, in the end, one reassembles the file.
For big files (size >> 10Mb) the required time is approximately the same as one needs for copying the file between two nodes plus 0.1s.

One needs basically a daemon which handles copying requests and establishes the connection to next node in the chain.

Has somebody written such a tool?

I saw something like this several years ago.

We were working on a different type of tool that exploited the fact that you have N/2 pairs, and tried to maximize the flow to these N/2 pairs. It included error correction and a few other nice things (multi-sourcing was on the roadmap). Never could find interested customers/users for it, so it fell off the radar. We called it xcp, and you used it as

        xcp [set of files] cluster://name/path/to/deposit/files/into

and it handled it all for you.

Prior to that, we had a system that used multicast, but after seeing what this did to other traffic on the gigabit switches, we went away from that. That was mcp, and was dated around 2000-ish or so.

You can use bittorrent to do something approximately like xcp though at lower performance.

Joe


Cheers,
Henning Fehrmann

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to