Douglas Eadline wrote:
[...]
Indeed, an excellent question. It seems logical, does it really help though
(or do I just feel clever about using the extra Ethernet Port)  I can see
that if you have a lot of monitoring traffic that might cause an issue,
but I have never tested that notion as well. Of course it all depends...
I wonder if a dual Ethernet node would be better served by something like
a FNN (http://aggregate.org/FNN/) Tim Mattox can probably weigh in on
this.

Hello, Doug.

One of the first systems I saw that used a dual ethernet, as opposed to just channel bonding multiple NIC's, was the EPCC BOBCAT:

        http://www.epcc.ed.ac.uk/bobcat/

Although this system has now been dismantled, it inspired me to build a similar cluster here at the Rowett:

        http://bobcat.rri.sari.ac.uk

The most important feature of a 'BOBCAT' architecture Beowulf is the use of 'diskless' compute nodes with separate dual network fabrics for the 'system' and 'application' traffic. The 'diskless' nodes are really 'dataless' because they have scratch disks for /tmp and swap, but no operating system installed.

This approach is useful because it means that you can still control the Beowulf cluster via the 'system' network even if the 'application' network becomes staturated. The traffic is segregated between the two private network fabrics.

In fact, the system I built here has three NIC's in the servers and uses NAT on the head node to allow compute nodes to make outgoing connections to the internet from the private cluster network via the LAN so that, for example, our [EMAIL PROTECTED] jobs on the nodes can download work units.

This system works very well and, incidentally, demonstrates that poor perfomance of 'diskless' compute nodes with NFS-mounted root filesystems might have more to do with saturation of the cluster interconnection by HPC 'application' traffic than NFS congestion on a 64-node cluster. I'm aware that NFS does not scale up very well to large clusters: No flames!

Our cluster has three networks:

143.234.32.0    LAN             100Base-T       (public)
192.168.0.0     System          100Base-T       (private)
192.168.1.0     Application     Gigabit         (private)

The compute nodes have two NIC's connected to the private network. The servers have three NIC's connected to the private networks and the LAN.

        Tony.
--
Dr. A.J.Travis,                     |  mailto:[EMAIL PROTECTED]
Rowett Research Institute,          |    http://www.rri.sari.ac.uk/~ajt
Greenburn Road, Bucksburn,          |   phone:+44 (0)1224 712751
Aberdeen AB21 9SB, Scotland, UK.    |     fax:+44 (0)1224 716687
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to