Re: [Beowulf] One network, or two?

Joe Landman Tue, 23 Sep 2008 16:11:28 -0700


Prentice Bisbal wrote:

Alan Ward wrote:

Good day.

I have been reading the ongoing discussion on network usage with some
interest, mainly because in all (admittedly very small, 4 to 8 node)
clusters we have set up so far, we have always gone with doubling the
network. Nowadays we mostly run a 100 MBit/s "el cheapo" FastEthernet
for control, NFS and monitoring, while the faster Gigabit is exclusively
for MPI. Applications are CFD, with various levels of granularity.

Anybody care to comment?

-Alan


My new cluster, which is still in labor, will have InfiniBand for MPI,
and we have 10 Gb ethernet switches for management/NFS, etc. The nodes
only have 1 Gb ethernet, so it will be effectively a 1 Gb network.

I'm also curious as to whether the dual networks are overkill, and if
using a slower network for I/O will cause the system to be slower than
doing all traffic over IB, since I/O will be slower and cause the nodes
to wait longer for these ops to finish.

Obviously YMMV, but in tuning systems, and seeing where bottlenecks are,we look for obvious things in the network design

1) poorly designed (office quality, usually uplinked/oversubscribed)networks ... have an interesting effect when you see two 128 portgigabit switches connected together with a 1 or 2 gigabit, or even 10GbE link. You see this plateau in MPI scalability. Yes, from a realcustomer case. :(

2) I/O: after burning incense to the daemons of low latency, the nextbig area we see is (curiously enough) I/O bandwidth. I can't begin toelaborate on how many times I have seen a big expensive shiny newcluster with an absolutely terrible I/O design. Usually starting with 1GbE connected NAS for 32 or more nodes. Well, ok, YMMV, but when yourun lots of Gaussian jobs which hammer on the NFS over this, you aregoing to experience pain. And no, this is not where you stick theNetApps. Yes, from several real customer cases. :( :(

No, you don't need to perform ritual incantations to make the I/O gofaster. You just need good hardware and good design to get the data tothe hardware.

1GbE may be great for EP jobs that occasionally write to disk. But ifyour units are going to hammer on disk, you are going to need to checkyour IO design, and make sure it scales. Yes, we are biased, westrongly believe in good I/O systems.



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] One network, or two?

Reply via email to