On Tue, Aug 5, 2008 at 4:25 PM, Gus Correa <[EMAIL PROTECTED]> wrote: > Is anybody using Infiniband to provide both > MPI connection and parallel file system services on a Beowulf cluster? > > I thought to have a storage node that would > serve a parallel file system to the beowulf nodes over IB > (something like a NFS on steroids). > The same IB net would also work as the MPI interconnect. > > Is this design possible?
We have customers doing Lustre and MPI with IB successfully. They still have a good-old gigabit management network to fall back on: it makes sense to keep this around because gigabit is so low-cost by comparison and it's rock-solid. But, you should know that you need more than a single node to provide disk I/O before you start to see the performance benefit. I/O from a single node can--generally--barely fill a gigabit link. To exceed that gigabit level of performance, you'd need more than one storage node delivering storage to the Lustre network. > On a small cluster, does it require two separate IB physical networks (cards > and switch), > or can it be done with a single IB card per node and one switch? It can be done with a single IB network. > Is this design efficient? Generally speaking, MPI programs will not be fetching/writing data from/to storage at the same time they are doing MPI calls so there tends to not be very much contention to worry about at the node level. > Are there other practical and cost effective alternatives to this idea? If the cluster is small enough, using gigabit with a shared filesystem is preferred since IB's low latency has relatively little affect on the big source of latency in any storage system: the physical disks. It's not until you cross the gigabit bandwidth barrier that IB really starts to make sense--and that's a barrier that's not crossed that often in a small cluster. > Would this type of design work with GigE instead of IB? Yes, but you'd still want IB for low latency MPI traffic. > I confess I know nothing about parallel file systems and IB. > So, please forgive me if my questions are nonsense. Lustre and Panassas are certainly both stable options in this area. -- Jason D. Clinton Advanced Clustering Technologies, Inc. _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
