Greg Lindahl wrote:
Joe Landman wrote:

Contrary to the detractors of the technologies comments, the
TOE/RDMA card *did* provide fairly significant performance delta for
real apps running MPI over gigabit ethernet.

As a detractor of TOEs, I should point out that one data point does
not prove that it's common that apps get a benefit.

True, however it does point out that it is possible to get better performance

I'd be willing to bet that this app was doing extremely large
transfers, and maybe even managed to get more concurrency with the

StarCD.  Not big transfers, it doesn't move GB to its nodes.

TOE... which could easily be a flaw in the MPI implementation's TCP
driver, a pretty common thing to be wrong. For example, LAM was always

Yes this could be possible.

much better than MPICH over TCP, and I wouldn't be surprised if
OpenMPI continues this superiority over MPICH-2.

Minor issues with OpenMPI and things like Overflow, but other than that, it does work extremely well.

The most interesting thing, to me, is that the various people selling
TOEs in the HPC arena publish almost no benchmarks. What's the message
rate and N1/2? The only N1/2 I've ever seen published was 100 kbytes.

What concerns me less than microbenchmarks are the issues of real application wallclock differences. Frankly we have seen far too many microbenchmarks pushed where real applications are avoided.

For this test, on 16 machines, with 2 processors per machine, the StarCD run was about 4x better on the TOE/RDMA Ammasso card than it was over this exact same infrastructure without the TOE/RDMA. Every MPI application we ran showed some similar behavior (Fluent, etc).

As Ammasso is out of business, this is sadly nothing we could really use these days.

Mark Hahn and others pointed out that the CBA for this may not work well, and I agree. The cost of TOE/RDMA honestly does not look like it provides significant benefits in HPC relative to other technologies. There may be some specific corner cases where it does, but I think the hardware has improved, and baseline SDR IB is quite competitive with TOE that using TOE may not make much sense in many situations.

(Obviously I'm not including Myricom in this bucket: they do publish
microbenchmarks.)

-- greg


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to