A non-intrusive test you could try is to replace your MPI (mpich) with a
lower-latency one. Scali or MPI/Gamma are just to name two. These can lower
your latency down to 15muS or so.

gamma is highly hardware dependent.  does scali really provide a latency
improvement independent of hardware?

If this drastically ups your efficiency you know where your bottleneck is.

indeed.  but another alternative is to find a _SLOWER_ MPI implementation.
in fact, I wonder if there's a handy place in, say, mpich, to put a simple
usleep() for this purpose.  perhaps just enable tracing.

usleep as a tool for performance characterization!
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to