At 01:23 25.06.2008, Chris Samuel wrote:
> IMHO, the MPI should virtualize these resources
> and relieve the end-user/application programmer
> from the burden.
IMHO the resource manager (Torque, SGE, LSF, etc) should
be setting up cpusets for the jobs based on what the
scheduler has told it to use and the MPI shouldn't
get a choice in the matter. :-)
I am inclined to agree with you in a perfect
world. But, from my understanding the resource
managers does not know the relationship between
the cores. E.g., does core 3 and core 5 share a
cache? Do they share a north-bridge bus, or are
they located on different sockets?
This is information we're using to optimize how
pnt-to-pnt communication is implemented. The
code-base involved is fairly complicated and I do
not expect resource management systems to cope with it.
I posted some measurement of the benefit of this
methods some time ago and I include it here as a
reference:
http://www.scali.com/info/SHM-perf-8bytes-2007-12-20.htm
If you look at the ping-ping numbers, you will se
a nearly constant message rate, independent of
placement of the processes. This contrary to
other MPIs which (apparently) does not use this technique.
So, in a practical world I go for performance, not perfect layering ;-)
Also helps when newbies run OpenMP codes thinking they're
single CPU codes and get 3 or 4 on the same 8 CPU node.
Not sure I read you here. Do you mean pure OMP or hybrid models?
Thanks, Håkon
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf