At 01:23 25.06.2008, Chris Samuel wrote:
> IMHO, the MPI should virtualize these resources
> and relieve the end-user/application programmer
> from the burden.

IMHO the resource manager (Torque, SGE, LSF, etc) should
be setting up cpusets for the jobs based on what the
scheduler has told it to use and the MPI shouldn't
get a choice in the matter. :-)

I am inclined to agree with you in a perfect world. But, from my understanding the resource managers does not know the relationship between the cores. E.g., does core 3 and core 5 share a cache? Do they share a north-bridge bus, or are they located on different sockets?

This is information we're using to optimize how pnt-to-pnt communication is implemented. The code-base involved is fairly complicated and I do not expect resource management systems to cope with it.

I posted some measurement of the benefit of this methods some time ago and I include it here as a reference: http://www.scali.com/info/SHM-perf-8bytes-2007-12-20.htm If you look at the ping-ping numbers, you will se a nearly constant message rate, independent of placement of the processes. This contrary to other MPIs which (apparently) does not use this technique.

So, in a practical world I go for performance, not perfect layering ;-)

Also helps when newbies run OpenMP codes thinking they're
single CPU codes and get 3 or 4 on the same 8 CPU node.

Not sure I read you here. Do you mean pure OMP or hybrid models?



Thanks, Håkon


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to