Re: [Beowulf] Again about NUMA (numactl and taskset)

Håkon Bugge Thu, 26 Jun 2008 02:15:07 -0700

At 01:23 25.06.2008, Chris Samuel wrote:

> IMHO, the MPI should virtualize these resources
> and relieve the end-user/application programmer
> from the burden.


IMHO the resource manager (Torque, SGE, LSF, etc) should
be setting up cpusets for the jobs based on what the
scheduler has told it to use and the MPI shouldn't
get a choice in the matter. :-)

I am inclined to agree with you in a perfectworld. But, from my understanding the resourcemanagers does not know the relationship betweenthe cores. E.g., does core 3 and core 5 share acache? Do they share a north-bridge bus, or arethey located on different sockets?

This is information we're using to optimize howpnt-to-pnt communication is implemented. Thecode-base involved is fairly complicated and I donot expect resource management systems to cope with it.

I posted some measurement of the benefit of thismethods some time ago and I include it here as areference:http://www.scali.com/info/SHM-perf-8bytes-2007-12-20.htmIf you look at the ping-ping numbers, you will sea nearly constant message rate, independent ofplacement of the processes. This contrary toother MPIs which (apparently) does not use this technique.


So, in a practical world I go for performance, not perfect layering ;-)

Also helps when newbies run OpenMP codes thinking they're
single CPU codes and get 3 or 4 on the same 8 CPU node.


Not sure I read you here. Do you mean pure OMP or hybrid models?



Thanks, Håkon


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Again about NUMA (numactl and taskset)

Reply via email to