----- "Paul Jackson" <[EMAIL PROTECTED]> wrote: Hi Paul,
> Chris wrote: > > The 2.6 cpuset support in Torque came out of a long > > Would you have any pointers to some more details of what you've > done here? Sure - mostly it was discussed on the torquedev list after the initial discussion at SC'07, Garrick started the thread here: http://www.supercluster.org/pipermail/torquedev/2007-November/000748.html His announcement of the initial implementation, along with notes on the differences from the plan is here: http://www.supercluster.org/pipermail/torquedev/2008-January/000842.html There is a Wiki page on it too, but that isn't up to date as it doesn't mention not using the per-vnode/core cpusets due to the OpenMPI issues. http://www.clusterresources.com/wiki/doku.php?id=torque:3.5_linux_cpuset_support > I'm the maintainer, and one of the authors, of Linux 2.6 > cpusets, and would like to do what I can with cpusets to make life > easier (or at least no more painful) for cluster and MPI folks. Wonderful! First of all thanks so much for the code. The only major issue we've come across is not due to cpusets themselves but just the way that things like OpenMPI tend to work in that they send launch a single process per node via the MPI launcher and then that forks off all the child processes necessary. This means it's not easy to lock MPI tasks to cores via this method, and it's also not trivial for the MPI program to be able to work out what cores it can try and bind itself to via setaffinity(). > My background comes more from the "big honkin NUMA iron" > running a Single System Image on 100's or 1000's of CPUs > (SGI Irix/Origin and later Linux/Altix), which was the > "country of origin" for cpusets, so my interest (and > ignorance) in asking this question is more to gain > an understanding of how cpusets have been adapted to > clusters, as I understand less well the needs of clusters, > and what if anything cpusets might do here to be of more use. The main purpose we're using them for is a quick and easy way to catch users who don't know better doing things like running an OpenMP code as a single CPU job and overloading a node (and causing chaos for other users) when it discovers 8 cores. Single CPU jobs get the benefit of being locked to a single core, and even MPI jobs get some benefit in that they can only be migrated between cores they've been allocated. > Totally totally trivial nit -- you wrote: [...] > I prefer in my setups to have that mount command be: > > mount -t cpuset cpuset /dev/cpuset > > so that the mount shows up in the output of the mount(8) command > with 'cpuset' in the mount 'device' field, not 'none'. Thanks for that, much appreciated! cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf