----- "Rayson Ho" <[EMAIL PROTECTED]> wrote: > I am working on adding processor affinity support for serial and > parallel jobs for Grid Engine, and I am working with the OpenMPI > developers to define an interface.
FWIW the Torque approach (currently in trunk in SVN) is to not use cpu affinity but instead use the cpuset support in most modern Linux kernels. So once you've got /dev/cpuset created and have mounted the VFS with "mount -t cpuset - /dev/cpuset" the new pbs_mom will automatically create (if it doesn't already exist) a "torque" cpuset with all the CPUs in it. It then creates job cpusets beneath that for each job and a "vnode" (aka per-process) cpuset for each process created. So, on an 8 core box running a 4 CPU MPI job you'd end up with: /dev/cpuset/torque (8 cores) /dev/cpuset/torque/1.cluster-m.foo.edu/ (4 cores) /dev/cpuset/torque/1.cluster-m.foo.edu/1/ (1 core) /dev/cpuset/torque/1.cluster-m.foo.edu/2/ (1 core) /dev/cpuset/torque/1.cluster-m.foo.edu/3/ (1 core) /dev/cpuset/torque/1.cluster-m.foo.edu/4/ (1 core) SMP processes would end up in the job set whereas processes launched via PBS's TM API would end up in their appropriate vnode set. So if a user launches what they think is a single CPU serial job that actually turns out to be a code that detects how many cores are in a system and then uses all of them it will no longer affect other users code on the system - their job will just take a hammering instead! :-) cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf