On Sun, Jun 10, 2018 at 06:46:04PM +1000, Chris Samuel wrote: > On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote: > > > We're a Grid Engine shop, and we have the execd/shepherds place each job in > > its own cgroup with CPU and memory limits in place. > > Slurm has supports cgroups as well (and we use it extensively), the idea here > is more to try and avoid/minimise unnecessary inter-node MPI traffic.
We have very little MPI, but if I had to solve this in GE, I would try to fill up one node before sending jobs to another. The queue sort order (defaults to instance load, but can be set to a simple sequence number) is a general way, while the allocation rule for parallel environments (defaults to round_robin, but can be set to fill_up) is another specific to multi-slot jobs. Not sure the specifics for Slurm, though. -- Skylar _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf