I look after a very heterogeneous GPU Slurm setup and some nodes have quite few cores. We use a job_submit lua script which calculates the number of requested cpu cores per gpu. This is then used to scan through a table of 'weak nodes' based on a 'max cores per gpu' property. The node names are appended to the job desc exc_nodes property.
It's not particularly elegant but it does work quite well for us. Aaron On 20 October 2020 at 18:17 BST, Relu Patrascu wrote: > Hi all, > > We have a GPU cluster and have run into this issue occasionally. Assume > four GPUs per node; when a user requests a GPU on such a node, and all > the cores, or all the RAM, the other three GPUs will be wasted for the > duration of the job, as slurm has no more cores or RAM available to > allocate those GPUs to subsequent jobs. > > > We have a "soft" solution to this, but it's not ideal. That is, we > assigned large TresBillingWeights to cpu consumption, thus discouraging > users to allocate many CPUs. > > > Ideal for us would be to be able to define a number of CPUs to always be > available on a node, for each GPU. Would help to a similar feature for > an amount of RAM. > > > Take for example a node that has: > > * four GPUs > > * 16 CPUs > > > Let's assume that most jobs would work just fine with a minimum number > of 2 CPUs per GPU. Then we could set in the node definition a variable > such as > > CpusReservedPerGpu = 2 > > The first job to run on this node could get between 2 and 10 CPUs, thus > 6 CPUs remaining for potential incoming jobs (2 per GPU). > > > We couldn't find a way to do this, are we missing something? We'd rather > not modify the source code again :/ > > Regards, > > Relu -- Research Fellow School of Computer Science University of Nottingham This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.