Yes GrpTRESRunMins is what I meant.
And thank you also for the solution, I hadn't tried that syntax.
Interesting that GrpCPURunMins works while GrpMemRunMins does not.
I also noticed that if the limit is specified as
GrpTRESRunMins=Memory=1000,Cpu=2000 only the CPU portion takes effect --
th
Diego,
I'm *guessing* that you are tripping over the use of "--tasks 32" on a
heterogeneous cluster, though your comment about the node without InfiniBand
troubles me. If you drain that node, or exclude it in your command line, that
might correct the problem. I wonder if OMPI and PMIx have deci
Hello all.
I already tried for some weeks to debug this problem, but it seems I'm
still missing something.
I have a small, (very) heterogeneous cluster. After upgrading to Debian
10 and packaged versions of Slurm and IB drivers/tools, I noticed that
*sometimes* jobs requesting 32 or more threads f
Hi Geoffrey,
I'm just curious as to what causes a user to decide that a given node
has an issue? If a node is healthy in all respects, why would a user
decide not to use the node?
We can certainly perform all sorts of node health checks from Slurm by
configuring the use of LBNL Node Health