Re: [slurm-users] Job are pending when plenty of resources available

2020-03-30 Thread Marcus Wagner
Hi Mike, but that would mean, that 409978 requests nearly the whole cluster. I'm wondering for what resources it waits. Yet, there are nearly 32000 nodes idle. I would assume, such one node job would fit. But you are right, depends on the higher prio job. Best Marcus On 3/30/20 3:47 PM, Ren

[slurm-users] MaxTime and partition config

2020-03-30 Thread Sajesh Singh
CentOS 7.7 Slurm 20.02 Having a bit of a time with jobs that are configured with a walltime of more than 365 days. The job is accepted for run, but the squeue -l output shows the TIME_LIMIT is INVALID. If I look at the job through scontrol it shows the correct TimeLImit. Any ideas as to what c

Re: [slurm-users] Job are pending when plenty of resources available

2020-03-30 Thread Renfro, Michael
All of this is subject to scheduler configuration, but: what has job 409978 requested, in terms of resources and time? It looks like it's the highest priority pending job in the interactive partition, and I’d expect the interactive partition has a higher priority than the regress partition. As

Re: [slurm-users] DefMemPerGPU bug?

2020-03-30 Thread Bas van der Vlies
We have the same issue see: * https://bugs.schedmd.com/show_bug.cgi?id=8527 * temporary fix we switched back to DefMemPerCpu regards On 26/03/2020 16:42, Wayne Hendricks wrote: When using 20.02/cons_tres and defining DefMemPerGPU, jobs submitted that request GPUs without defining “—mem” will