We are facing the same issues stated as SLURM Bug 4717 [1]. Moe Jette clarified that in the old gres.conf manpage the statements about Cores= parameter that "only the identified cores can be allocated with each generic resource" is wrong, and the core binding setting is only advisory by default.
We have been using SLURM version 17.02.10. CPUs= parameter is used instead of Cores= parameter. I suppose that they have exactly the same effect. However, it seems that CPUs= does enforce CPU binding on our cluster. If one user allocates all cores bind to a GPU, then the GPU will become unallocatable. (So --gres-flags=enforce-binding is just the default behaviour and I haven't find a way to disable it.) I checked RELEASE_NOTES of 17.02 and 17.11, but didn't find any clue about the behaviour change. Can anyone tell if this is the expected behaviour in version 17.02.10, or if it is a bug hasn't been fixed in 17.02.10, or something went wrong with our configuration? Here is our gres.conf: NodeName=wmc-slave-g[1-3] Name=gpu File=/dev/nvidia[0-1] CPUs=0-11 NodeName=wmc-slave-g[1-3] Name=gpu File=/dev/nvidia[2-3] CPUs=12-23 [1] https://bugs.schedmd.com/show_bug.cgi?id=4717 -- 崔灏 / CUI Hao Homepage: i-yu.me Twitter: @cuihaoleo