Hello, yesterday we upgrade our cluster from Slurm 20.02.2 to 20.02.5 and recognized some problems with the usage of gpus and more than one cpu per task. I could reproduce that problem in a little Docker container, which description you could find on the following link. https://github.com/bikerdanny/docker-centos-slurm/tree/gres-bug <https://github.com/bikerdanny/docker-centos-slurm/tree/gres-bug>
I created a separate branch (gres-bug) for reproducing that problem, please checkout the README.md. Could anybody of you tell me, what do we wrong, how can we solve that problem? We also found out that using „--cpus-per-gpu“ instead of „--cpus-per-task“ works with more than 1. Kind regards and stay healthy Danny Rotscher
smime.p7s
Description: S/MIME cryptographic signature