Whether slurm can be configured to use the multiple gpu instance of the A100.
I can’t add multiple same devices into gres.conf, this is supported in slurm
18, but not supported in slurm 20.
cat gres.conf
Name=gpu File=/dev/nvidia0
Name=gpu File=/dev/nvidia0
Name=gpu File=/dev/nvidia0
Name=gpu Fi
Hi,
I'm new to slurm (as admin) and I need some help. Testing my initial
setup with:
[begou@tenibre ~]$ *salloc -n 1 sh*
salloc: Granted job allocation 11
sh-4.4$ *squeue*
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
Looking into this more it looks like memory.max_usage_in_byte and
memory.usage_in_bytes also count file cache. Which is very surprising and
not at all useful. But total_rss in memory.stat shows a more correct
number. Looking at that one for a real job gives me around 30 GB, which
matches my other d
This looks like it may be trying to do something using mpi.
What does your slurm.conf look like for that node?
Brian Andrus
On 11/10/2020 2:54 AM, Patrick Bégou wrote:
Hi,
I'm new to slurm (as admin) and I need some help. Testing my initial
setup with:
[begou@tenibre ~]$ *salloc -n 1