[slurm-users] Advice on managing GPU cards using SLURM

2018-03-05 Thread david baker
Hello, I'm sure that this question has been asked before. We have recently added some GPU nodes to our SLURM cluster. There are 10 nodes each providing 2 * Tesla V100-PCIE-16GB cards There are 10 nodes each providing 4 * GeForce GTX 1080 Ti cards I'm aware that the simplest way to manage t

Re: [slurm-users] SLURM_JOB_NODELIST not available in prolog / epilog scripts

2018-03-05 Thread John Hearns
Dan, thankoyu very much for a comprehensive and understandable reply. On 5 March 2018 at 16:28, Dan Jordan wrote: > John/Chris, > > Thanks for your advice. I'll need to do some reading on cgroups, I've > never even been exposed to that concept. I don't even know if the SLURM > setup I have acce

Re: [slurm-users] SLURM_JOB_NODELIST not available in prolog / epilog scripts

2018-03-05 Thread Dan Jordan
John/Chris, Thanks for your advice. I'll need to do some reading on cgroups, I've never even been exposed to that concept. I don't even know if the SLURM setup I have access to has the cgroups or PAM plugin/modules enabled/available. Unfortunately I'm not involved in the administration of SLURM,

[slurm-users] Advice on managing GPU cards using SLURM

2018-03-05 Thread Baker D . J .
Hello, I'm sure that this question has been asked before. We have recently added some GPU nodes to our SLURM cluster. There are 10 nodes each providing 2 * Tesla V100-PCIE-16GB cards There are 10 nodes each providing 4 * GeForce GTX 1080 Ti cards I'm aware that the simplest way to manage these