Hi Taras,
no we have set ConstrainDevices to "yes".
And this is, why CUDA_VISIBLE_DEVICES starts from zero.
Otherwise both below mentioned jobs would have been on one GPU, but as
nvidia-smi shows clearly (did not show the output this time, see earlier
post), both GPUs are used, environment of
Which version of slurm do you use? as slurm 19.05:
* DefCpuPerGPU
{{{
PartitionName=gpu_shared_education DefCpuPerGPU=3 DefMemPerCPU=20900
Default=No DefaultTime=5 DisableRootJobs=YES ExclusiveUser=NO MaxNodes=1
MaxTime=2-0 Nodes=r30n[4] OverSubscribe=FORCE Priority=1000
QOS=p_gpu_shared_educ
Hi,
We would like to enforce a fixed ratio of CPUs to GPUs allocated. To
explain further - when a job is submitted requesting a certain number of
GPUs (using --gres=gpu:n) we would like to fix the number of CPUs that will
be allocated to the job based on the number of GPUs. We would like this to
Hi Marcus,
This may depend on ConstrainDevices in cgroups.conf. I guess it is set to
"no" in your case.
Best regards,
Taras
On Tue, Jun 23, 2020 at 4:02 PM Marcus Wagner
wrote:
> Hi Kota,
>
> thanks for the hint.
>
> Yet, I'm still a little bit astonished, as if I remember right,
> CUDA_VISIBL
Hi Kota,
thanks for the hint.
Yet, I'm still a little bit astonished, as if I remember right,
CUDA_VISIBLE_DEVICES in a cgroup always start from zero. That has been
already years ago, as we still used LSF.
But SLURM_JOB_GPUS seems to be the right thing:
same node, two different users (and t