After 9 months of development and testing we are pleased to announce the
availability of Slurm version 18.08.0!
Downloads are available from https://www.schedmd.com/downloads.php.
Thank you to all customers, partners, and community members who
contributed to getting this release done. A list o
On 08/29/2018 04:59 PM, Chris Samuel wrote:
On Thursday, 30 August 2018 12:45:51 AM AEST Brian W. Johanson wrote:
In your example, you do not have enough memory for both sruns at the same
time.
Nice spot, I think I was thinking in mem-per-task (which doesn't exist) then!
Unfortunately fixing
Sorry, for the variable replacement, for 17.02, even I don’t set
CUDA_VISIBLE_DEVICES=NoDevFiles in the srun, it is the same result.
From: slurm-users On Behalf Of Chaofeng
Zhang
Sent: Friday, August 31, 2018 12:16 AM
To: Slurm User Community List
Subject: [External] Re: [slurm-users] Wheth
I found it is a bug in slurm 17.11.7, if I run the same command in 17.02, it
can be replaced, the below is the command run under slurm 17.02
[root@head ~]# export CUDA_VISIBLE_DEVICES=0,1
[root@head ~]# srun -N1 -n1 --nodelist=head
--export=CUDA_VISIBLE_DEVICES=NoDevFiles,ALL env|grep CUDA
CUDA_
export CUDA_VISIBLE_DEVICES=0,1
srun -N1 -n1 --nodelist=head --export=CUDA_VISIBLE_DEVICES=NoDevFiles,ALL
env|grep CUDA
The srun result is CUDA_VISIBLE_DEVICES=0,1, how could I replace
CUDA_VISIBLE_DEVICES with NoDevFiles.
Thanks.
$ export CUDA_VISIBLE_DEVICES=0,1; srun -N 1 -n 1 --gres=none -p GPU
/usr/bin/env |grep CUDA
CUDA_VISIBLE_DEVICES=0,1
This result should be CUDA_VISIBLE_DEVICES=NoDevFiles, and it really is
NoDevFiles in 17.02. So this must be a bug in 17.11.7.
From: slurm-users On Behalf Of Brian W.
Johanso
My case, gpu resource is defined in the job file #SBATCH --gres=gpu:2, so when
I using srun, the CUDA_VISBLE_DEVICES=0,1 is already in the shell, I just want
to set CUDA_VISIBLE_DEVICES=NoDevFiles in one specific srun, it can not work in
the 17.11.7.
But it work in 17.02
#!/bin/bash
#SBATCH
and to answer "CUDA_VISBLE_DEVICES can't be set NoDevFiles in Slurm 17.11.7"
CUDA_VISIBLE_DEVICES is unset if --gres=none and if set in the user's
environment, it will remains set to whatever. If you want really want to see
NoDevFIles, set it in /etc/profile.d, it will get clobbered when the r
sbatch --wrap="command --args" is similar to what you're looking for.
Ryan
On 08/30/2018 09:12 AM, Anson Abraham wrote:
In Sun Grid Engine, there's an option (parameter) of -b
"Gives the user the possibility to indicate explicitly whether command
should be treated as binary or script. If the v
In Sun Grid Engine, there's an option (parameter) of -b
"Gives the user the possibility to indicate explicitly whether command
should be treated as binary or script. If the value of -b is 'y', then
command may be a binary or script. "
I can not find that equivalent for slurm. is there an option
I also remember there being write-only permissions involved when working
with cgroups and devices .. which bent my head slightly..
On Thu, 30 Aug 2018 at 17:02, John Hearns wrote:
> Chaofeng, I agree with what Chris says. You should be using cgroups.
>
> I did a lot of work with cgroups anf GPU
Chaofeng, I agree with what Chris says. You should be using cgroups.
I did a lot of work with cgroups anf GPUs in PBSPro (yes I know...
splitter!)
With cgroups you only get access to the devices which are allocated to that
cgroup, and you get CUDA_VISIBLE_DEVICES set for you.
Remember also to lo
Chris’ method will set CUDA_VISIBLE_DEVICES like you’re used to, and it will
help keep you or your users from picking conflicting devices.
My cgroup/GPU settings from slurm.conf:
=
[renfro@login ~]$ egrep -i '(cgroup|gpu)' /etc/slurm/slurm.conf | grep -v '^#'
ProctrackType=proctrack/cgroup
CUDA_VISBLE_DEVICES is used by many AI framework to determine which gpu to use,
like tensorflow. So this environment is critical to us.
-Original Message-
From: slurm-users On Behalf Of Chris
Samuel
Sent: Thursday, August 30, 2018 4:42 PM
To: slurm-users@lists.schedmd.com
Subject: [Exte
On Thursday, 30 August 2018 6:38:08 PM AEST Chaofeng Zhang wrote:
> The CUDA_VISBLE_DEVICES can't be set NoDevFiles in Slurm 17.11.7. This is
> worked when we use Slurm 17.02.
You probably should be using cgroups instead to constrain access to GPUs.
Then it doesn't matter what you set CUDA_VIS
The CUDA_VISBLE_DEVICES can't be set NoDevFiles in Slurm 17.11.7. This is
worked when we use Slurm 17.02.
Slurm 17.02:
[root@head ~]# export CUDA_VISIBLE_DEVICES=0,1
[root@head ~]# srun -N1 -n1 --gres=none --nodelist=head /usr/bin/env|grep CUDA
CUDA_HOME=/usr/local/cuda
CUDA_VISIBLE_DEVICES=NoD
16 matches
Mail list logo