Hi Sajesh,
On 10/8/20 4:18 pm, Sajesh Singh wrote:
Thank you for the tip. That works as expected.
No worries, glad it's useful. Do be aware that the core bindings for the
GPUs would likely need to be adjusted for your hardware!
Best of luck,
Chris
--
Chris Samuel : http://www.csamuel
Christopher,
Thank you for the tip. That works as expected.
-SS-
-Original Message-
From: slurm-users On Behalf Of
Christopher Samuel
Sent: Thursday, October 8, 2020 6:52 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] CUDA environment variable not being set
On 10/8/20 3:48 pm, Sajesh Singh wrote:
Thank you. Looks like the fix is indeed the missing file
/etc/slurm/cgroup_allowed_devices_file.conf
No, you don't want that, that will allow all access to GPUs whether
people have requested them or not.
What you want is in gres.conf and looks lik
] CUDA environment variable not being set
EXTERNAL SENDER
Hi Sajesh,
On 10/8/20 11:57 am, Sajesh Singh wrote:
> debug: common_gres_set_env: unable to set env vars, no device files
> configured
I suspect the clue is here - what does your gres.conf look like?
Does it list the devices i
Hi Sajesh,
On 10/8/20 11:57 am, Sajesh Singh wrote:
debug: common_gres_set_env: unable to set env vars, no device files
configured
I suspect the clue is here - what does your gres.conf look like?
Does it list the devices in /dev for the GPUs?
All the best,
Chris
--
Chris Samuel : http:/
|
+-+
--
-SS-
*From:* slurm-users *On Behalf
Of *Relu Patrascu
*Sent:* Thursday, October 8, 2020 4:26 PM
*To:* slurm-users@lists.schedmd.com
*Subject:* Re: [slurm-users] CUDA environment variable not
Yes. It is located in the /etc/slurm directory
--
-SS-
From: slurm-users On Behalf Of Brian
Andrus
Sent: Thursday, October 8, 2020 5:02 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] CUDA environment variable not being set
EXTERNAL SENDER
do you have your gres.conf on the
do you have your gres.conf on the nodes also?
Brian Andrus
On 10/8/2020 11:57 AM, Sajesh Singh wrote:
Slurm 18.08
CentOS 7.7.1908
I have 2 M500 GPUs in a compute node which is defined in the
slurm.conf and gres.conf of the cluster, but if I launch a job
requesting GPUs the environment vari
History
200 Central Park West
New York, NY 10024
(O) (212) 313-7263
(C) (917) 763-9038
(E) ssi...@amnh.org
From: slurm-users On Behalf Of Renfro,
Michael
Sent: Thursday, October 8, 2020 4:53 PM
To: Slurm User Community List
Subject: Re: [slurm-users] CUDA environment variable not being set
User Community List
Subject: Re: [slurm-users] CUDA environment variable not being set
External Email Warning
This email originated from outside the university. Please use caution when
opening attachments, clicking links, or responding to requests.
It seems as
: slurm-users On Behalf Of Relu
Patrascu
Sent: Thursday, October 8, 2020 4:26 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] CUDA environment variable not being set
EXTERNAL SENDER
That usually means you don't have the nvidia kernel module loaded, probably
because there
That usually means you don't have the nvidia kernel module loaded,
probably because there's no driver installed.
Relu
On 2020-10-08 14:57, Sajesh Singh wrote:
Slurm 18.08
CentOS 7.7.1908
I have 2 M500 GPUs in a compute node which is defined in the
slurm.conf and gres.conf of the cluster, b
Slurm 18.08
CentOS 7.7.1908
I have 2 M500 GPUs in a compute node which is defined in the slurm.conf and
gres.conf of the cluster, but if I launch a job requesting GPUs the environment
variable CUDA_VISIBLE_DEVICES Is never set and I see the following messages in
the slurmd.log file:
debug: co
13 matches
Mail list logo