On Thu, 2021-05-06 at 08:58 +0000, Williams, Gareth (IM&T, Black Mountain) wrote: > ROCR_VISIBLE_DEVICES Is the closer analogy. GPU_DEVICE_ORDINAL is in > principle more generic (though does have GPU in the name). OpenCL > could in principle (can!) run on other devices which could/can have > more exotic topology, but for the sake of simplicity are likely to be > presented as a list of devices... > > Gareth
Here is a ROCm issue discussion on device selection: https://github.com/RadeonOpenCompute/ROCm/issues/994 ROCm also has a different way to select devices by serial number using the rocm-smi interface, this approach is much more reliable than using device ordinals: https://rocmdocs.amd.com/en/latest/ROCm_System_Managment/ROCm-SMI-CLI.html?highlight=showuniqueid > -----Original Message----- > From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf > Of Valerio Bellizzomi > Sent: Thursday, 6 May 2021 6:35 PM > To: slurm-users@lists.schedmd.com > Subject: Re: [slurm-users] CUDA vs OpenCL > > On Thu, 2021-05-06 at 08:00 +0000, Williams, Gareth (IM&T, Black > Mountain) wrote: > > The post has me thinking so I did a little searching... AMD have > > an > > offering that supports OpenCL and they are not NVIDIA. They use a > > different approach: > > https://rocmdocs.amd.com/en/latest/Programming_Guides/Opencl-programmi > > ng-guide.html#masking-visible-devices > > Thank you for the pointer. It seems to me that they just name the > variable differently (GPU_DEVICE_ORDINAL) but the approach is the > same. > > > > FWIW I did not yet see anything there about cgroups and enforced > > device visibility/constraints vs playing nicely with environment > > variables. > > Here documentation on device cgroups: > https://rocmdocs.amd.com/en/latest/ROCm_System_Managment/ROCm-System-Managment.html?highlight=device%20cgroups#device-cgroup > > > > For reference, I have no AMD affiliation and little to no direct > > experience. > > > > It is pretty easy to also find what else supports OpenCL > > (Wikipedia?). > > What environment to honor seems to me to mostly be a software > > choice > > and most of the software is from vendors, albeit sometimes being > > open > > source or using on or relying on open source components or layers. > > > > Gareth > > > > -----Original Message----- > > From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf > > Of > > Valerio Bellizzomi > > Sent: Thursday, 6 May 2021 5:21 PM > > To: slurm-users@lists.schedmd.com > > Subject: Re: [slurm-users] CUDA vs OpenCL > > > > On Wed, 2021-04-28 at 10:56 +0200, Valerio Bellizzomi wrote: > > > Greetings, > > > I see here https://slurm.schedmd.com/gres.html#GPU_Management > > > that > > > CUDA_VISIBLE_DEVICES is available for NVIDIA GPUs, what about > > > OpenCL > > > GPUs? > > > > > > Is there an OPENCL_VISIBLE_DEVICES ? > > > > > > > > > > Lack of followup lets me conclude that there isn't an OpenCL > > equivalent of CUDA_VISIBLE_DEVICES. It is unfortunate that this > > open > > source software is committed to a single gpu supplier. > > > >