MPS only works for the first GPU in a system. If you have a server with 
multiple GPUs, you can only share the first GPU between multiple jobs.

Sharding, on the other hand, works for all GPU's in system. Not that sharding 
is soft, Slurm will not monitor the actual GPU use, so jobs will have to 
respect the requested resources.

Sharding works great in our setup (3 servers with 8, 6 and 4 Nvidia GPUs, 
respectively + a few smaller single GPU boxes). We mainly use 1 shard = 1GB of 
GPU memory, but other setups may be used.

Cheers,

Esben



________________________________
From: EPF (Esben Peter Friis) <e...@novozymes.com>
Sent: Friday, February 3, 2023 17:03
To: EPF (Esben Peter Friis) <e...@novozymes.com>
Subject: Fw: [slurm-users] GPU: MPS vs Sharding


________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Ward 
Poelmans <ward.poelm...@vub.be>
Sent: Wednesday, January 25, 2023 13:19
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] GPU: MPS vs Sharding

Hi,

Slurm 22.05 has a new thing called GPU sharding that allows a single GPU to be 
used by multiple jobs at once. As far as I understood the major difference with 
the MPS approach is that this should generic (not tied to NVidia technology).

Has anyone tried it out? Does it work well? Any caveats or downsides compared 
to MPS?

Thanks,


Ward

Reply via email to