[slurm-users] DefCpuPerGPU and multiple partitions

2024-04-09 Thread Ansgar Esztermann-Kirchner via slurm-users
Hello List, does anyone have experience with DefCpuPerGPU and jobs requesting multiple partitions? I would expect Slurm to select a partition from those requested by the job, then assign CPUs based on that partition's DefCpuPerGPU. But according to my observations, it appears that (at least someti

Re: [slurm-users] Job flexibility with cons_tres

2021-02-12 Thread Ansgar Esztermann-Kirchner
On Fri, Feb 12, 2021 at 09:47:56AM +0100, Ole Holm Nielsen wrote: > > Could you kindly say where you have found documentation of the > DefaultCpusPerGpu (or DefCpusPerGpu?) parameter. Humph, I shouldn't have written the message from memory. It's actually DefCpuPerGPU (singular). > I'm unable t

Re: [slurm-users] Job flexibility with cons_tres

2021-02-12 Thread Ansgar Esztermann-Kirchner
On Mon, Feb 08, 2021 at 12:36:06PM +0100, Ansgar Esztermann-Kirchner wrote: > Of course, one could use different partitions for different nodes, and > then submit individual jobs with CPU requests tailored to one such > partition, but I'd prefer a more flexible approach where a giv

Re: [slurm-users] Job flexibility with cons_tres

2021-02-10 Thread Ansgar Esztermann-Kirchner
Hi Yair, thank you very much for your reply. I'll keep the points you make in mind while we're evolving our configuration toward something that can be called production-ready. A. -- Ansgar Esztermann Sysadmin Dep. Theoretical and Computational Biophysics http://www.mpibpc.mpg.de/grubmueller/esz

[slurm-users] Job flexibility with cons_tres

2021-02-08 Thread Ansgar Esztermann-Kirchner
Hello List, we're running a heterogeneous cluster (just x86_64, but a lot of different node types from 8 to 64 HW threads, 1 to 4 GPUs). Our processing power (for our main application, at least) is exclusively provided by the GPUs, so cons_tres looks quite promising: depending on the size of the

[slurm-users] incompatible plugin version

2019-05-24 Thread Ansgar Esztermann-Kirchner
Hello List, I'm seeing a version clash when trying to start MPI jobs via srun. In stderr, my executable (mdrun) complains about: mdrun: /usr/lib/x86_64-linux-gnu/slurm/auth_munge.so: Incompatible Slurm plug in version (17.11.9) I've checked my installation, and found nothing that suggests there

Re: [slurm-users] Kinda Off-Topic: data management for Slurm clusters

2019-02-26 Thread Ansgar Esztermann-Kirchner
Hi, I'd like to share our set-up as well, even though it's very specialized and thus probably won't work in most places. However, it's also very efficient in terms of budget when it does. Our users don't usually have shared data sets, so we don't need high bandwidth at any particular point -- the

Re: [slurm-users] How to partition nodes into smaller units

2019-02-11 Thread Ansgar Esztermann-Kirchner
Hi, > On 05.02.19 16:46, Ansgar Esztermann-Kirchner wrote: > > [...]-- we'd like to have two "half nodes", where > > jobs will be able to use one of the two GPUs, plus (at most) half of > > the CPUs. With SGE, we've put two queues on the nodes,

[slurm-users] How to partition nodes into smaller units

2019-02-05 Thread Ansgar Esztermann-Kirchner
Hello List, we're operating a large-ish cluster (about 900 nodes) with diverse hardware. It has been running with SGE for several years now, but the more we refine our configuration, the more we're feeling SGE's limitations. Therefore, we're considering switching to Slurm. The latest challenge i