Re: [slurm-users] HA for slurmdbd

2022-02-15 Thread Xand Meaden
OK, feeling a bit silly about having sent this after re-re-reading the man page for slurm.conf... and discovering the AccountingStorageBackupHost setting. Sorry for wasting the time of anyone who read that :) Xand* * On 15/02/2022 15:46, Xand Meaden wrote: Hello, I'm wondering what others a

Re: [slurm-users] HA for slurmdbd

2022-02-15 Thread Brian Andrus
There hasn't been as much effort to make slurmdbd as resilient as you are hinting at because there has been no need. The database itself can be made resilient for keeping the data safe. Data that is unable to go in to the database is cached until it becomes available, even if that is to failov

Re: [slurm-users] Limiting srun to a specific partition

2022-02-15 Thread Ewan Roche
It doesn’t affect the use case of connecting via srun afterwards as no new job is submitted so the job_submit.lua logic is never called. $ srun --pty /bin/bash srun: error: submit_job: ERROR: interactive jobs are not allowed in the CPU or GPU partitions. Use the interactive partition srun: error

Re: [slurm-users] Limiting srun to a specific partition

2022-02-15 Thread Tina Friedrich
...that would interfere with users 'logging in' to a job to check on it though, wouldn't it? I mean we do have pam_slurm_adopt configured but I still tell people it's preferable to use 'srun --jobid= --pty /bin/bash' to check what a specific job is doing as pam_slurm_adopt doesn't seem to i

Re: [slurm-users] Limiting srun to a specific partition

2022-02-15 Thread Ewan Roche
Hi Peter, as Rémi said, the way to do this in Slurm is via a job submit plugin. For example in our job_submit.lua we have if (job_desc.partition == "cpu" or job_desc.partition == "gpu") and job_desc.qos ~= "admin" then if job_desc.script == nil or job_desc.script == '' then slurm.l