Re: [slurm-users] pam_slurm_adopt not working for all users

2021-05-26 Thread Ole Holm Nielsen
Hi Loris, On 5/27/21 8:19 AM, Loris Bennett wrote: Regarding keys vs. host-based SSH, I see that host-based would be more elegant, but would involve more configuration. What exactly are the simplification gains you see? I just have a single cluster and naively I would think dropping a script in

Re: [slurm-users] pam_slurm_adopt not working for all users

2021-05-26 Thread Loris Bennett
Hi Michael, Michael Jennings writes: > On Tuesday, 25 May 2021, at 14:09:54 (+0200), > Loris Bennett wrote: > >> I think my main problem is that I expect logging in to a node with a job >> to work with pam_slurm_adopt but without any SSH keys. My assumption >> was that MUNGE takes care of the a

Re: [slurm-users] Upgrading slurm - can I do it while jobs running?

2021-05-26 Thread Ole Holm Nielsen
On 26-05-2021 20:23, Will Dennis wrote: About to embark on my first Slurm upgrade (building from source now, into a versioned path /opt/slurm// which is then symlinked to /opt/slurm/current/ for the “in-use” one…) This is a new cluster, running 20.11.5 (which we now know has a CVE that was fixe

Re: [slurm-users] Upgrading slurm - can I do it while jobs running?

2021-05-26 Thread Will Dennis
On Wednesday, May 26, 2021 at 2:49 PM Ole Holm Nielsen said: > I recommend strongly to read the SchedMD presentations in the > [snipped] page, especially the "Field > notes" documents. The latest one is "Field Notes 4: From The Frontlines > of Slurm Support", Jason Booth, SchedMD. Yes, thanks fo

Re: [slurm-users] Upgrading slurm - can I do it while jobs running?

2021-05-26 Thread Will Dennis
Yup, in our case, it would be 20.11.5 -> 20.11.7. From: slurm-users on behalf of Paul Edmon Date: Wednesday, May 26, 2021 at 2:59 PM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Upgrading slurm - can I do it while jobs running? We generally pause scheduling during upgrades out

Re: [slurm-users] Upgrading slurm - can I do it while jobs running?

2021-05-26 Thread Paul Edmon
We generally pause scheduling during upgrades out of paranoia more than anything.  What that means is that we set all our partitions to DOWN and suspend all the jobs.  Then we do the upgrade.  That said I know of people who do it live with out much trouble. The risk is more substantial for maj

Re: [slurm-users] Upgrading slurm - can I do it while jobs running?

2021-05-26 Thread Antony Cleave
Short answer yes Its not risk free but as long as you increase all the timeouts to your worst case estimate x4 and make sure you understand the upgrades section of this link https://slurm.schedmd.com/quickstart_admin.html And keep it open for reference you should be fine Antony On Wed, 26 May 2

[slurm-users] Upgrading slurm - can I do it while jobs running?

2021-05-26 Thread Will Dennis
Hi all, About to embark on my first Slurm upgrade (building from source now, into a versioned path /opt/slurm// which is then symlinked to /opt/slurm/current/ for the “in-use” one…) This is a new cluster, running 20.11.5 (which we now know has a CVE that was fixed in 20.11.7) but I have resear

Re: [slurm-users] Cluster usage, filtered by partition

2021-05-26 Thread Alan Orth
Hi, Every year or so a manager/auditor asks to see our cluster usage as well and I use the R scripts from slurm-stats to generate them: https://github.com/CSCfi/slurm-stats This will give you a nice CSV with lots of data. Hasn't been updated in a few years, but works with R/4.0 last time I tried