Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Chip Seraphine
We actually have a bunch of people doing that. It’s fine for “I want to just run squeue and see how busy the cluster is without having to head over there and look”, but it starts to break down with “my GUI framework runs squeue and sacct to check job status every few seconds, all day long, per

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Davide DelVento
> > Having a large number of researchers able to run arbitrary code on the > same submit host has a marked tendency to result in an overloaded host. > There are various ways to regulate that ranging from "constant scolding" to > "aggressive quotas/cgroups/etc", but all involve some degree of > inco

Re: [slurm-users] How to delay the start of slurmd until Infiniband/OPA network is fully up?

2023-11-10 Thread Max Rutkowski
Hi Ward, Am 10.11.2023 um 19:45 schrieb Ward Poelmans: Hi Ole, On 10/11/2023 15:04, Ole Holm Nielsen wrote: On 11/5/23 21:32, Ward Poelmans wrote: Yes, it's very similar. I've put our systemd unit file also online on https://gist.github.com/wpoely86/cf88e8e41ee885677082a7b08e12ae11 This mig

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Jared Baker
At the risk of going a bit off the rails and alternative to the REST method (maybe), but not too far as we've been thinking of alternative ways for similar things (not slurm). Anyway, SSH certificates with a forced command entry and wrapper for slurm commands on submit hosts along with small wrappe

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Chip Seraphine
On 11/10/23, 2:25 AM, "slurm-users on behalf of Loris Bennett" mailto:slurm-users-boun...@lists.schedmd.com> on behalf of loris.benn...@fu-berlin.de > wrote: >> Basically, we want to create a subset of the s* commands that can be >> run from some arbitrary mach

Re: [slurm-users] How to delay the start of slurmd until Infiniband/OPA network is fully up?

2023-11-10 Thread Ward Poelmans
Hi Ole, On 10/11/2023 15:04, Ole Holm Nielsen wrote: On 11/5/23 21:32, Ward Poelmans wrote: Yes, it's very similar. I've put our systemd unit file also online on https://gist.github.com/wpoely86/cf88e8e41ee885677082a7b08e12ae11 This might disturb the logic in waitforib.sh, or at least cause

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Chip Seraphine
> what are the security concerns? The cluster is shared between some business units that do not want to share data, so if we install the munge key on a machine that users have administrative or physical access to it could become compromised. This could allow them to run jobs on the cluster as

[slurm-users] JobState of RaisedSignal:53 Real-time_signal_19; slurm 23.02.4

2023-11-10 Thread Robert Kudyba
The user is launching a Singularity container for RStudio and the final option for --rsession-path does not exist. scontrol show job 420719 JobId=420719 JobName=r2.sbatch UserId=ouruser(552199) GroupId=user(500) MCS_label=N/A Priority=1428 Nice=0 Account=ouracct QOS=xxx JobState=FAILED Reaso

Re: [slurm-users] How to delay the start of slurmd until Infiniband/OPA network is fully up?

2023-11-10 Thread Ole Holm Nielsen
Hi Ward, On 11/5/23 21:32, Ward Poelmans wrote: Yes, it's very similar. I've put our systemd unit file also online on https://gist.github.com/wpoely86/cf88e8e41ee885677082a7b08e12ae11 This looks really good! However, I was testing the waitforib.sh script on a SuperMicro server WITHOUT Infini

Re: [slurm-users] REST-based CLI tools out there somewhere?

2023-11-10 Thread Loris Bennett
Chip Seraphine writes: > Hello, > > Our users submit their jobs from shared submit hosts, and have > expressed an understandable preference for being able to submit > directly from their own workstations. The obvious solution > (installing the slurm client on their workstations, or providing a >