Re: [slurm-users] Are SLURM_JOB_USER and SLURM_JOB_UID always constant and available

2020-05-20 Thread Christopher Samuel
On 5/20/20 7:23 pm, Kevin Buckley wrote: Are they set as part of the job payload creation, and so would ignore and node local lookup, or set as the job gets allocated to the various nodes it will run on? Looking at git, it's a bit of both: src/slurmd/slurmd/req.c: setenvf(&env, "SLUR

[slurm-users] Are SLURM_JOB_USER and SLURM_JOB_UID always constant and available

2020-05-20 Thread Kevin Buckley
Just trying to get our heads around when SLURM_JOB_USER and SLURM_JOB_UID get set. Are they set as part of the job payload creation, and so would ignore and node local lookup, or set as the job gets allocated to the various nodes it will run on? I know they're available to the Prolog script, and

Re: [slurm-users] How does slurm keep track of latest jobid

2020-05-20 Thread Christoph Brüning
I'd assume that the NEXT_JOB_ID comes from one of the files in slurmctld's state directory $ scontrol show config | grep StateSaveLocation Not sure, but I think I have seen the id being reset when this directory got lost once... Best, Christoph On 20/05/2020 16.31, Ole Holm Nielsen wrote:

Re: [slurm-users] How does slurm keep track of latest jobid

2020-05-20 Thread Ole Holm Nielsen
On 5/20/20 7:57 AM, Ole Holm Nielsen wrote: On 20-05-2020 00:03, Flynn, David P. (Dave) wrote: Where does Slurm keep track of the latest jobid.  Since it is persistent across reboots, I suspect it’s in a file somewhere. $ scontrol show config | grep MaxJobId Sorry, I should have written: $

Re: [slurm-users] Overzealous PartitionQoS Limits

2020-05-20 Thread Christoph Brüning
Quick update: When we increase the GrpNodes limit, some of the jobs start running. However, they run on nodes that already have jobs from the "long" partition running. To my understanding, that should node change the node count against which the GrpNodes limit is applied... Best, Christoph

[slurm-users] Overzealous PartitionQoS Limits

2020-05-20 Thread Christoph Brüning
Dear all, we set up a floating partition as described in SLURM's QoS documentation to allow for jobs with a longer than usual walltime on a part of our cluster: QoS with GrpCPUs and GrpNodes limits attached to the longer-walltime partition which contains all nodes. We observe that jobs are s