If you have accounting implemented, just set MaxJobs and it will do the
trick:
MaxJobs= The total number of jobs able to run at any given time for the
given association. If this limit is reached new jobs will be queued but
only allowed to run after previous jobs complete from the association.
Greetings!
I am trying to set up a partition that will only allow one job at a time to
run, regardless of who submits it.
So multiple jobs from multiple users can be in the queue. But I only want the
partition to run one at a time.
I also have the need to set up an additional partition with the
Thanks for the replies!
This is exactly what I need.
-tom
From: slurm-users on behalf of Ole Holm
Nielsen
Sent: Friday, October 18, 2019 2:15 PM
To: slurm-users@lists.schedmd.com
Subject: [EXT] Re: [slurm-users] How to find core count per job per node
WARNING
$ scontrol --details show job 1653838
JobId=1653838 JobName=v1.20
...
Nodes=r00g01 CPU_IDs=31-35 Mem=5120 GRES_IDX=
Nodes=r00n16 CPU_IDs=34-35 Mem=2048 GRES_IDX=
Nodes=r00n20 CPU_IDs=12-17,30-35 Mem=12288 GRES_IDX=
Nodes=r01n16 CPU_IDs=15 Mem=1024 GRES_IDX=
thanks for sharing t
Adding the "--details" flag to scontrol lookup of the job:
$ scontrol --details show job 1636832
JobId=1636832 JobName=R3_L2d
:
NodeList=r00g01,r00n09
BatchHost=r00g01
NumNodes=2 NumCPUs=60 NumTasks=60 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=60,mem=60G,node=2,billing=55350
Sock
On 18-10-2019 19:56, Tom Wurgler wrote:
I need to know how many cores a given job is using per node.
Say my nodes have 24 cores each and I run a 36 way job.
It take a node and a half.
scontrol show job id
shows me 36 cores, and the 2 nodes it is running on.
But I want to know how it split the job
I need to know how many cores a given job is using per node.
Say my nodes have 24 cores each and I run a 36 way job.
It take a node and a half.
scontrol show job id
shows me 36 cores, and the 2 nodes it is running on.
But I want to know how it split the job up between the nodes.
Thanks for any inf
Hi Lech,
Thanks for the hint. I didn't know about that option.
Another way would be to just retain the StateSaveLocation files and move those
over to the sandbox in which I've tested the upgrade. Once I copied the files
and re-did the upgrade from scratch, the IDs were consecutive as expected.
Hi Florian,
You can use the FirstJobId option from slurm.conf to continue the JobIds
seamlessly.
Kind Regards,
Lech
> Am 18.10.2019 um 11:47 schrieb Florian Zillner :
>
> Hi all,
>
> we’re using OpenHPC packages to run SLURM. Current OpenHPC Version is 1.3.8
> (SLURM 18.08.8), though we’re
Hi all,
we're using OpenHPC packages to run SLURM. Current OpenHPC Version is 1.3.8
(SLURM 18.08.8), though we're still at 1.3.3 (SLURM 17.02.7), for now.
I've successfully attempted an upgrade in a separate testing environment, which
works fine once you adhere to the upgrading notes... So the
10 matches
Mail list logo