Navin,
Check out 'sprio', this will give show you how the job priority changes
with the weight changes you are making.
-b
On 4/29/20 5:00 AM, navin srivastava wrote:
Thanks Daniel.
All jobs went into run state so unable to provide the details but
definitely will reach out later if we see similar issue.
i am more interested to understand the FIFO with Fair Tree.it will be
good if anybody provide some insight on this combination and also if
we will enable the backfilling here how the behaviour will change.
what is the role of the Fair tree here?
PriorityType=priority/multifactor
PriorityDecayHalfLife=2
PriorityUsageResetPeriod=DAILY
PriorityWeightFairshare=500000
PriorityFlags=FAIR_TREE
Regards
Navin.
On Mon, Apr 27, 2020 at 9:37 PM Daniel Letai <d...@letai.org.il
<mailto:d...@letai.org.il>> wrote:
Are you sure there are enough resources available? The node is in
mixed state, so it's configured for both partitions - it's
possible that earlier lower priority jobs are already running thus
blocking the later jobs, especially since it's fifo.
It would really help if you pasted the results of:
squeue
sinfo
As well as the exact sbatch line, so we can see how many resources
per node are requested.
On 26/04/2020 12:00:06, navin srivastava wrote:
Thanks Brian,
As suggested i gone through document and what i understood that
the fair tree leads to the Fairshare mechanism and based on that
the job should be scheduling.
so it mean job scheduling will be based on FIFO but priority will
be decided on the Fairshare. i am not sure if both conflicts
here.if i see the normal jobs priority is lower than the GPUsmall
priority. so resources are available with gpusmall partition then
it should go. there is no job pend due to gpu resources. the gpu
resources itself not asked with the job.
is there any article where i can see how the fairshare works and
which are setting should not be conflict with this.
According to document it never says that if fair-share is applied
then FIFO should be disabled.
Regards
Navin.
On Sat, Apr 25, 2020 at 12:47 AM Brian W. Johanson
<bjoha...@psc.edu <mailto:bjoha...@psc.edu>> wrote:
If you haven't looked at the man page for slurm.conf, it will
answer most if not all your questions.
https://slurm.schedmd.com/slurm.conf.html but I would depend
on the the manual version that was distributed with the
version you have installed as options do change.
There is a ton of information that is tedious to get through
but reading through it multiple times opens many doors.
DefaultTime is listed in there as a Partition option.
If you are scheduling gres/gpu resources, it's quite possible
there are cores available with no corresponding gpus avail.
-b
On 4/24/20 2:49 PM, navin srivastava wrote:
Thanks Brian.
I need to check the jobs order.
Is there any way to define the default timeline of the job
if user not specifying time limit.
Also what does the meaning of fairtree in priorities in
slurm.Conf file.
The set of nodes are different in partitions.FIFO does not
care for any partitiong.
Is it like strict odering means the job came 1st will go and
until it runs it will not allow others.
Also priorities is high for gpusmall partition and low for
normal jobs and the nodes of the normal partition is full
but gpusmall cores are available.
Regards
Navin
On Fri, Apr 24, 2020, 23:49 Brian W. Johanson
<bjoha...@psc.edu <mailto:bjoha...@psc.edu>> wrote:
Without seeing the jobs in your queue, I would expect
the next job in FIFO order to be too large to fit in the
current idle resources.
Configure it to use the backfill scheduler:
SchedulerType=sched/backfill
SchedulerType
Identifies the type of scheduler to be
used. Note the slurmctld daemon must be restarted for a
change in scheduler type to become effective
(reconfiguring a running daemon has no effect for this
parameter). The scontrol command can be used to
manually change job priorities if desired. Acceptable
values include:
sched/backfill
For a backfill scheduling module to
augment the default FIFO scheduling. Backfill
scheduling will initiate lower-priority jobs if doing so
does not delay the expected initiation time of any
higher priority job. Effectiveness of backfill
scheduling is dependent upon users specifying job time
limits, otherwise all jobs will have the same time limit
and backfilling is impossible. Note documentation for
the SchedulerParameters option above. This is the
default configuration.
sched/builtin
This is the FIFO scheduler which
initiates jobs in priority order. If any job in the
partition can not be scheduled, no lower priority job in
that partition will be scheduled. An exception is made
for jobs that can not run due to partition constraints
(e.g. the time limit) or down/drained nodes. In that
case, lower priority jobs can be initiated and not
impact the higher priority job.
Your partitions are set with maxtime=INFINITE, if your
users are not specifying a reasonable timelimit to their
jobs, this won't help either.
-b
On 4/24/20 1:52 PM, navin srivastava wrote:
In addition to the above when i see the sprio of both
the jobs it says :-
for normal queue jobs all jobs showing the same priority
JOBID PARTITION PRIORITY FAIRSHARE
1291352 normal 15789 15789
for GPUsmall all jobs showing the same priority.
JOBID PARTITION PRIORITY FAIRSHARE
1291339 GPUsmall 21052 21053
On Fri, Apr 24, 2020 at 11:14 PM navin srivastava
<navin.alt...@gmail.com
<mailto:navin.alt...@gmail.com>> wrote:
Hi Team,
we are facing some issue in our environment. The
resources are free but job is going into the QUEUE
state but not running.
i have attached the slurm.conf file here.
scenario:-
There are job only in the 2 partitions:
344 jobs are in PD state in normal partition and
the node belongs from the normal partitions are
full and no more job can run.
1300 JOBS are in GPUsmall partition are in queue
and enough CPU is avaiable to execute the jobs but
i see the jobs are not scheduling on free nodes.
Rest there are no pend jobs in any other partition .
eg:-
node status:- node18
NodeName=node18 Arch=x86_64 CoresPerSocket=18
CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
AvailableFeatures=K2200
ActiveFeatures=K2200
Gres=gpu:2
NodeAddr=node18 NodeHostName=node18 Version=17.11
OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17
07:44:50 UTC 2018 (0b375e4)
RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2
Boards=1
State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1
Owner=N/A MCS_label=N/A
Partitions=GPUsmall,pm_shared
BootTime=2019-12-10T14:16:37
SlurmdStartTime=2019-12-10T14:24:08
CfgTRES=cpu=36,mem=1M,billing=36
AllocTRES=cpu=6
CapWatts=n/a
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
ExtSensorsJoules=n/s ExtSensorsWatts=0
ExtSensorsTemp=n/s
node19:-
NodeName=node19 Arch=x86_64 CoresPerSocket=18
CPUAlloc=16 CPUErr=0 CPUTot=36 CPULoad=15.43
AvailableFeatures=K2200
ActiveFeatures=K2200
Gres=gpu:2
NodeAddr=node19 NodeHostName=node19 Version=17.11
OS=Linux 4.12.14-94.41-default #1 SMP Wed Oct 31
12:25:04 UTC 2018 (3090901)
RealMemory=1 AllocMem=0 FreeMem=63998 Sockets=2
Boards=1
State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1
Owner=N/A MCS_label=N/A
Partitions=GPUsmall,pm_shared
BootTime=2020-03-12T06:51:54
SlurmdStartTime=2020-03-12T06:53:14
CfgTRES=cpu=36,mem=1M,billing=36
AllocTRES=cpu=16
CapWatts=n/a
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
ExtSensorsJoules=n/s ExtSensorsWatts=0
ExtSensorsTemp=n/s
could you please help me to understand what could
be the reason?
--
Regards,
Daniel Letai
+972 (0)505 870 456