Re: [slurm-users] not allocating jobs even resources are free

Brian W. Johanson Fri, 24 Apr 2020 12:19:40 -0700

If you haven't looked at the man page for slurm.conf, it will answermost if not all your questions.https://slurm.schedmd.com/slurm.conf.html but I would depend on the themanual version that was distributed with the version you have installedas options do change.

There is a ton of information that is tedious to get through but readingthrough it multiple times opens many doors.


DefaultTime is listed in there as a Partition option.

If you are scheduling gres/gpu resources, it's quite possible there arecores available with no corresponding gpus avail.


-b

On 4/24/20 2:49 PM, navin srivastava wrote:

Thanks Brian.

I need  to check the jobs order.

Is there any way to define the default timeline of the job if user not specifying time limit.


Also what does the meaning of fairtree  in priorities in slurm.Conf file.

The set of nodes are different in partitions.FIFO does not care forany partitiong.Is it like strict odering means the job came 1st will go and until itruns it will not allow others.

Also priorities is high for gpusmall partition and low for normal jobsand the nodes of the normal partition is full but gpusmall cores areavailable.


Regards
Navin

On Fri, Apr 24, 2020, 23:49 Brian W. Johanson <bjoha...@psc.edu<mailto:bjoha...@psc.edu>> wrote:


    Without seeing the jobs in your queue, I would expect the next job
    in FIFO order to be too large to fit in the current idle resources.

    Configure it to use the backfill scheduler:
    SchedulerType=sched/backfill

          SchedulerType
                  Identifies  the type of scheduler to be used.  Note
    the slurmctld daemon must be restarted for a change in scheduler
    type to become effective (reconfiguring a running daemon has no
    effect for this parameter).  The scontrol command can be used to
    manually change job priorities if desired.  Acceptable values include:

                  sched/backfill
                         For a backfill scheduling module to augment
    the default FIFO scheduling.  Backfill scheduling will initiate
    lower-priority jobs if doing so does not delay the expected
    initiation time of any  higher priority  job.   Effectiveness  of 
    backfill scheduling is dependent upon users specifying job time
    limits, otherwise all jobs will have the same time limit and
    backfilling is impossible.  Note documentation for the
    SchedulerParameters option above.  This is the default configuration.

                  sched/builtin
                         This  is  the  FIFO scheduler which initiates
    jobs in priority order.  If any job in the partition can not be
    scheduled, no lower priority job in that partition will be
    scheduled.  An exception is made for jobs that can not run due to
    partition constraints (e.g. the time limit) or down/drained
    nodes.  In that case, lower priority jobs can be initiated and not
    impact the higher priority job.



    Your partitions are set with maxtime=INFINITE, if your users are
    not specifying a reasonable timelimit to their jobs, this won't
    help either.


    -b


    On 4/24/20 1:52 PM, navin srivastava wrote:

    In addition to the above when i see the sprio of both the jobs it
    says :-

    for normal queue jobs all jobs showing the same priority

     JOBID PARTITION   PRIORITY  FAIRSHARE
            1291352 normal           15789      15789

    for GPUsmall all jobs showing the same priority.

     JOBID PARTITION   PRIORITY  FAIRSHARE
            1291339 GPUsmall      21052      21053

    On Fri, Apr 24, 2020 at 11:14 PM navin srivastava
    <navin.alt...@gmail.com <mailto:navin.alt...@gmail.com>> wrote:

        Hi Team,

        we are facing some issue in our environment. The resources
        are free but job is going into the QUEUE state but not running.

        i have attached the slurm.conf file here.

        scenario:-

        There are job only in the 2 partitions:
         344 jobs are in PD state in normal partition and the node
        belongs from the normal partitions are full and no more job
        can run.

        1300 JOBS are in GPUsmall partition are in queue and enough
        CPU is avaiable to execute the jobs but i see the jobs are
        not scheduling on free nodes.

        Rest there are no pend jobs in any other partition .
        eg:-
        node status:- node18

        NodeName=node18 Arch=x86_64 CoresPerSocket=18
           CPUAlloc=6 CPUErr=0 CPUTot=36 CPULoad=4.07
           AvailableFeatures=K2200
           ActiveFeatures=K2200
           Gres=gpu:2
           NodeAddr=node18 NodeHostName=node18 Version=17.11
           OS=Linux 4.4.140-94.42-default #1 SMP Tue Jul 17 07:44:50
        UTC 2018 (0b375e4)
           RealMemory=1 AllocMem=0 FreeMem=79532 Sockets=2 Boards=1
           State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
        MCS_label=N/A
           Partitions=GPUsmall,pm_shared
           BootTime=2019-12-10T14:16:37
        SlurmdStartTime=2019-12-10T14:24:08
           CfgTRES=cpu=36,mem=1M,billing=36
           AllocTRES=cpu=6
           CapWatts=n/a
           CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
           ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

        node19:-

        NodeName=node19 Arch=x86_64 CoresPerSocket=18
           CPUAlloc=16 CPUErr=0 CPUTot=36 CPULoad=15.43
           AvailableFeatures=K2200
           ActiveFeatures=K2200
           Gres=gpu:2
           NodeAddr=node19 NodeHostName=node19 Version=17.11
           OS=Linux 4.12.14-94.41-default #1 SMP Wed Oct 31 12:25:04
        UTC 2018 (3090901)
           RealMemory=1 AllocMem=0 FreeMem=63998 Sockets=2 Boards=1
           State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
        MCS_label=N/A
           Partitions=GPUsmall,pm_shared
           BootTime=2020-03-12T06:51:54
        SlurmdStartTime=2020-03-12T06:53:14
           CfgTRES=cpu=36,mem=1M,billing=36
           AllocTRES=cpu=16
           CapWatts=n/a
           CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
           ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

        could you please help me to understand what could be the reason?

Re: [slurm-users] not allocating jobs even resources are free

Reply via email to