Slurm version 19.05.3 is now available, and includes a series of fixes since 19.05.2 was released nearly two months ago.

Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

* Changes in Slurm 19.05.3
==========================
 -- Fix missing check from conversion of cray -> cray_aries.
 -- Improve job state reason string when required nodes are not available by
    not including those that don't belong to the job partition.
 -- Set a more appropriate ESLURM_RESERVATION_MAINT job state reason for jobs
    requesting feature(s) and required nodes are in a maintenance reservation.
 -- Fix logic to better handle maintenance reservations.
 -- Add spank options to cache in remote callback.
 -- Enforce the use of spank_option_getopt().
 -- Fix select plugins' will run test under-allocating nodes usage for
    completing jobs.
 -- Nodes in COMPLETING state treated as being currently available for job
    will-run test.
 -- Cray - fix contribs slurm.conf.j2 with updated cray_aries plugin names.
 -- job_submit/lua - fix problem where nil was expected for min_mem_per_cpu.
 -- Fix extra, unaccounted TRESRunMins usage created by heterogeneous jobs when
    running with the priority/multifactor plugin.
 -- Detach threads once they are done to avoid having to join them
    in track scripts code.
 -- Handle situation where a slurmctld tries to communicate with slurmdbd more
    than once at the same time.
 -- Fix XOR/XAND features like cpu&fastio&[knl|westmere] to be resolved
    correctly.
 -- Don't update [min|max]_exit_code on job array task requeue.
 -- Don't assume the first node of a job is the batch host when testing if the
    job's allocated nodes are booted/ready.
 -- Make --batch=<feature> requests wait for all nodes to be booted so that it
    can choose the batch host after the nodes have been booted -- possibly with
    different features.
 -- Fix talking to batch host on it's protocol version when using --batch.
 -- gres/mic plugin - add missing fini() function to clean up plugin state.
 -- Move _validate_node_choice() before prolog/epilog check.
 -- Look forward one week while create new reservation.
 -- Set mising resv_desc.flags before call _select_nodes().
 -- Use correct start_time for TIME_FLOAT reservation in _job_overlap().
 -- Properly enforce a job's mem-per-cpu option when allocate the node
    exclusively to that job.
 -- sched/backfill - clear estimated sched_nodes as done for start_time.
 -- Have safe_[read|write] handle EAGAIN and EINTR.
 -- Fix checking for flag with logical AND.
 -- Correct "extern" definition of variable if compiling with __APPLE__.
 -- Deprecate FastSchedule. FastSchedule will be removed in 20.02.
    The FastSchedule=2 functionality (used for testing and development) has
    been retained as the new SlurmdParameters=config_overrides option.
 -- Fix preemption issue when picking nodes for a feature job request.
 -- Fix race condition preventing held array job from getting a db_index.
 -- Fix select/cons_tres gres code infinite loop leaving slurmctld unresponsive.
 -- Remove redefinition of global variable in gres.c
 -- Fix issue where GPU devices are denied access when MPS is enabled.
 -- Fix uninitialized errors when compiling with CFLAGS="--coverage".
 -- Fix scancel --full for proctrack/cgroups.
 -- Fix sdiag backfill last and mean queue length stats.
 -- Do not remove batch host when resizing/shrinking a batch job.
 -- nss_slurm - fix file descriptor leaks.
 -- Fix preemption for jobs using complex feature requests
    (e.g. -C "[rack1*2&rack2*4]").
 -- Fix memory leaks in preemption when jobs request multiple features.
 -- Allow Operator users to show/fix runaways.
 -- Disallow coordinators to show/fix runaways.
 -- mpi/pmi2 - increase array len to avoid buffer size exceeded error.
 -- Preserve rebooting node's nextstate when updating state with scontrol.
 -- Fully merge slurm.conf and gres.conf before node_config_load().
 -- Remove FastSchedule dependence from gres.conf's AutoDetect=nvml.
 -- Forbid mix of typed and untyped GRES of same name in slurm.conf.
 -- cons_tres: Prevent creating a job without CPUs.
 -- Prevent underflow when filtering cores with gres.
 -- proctrack/cray_aries: use current pid instead of thread if we're in a fork.
 -- Fix missing check for prolog launch credential creation failure that can
    lead to segfaults

Reply via email to