We are pleased to announce the availability of Slurm version 17.11.8.
This includes over 30 fixes made since 17.11.7 was released at the end of May. This includes a change to the slurmd.service file used with systemd, this fix prevents systemd from destroying the cgroup hierarchies slurmd/slurmstepd have created whenever 'systemctl daemon-reload' is called (e.g., by yum/rpm).
Slurm can be downloaded from https://www.schedmd.com/downloads.php . - Tim -- Tim Wickberg Chief Technology Officer, SchedMD LLC Commercial Slurm Development and Support
* Changes in Slurm 17.11.8 ========================== -- Fix incomplete RESPONSE_[RESOURCE|JOB_PACK]_ALLOCATION building path. -- Do not allocate nodes that were marked down due to the node not responding by ResumeTimeout. -- task/cray plugin - search for "mems" cgroup information in the file "cpuset.mems" then fall back to the file "mems". -- Fix ipmi profile debug uninitialized variable. -- Improve detection of Lua package on older RHEL distributions. -- PMIx: fixed the direct connect inline msg sending. -- MYSQL: Fix issue not handling all fields when loading an archive dump. -- Allow a job_submit plugin to change the admin_comment field during job_submit_plugin_modify(). -- job_submit/lua - fix access into reservation table. -- MySQL - Prevent deadlock caused by archive logic locking reads. -- Don't enforce MaxQueryTimeRange when requesting specific jobs. -- Modify --test-only logic to properly support jobs submitted to more than one partition. -- Prevent slurmctld from abort when attempting to set non-existing qos as def_qos_id. -- Add new job dependency type of "afterburstbuffer". The pending job will be delayed until the first job completes execution and it's burst buffer stage-out is completed. -- Reorder proctrack/task plugin load in the slurmstepd to match that of slurmd and avoid race condition calling task before proctrack can introduce. -- Prevent reboot of a busy KNL node when requesting inactive features. -- Revert to previous behavior when requesting memory per cpu/node introduced in 17.11.7. -- Fix to reinitialize previously adjusted job members to their original value when validating the job memory in multi-partition requests. -- Fix _step_signal() from always returning SLURM_SUCCESS. -- Combine active and available node feature change logs on one line rather than one line per node for performance reasons. -- Prevent occasionally leaking freezer cgroups. -- Fix potential segfault when closing the mpi/pmi2 plugin. -- Fix issues with --exclusive=[user|mcs] to work correctly with preemption or when job requests a specific list of hosts. -- Make code compile with hdf5 1.10.2+ -- mpi/pmix: Fixed the collectives canceling. -- SlurmDBD: improve error message handling on archive load failure. -- Fix incorrect locking when deleting reservations. -- Fix incorrect locking when setting up the power save module. -- Fix setting format output length for squeue when showing array jobs. -- Add xstrstr function. -- Fix printing out of --hint options in sbatch, salloc --help. -- Prevent possible divide by zero in _validate_time_limit(). -- Add Delegate=yes to the slurmd.service file to prevent systemd from interfering with the jobs' cgroup hierarchies.