Slurm versions 24.11.5, 24.05.8, and 23.11.11 are now available and include a fix for a recently discovered security issue.

SchedMD customers were informed on April 23rd and provided a patch on
request; this process is documented in our security policy. [1]

A mistake with permission handling for Coordinators within Slurm's accounting system can allow a Coordinator to promote a user to Administrator. (CVE-2025-43904)

Thank you to Sekou Diakite (HPE) for reporting this.

Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security-policy/

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

* Changes in Slurm 24.11.5
==========================
 -- Return error to scontrol reboot on bad nodelists.
 -- slurmrestd - Report an error when QOS resolution fails for v0.0.40
    endpoints.
 -- slurmrestd - Report an error when QOS resolution fails for v0.0.41
    endpoints.
 -- slurmrestd - Report an error when QOS resolution fails for v0.0.42
    endpoints.
 -- data_parser/v0.0.42 - Added +inline_enums flag which modifies the
    output when generating OpenAPI specification. It causes enum arrays to not
    be defined in their own schema with references ($ref) to them. Instead they
    will be dumped inline.
 -- Fix binding error with tres-bind map/mask on partial node allocations.
 -- Fix stepmgr enabled steps being able to request features.
 -- Reject step creation if requested feature is not available in job.
 -- slurmd - Restrict listening for new incoming RPC requests further into
    startup.
 -- slurmd - Avoid auth/slurm related hangs of CLI commands during startup
    and shutdown.
 -- slurmctld - Restrict processing new incoming RPC requests further into
    startup. Stop processing requests sooner during shutdown.
 -- slurmcltd - Avoid auth/slurm related hangs of CLI commands during
    startup and shutdown.
 -- slurmctld: Avoid race condition during shutdown or reconfigure that
    could result in a crash due delayed processing of a connection while
    plugins are unloaded.
 -- Fix small memleak when getting the job list from the database.
 -- Fix incorrect printing of % escape characters when printing stdio
    fields for jobs.
 -- Fix padding parsing when printing stdio fields for jobs.
 -- Fix printing %A array job id when expanding patterns.
 -- Fix reservations causing jobs to be held for Bad Constraints
 -- switch/hpe_slingshot - Prevent potential segfault on failed curl
    request to the fabric manager.
 -- Fix printing incorrect array job id when expanding stdio file names.
    The %A will now be substituted by the correct value.
 -- Fix printing incorrect array job id when expanding stdio file names.
    The %A will now be substituted by the correct value.
 -- switch/hpe_slingshot - Fix vni range not updating on slurmctld restart
    or reconfigre.
 -- Fix steps not being created when using certain combinations of -c and
    -n inferior to the jobs requested resources, when using stepmgr and nodes
    are configured with CPUs == Sockets*CoresPerSocket.
 -- Permit configuring the number of retry attempts to destroy CXI service
    via the new destroy_retries SwitchParameter.
 -- Do not reset memory.high and memory.swap.max in slurmd startup or
    reconfigure as we are never really touching this in slurmd.
 -- Fix reconfigure failure of slurmd when it has been started manually and
    the CoreSpecLimits have been removed from slurm.conf.
 -- Set or reset CoreSpec limits when slurmd is reconfigured and it was
    started with systemd.
 -- switch/hpe-slingshot - Make sure the slurmctld can free step VNIs after
    the controller restarts or reconfigures while the job is running.
 -- Fix backup slurmctld failure on 2nd takeover.
 -- Testsuite - fix python test 130_2.
 -- Fix security issue where a coordinator could add a user with elevated
    privileges. CVE-2025-43904.

* Changes in Slurm 24.05.8
==========================
 -- Testsuite - fix python test 130_2.
 -- Fix security issue where a coordinator could add a user with elevated
    privileges. CVE-2025-43904.

* Changes in Slurm 23.11.11
===========================
 -- Fixed a job requeuing issue that merged job entries into the same SLUID
    when all nodes in a job failed simultaneously.
 -- Add ABORT_ON_FATAL environment variable to capture a backtrace from any
    fatal() message.
 -- Testsuite - fix python test 130_2.
 -- Fix security issue where a coordinator could add a user with elevated
    privileges. CVE-2025-43904.

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to