We are pleased to announce the availability of Slurm version 23.11.8.

The 23.11.8 release fixes some potential crashes in slurmctld, slurmrestd, and slurmd when using less common features; two issues in auth/slurm; and a few other minor bugs.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

-Marshall


 -- Fix slurmctld crash when reconfiguring with a PrologSlurmctld is running.
 -- Fix slurmctld crash after a job has been resized.
 -- Fix slurmctld and slurmdbd potentially stopping instead of performing a
    logrotate when recieving SIGUSR2 when using auth/slurm.
 -- Fix not having a disabled value for keepalive CommunicationParameters in
    slurm.conf when these parameters are not set. This can log an error when
    setting a socket, for example during slurmdbd registration with ctld.
 -- switch/hpe_slingshot - Fix slurmctld crash when upgrading from 23.02.
 -- Fix "Could not find group" errors from validate_group() when using
    AllowGroups with large /etc/group files.
 -- slurmrestd - Prevent a slurmrestd segfault when parsing the crontab field,
    which was never usable. Now it explicitly ignores the value and emits a
    warning if it is used for the following endpoints:
      'POST /slurm/v0.0.39/job/{job_id}'
      'POST /slurm/v0.0.39/job/submit'
      'POST /slurm/v0.0.40/job/{job_id}'
      'POST /slurm/v0.0.40/job/submit'
 -- Fix getting user environment when using sbatch with "--get-user-env" or
    "--export=" when there is a user profile script that reads /proc.
 -- Prevent slurmd from crashing if acct_gather_energy/gpu is configured but
    GresTypes is not configured.
 -- Do not log the following errors when AcctGatherEnergyType plugins are used
    but a node does not have or cannot find sensors:
    "error: _get_joules_task: can't get info from slurmd"
    "error: slurm_get_node_energy: Zero Bytes were transmitted or received"
    However, the following error will continue to be logged:
    "error: Can't get energy data. No power sensors are available. Try later"
 -- Fix cloud nodes not being able to forward to nodes that restarted with new
    IP addresses.
 -- sacct - Fix printing of job group for job steps.
 -- Fix error in scrontab jobs when using slurm.conf:PropagatePrioProcess=1.
 -- Fix slurmctld crash on a batch job submission with "--nodes 0,...".
 -- Fix dynamic IP address fanout forwarding when using auth/slurm.

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to