Re: [slurm-users] CommunicationParameters=block_null_hash issue in 21.08.8

Ole Holm Nielsen Thu, 05 May 2022 06:44:49 -0700

Hi Marcus,

On 5/5/22 14:45, Marcus Boden wrote:

we had a similar issues on our systems. As I understand from the bug youlinked, we just need to wait until all the old jobs are finished (and theold slurmstepd are gone). So a full drain should not be necessary?


Yes, I believe that sounds right.

I've been thinking about how to determine the timestamp of the oldest jobrunning on the cluster, and then make sure this is after the time that allslurmd daemons were upgraded to 21.08.8.


This command will tell you the oldest running jobs:

$ squeue -t running -O StartTime | sort | head

You can add more -O options to get JobIDs etc., as long as you sort on theStartTime column (Slurm ISO 8601 timestamps[1] can simply be sorted inlexicographical order).


I hope this helps.

/Ole


[1] https://en.wikipedia.org/wiki/ISO_8601

On 05.05.22 13:53, Ole Holm Nielsen wrote:
Just a heads-up regarding settingCommunicationParameters=block_null_hash in slurm.conf:
On 5/4/22 21:50, Tim Wickberg wrote:
CVE-2022-29500:
An architectural flaw with how credentials are handled can be exploitedto allow an unprivileged user to impersonate the SlurmUser account.Access to the SlurmUser account can be used to execute arbitraryprocesses as root.
This issue impacts all Slurm releases since at least Slurm 1.0.0.
Systems remain vulnerable until all slurmdbd, slurmctld, and slurmdprocesses have been restarted in the cluster.
Once all daemons have been upgraded sites are encouraged to add"block_null_hash" to CommunicationParameters. That new option providesadditional protection against a potential exploit.
The block_null_hash still needs to be documented in the slurm.confman-page. But in https://bugs.schedmd.com/show_bug.cgi?id=14002 I wasassured that it's OK to use it now.
I upgraded 21.08.7 to 21.08.8 using RPM packages while the cluster wasrunning production jobs. This is perhaps not recommended (seehttps://slurm.schedmd.com/quickstart_admin.html#upgrade), but it workedwithout a glitch also in this case.
However, when I defined CommunicationParameters=block_null_hash inslurm.conf later today, I started getting RPC errors on the computenodes and in slurmctld when jobs were completing, see bug 14002.
I would recommend sites to hold up a bit withCommunicationParameters=block_null_hash until we have found a resolutionin bug 14002. Draining all jobs from the cluster before setting thisparameter may be the safe approach(?).

Re: [slurm-users] CommunicationParameters=block_null_hash issue in 21.08.8

Reply via email to