Typically slurm only supports upgrading between 2 major versions ahead.
If you are on 18.08 you likely can only go to 20.02. Then after you
upgrade to 20.02 you can go to 20.11 or 21.08.
-Paul Edmon-
On 9/8/22 11:38 AM, Wadud Miah wrote:
hi Mick,
I have checked that all the compute nodes and controllers all have the
same version of SLURM (20.11.9). I am indeed trying to upgrade SlurmDB
first, and am getting the errors in the slurmdbd.log:
[2022-09-08T15:45:11.115] slurmdbd version 20.11.9 started
[2022-09-08T15:45:23.001] error: unpack_header: protocol_version 8448
not supported
[2022-09-08T15:33:57.001] unpacking header
[2022-09-08T15:33:57.001] error: destroy_forward: no init
[2022-09-08T15:33:57.001] error: slurm_unpack_received_msg: Message
receive failure
[2022-09-08T15:33:57.011] error: CONN:11 Failed to unpack
SLURM_PERSIST_INIT message
Regards,
Wadud.
------------------------------------------------------------------------
*From:* slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf
of Timony, Mick <michael_tim...@hms.harvard.edu>
*Sent:* 08 September 2022 16:24
*To:* Slurm User Community List <slurm-users@lists.schedmd.com>
*Subject:* Re: [slurm-users] Upgrading SLURM from 18 to 20.11.9
*CAUTION:* This e-mail originated outside the University of Southampton.
This thread on the forums may help:
https://groups.google.com/g/slurm-users/c/YB55Ru9rvD4
<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fg%2Fslurm-users%2Fc%2FYB55Ru9rvD4&data=05%7C01%7Cw.miah%40soton.ac.uk%7C13f4b2b736764041dc9d08da91af4672%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637982479244856364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=cQGagihxp%2BD2JTZZY%2BMKVH5I%2B386oZIXbCZT9eyfTlg%3D&reserved=0>
It looks like you have something on your network with an older version
of slurm installed. I'd check the Slurm version installed on your
compute nodes and controllers.
The recommended approach to upgrading is to upgrade the SlurmDB first,
then the controllers, then the compute nodes. More info here:
https://slurm.schedmd.com/quickstart_admin.html#upgrade
<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fquickstart_admin.html%23upgrade&data=05%7C01%7Cw.miah%40soton.ac.uk%7C13f4b2b736764041dc9d08da91af4672%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637982479244856364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=BvJQSt4tfJY616T%2BTzfbGzw4nrTFCuZTbjyuThpssnQ%3D&reserved=0>
Regards
--
Mick Timony
Senior DevOps Engineer
Harvard Medical School
--
------------------------------------------------------------------------
*From:* slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf
of Wadud Miah <w.m...@soton.ac.uk>
*Sent:* Thursday, September 8, 2022 10:47 AM
*To:* slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
*Subject:* [slurm-users] Upgrading SLURM from 18 to 20.11.9
Hi,
I am attempting to upgrade from SLURM 18 to 20.11.9 and when I attempt
to start slurmdbd (version 20.11.9), I get the following error
messages in /var/log/slurm/slurmdbd.log:
[2022-09-08T15:45:11.115] slurmdbd version 20.11.9 started
[2022-09-08T15:45:23.001] error: unpack_header: protocol_version 8448
not supported
[2022-09-08T15:33:57.001] unpacking header
[2022-09-08T15:33:57.001] error: destroy_forward: no init
[2022-09-08T15:33:57.001] error: slurm_unpack_received_msg: Message
receive failure
[2022-09-08T15:33:57.011] error: CONN:11 Failed to unpack
SLURM_PERSIST_INIT message
Any help will be greatly appreciated.
Regards,
----------
Wadud Miah
Research Computing Support
University of Southampton