hi Mick,

I have checked that all the compute nodes and controllers all have the same 
version of SLURM (20.11.9). I am indeed trying to upgrade SlurmDB first, and am 
getting the errors in the slurmdbd.log:

[2022-09-08T15:45:11.115] slurmdbd version 20.11.9 started
[2022-09-08T15:45:23.001] error: unpack_header: protocol_version 8448 not 
supported
[2022-09-08T15:33:57.001] unpacking header
[2022-09-08T15:33:57.001] error: destroy_forward: no init
[2022-09-08T15:33:57.001] error: slurm_unpack_received_msg: Message receive 
failure
[2022-09-08T15:33:57.011] error: CONN:11 Failed to unpack SLURM_PERSIST_INIT 
message

Regards,
Wadud.

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Timony, 
Mick <michael_tim...@hms.harvard.edu>
Sent: 08 September 2022 16:24
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Upgrading SLURM from 18 to 20.11.9

CAUTION: This e-mail originated outside the University of Southampton.
This thread on the forums may help:

https://groups.google.com/g/slurm-users/c/YB55Ru9rvD4<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fg%2Fslurm-users%2Fc%2FYB55Ru9rvD4&data=05%7C01%7Cw.miah%40soton.ac.uk%7C13f4b2b736764041dc9d08da91af4672%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637982479244856364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=cQGagihxp%2BD2JTZZY%2BMKVH5I%2B386oZIXbCZT9eyfTlg%3D&reserved=0>


It looks like you have something on your network with an older version of slurm 
installed. I'd check the Slurm version installed on your compute nodes and 
controllers.

The recommended approach to upgrading is to upgrade the SlurmDB first, then the 
controllers, then the compute nodes. More info here:

https://slurm.schedmd.com/quickstart_admin.html#upgrade<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fquickstart_admin.html%23upgrade&data=05%7C01%7Cw.miah%40soton.ac.uk%7C13f4b2b736764041dc9d08da91af4672%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637982479244856364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=BvJQSt4tfJY616T%2BTzfbGzw4nrTFCuZTbjyuThpssnQ%3D&reserved=0>

Regards
--
Mick Timony
Senior DevOps Engineer
Harvard Medical School
--

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Wadud 
Miah <w.m...@soton.ac.uk>
Sent: Thursday, September 8, 2022 10:47 AM
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Upgrading SLURM from 18 to 20.11.9

Hi,

I am attempting to upgrade from SLURM 18 to 20.11.9 and when I attempt to start 
slurmdbd (version 20.11.9), I get the following error messages in 
/var/log/slurm/slurmdbd.log:

[2022-09-08T15:45:11.115] slurmdbd version 20.11.9 started
[2022-09-08T15:45:23.001] error: unpack_header: protocol_version 8448 not 
supported
[2022-09-08T15:33:57.001] unpacking header
[2022-09-08T15:33:57.001] error: destroy_forward: no init
[2022-09-08T15:33:57.001] error: slurm_unpack_received_msg: Message receive 
failure
[2022-09-08T15:33:57.011] error: CONN:11 Failed to unpack SLURM_PERSIST_INIT 
message

Any help will be greatly appreciated.

Regards,

----------
Wadud Miah
Research Computing Support
University of Southampton

Reply via email to