[slurm-users] Help with slurmdbd and Slurmctld
Hello, I wanted help with a problem I have. I just updated my operation system and the slurmdbd and slurmctld is not working anymore. I get this Errors: slurmdbd: fatal: Database schema is too old for this version of Slurm to upgrade. slurmctld: fatal: Can not recover last_tres state, incompatible version, got 9472 need >= 9728 <= 10240,start with '-i' to ignore this. Warning: using -i will lose the data that can't be recovered. I hope someone can help ! -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Help with slurmdbd and Slurmctld
Hello, thank you for your responses. I am currently running on Ubuntu 24.04 LTS. and I have the slurm version 23.11.4. As long as I know the slurm version is actually the same as before. thank you. On Mon, Jun 17, 2024 at 6:21 PM Ole Holm Nielsen via slurm-users < slurm-users@lists.schedmd.com> wrote: > On 6/17/24 17:56, stth via slurm-users wrote: > > Hello, > > I wanted help with a problem I have. I just updated my operation system > > and the slurmdbd and slurmctld is not working anymore. I get this Errors: > > > > slurmdbd: fatal: Database schema is too old for this version of Slurm to > > upgrade. > > > > slurmctld: fatal: Can not recover last_tres state, incompatible version, > > got 9472 need >= 9728 <= 10240,start with '-i' to ignore this. Warning: > > using -i will lose the data that can'tbe recovered. > > > > > > I hope someone can help ! > > You don't give any details about what upgrades you did, so it's not easy > to provide help. > > In case you're upgrading Slurm within the same OS, there is some > information in this Wiki page: > > https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/#upgrading-slurm > > /Ole > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Slurmctld Problems
Dear slurm users, It is my first time setting slurm up and I am looking for a solution to this errors. Has anyone here already ecountered this problem. I would really appreciate the help. mariadb, slurmdbd and slurmd are active. *×* slurmctld.service - Slurm controller daemon Loaded: loaded (/usr/lib/systemd/system/slurmctld.service; *enabled*; preset: *enabled*) Active: *failed* (Result: exit-code) since Tue 2024-06-25 10:06:39 UTC; 2min 42s ago Duration: 584ms Docs: man:slurmctld(8) Process: 63738 ExecStart=/usr/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS *(code=exited, status=1/FAILURE)* Main PID: 63738 (code=exited, status=1/FAILURE) CPU: 25ms Jun 25 10:06:39 server systemd[1]: Starting slurmctld.service - Slurm controller daemon... Jun 25 10:06:39 server (lurmctld)[63738]: *slurmctld.service: Referenced but unset environment variable evaluates to an empty string: SLURMCTLD_OPTIONS* Jun 25 10:06:39 server slurmctld[63738]: slurmctld: slurmctld version 23.11.4 started on servercluster Jun 25 10:06:39 server systemd[1]: Started slurmctld.service - Slurm controller daemon. Jun 25 10:06:39 server slurmctld[63738]: slurmctld: accounting_storage/slurmdbd: clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817 with slurmdbd Jun 25 10:06:39 server slurmctld[63738]: slurmctld: priority/multifactor: _read_last_decay_ran: No last decay (/var/spool/slurm/state/priority_last_decay_ran) to recover Jun 25 10:06:39 server slurmctld[63738]: slurmctld: No memory enforcing mechanism configured. Jun 25 10:06:39 server slurmctld[63738]: slurmctld: fatal: Can not recover last_conf_lite, incompatible version, (9472 not between 9728 and 10240), start with '-i' to ignore this. Warning: using -i will lose the data that can't be recovered. Jun 25 10:06:39 server systemd[1]: *slurmctld.service: Main process exited, code=exited, status=1/FAILURE* Jun 25 10:06:39 server systemd[1]: *slurmctld.service: Failed with result 'exit-code'.* -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Slurmctld Problems
Hello, slurmctld.log and journalctl -u slurmctld --no-pager give the same info as I have already provided. “ Referenced but unset environment variable evaluates to an empty string: SLURMCTLD_OPTIONS* " has to do with the files on /etc/default (slurmdbd/slurmctld/slurmd), where there is a line: SLURMDBD_OPTIONS="". But it does not have anything to do with the fact that the deamon is not active On Tue, Jun 25, 2024 at 3:49 PM daijiangkuicgo--- via slurm-users < slurm-users@lists.schedmd.com> wrote: > What's your “ Referenced but unset environment variable evaluates to an > empty string: > SLURMCTLD_OPTIONS* ”? Meanwhile, you can check slurmctld.log and > journalctl -u slurmctld --no-pager. > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Slurmctld Problems
Hello Lorenzo, Thank you for your reply. Yes I got the 23.11.4 version. Lorenzo Bosio schrieb am Di. 25. Juni 2024 um 16:50: > Hello, > > I suppose the actual error is: > > slurmctld: fatal: Can not recover last_conf_lite, incompatible version, > (9472 not between 9728 and 10240), start with '-i' to ignore this. > Warning: using -i will lose the data that can't be recovered. > > did you upgrade from Slurm 21.08 (9472) to your actual version 23.11 > (10240) ? See here for numbers reference: > https://github.com/SchedMD/slurm/blob/40058e4df5fa243f4c340db9622ed559ce771778/src/common/slurm_protocol_common.h#L63 > > You have to stay in a 2 releases window for the upgrades to work. > > Best regards, > Lorenzo > On 25/06/24 16:30, stth via slurm-users wrote: > > Hello, > slurmctld.log and journalctl -u slurmctld --no-pager give the same info as > I have already provided. > “ Referenced but unset environment variable evaluates to an empty string: > SLURMCTLD_OPTIONS* " has to do with the files on /etc/default > (slurmdbd/slurmctld/slurmd), where there is a line: SLURMDBD_OPTIONS="". > > But it does not have anything to do with the fact that the deamon is not > active > > On Tue, Jun 25, 2024 at 3:49 PM daijiangkuicgo--- via slurm-users < > slurm-users@lists.schedmd.com> wrote: > >> What's your “ Referenced but unset environment variable evaluates to an >> empty string: >> SLURMCTLD_OPTIONS* ”? Meanwhile, you can check slurmctld.log and >> journalctl -u slurmctld --no-pager. >> >> -- >> slurm-users mailing list -- slurm-users@lists.schedmd.com >> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> > > > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Slurmctld Problems
Hi Timo, Thanks, The old data wasn’t important so I did that. I changed the line as follows in the /usr/lib/systemd/system/slurmctld.service : ExecStart=/usr/sbin/slurmctld --systemd -i $SLURMCTLD_OPTIONS Slurmctld is now active Timo Rothenpieler via slurm-users schrieb am Di. 25. Juni 2024 um 17:26: > On 25/06/2024 12:20, stth via slurm-users wrote: > > Jun 25 10:06:39 server slurmctld[63738]: slurmctld: fatal: Can not > > recover last_conf_lite, incompatible version, (9472 not between 9728 and > > 10240), start with '-i' to ignore this. Warning: using -i will lose the > > data that can't be recovered. > > Seems like it's not the first time, but the first time in a long while. > If there is no important data in that old db, just do what the error > says as a one-off. > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Cgroup
Hello, I am configuring cgroups on my server for the first time. I've created a cgroup.conf file in the Slurm directory with the following values: ConstrainCores=yes ConstrainRAMSpace=yes ConstrainSwapSpace=yes AllowedRAMSpace=90 AllowedSwapSpace=10 I feel like this configuration might be incomplete. Can anyone provide assistance? I haven't found any specific guidance online. Thanks! -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com