[slurm-users] Help with slurmdbd and Slurmctld

2024-06-17 Thread stth via slurm-users
Hello,
I wanted help with a problem I have. I just updated my operation system and
the slurmdbd and slurmctld is not working anymore. I get this Errors:

slurmdbd: fatal: Database schema is too old for this version of Slurm to
upgrade.

slurmctld: fatal: Can not recover last_tres state, incompatible version,
got 9472 need >= 9728 <= 10240,start with '-i' to ignore this. Warning:
using -i will lose the data that can't be recovered.


I hope someone can help !

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Help with slurmdbd and Slurmctld

2024-06-18 Thread stth via slurm-users
Hello,

thank you for your responses. I am currently running on Ubuntu 24.04 LTS.
and I have the slurm version 23.11.4. As long as I know the slurm version
is actually the same as before.

thank you.

On Mon, Jun 17, 2024 at 6:21 PM Ole Holm Nielsen via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> On 6/17/24 17:56, stth via slurm-users wrote:
> > Hello,
> > I wanted help with a problem I have. I just updated my operation system
> > and the slurmdbd and slurmctld is not working anymore. I get this Errors:
> >
> > slurmdbd: fatal: Database schema is too old for this version of Slurm to
> > upgrade.
> >
> > slurmctld: fatal: Can not recover last_tres state, incompatible version,
> > got 9472 need >= 9728 <= 10240,start with '-i' to ignore this. Warning:
> > using -i will lose the data that can'tbe recovered.
> >
> >
> > I hope someone can help !
>
> You don't give any details about what upgrades you did, so it's not easy
> to provide help.
>
> In case you're upgrading Slurm within the same OS, there is some
> information in this Wiki page:
>
> https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/#upgrading-slurm
>
> /Ole
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Slurmctld Problems

2024-06-25 Thread stth via slurm-users
Dear slurm users,

It is my first time setting slurm up and I am looking for a solution to
this errors. Has anyone here already ecountered this problem. I would
really appreciate the help.  mariadb, slurmdbd and slurmd are active.

*×* slurmctld.service - Slurm controller daemon

 Loaded: loaded (/usr/lib/systemd/system/slurmctld.service; *enabled*;
preset: *enabled*)

 Active: *failed* (Result: exit-code) since Tue 2024-06-25 10:06:39
UTC; 2min 42s ago

   Duration: 584ms

   Docs: man:slurmctld(8)

Process: 63738 ExecStart=/usr/sbin/slurmctld --systemd
$SLURMCTLD_OPTIONS *(code=exited, status=1/FAILURE)*

   Main PID: 63738 (code=exited, status=1/FAILURE)

CPU: 25ms


Jun 25 10:06:39 server systemd[1]: Starting slurmctld.service - Slurm
controller daemon...

Jun 25 10:06:39 server (lurmctld)[63738]: *slurmctld.service: Referenced
but unset environment variable evaluates to an empty string:
SLURMCTLD_OPTIONS*

Jun 25 10:06:39 server slurmctld[63738]: slurmctld: slurmctld version
23.11.4 started on servercluster

Jun 25 10:06:39 server systemd[1]: Started slurmctld.service - Slurm
controller daemon.

Jun 25 10:06:39 server slurmctld[63738]: slurmctld:
accounting_storage/slurmdbd: clusteracct_storage_p_register_ctld:
Registering slurmctld at port 6817 with slurmdbd

Jun 25 10:06:39 server slurmctld[63738]: slurmctld: priority/multifactor:
_read_last_decay_ran: No last decay
(/var/spool/slurm/state/priority_last_decay_ran)
to recover

Jun 25 10:06:39 server slurmctld[63738]: slurmctld: No memory enforcing
mechanism configured.

Jun 25 10:06:39 server slurmctld[63738]: slurmctld: fatal: Can not recover
last_conf_lite, incompatible version, (9472 not between 9728 and 10240),
start with '-i' to ignore this. Warning: using -i will lose the data that
can't be recovered.

Jun 25 10:06:39 server systemd[1]: *slurmctld.service: Main process exited,
code=exited, status=1/FAILURE*

Jun 25 10:06:39 server systemd[1]: *slurmctld.service: Failed with result
'exit-code'.*

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Slurmctld Problems

2024-06-25 Thread stth via slurm-users
Hello,
slurmctld.log and journalctl -u slurmctld --no-pager give the same info as
I have already provided.
“ Referenced but unset environment variable evaluates to an empty string:
SLURMCTLD_OPTIONS* " has to do with the files on /etc/default
(slurmdbd/slurmctld/slurmd), where there is a line: SLURMDBD_OPTIONS="".

But it does not have anything to do with the fact that the deamon is not
active

On Tue, Jun 25, 2024 at 3:49 PM daijiangkuicgo--- via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> What's your “ Referenced but unset environment variable evaluates to an
> empty string:
> SLURMCTLD_OPTIONS* ”? Meanwhile, you can check slurmctld.log and
> journalctl -u slurmctld --no-pager.
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Slurmctld Problems

2024-06-25 Thread stth via slurm-users
Hello Lorenzo,

Thank you for your reply. Yes I got the
23.11.4 version.

Lorenzo Bosio  schrieb am Di. 25. Juni 2024 um
16:50:

> Hello,
>
> I suppose the actual error is:
>
> slurmctld: fatal: Can not recover last_conf_lite, incompatible version,
> (9472 not between 9728 and 10240), start with '-i' to ignore this.
> Warning: using -i will lose the data that can't be recovered.
>
> did you upgrade from Slurm 21.08 (9472) to your actual version 23.11
> (10240) ? See here for numbers reference:
> https://github.com/SchedMD/slurm/blob/40058e4df5fa243f4c340db9622ed559ce771778/src/common/slurm_protocol_common.h#L63
>
> You have to stay in a 2 releases window for the upgrades to work.
>
> Best regards,
> Lorenzo
> On 25/06/24 16:30, stth via slurm-users wrote:
>
> Hello,
> slurmctld.log and journalctl -u slurmctld --no-pager give the same info as
> I have already provided.
> “ Referenced but unset environment variable evaluates to an empty string:
> SLURMCTLD_OPTIONS* " has to do with the files on /etc/default
> (slurmdbd/slurmctld/slurmd), where there is a line: SLURMDBD_OPTIONS="".
>
> But it does not have anything to do with the fact that the deamon is not
> active
>
> On Tue, Jun 25, 2024 at 3:49 PM daijiangkuicgo--- via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
>> What's your “ Referenced but unset environment variable evaluates to an
>> empty string:
>> SLURMCTLD_OPTIONS* ”? Meanwhile, you can check slurmctld.log and
>> journalctl -u slurmctld --no-pager.
>>
>> --
>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>>
>
>
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Slurmctld Problems

2024-06-25 Thread stth via slurm-users
Hi Timo,

Thanks, The old data wasn’t important so I did that. I changed the line as
follows in the
/usr/lib/systemd/system/slurmctld.service :

ExecStart=/usr/sbin/slurmctld --systemd -i $SLURMCTLD_OPTIONS

Slurmctld is now active

Timo Rothenpieler via slurm-users  schrieb
am Di. 25. Juni 2024 um 17:26:

> On 25/06/2024 12:20, stth via slurm-users wrote:
> > Jun 25 10:06:39 server slurmctld[63738]: slurmctld: fatal: Can not
> > recover last_conf_lite, incompatible version, (9472 not between 9728 and
> > 10240), start with '-i' to ignore this. Warning: using -i will lose the
> > data that can't be recovered.
>
> Seems like it's not the first time, but the first time in a long while.
> If there is no important data in that old db, just do what the error
> says as a one-off.
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Cgroup

2024-07-22 Thread stth via slurm-users
Hello,

I am configuring cgroups on my server for the first time. I've created a
cgroup.conf file in the Slurm directory with the following values:

ConstrainCores=yes
ConstrainRAMSpace=yes
ConstrainSwapSpace=yes
AllowedRAMSpace=90
AllowedSwapSpace=10

I feel like this configuration might be incomplete. Can anyone provide
assistance? I haven't found any specific guidance online.

Thanks!

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com