Ok I made some progress here.

I removed and purged slurmdbd mysql mariadb etc .. and started from scratch.
I added the recommended mysqld requirements

Started slurmdbd manually : sudo slurmdbd -D /path/to/conf and everything
worked well

When I tried to start the service sudo systemctl start slurmdbd.service  it
didnt work

sudo systemctl status  slurmdbd.service
● slurmdbd.service - Slurm DBD accounting daemon
     Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled; vendor
preset: enabled)
     Active: failed (Result: timeout) since Fri 2024-05-31 00:21:30 UTC;
2min 5s ago
    Process: 6258 ExecStart=/usr/sbin/slurmdbd -D
/etc/slurm-llnl/slurmdbd.conf (code=exited, status=0/SUCCESS)

May 31 00:20:00 hannibal-hn systemd[1]: Starting Slurm DBD accounting
daemon...
May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: start operation
timed out. Terminating.
May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: Failed with
result 'timeout'.
May 31 00:21:30 hannibal-hn systemd[1]: Failed to start Slurm DBD
accounting daemon.

Even though it is the same command ?!

Any idea ?


On Thu, May 30, 2024 at 5:02 PM Radhouane Aniba <arad...@gmail.com> wrote:

> Thank you Ahmet and Brian,
>
> Ahmet, which conf in particular slurmdbd is readiugn from, I parsed all
> the cnf files for mysql and I cannot find the data it is displaying here
>
> slurmdbd: debug2: Attempting to connect to localhost:3306
> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
> slurmdbd: debug2: innodb_log_file_size: 50331648
> slurmdbd: debug2: innodb_lock_wait_timeout: 50
> slurmdbd: error: Database settings not recommended values:
> innodb_buffer_pool_size innodb_lock_wait_timeout
>
>
> sudo tree /etc/mysql/*
> /etc/mysql/conf.d
> ├── mysql.cnf
> └── mysqldump.cnf
> /etc/mysql/debian.cnf
> /etc/mysql/debian-start
> /etc/mysql/FROZEN
> /etc/mysql/mariadb.cnf
> /etc/mysql/mariadb.conf.d
> ├── 50-client.cnf
> ├── 50-mysql-clients.cnf
> ├── 50-mysqld_safe.cnf
> └── 50-server.cnf
> /etc/mysql/my.cnf
> /etc/mysql/my.cnf.fallback
> /etc/mysql/mysql.cnf
> /etc/mysql/mysql.conf.d
> ├── mysql.cnf
> └── mysqld.cnf
>
> On Thu, May 30, 2024 at 12:21 PM Brian Andrus via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
>> That SIGTERM message means something is telling slurmdbd to quit.
>>
>> Check your cron jobs, maintenance scripts, etc. Slurmdbd is being told to
>> shutdown. If you are running in the foreground, a ^C does that. If you run
>> a kill or killall on it, you will get that same message.
>>
>> Brian Andrus
>> On 5/30/2024 6:53 AM, Radhouane Aniba via slurm-users wrote:
>>
>> Yes I can connect to my database using mysql --user=slurm
>> --password=slurmdbpass  slurm_acct_db and there is no firewall blocking
>> mysql after checking the firewall question
>>
>> ALso here is the output of slurmdbd -D -vvv (note I can only run this as
>> sudo )
>>
>> sudo slurmdbd -D -vvv
>> slurmdbd: debug: Log file re-opened
>> slurmdbd: debug: Munge authentication plugin loaded
>> slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
>> slurmdbd: debug2: Attempting to connect to localhost:3306
>> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
>> slurmdbd: debug2: innodb_log_file_size: 50331648
>> slurmdbd: debug2: innodb_lock_wait_timeout: 50
>> slurmdbd: error: Database settings not recommended values:
>> innodb_buffer_pool_size innodb_lock_wait_timeout
>> slurmdbd: Accounting storage MYSQL plugin loaded
>> slurmdbd: debug2: ArchiveDir = /tmp
>> slurmdbd: debug2: ArchiveScript = (null)
>> slurmdbd: debug2: AuthAltTypes = (null)
>> slurmdbd: debug2: AuthInfo = (null)
>> slurmdbd: debug2: AuthType = auth/munge
>> slurmdbd: debug2: CommitDelay = 0
>> slurmdbd: debug2: DbdAddr = localhost
>> slurmdbd: debug2: DbdBackupHost = (null)
>> slurmdbd: debug2: DbdHost = hannibal-hn
>> slurmdbd: debug2: DbdPort = 7032
>> slurmdbd: debug2: DebugFlags = (null)
>> slurmdbd: debug2: DebugLevel = 6
>> slurmdbd: debug2: DebugLevelSyslog = 10
>> slurmdbd: debug2: DefaultQOS = (null)
>> slurmdbd: debug2: LogFile = /var/log/slurmdbd.log
>> slurmdbd: debug2: MessageTimeout = 100
>> slurmdbd: debug2: Parameters = (null)
>> slurmdbd: debug2: PidFile = /run/slurmdbd.pid
>> slurmdbd: debug2: PluginDir = /usr/lib/x86_64-linux-gnu/slurm-wlm
>> slurmdbd: debug2: PrivateData = none
>> slurmdbd: debug2: PurgeEventAfter = 1 months*
>> slurmdbd: debug2: PurgeJobAfter = 12 months*
>> slurmdbd: debug2: PurgeResvAfter = 1 months*
>> slurmdbd: debug2: PurgeStepAfter = 1 months
>> slurmdbd: debug2: PurgeSuspendAfter = 1 months
>> slurmdbd: debug2: PurgeTXNAfter = 12 months
>> slurmdbd: debug2: PurgeUsageAfter = 24 months
>> slurmdbd: debug2: SlurmUser = root(0)
>> slurmdbd: debug2: StorageBackupHost = (null)
>> slurmdbd: debug2: StorageHost = localhost
>> slurmdbd: debug2: StorageLoc = slurm_acct_db
>> slurmdbd: debug2: StoragePort = 3306
>> slurmdbd: debug2: StorageType = accounting_storage/mysql
>> slurmdbd: debug2: StorageUser = slurm
>> slurmdbd: debug2: TCPTimeout = 2
>> slurmdbd: debug2: TrackWCKey = 0
>> slurmdbd: debug2: TrackSlurmctldDown= 0
>> slurmdbd: debug2: acct_storage_p_get_connection: request new connection 1
>> slurmdbd: debug2: Attempting to connect to localhost:3306
>> slurmdbd: slurmdbd version 19.05.5 started
>> slurmdbd: debug2: running rollup at Thu May 30 13:50:08 2024
>> slurmdbd: debug2: Everything rolled up
>>
>>
>> It goes like this for some time and then it crashes with this message
>>
>> slurmdbd: Terminate signal (SIGINT or SIGTERM) received
>> slurmdbd: debug: rpc_mgr shutting down
>>
>>
>> On Thu, May 30, 2024 at 8:18 AM mercan <ahmet.mer...@uhem.itu.edu.tr>
>> wrote:
>>
>>> Did you try to connect database using mysql command?
>>>
>>> mysql --user=slurm --password=slurmdbpass  slurm_acct_db
>>>
>>> C. Ahmet Mercan
>>>
>>> On 30.05.2024 14:48, Radhouane Aniba via slurm-users wrote:
>>>
>>> Thank you Ahmet,
>>> I dont have a firewall active.
>>> And because slurmdbd cannot connect to the database I am not able to
>>> getting it to be activated through systemctl I will share the output for
>>> slurmdbd -D -vvv shortly but overall it is always saying trying to connect
>>> to the db and then retries a couple of times and crashes
>>>
>>> R.
>>>
>>>
>>>
>>>
>>> On Thu, May 30, 2024 at 2:51 AM mercan <ahmet.mer...@uhem.itu.edu.tr>
>>> wrote:
>>>
>>>> Hi;
>>>>
>>>> Did you check can you connect db with your conf parameters from
>>>> head-node:
>>>>
>>>> mysql --user=slurm --password=slurmdbpass  slurm_acct_db
>>>>
>>>> Also, check and stop firewall and selinux, if they are running.
>>>>
>>>> Last, you can stop slurmdbd, then run run terminal with:
>>>>
>>>> slurmdbd -D -vvv
>>>>
>>>> Regards;
>>>>
>>>> C. Ahmet Mercan
>>>>
>>>> On 30.05.2024 00:05, Radhouane Aniba via slurm-users wrote:
>>>>
>>>> Hi everyone
>>>> I am trying to get slurmdbd to run on my local home server but I am
>>>> really struggling.
>>>> Note : am a novice slurm user
>>>> my slurmdbd always times out even though all the details in the conf
>>>> file are correct
>>>>
>>>> My log looks like this
>>>>
>>>> [2024-05-29T20:51:30.088] Accounting storage MYSQL plugin loaded
>>>> [2024-05-29T20:51:30.088] debug2: ArchiveDir = /tmp
>>>> [2024-05-29T20:51:30.088] debug2: ArchiveScript = (null)
>>>> [2024-05-29T20:51:30.088] debug2: AuthAltTypes = (null)
>>>> [2024-05-29T20:51:30.088] debug2: AuthInfo = (null)
>>>> [2024-05-29T20:51:30.088] debug2: AuthType = auth/munge
>>>> [2024-05-29T20:51:30.088] debug2: CommitDelay = 0
>>>> [2024-05-29T20:51:30.088] debug2: DbdAddr = localhost
>>>> [2024-05-29T20:51:30.088] debug2: DbdBackupHost = (null)
>>>> [2024-05-29T20:51:30.088] debug2: DbdHost = head-node
>>>> [2024-05-29T20:51:30.088] debug2: DbdPort = 7032
>>>> [2024-05-29T20:51:30.088] debug2: DebugFlags = (null)
>>>> [2024-05-29T20:51:30.088] debug2: DebugLevel = 6
>>>> [2024-05-29T20:51:30.088] debug2: DebugLevelSyslog = 10
>>>> [2024-05-29T20:51:30.088] debug2: DefaultQOS = (null)
>>>> [2024-05-29T20:51:30.088] debug2: LogFile = /var/log/slurmdbd.log
>>>> [2024-05-29T20:51:30.088] debug2: MessageTimeout = 100
>>>> [2024-05-29T20:51:30.088] debug2: Parameters = (null)
>>>> [2024-05-29T20:51:30.088] debug2: PidFile = /run/slurmdbd.pid
>>>> [2024-05-29T20:51:30.088] debug2: PluginDir =
>>>> /usr/lib/x86_64-linux-gnu/slurm-wlm
>>>> [2024-05-29T20:51:30.088] debug2: PrivateData = none
>>>> [2024-05-29T20:51:30.088] debug2: PurgeEventAfter = 1 months*
>>>> [2024-05-29T20:51:30.088] debug2: PurgeJobAfter = 12 months*
>>>> [2024-05-29T20:51:30.088] debug2: PurgeResvAfter = 1 months*
>>>> [2024-05-29T20:51:30.088] debug2: PurgeStepAfter = 1 months
>>>> [2024-05-29T20:51:30.088] debug2: PurgeSuspendAfter = 1 months
>>>> [2024-05-29T20:51:30.088] debug2: PurgeTXNAfter = 12 months
>>>> [2024-05-29T20:51:30.088] debug2: PurgeUsageAfter = 24 months
>>>> [2024-05-29T20:51:30.088] debug2: SlurmUser = root(0)
>>>> [2024-05-29T20:51:30.089] debug2: StorageBackupHost = (null)
>>>> [2024-05-29T20:51:30.089] debug2: StorageHost = localhost
>>>> [2024-05-29T20:51:30.089] debug2: StorageLoc = slurm_acct_db
>>>> [2024-05-29T20:51:30.089] debug2: StoragePort = 3306
>>>> [2024-05-29T20:51:30.089] debug2: StorageType =
>>>> accounting_storage/mysql
>>>> [2024-05-29T20:51:30.089] debug2: StorageUser = slurm
>>>> [2024-05-29T20:51:30.089] debug2: TCPTimeout = 2
>>>> [2024-05-29T20:51:30.089] debug2: TrackWCKey = 0
>>>> [2024-05-29T20:51:30.089] debug2: TrackSlurmctldDown= 0
>>>> [2024-05-29T20:51:30.089] debug2: acct_storage_p_get_connection:
>>>> request new connection 1
>>>> [2024-05-29T20:51:30.089] debug2: Attempting to connect to
>>>> localhost:3306
>>>> [2024-05-29T20:51:30.090] slurmdbd version 19.05.5 started
>>>> [2024-05-29T20:51:30.090] debug2: running rollup at Wed May 29 20:51:30
>>>> 2024
>>>> [2024-05-29T20:51:30.091] debug2: Everything rolled up
>>>> [2024-05-29T20:51:49.673] Terminate signal (SIGINT or SIGTERM) received
>>>> [2024-05-29T20:51:49.673] debug: rpc_mgr shutting down
>>>>
>>>>
>>>>
>>>> my config file looks like this
>>>>
>>>> ArchiveEvents=yes
>>>> ArchiveJobs=yes
>>>> ArchiveResvs=yes
>>>> ArchiveSteps=no
>>>> ArchiveSuspend=no
>>>> ArchiveTXN=no
>>>> ArchiveUsage=no
>>>> PurgeEventAfter=1month
>>>> PurgeJobAfter=12month
>>>> PurgeResvAfter=1month
>>>> PurgeStepAfter=1month
>>>> PurgeSuspendAfter=1month
>>>> PurgeTXNAfter=12month
>>>> PurgeUsageAfter=24month
>>>> # Authentication info
>>>> AuthType=auth/munge
>>>> # slurmDBD info
>>>> DbdAddr=localhost
>>>> DbdHost=head-node
>>>> DbdPort=7032
>>>> SlurmUser=root
>>>> MessageTimeout=100
>>>> DebugLevel=5
>>>> #DefaultQOS=normal,standby
>>>> LogFile=/var/log/slurmdbd.log
>>>> PidFile=/run/slurmdbd.pid
>>>> #PrivateData=accounts,users,usage,jobs
>>>> #TrackWCKey=yes
>>>> #
>>>> # Database info
>>>> StorageType=accounting_storage/mysql
>>>> StorageHost=localhost
>>>> StoragePort=3306
>>>> StoragePass=slurmdbpass
>>>> StorageUser=slurm
>>>> StorageLoc=slurm_acct_db
>>>> I used standard names and passwords to get started and I will change
>>>> later
>>>>
>>>> but everytime I try to start slurmdbd.service it crashes and I have
>>>> that log that I shared with you
>>>>
>>>> I use these versions
>>>>
>>>> slurmdbd -V
>>>> slurm-wlm 19.05.5
>>>> mysql Ver 15.1 Distrib 10.3.39-MariaDB, for debian-linux-gnu (x86_64)
>>>> using readline 5.2
>>>> Everything else Is working properly except I cannot get slurmdbd to
>>>> work and at this point I exhausted all my possible trials :) looking for
>>>> some expert insights :)
>>>>
>>>>
>>>> Any idea what I am doing wrong here ? Also I didn't compile any slurm
>>>> package. I used the binary from apt repos
>>>>
>>>> Any help will be appreciated
>>>>
>>>> Cheers
>>>>
>>>> Rad
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>
>>
>> --
>> *Rad Aniba, PhD*
>>
>>
>>
>> --
>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>>
>
>
> --
> *Rad Aniba, PhD*
>
>

-- 
*Rad Aniba, PhD*
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to