[slurm-users] slurmdbd not connecting to mysql (mariadb)

2024-05-29 Thread Radhouane Aniba via slurm-users
Hi everyone
I am trying to get slurmdbd to run on my local home server but I am really
struggling.
Note : am a novice slurm user
my slurmdbd always times out even though all the details in the conf file
are correct

My log looks like this

[2024-05-29T20:51:30.088] Accounting storage MYSQL plugin loaded
[2024-05-29T20:51:30.088] debug2: ArchiveDir = /tmp
[2024-05-29T20:51:30.088] debug2: ArchiveScript = (null)
[2024-05-29T20:51:30.088] debug2: AuthAltTypes = (null)
[2024-05-29T20:51:30.088] debug2: AuthInfo = (null)
[2024-05-29T20:51:30.088] debug2: AuthType = auth/munge
[2024-05-29T20:51:30.088] debug2: CommitDelay = 0
[2024-05-29T20:51:30.088] debug2: DbdAddr = localhost
[2024-05-29T20:51:30.088] debug2: DbdBackupHost = (null)
[2024-05-29T20:51:30.088] debug2: DbdHost = head-node
[2024-05-29T20:51:30.088] debug2: DbdPort = 7032
[2024-05-29T20:51:30.088] debug2: DebugFlags = (null)
[2024-05-29T20:51:30.088] debug2: DebugLevel = 6
[2024-05-29T20:51:30.088] debug2: DebugLevelSyslog = 10
[2024-05-29T20:51:30.088] debug2: DefaultQOS = (null)
[2024-05-29T20:51:30.088] debug2: LogFile = /var/log/slurmdbd.log
[2024-05-29T20:51:30.088] debug2: MessageTimeout = 100
[2024-05-29T20:51:30.088] debug2: Parameters = (null)
[2024-05-29T20:51:30.088] debug2: PidFile = /run/slurmdbd.pid
[2024-05-29T20:51:30.088] debug2: PluginDir =
/usr/lib/x86_64-linux-gnu/slurm-wlm
[2024-05-29T20:51:30.088] debug2: PrivateData = none
[2024-05-29T20:51:30.088] debug2: PurgeEventAfter = 1 months*
[2024-05-29T20:51:30.088] debug2: PurgeJobAfter = 12 months*
[2024-05-29T20:51:30.088] debug2: PurgeResvAfter = 1 months*
[2024-05-29T20:51:30.088] debug2: PurgeStepAfter = 1 months
[2024-05-29T20:51:30.088] debug2: PurgeSuspendAfter = 1 months
[2024-05-29T20:51:30.088] debug2: PurgeTXNAfter = 12 months
[2024-05-29T20:51:30.088] debug2: PurgeUsageAfter = 24 months
[2024-05-29T20:51:30.088] debug2: SlurmUser = root(0)
[2024-05-29T20:51:30.089] debug2: StorageBackupHost = (null)
[2024-05-29T20:51:30.089] debug2: StorageHost = localhost
[2024-05-29T20:51:30.089] debug2: StorageLoc = slurm_acct_db
[2024-05-29T20:51:30.089] debug2: StoragePort = 3306
[2024-05-29T20:51:30.089] debug2: StorageType = accounting_storage/mysql
[2024-05-29T20:51:30.089] debug2: StorageUser = slurm
[2024-05-29T20:51:30.089] debug2: TCPTimeout = 2
[2024-05-29T20:51:30.089] debug2: TrackWCKey = 0
[2024-05-29T20:51:30.089] debug2: TrackSlurmctldDown= 0
[2024-05-29T20:51:30.089] debug2: acct_storage_p_get_connection: request
new connection 1
[2024-05-29T20:51:30.089] debug2: Attempting to connect to localhost:3306
[2024-05-29T20:51:30.090] slurmdbd version 19.05.5 started
[2024-05-29T20:51:30.090] debug2: running rollup at Wed May 29 20:51:30
2024
[2024-05-29T20:51:30.091] debug2: Everything rolled up
[2024-05-29T20:51:49.673] Terminate signal (SIGINT or SIGTERM) received
[2024-05-29T20:51:49.673] debug: rpc_mgr shutting down



my config file looks like this

ArchiveEvents=yes
ArchiveJobs=yes
ArchiveResvs=yes
ArchiveSteps=no
ArchiveSuspend=no
ArchiveTXN=no
ArchiveUsage=no
PurgeEventAfter=1month
PurgeJobAfter=12month
PurgeResvAfter=1month
PurgeStepAfter=1month
PurgeSuspendAfter=1month
PurgeTXNAfter=12month
PurgeUsageAfter=24month
# Authentication info
AuthType=auth/munge
# slurmDBD info
DbdAddr=localhost
DbdHost=head-node
DbdPort=7032
SlurmUser=root
MessageTimeout=100
DebugLevel=5
#DefaultQOS=normal,standby
LogFile=/var/log/slurmdbd.log
PidFile=/run/slurmdbd.pid
#PrivateData=accounts,users,usage,jobs
#TrackWCKey=yes
#
# Database info
StorageType=accounting_storage/mysql
StorageHost=localhost
StoragePort=3306
StoragePass=slurmdbpass
StorageUser=slurm
StorageLoc=slurm_acct_db


I used standard names and passwords to get started and I will change later

but everytime I try to start slurmdbd.service it crashes and I have that
log that I shared with you

I use these versions

slurmdbd -V
slurm-wlm 19.05.5
mysql Ver 15.1 Distrib 10.3.39-MariaDB, for debian-linux-gnu (x86_64) using
readline 5.2

Everything else Is working properly except I cannot get slurmdbd to work
and at this point I exhausted all my possible trials :) looking for some
expert insights :)


Any idea what I am doing wrong here ? Also I didn't compile any slurm
package. I used the binary from apt repos

Any help will be appreciated

Cheers

Rad

--

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
Thank you Ahmet,
I dont have a firewall active.
And because slurmdbd cannot connect to the database I am not able to
getting it to be activated through systemctl I will share the output for
slurmdbd -D -vvv shortly but overall it is always saying trying to connect
to the db and then retries a couple of times and crashes

R.




On Thu, May 30, 2024 at 2:51 AM mercan  wrote:

> Hi;
>
> Did you check can you connect db with your conf parameters from head-node:
>
> mysql --user=slurm --password=slurmdbpass  slurm_acct_db
>
> Also, check and stop firewall and selinux, if they are running.
>
> Last, you can stop slurmdbd, then run run terminal with:
>
> slurmdbd -D -vvv
>
> Regards;
>
> C. Ahmet Mercan
>
> On 30.05.2024 00:05, Radhouane Aniba via slurm-users wrote:
>
> Hi everyone
> I am trying to get slurmdbd to run on my local home server but I am really
> struggling.
> Note : am a novice slurm user
> my slurmdbd always times out even though all the details in the conf file
> are correct
>
> My log looks like this
>
> [2024-05-29T20:51:30.088] Accounting storage MYSQL plugin loaded
> [2024-05-29T20:51:30.088] debug2: ArchiveDir = /tmp
> [2024-05-29T20:51:30.088] debug2: ArchiveScript = (null)
> [2024-05-29T20:51:30.088] debug2: AuthAltTypes = (null)
> [2024-05-29T20:51:30.088] debug2: AuthInfo = (null)
> [2024-05-29T20:51:30.088] debug2: AuthType = auth/munge
> [2024-05-29T20:51:30.088] debug2: CommitDelay = 0
> [2024-05-29T20:51:30.088] debug2: DbdAddr = localhost
> [2024-05-29T20:51:30.088] debug2: DbdBackupHost = (null)
> [2024-05-29T20:51:30.088] debug2: DbdHost = head-node
> [2024-05-29T20:51:30.088] debug2: DbdPort = 7032
> [2024-05-29T20:51:30.088] debug2: DebugFlags = (null)
> [2024-05-29T20:51:30.088] debug2: DebugLevel = 6
> [2024-05-29T20:51:30.088] debug2: DebugLevelSyslog = 10
> [2024-05-29T20:51:30.088] debug2: DefaultQOS = (null)
> [2024-05-29T20:51:30.088] debug2: LogFile = /var/log/slurmdbd.log
> [2024-05-29T20:51:30.088] debug2: MessageTimeout = 100
> [2024-05-29T20:51:30.088] debug2: Parameters = (null)
> [2024-05-29T20:51:30.088] debug2: PidFile = /run/slurmdbd.pid
> [2024-05-29T20:51:30.088] debug2: PluginDir =
> /usr/lib/x86_64-linux-gnu/slurm-wlm
> [2024-05-29T20:51:30.088] debug2: PrivateData = none
> [2024-05-29T20:51:30.088] debug2: PurgeEventAfter = 1 months*
> [2024-05-29T20:51:30.088] debug2: PurgeJobAfter = 12 months*
> [2024-05-29T20:51:30.088] debug2: PurgeResvAfter = 1 months*
> [2024-05-29T20:51:30.088] debug2: PurgeStepAfter = 1 months
> [2024-05-29T20:51:30.088] debug2: PurgeSuspendAfter = 1 months
> [2024-05-29T20:51:30.088] debug2: PurgeTXNAfter = 12 months
> [2024-05-29T20:51:30.088] debug2: PurgeUsageAfter = 24 months
> [2024-05-29T20:51:30.088] debug2: SlurmUser = root(0)
> [2024-05-29T20:51:30.089] debug2: StorageBackupHost = (null)
> [2024-05-29T20:51:30.089] debug2: StorageHost = localhost
> [2024-05-29T20:51:30.089] debug2: StorageLoc = slurm_acct_db
> [2024-05-29T20:51:30.089] debug2: StoragePort = 3306
> [2024-05-29T20:51:30.089] debug2: StorageType = accounting_storage/mysql
> [2024-05-29T20:51:30.089] debug2: StorageUser = slurm
> [2024-05-29T20:51:30.089] debug2: TCPTimeout = 2
> [2024-05-29T20:51:30.089] debug2: TrackWCKey = 0
> [2024-05-29T20:51:30.089] debug2: TrackSlurmctldDown= 0
> [2024-05-29T20:51:30.089] debug2: acct_storage_p_get_connection: request
> new connection 1
> [2024-05-29T20:51:30.089] debug2: Attempting to connect to localhost:3306
> [2024-05-29T20:51:30.090] slurmdbd version 19.05.5 started
> [2024-05-29T20:51:30.090] debug2: running rollup at Wed May 29 20:51:30
> 2024
> [2024-05-29T20:51:30.091] debug2: Everything rolled up
> [2024-05-29T20:51:49.673] Terminate signal (SIGINT or SIGTERM) received
> [2024-05-29T20:51:49.673] debug: rpc_mgr shutting down
>
>
>
> my config file looks like this
>
> ArchiveEvents=yes
> ArchiveJobs=yes
> ArchiveResvs=yes
> ArchiveSteps=no
> ArchiveSuspend=no
> ArchiveTXN=no
> ArchiveUsage=no
> PurgeEventAfter=1month
> PurgeJobAfter=12month
> PurgeResvAfter=1month
> PurgeStepAfter=1month
> PurgeSuspendAfter=1month
> PurgeTXNAfter=12month
> PurgeUsageAfter=24month
> # Authentication info
> AuthType=auth/munge
> # slurmDBD info
> DbdAddr=localhost
> DbdHost=head-node
> DbdPort=7032
> SlurmUser=root
> MessageTimeout=100
> DebugLevel=5
> #DefaultQOS=normal,standby
> LogFile=/var/log/slurmdbd.log
> PidFile=/run/slurmdbd.pid
> #PrivateData=accounts,users,usage,jobs
> #TrackWCKey=yes
> #
> # Database info
> StorageType=accounting_storage/mysql
> StorageHost=localhost
> StoragePort=3306
> StoragePass=slurmdbpass
> StorageUser=slurm
> StorageLoc=slurm_acct_db
> I used stand

[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
Yes I can connect to my database using mysql --user=slurm
--password=slurmdbpass  slurm_acct_db and there is no firewall blocking
mysql after checking the firewall question

ALso here is the output of slurmdbd -D -vvv (note I can only run this as
sudo )

sudo slurmdbd -D -vvv
slurmdbd: debug: Log file re-opened
slurmdbd: debug: Munge authentication plugin loaded
slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: debug2: innodb_buffer_pool_size: 134217728
slurmdbd: debug2: innodb_log_file_size: 50331648
slurmdbd: debug2: innodb_lock_wait_timeout: 50
slurmdbd: error: Database settings not recommended values:
innodb_buffer_pool_size innodb_lock_wait_timeout
slurmdbd: Accounting storage MYSQL plugin loaded
slurmdbd: debug2: ArchiveDir = /tmp
slurmdbd: debug2: ArchiveScript = (null)
slurmdbd: debug2: AuthAltTypes = (null)
slurmdbd: debug2: AuthInfo = (null)
slurmdbd: debug2: AuthType = auth/munge
slurmdbd: debug2: CommitDelay = 0
slurmdbd: debug2: DbdAddr = localhost
slurmdbd: debug2: DbdBackupHost = (null)
slurmdbd: debug2: DbdHost = hannibal-hn
slurmdbd: debug2: DbdPort = 7032
slurmdbd: debug2: DebugFlags = (null)
slurmdbd: debug2: DebugLevel = 6
slurmdbd: debug2: DebugLevelSyslog = 10
slurmdbd: debug2: DefaultQOS = (null)
slurmdbd: debug2: LogFile = /var/log/slurmdbd.log
slurmdbd: debug2: MessageTimeout = 100
slurmdbd: debug2: Parameters = (null)
slurmdbd: debug2: PidFile = /run/slurmdbd.pid
slurmdbd: debug2: PluginDir = /usr/lib/x86_64-linux-gnu/slurm-wlm
slurmdbd: debug2: PrivateData = none
slurmdbd: debug2: PurgeEventAfter = 1 months*
slurmdbd: debug2: PurgeJobAfter = 12 months*
slurmdbd: debug2: PurgeResvAfter = 1 months*
slurmdbd: debug2: PurgeStepAfter = 1 months
slurmdbd: debug2: PurgeSuspendAfter = 1 months
slurmdbd: debug2: PurgeTXNAfter = 12 months
slurmdbd: debug2: PurgeUsageAfter = 24 months
slurmdbd: debug2: SlurmUser = root(0)
slurmdbd: debug2: StorageBackupHost = (null)
slurmdbd: debug2: StorageHost = localhost
slurmdbd: debug2: StorageLoc = slurm_acct_db
slurmdbd: debug2: StoragePort = 3306
slurmdbd: debug2: StorageType = accounting_storage/mysql
slurmdbd: debug2: StorageUser = slurm
slurmdbd: debug2: TCPTimeout = 2
slurmdbd: debug2: TrackWCKey = 0
slurmdbd: debug2: TrackSlurmctldDown= 0
slurmdbd: debug2: acct_storage_p_get_connection: request new connection 1
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: slurmdbd version 19.05.5 started
slurmdbd: debug2: running rollup at Thu May 30 13:50:08 2024
slurmdbd: debug2: Everything rolled up


It goes like this for some time and then it crashes with this message

slurmdbd: Terminate signal (SIGINT or SIGTERM) received
slurmdbd: debug: rpc_mgr shutting down


On Thu, May 30, 2024 at 8:18 AM mercan  wrote:

> Did you try to connect database using mysql command?
>
> mysql --user=slurm --password=slurmdbpass  slurm_acct_db
>
> C. Ahmet Mercan
>
> On 30.05.2024 14:48, Radhouane Aniba via slurm-users wrote:
>
> Thank you Ahmet,
> I dont have a firewall active.
> And because slurmdbd cannot connect to the database I am not able to
> getting it to be activated through systemctl I will share the output for
> slurmdbd -D -vvv shortly but overall it is always saying trying to connect
> to the db and then retries a couple of times and crashes
>
> R.
>
>
>
>
> On Thu, May 30, 2024 at 2:51 AM mercan 
> wrote:
>
>> Hi;
>>
>> Did you check can you connect db with your conf parameters from head-node:
>>
>> mysql --user=slurm --password=slurmdbpass  slurm_acct_db
>>
>> Also, check and stop firewall and selinux, if they are running.
>>
>> Last, you can stop slurmdbd, then run run terminal with:
>>
>> slurmdbd -D -vvv
>>
>> Regards;
>>
>> C. Ahmet Mercan
>>
>> On 30.05.2024 00:05, Radhouane Aniba via slurm-users wrote:
>>
>> Hi everyone
>> I am trying to get slurmdbd to run on my local home server but I am
>> really struggling.
>> Note : am a novice slurm user
>> my slurmdbd always times out even though all the details in the conf file
>> are correct
>>
>> My log looks like this
>>
>> [2024-05-29T20:51:30.088] Accounting storage MYSQL plugin loaded
>> [2024-05-29T20:51:30.088] debug2: ArchiveDir = /tmp
>> [2024-05-29T20:51:30.088] debug2: ArchiveScript = (null)
>> [2024-05-29T20:51:30.088] debug2: AuthAltTypes = (null)
>> [2024-05-29T20:51:30.088] debug2: AuthInfo = (null)
>> [2024-05-29T20:51:30.088] debug2: AuthType = auth/munge
>> [2024-05-29T20:51:30.088] debug2: CommitDelay = 0
>> [2024-05-29T20:51:30.088] debug2: DbdAddr = localhost
>> [2024-05-29T20:51:30.088] debug2: DbdBackupHost = (null)
>> [2024-05-29T20:51:30.088] debug2: DbdHost = head-node
>> [2024-05-29T

[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
Thank you Ahmet and Brian,

Ahmet, which conf in particular slurmdbd is readiugn from, I parsed all the
cnf files for mysql and I cannot find the data it is displaying here

slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: debug2: innodb_buffer_pool_size: 134217728
slurmdbd: debug2: innodb_log_file_size: 50331648
slurmdbd: debug2: innodb_lock_wait_timeout: 50
slurmdbd: error: Database settings not recommended values:
innodb_buffer_pool_size innodb_lock_wait_timeout


sudo tree /etc/mysql/*
/etc/mysql/conf.d
├── mysql.cnf
└── mysqldump.cnf
/etc/mysql/debian.cnf
/etc/mysql/debian-start
/etc/mysql/FROZEN
/etc/mysql/mariadb.cnf
/etc/mysql/mariadb.conf.d
├── 50-client.cnf
├── 50-mysql-clients.cnf
├── 50-mysqld_safe.cnf
└── 50-server.cnf
/etc/mysql/my.cnf
/etc/mysql/my.cnf.fallback
/etc/mysql/mysql.cnf
/etc/mysql/mysql.conf.d
├── mysql.cnf
└── mysqld.cnf

On Thu, May 30, 2024 at 12:21 PM Brian Andrus via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> That SIGTERM message means something is telling slurmdbd to quit.
>
> Check your cron jobs, maintenance scripts, etc. Slurmdbd is being told to
> shutdown. If you are running in the foreground, a ^C does that. If you run
> a kill or killall on it, you will get that same message.
>
> Brian Andrus
> On 5/30/2024 6:53 AM, Radhouane Aniba via slurm-users wrote:
>
> Yes I can connect to my database using mysql --user=slurm
> --password=slurmdbpass  slurm_acct_db and there is no firewall blocking
> mysql after checking the firewall question
>
> ALso here is the output of slurmdbd -D -vvv (note I can only run this as
> sudo )
>
> sudo slurmdbd -D -vvv
> slurmdbd: debug: Log file re-opened
> slurmdbd: debug: Munge authentication plugin loaded
> slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
> slurmdbd: debug2: Attempting to connect to localhost:3306
> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
> slurmdbd: debug2: innodb_log_file_size: 50331648
> slurmdbd: debug2: innodb_lock_wait_timeout: 50
> slurmdbd: error: Database settings not recommended values:
> innodb_buffer_pool_size innodb_lock_wait_timeout
> slurmdbd: Accounting storage MYSQL plugin loaded
> slurmdbd: debug2: ArchiveDir = /tmp
> slurmdbd: debug2: ArchiveScript = (null)
> slurmdbd: debug2: AuthAltTypes = (null)
> slurmdbd: debug2: AuthInfo = (null)
> slurmdbd: debug2: AuthType = auth/munge
> slurmdbd: debug2: CommitDelay = 0
> slurmdbd: debug2: DbdAddr = localhost
> slurmdbd: debug2: DbdBackupHost = (null)
> slurmdbd: debug2: DbdHost = hannibal-hn
> slurmdbd: debug2: DbdPort = 7032
> slurmdbd: debug2: DebugFlags = (null)
> slurmdbd: debug2: DebugLevel = 6
> slurmdbd: debug2: DebugLevelSyslog = 10
> slurmdbd: debug2: DefaultQOS = (null)
> slurmdbd: debug2: LogFile = /var/log/slurmdbd.log
> slurmdbd: debug2: MessageTimeout = 100
> slurmdbd: debug2: Parameters = (null)
> slurmdbd: debug2: PidFile = /run/slurmdbd.pid
> slurmdbd: debug2: PluginDir = /usr/lib/x86_64-linux-gnu/slurm-wlm
> slurmdbd: debug2: PrivateData = none
> slurmdbd: debug2: PurgeEventAfter = 1 months*
> slurmdbd: debug2: PurgeJobAfter = 12 months*
> slurmdbd: debug2: PurgeResvAfter = 1 months*
> slurmdbd: debug2: PurgeStepAfter = 1 months
> slurmdbd: debug2: PurgeSuspendAfter = 1 months
> slurmdbd: debug2: PurgeTXNAfter = 12 months
> slurmdbd: debug2: PurgeUsageAfter = 24 months
> slurmdbd: debug2: SlurmUser = root(0)
> slurmdbd: debug2: StorageBackupHost = (null)
> slurmdbd: debug2: StorageHost = localhost
> slurmdbd: debug2: StorageLoc = slurm_acct_db
> slurmdbd: debug2: StoragePort = 3306
> slurmdbd: debug2: StorageType = accounting_storage/mysql
> slurmdbd: debug2: StorageUser = slurm
> slurmdbd: debug2: TCPTimeout = 2
> slurmdbd: debug2: TrackWCKey = 0
> slurmdbd: debug2: TrackSlurmctldDown= 0
> slurmdbd: debug2: acct_storage_p_get_connection: request new connection 1
> slurmdbd: debug2: Attempting to connect to localhost:3306
> slurmdbd: slurmdbd version 19.05.5 started
> slurmdbd: debug2: running rollup at Thu May 30 13:50:08 2024
> slurmdbd: debug2: Everything rolled up
>
>
> It goes like this for some time and then it crashes with this message
>
> slurmdbd: Terminate signal (SIGINT or SIGTERM) received
> slurmdbd: debug: rpc_mgr shutting down
>
>
> On Thu, May 30, 2024 at 8:18 AM mercan 
> wrote:
>
>> Did you try to connect database using mysql command?
>>
>> mysql --user=slurm --password=slurmdbpass  slurm_acct_db
>>
>> C. Ahmet Mercan
>>
>> On 30.05.2024 14:48, Radhouane Aniba via slurm-users wrote:
>>
>> Thank you Ahmet,
>> I dont have a firewall active.
>> And because slurmdbd cannot connect to the database I am not able to
>> getting it to be activated through systemctl

[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
Ok I made some progress here.

I removed and purged slurmdbd mysql mariadb etc .. and started from scratch.
I added the recommended mysqld requirements

Started slurmdbd manually : sudo slurmdbd -D /path/to/conf and everything
worked well

When I tried to start the service sudo systemctl start slurmdbd.service  it
didnt work

sudo systemctl status  slurmdbd.service
● slurmdbd.service - Slurm DBD accounting daemon
 Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled; vendor
preset: enabled)
 Active: failed (Result: timeout) since Fri 2024-05-31 00:21:30 UTC;
2min 5s ago
Process: 6258 ExecStart=/usr/sbin/slurmdbd -D
/etc/slurm-llnl/slurmdbd.conf (code=exited, status=0/SUCCESS)

May 31 00:20:00 hannibal-hn systemd[1]: Starting Slurm DBD accounting
daemon...
May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: start operation
timed out. Terminating.
May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: Failed with
result 'timeout'.
May 31 00:21:30 hannibal-hn systemd[1]: Failed to start Slurm DBD
accounting daemon.

Even though it is the same command ?!

Any idea ?


On Thu, May 30, 2024 at 5:02 PM Radhouane Aniba  wrote:

> Thank you Ahmet and Brian,
>
> Ahmet, which conf in particular slurmdbd is readiugn from, I parsed all
> the cnf files for mysql and I cannot find the data it is displaying here
>
> slurmdbd: debug2: Attempting to connect to localhost:3306
> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
> slurmdbd: debug2: innodb_log_file_size: 50331648
> slurmdbd: debug2: innodb_lock_wait_timeout: 50
> slurmdbd: error: Database settings not recommended values:
> innodb_buffer_pool_size innodb_lock_wait_timeout
>
>
> sudo tree /etc/mysql/*
> /etc/mysql/conf.d
> ├── mysql.cnf
> └── mysqldump.cnf
> /etc/mysql/debian.cnf
> /etc/mysql/debian-start
> /etc/mysql/FROZEN
> /etc/mysql/mariadb.cnf
> /etc/mysql/mariadb.conf.d
> ├── 50-client.cnf
> ├── 50-mysql-clients.cnf
> ├── 50-mysqld_safe.cnf
> └── 50-server.cnf
> /etc/mysql/my.cnf
> /etc/mysql/my.cnf.fallback
> /etc/mysql/mysql.cnf
> /etc/mysql/mysql.conf.d
> ├── mysql.cnf
> └── mysqld.cnf
>
> On Thu, May 30, 2024 at 12:21 PM Brian Andrus via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
>> That SIGTERM message means something is telling slurmdbd to quit.
>>
>> Check your cron jobs, maintenance scripts, etc. Slurmdbd is being told to
>> shutdown. If you are running in the foreground, a ^C does that. If you run
>> a kill or killall on it, you will get that same message.
>>
>> Brian Andrus
>> On 5/30/2024 6:53 AM, Radhouane Aniba via slurm-users wrote:
>>
>> Yes I can connect to my database using mysql --user=slurm
>> --password=slurmdbpass  slurm_acct_db and there is no firewall blocking
>> mysql after checking the firewall question
>>
>> ALso here is the output of slurmdbd -D -vvv (note I can only run this as
>> sudo )
>>
>> sudo slurmdbd -D -vvv
>> slurmdbd: debug: Log file re-opened
>> slurmdbd: debug: Munge authentication plugin loaded
>> slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
>> slurmdbd: debug2: Attempting to connect to localhost:3306
>> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
>> slurmdbd: debug2: innodb_log_file_size: 50331648
>> slurmdbd: debug2: innodb_lock_wait_timeout: 50
>> slurmdbd: error: Database settings not recommended values:
>> innodb_buffer_pool_size innodb_lock_wait_timeout
>> slurmdbd: Accounting storage MYSQL plugin loaded
>> slurmdbd: debug2: ArchiveDir = /tmp
>> slurmdbd: debug2: ArchiveScript = (null)
>> slurmdbd: debug2: AuthAltTypes = (null)
>> slurmdbd: debug2: AuthInfo = (null)
>> slurmdbd: debug2: AuthType = auth/munge
>> slurmdbd: debug2: CommitDelay = 0
>> slurmdbd: debug2: DbdAddr = localhost
>> slurmdbd: debug2: DbdBackupHost = (null)
>> slurmdbd: debug2: DbdHost = hannibal-hn
>> slurmdbd: debug2: DbdPort = 7032
>> slurmdbd: debug2: DebugFlags = (null)
>> slurmdbd: debug2: DebugLevel = 6
>> slurmdbd: debug2: DebugLevelSyslog = 10
>> slurmdbd: debug2: DefaultQOS = (null)
>> slurmdbd: debug2: LogFile = /var/log/slurmdbd.log
>> slurmdbd: debug2: MessageTimeout = 100
>> slurmdbd: debug2: Parameters = (null)
>> slurmdbd: debug2: PidFile = /run/slurmdbd.pid
>> slurmdbd: debug2: PluginDir = /usr/lib/x86_64-linux-gnu/slurm-wlm
>> slurmdbd: debug2: PrivateData = none
>> slurmdbd: debug2: PurgeEventAfter = 1 months*
>> slurmdbd: debug2: PurgeJobAfter = 12 months*
>> slurmdbd: debug2: PurgeResvAfter = 1 months*
>> slurmdbd: debug2: PurgeStepAfter = 1 months
>> slurmdbd: debug2: PurgeSuspendAfter = 1 months
>> slurmdbd:

[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
manually running it through sudo slurmdbd -D /path/to/conf is very quick on
my fresh install

trying to start the slurmdbd through systemctl take 3 minutes and then
crashes and fail

Is there an alternative to systemctl to start the slurmdbd in the
background ?

But most importantly I wanted to know why it takes so long through
systemctl. Maybe I can increase the timeout limit ?

On Thu, May 30, 2024 at 11:54 PM Ryan Novosielski 
wrote:

> It may take longer to start than systemd allows for. How long does it take
> to start from the command line? It’s common to need to run it manually for
> upgrades to complete.
>
> --
> #BlackLivesMatter
> 
> || \\UTGERS, |---*O*---
> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\of NJ  | Office of Advanced Research Computing - MSB
> A555B, Newark
>  `'
>
> On May 30, 2024, at 20:24, Radhouane Aniba via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
> Ok I made some progress here.
>
> I removed and purged slurmdbd mysql mariadb etc .. and started from
> scratch.
> I added the recommended mysqld requirements
>
> Started slurmdbd manually : sudo slurmdbd -D /path/to/conf and everything
> worked well
>
> When I tried to start the service sudo systemctl start slurmdbd.service
> it didnt work
>
> sudo systemctl status  slurmdbd.service
> ● slurmdbd.service - Slurm DBD accounting daemon
>  Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled; vendor
> preset: enabled)
>  Active: failed (Result: timeout) since Fri 2024-05-31 00:21:30 UTC;
> 2min 5s ago
> Process: 6258 ExecStart=/usr/sbin/slurmdbd -D
> /etc/slurm-llnl/slurmdbd.conf (code=exited, status=0/SUCCESS)
>
> May 31 00:20:00 hannibal-hn systemd[1]: Starting Slurm DBD accounting
> daemon...
> May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: start operation
> timed out. Terminating.
> May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: Failed with
> result 'timeout'.
> May 31 00:21:30 hannibal-hn systemd[1]: Failed to start Slurm DBD
> accounting daemon.
>
> Even though it is the same command ?!
>
> Any idea ?
>
>
> On Thu, May 30, 2024 at 5:02 PM Radhouane Aniba  wrote:
>
>> Thank you Ahmet and Brian,
>>
>> Ahmet, which conf in particular slurmdbd is readiugn from, I parsed all
>> the cnf files for mysql and I cannot find the data it is displaying here
>>
>> slurmdbd: debug2: Attempting to connect to localhost:3306
>> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
>> slurmdbd: debug2: innodb_log_file_size: 50331648
>> slurmdbd: debug2: innodb_lock_wait_timeout: 50
>> slurmdbd: error: Database settings not recommended values:
>> innodb_buffer_pool_size innodb_lock_wait_timeout
>>
>>
>> sudo tree /etc/mysql/*
>> /etc/mysql/conf.d
>> ├── mysql.cnf
>> └── mysqldump.cnf
>> /etc/mysql/debian.cnf
>> /etc/mysql/debian-start
>> /etc/mysql/FROZEN
>> /etc/mysql/mariadb.cnf
>> /etc/mysql/mariadb.conf.d
>> ├── 50-client.cnf
>> ├── 50-mysql-clients.cnf
>> ├── 50-mysqld_safe.cnf
>> └── 50-server.cnf
>> /etc/mysql/my.cnf
>> /etc/mysql/my.cnf.fallback
>> /etc/mysql/mysql.cnf
>> /etc/mysql/mysql.conf.d
>> ├── mysql.cnf
>> └── mysqld.cnf
>>
>> On Thu, May 30, 2024 at 12:21 PM Brian Andrus via slurm-users <
>> slurm-users@lists.schedmd.com> wrote:
>>
>>> That SIGTERM message means something is telling slurmdbd to quit.
>>>
>>> Check your cron jobs, maintenance scripts, etc. Slurmdbd is being told
>>> to shutdown. If you are running in the foreground, a ^C does that. If you
>>> run a kill or killall on it, you will get that same message.
>>>
>>> Brian Andrus
>>> On 5/30/2024 6:53 AM, Radhouane Aniba via slurm-users wrote:
>>>
>>> Yes I can connect to my database using mysql --user=slurm
>>> --password=slurmdbpass  slurm_acct_db and there is no firewall blocking
>>> mysql after checking the firewall question
>>>
>>> ALso here is the output of slurmdbd -D -vvv (note I can only run this as
>>> sudo )
>>>
>>> sudo slurmdbd -D -vvv
>>> slurmdbd: debug: Log file re-opened
>>> slurmdbd: debug: Munge authentication plugin loaded
>>> slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
>>> slurmdbd: debug2: Attempting to connect to localhost:3306
>>> slurmdbd: debug2: innodb_buffer_pool_size: 134217728
>>> slurmdbd: debug

[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
Yes when I run it manually it says something like this

[2024-05-31T00:20:01.142] Accounting storage MYSQL plugin loaded
[2024-05-31T00:20:01.146] slurmdbd version 19.05.5 started

But when I try to do it through systemctl

[2024-05-31T00:21:30.953] Terminate signal (SIGINT or SIGTERM) received
[2024-05-31T00:21:30.953] debug:  rpc_mgr shutting down



On Fri, May 31, 2024 at 12:01 AM Ryan Novosielski 
wrote:

> Are you looking at the log/what appears on the screen, and do you know for
> a fact that it is all the way up (should say "version  started”
> at the end)?
>
> If that’s not it, you could have a permissions thing or something.
>
> I do not expect you’d need to extend the timeout for a normal run. I
> suspect it is doing something.
>
> --
> #BlackLivesMatter
> 
> || \\UTGERS, |---*O*---
> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\of NJ  | Office of Advanced Research Computing - MSB
> A555B, Newark
>  `'
>
> On May 30, 2024, at 23:57, Radhouane Aniba  wrote:
>
> manually running it through sudo slurmdbd -D /path/to/conf is very quick
> on my fresh install
>
> trying to start the slurmdbd through systemctl take 3 minutes and then
> crashes and fail
>
> Is there an alternative to systemctl to start the slurmdbd in the
> background ?
>
> But most importantly I wanted to know why it takes so long through
> systemctl. Maybe I can increase the timeout limit ?
>
> On Thu, May 30, 2024 at 11:54 PM Ryan Novosielski 
> wrote:
>
>> It may take longer to start than systemd allows for. How long does it
>> take to start from the command line? It’s common to need to run it manually
>> for upgrades to complete.
>>
>> --
>> #BlackLivesMatter
>> 
>> || \\UTGERS,
>> |---*O*---
>> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~
>> RBHS Campus
>> ||  \\of NJ  | Office of Advanced Research Computing - MSB
>> A555B, Newark
>>  `'
>>
>> On May 30, 2024, at 20:24, Radhouane Aniba via slurm-users <
>> slurm-users@lists.schedmd.com> wrote:
>>
>> Ok I made some progress here.
>>
>> I removed and purged slurmdbd mysql mariadb etc .. and started from
>> scratch.
>> I added the recommended mysqld requirements
>>
>> Started slurmdbd manually : sudo slurmdbd -D /path/to/conf and everything
>> worked well
>>
>> When I tried to start the service sudo systemctl start slurmdbd.service
>> it didnt work
>>
>> sudo systemctl status  slurmdbd.service
>> ● slurmdbd.service - Slurm DBD accounting daemon
>>  Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled;
>> vendor preset: enabled)
>>  Active: failed (Result: timeout) since Fri 2024-05-31 00:21:30 UTC;
>> 2min 5s ago
>> Process: 6258 ExecStart=/usr/sbin/slurmdbd -D
>> /etc/slurm-llnl/slurmdbd.conf (code=exited, status=0/SUCCESS)
>>
>> May 31 00:20:00 hannibal-hn systemd[1]: Starting Slurm DBD accounting
>> daemon...
>> May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: start operation
>> timed out. Terminating.
>> May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: Failed with
>> result 'timeout'.
>> May 31 00:21:30 hannibal-hn systemd[1]: Failed to start Slurm DBD
>> accounting daemon.
>>
>> Even though it is the same command ?!
>>
>> Any idea ?
>>
>> --
> *Rad Aniba, PhD*
>
>
>

-- 
*Rad Aniba, PhD*

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: slurmdbd not connecting to mysql (mariadb)

2024-05-30 Thread Radhouane Aniba via slurm-users
I also run both commands using sudo so I am assuming permission should not
be the issue ?  my cluster user is root (i know not good, but im testing
things out)

On Fri, May 31, 2024 at 12:03 AM Radhouane Aniba  wrote:

> Yes when I run it manually it says something like this
>
> [2024-05-31T00:20:01.142] Accounting storage MYSQL plugin loaded
> [2024-05-31T00:20:01.146] slurmdbd version 19.05.5 started
>
> But when I try to do it through systemctl
>
> [2024-05-31T00:21:30.953] Terminate signal (SIGINT or SIGTERM) received
> [2024-05-31T00:21:30.953] debug:  rpc_mgr shutting down
>
>
>
> On Fri, May 31, 2024 at 12:01 AM Ryan Novosielski 
> wrote:
>
>> Are you looking at the log/what appears on the screen, and do you know
>> for a fact that it is all the way up (should say "version 
>> started” at the end)?
>>
>> If that’s not it, you could have a permissions thing or something.
>>
>> I do not expect you’d need to extend the timeout for a normal run. I
>> suspect it is doing something.
>>
>> --
>> #BlackLivesMatter
>> 
>> || \\UTGERS,
>> |---*O*---
>> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~
>> RBHS Campus
>> ||  \\of NJ  | Office of Advanced Research Computing - MSB
>> A555B, Newark
>>  `'
>>
>> On May 30, 2024, at 23:57, Radhouane Aniba  wrote:
>>
>> manually running it through sudo slurmdbd -D /path/to/conf is very quick
>> on my fresh install
>>
>> trying to start the slurmdbd through systemctl take 3 minutes and then
>> crashes and fail
>>
>> Is there an alternative to systemctl to start the slurmdbd in the
>> background ?
>>
>> But most importantly I wanted to know why it takes so long through
>> systemctl. Maybe I can increase the timeout limit ?
>>
>> On Thu, May 30, 2024 at 11:54 PM Ryan Novosielski 
>> wrote:
>>
>>> It may take longer to start than systemd allows for. How long does it
>>> take to start from the command line? It’s common to need to run it manually
>>> for upgrades to complete.
>>>
>>> --
>>> #BlackLivesMatter
>>> 
>>> || \\UTGERS,
>>> |-------*O*---
>>> ||_// the State  | Ryan Novosielski - novos...@rutgers.edu
>>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~
>>> RBHS Campus
>>> ||  \\of NJ  | Office of Advanced Research Computing - MSB
>>> A555B, Newark
>>>  `'
>>>
>>> On May 30, 2024, at 20:24, Radhouane Aniba via slurm-users <
>>> slurm-users@lists.schedmd.com> wrote:
>>>
>>> Ok I made some progress here.
>>>
>>> I removed and purged slurmdbd mysql mariadb etc .. and started from
>>> scratch.
>>> I added the recommended mysqld requirements
>>>
>>> Started slurmdbd manually : sudo slurmdbd -D /path/to/conf and
>>> everything worked well
>>>
>>> When I tried to start the service sudo systemctl start slurmdbd.service
>>> it didnt work
>>>
>>> sudo systemctl status  slurmdbd.service
>>> ● slurmdbd.service - Slurm DBD accounting daemon
>>>  Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled;
>>> vendor preset: enabled)
>>>  Active: failed (Result: timeout) since Fri 2024-05-31 00:21:30 UTC;
>>> 2min 5s ago
>>> Process: 6258 ExecStart=/usr/sbin/slurmdbd -D
>>> /etc/slurm-llnl/slurmdbd.conf (code=exited, status=0/SUCCESS)
>>>
>>> May 31 00:20:00 hannibal-hn systemd[1]: Starting Slurm DBD accounting
>>> daemon...
>>> May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: start
>>> operation timed out. Terminating.
>>> May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: Failed with
>>> result 'timeout'.
>>> May 31 00:21:30 hannibal-hn systemd[1]: Failed to start Slurm DBD
>>> accounting daemon.
>>>
>>> Even though it is the same command ?!
>>>
>>> Any idea ?
>>>
>>> --
>> *Rad Aniba, PhD*
>>
>>
>>
>
> --
> *Rad Aniba, PhD*
>
>

-- 
*Rad Aniba, PhD*

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com