Ok I made some progress here. I removed and purged slurmdbd mysql mariadb etc .. and started from scratch. I added the recommended mysqld requirements
Started slurmdbd manually : sudo slurmdbd -D /path/to/conf and everything worked well When I tried to start the service sudo systemctl start slurmdbd.service it didnt work sudo systemctl status slurmdbd.service ● slurmdbd.service - Slurm DBD accounting daemon Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled; vendor preset: enabled) Active: failed (Result: timeout) since Fri 2024-05-31 00:21:30 UTC; 2min 5s ago Process: 6258 ExecStart=/usr/sbin/slurmdbd -D /etc/slurm-llnl/slurmdbd.conf (code=exited, status=0/SUCCESS) May 31 00:20:00 hannibal-hn systemd[1]: Starting Slurm DBD accounting daemon... May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: start operation timed out. Terminating. May 31 00:21:30 hannibal-hn systemd[1]: slurmdbd.service: Failed with result 'timeout'. May 31 00:21:30 hannibal-hn systemd[1]: Failed to start Slurm DBD accounting daemon. Even though it is the same command ?! Any idea ? On Thu, May 30, 2024 at 5:02 PM Radhouane Aniba <arad...@gmail.com> wrote: > Thank you Ahmet and Brian, > > Ahmet, which conf in particular slurmdbd is readiugn from, I parsed all > the cnf files for mysql and I cannot find the data it is displaying here > > slurmdbd: debug2: Attempting to connect to localhost:3306 > slurmdbd: debug2: innodb_buffer_pool_size: 134217728 > slurmdbd: debug2: innodb_log_file_size: 50331648 > slurmdbd: debug2: innodb_lock_wait_timeout: 50 > slurmdbd: error: Database settings not recommended values: > innodb_buffer_pool_size innodb_lock_wait_timeout > > > sudo tree /etc/mysql/* > /etc/mysql/conf.d > ├── mysql.cnf > └── mysqldump.cnf > /etc/mysql/debian.cnf > /etc/mysql/debian-start > /etc/mysql/FROZEN > /etc/mysql/mariadb.cnf > /etc/mysql/mariadb.conf.d > ├── 50-client.cnf > ├── 50-mysql-clients.cnf > ├── 50-mysqld_safe.cnf > └── 50-server.cnf > /etc/mysql/my.cnf > /etc/mysql/my.cnf.fallback > /etc/mysql/mysql.cnf > /etc/mysql/mysql.conf.d > ├── mysql.cnf > └── mysqld.cnf > > On Thu, May 30, 2024 at 12:21 PM Brian Andrus via slurm-users < > slurm-users@lists.schedmd.com> wrote: > >> That SIGTERM message means something is telling slurmdbd to quit. >> >> Check your cron jobs, maintenance scripts, etc. Slurmdbd is being told to >> shutdown. If you are running in the foreground, a ^C does that. If you run >> a kill or killall on it, you will get that same message. >> >> Brian Andrus >> On 5/30/2024 6:53 AM, Radhouane Aniba via slurm-users wrote: >> >> Yes I can connect to my database using mysql --user=slurm >> --password=slurmdbpass slurm_acct_db and there is no firewall blocking >> mysql after checking the firewall question >> >> ALso here is the output of slurmdbd -D -vvv (note I can only run this as >> sudo ) >> >> sudo slurmdbd -D -vvv >> slurmdbd: debug: Log file re-opened >> slurmdbd: debug: Munge authentication plugin loaded >> slurmdbd: debug2: mysql_connect() called for db slurm_acct_db >> slurmdbd: debug2: Attempting to connect to localhost:3306 >> slurmdbd: debug2: innodb_buffer_pool_size: 134217728 >> slurmdbd: debug2: innodb_log_file_size: 50331648 >> slurmdbd: debug2: innodb_lock_wait_timeout: 50 >> slurmdbd: error: Database settings not recommended values: >> innodb_buffer_pool_size innodb_lock_wait_timeout >> slurmdbd: Accounting storage MYSQL plugin loaded >> slurmdbd: debug2: ArchiveDir = /tmp >> slurmdbd: debug2: ArchiveScript = (null) >> slurmdbd: debug2: AuthAltTypes = (null) >> slurmdbd: debug2: AuthInfo = (null) >> slurmdbd: debug2: AuthType = auth/munge >> slurmdbd: debug2: CommitDelay = 0 >> slurmdbd: debug2: DbdAddr = localhost >> slurmdbd: debug2: DbdBackupHost = (null) >> slurmdbd: debug2: DbdHost = hannibal-hn >> slurmdbd: debug2: DbdPort = 7032 >> slurmdbd: debug2: DebugFlags = (null) >> slurmdbd: debug2: DebugLevel = 6 >> slurmdbd: debug2: DebugLevelSyslog = 10 >> slurmdbd: debug2: DefaultQOS = (null) >> slurmdbd: debug2: LogFile = /var/log/slurmdbd.log >> slurmdbd: debug2: MessageTimeout = 100 >> slurmdbd: debug2: Parameters = (null) >> slurmdbd: debug2: PidFile = /run/slurmdbd.pid >> slurmdbd: debug2: PluginDir = /usr/lib/x86_64-linux-gnu/slurm-wlm >> slurmdbd: debug2: PrivateData = none >> slurmdbd: debug2: PurgeEventAfter = 1 months* >> slurmdbd: debug2: PurgeJobAfter = 12 months* >> slurmdbd: debug2: PurgeResvAfter = 1 months* >> slurmdbd: debug2: PurgeStepAfter = 1 months >> slurmdbd: debug2: PurgeSuspendAfter = 1 months >> slurmdbd: debug2: PurgeTXNAfter = 12 months >> slurmdbd: debug2: PurgeUsageAfter = 24 months >> slurmdbd: debug2: SlurmUser = root(0) >> slurmdbd: debug2: StorageBackupHost = (null) >> slurmdbd: debug2: StorageHost = localhost >> slurmdbd: debug2: StorageLoc = slurm_acct_db >> slurmdbd: debug2: StoragePort = 3306 >> slurmdbd: debug2: StorageType = accounting_storage/mysql >> slurmdbd: debug2: StorageUser = slurm >> slurmdbd: debug2: TCPTimeout = 2 >> slurmdbd: debug2: TrackWCKey = 0 >> slurmdbd: debug2: TrackSlurmctldDown= 0 >> slurmdbd: debug2: acct_storage_p_get_connection: request new connection 1 >> slurmdbd: debug2: Attempting to connect to localhost:3306 >> slurmdbd: slurmdbd version 19.05.5 started >> slurmdbd: debug2: running rollup at Thu May 30 13:50:08 2024 >> slurmdbd: debug2: Everything rolled up >> >> >> It goes like this for some time and then it crashes with this message >> >> slurmdbd: Terminate signal (SIGINT or SIGTERM) received >> slurmdbd: debug: rpc_mgr shutting down >> >> >> On Thu, May 30, 2024 at 8:18 AM mercan <ahmet.mer...@uhem.itu.edu.tr> >> wrote: >> >>> Did you try to connect database using mysql command? >>> >>> mysql --user=slurm --password=slurmdbpass slurm_acct_db >>> >>> C. Ahmet Mercan >>> >>> On 30.05.2024 14:48, Radhouane Aniba via slurm-users wrote: >>> >>> Thank you Ahmet, >>> I dont have a firewall active. >>> And because slurmdbd cannot connect to the database I am not able to >>> getting it to be activated through systemctl I will share the output for >>> slurmdbd -D -vvv shortly but overall it is always saying trying to connect >>> to the db and then retries a couple of times and crashes >>> >>> R. >>> >>> >>> >>> >>> On Thu, May 30, 2024 at 2:51 AM mercan <ahmet.mer...@uhem.itu.edu.tr> >>> wrote: >>> >>>> Hi; >>>> >>>> Did you check can you connect db with your conf parameters from >>>> head-node: >>>> >>>> mysql --user=slurm --password=slurmdbpass slurm_acct_db >>>> >>>> Also, check and stop firewall and selinux, if they are running. >>>> >>>> Last, you can stop slurmdbd, then run run terminal with: >>>> >>>> slurmdbd -D -vvv >>>> >>>> Regards; >>>> >>>> C. Ahmet Mercan >>>> >>>> On 30.05.2024 00:05, Radhouane Aniba via slurm-users wrote: >>>> >>>> Hi everyone >>>> I am trying to get slurmdbd to run on my local home server but I am >>>> really struggling. >>>> Note : am a novice slurm user >>>> my slurmdbd always times out even though all the details in the conf >>>> file are correct >>>> >>>> My log looks like this >>>> >>>> [2024-05-29T20:51:30.088] Accounting storage MYSQL plugin loaded >>>> [2024-05-29T20:51:30.088] debug2: ArchiveDir = /tmp >>>> [2024-05-29T20:51:30.088] debug2: ArchiveScript = (null) >>>> [2024-05-29T20:51:30.088] debug2: AuthAltTypes = (null) >>>> [2024-05-29T20:51:30.088] debug2: AuthInfo = (null) >>>> [2024-05-29T20:51:30.088] debug2: AuthType = auth/munge >>>> [2024-05-29T20:51:30.088] debug2: CommitDelay = 0 >>>> [2024-05-29T20:51:30.088] debug2: DbdAddr = localhost >>>> [2024-05-29T20:51:30.088] debug2: DbdBackupHost = (null) >>>> [2024-05-29T20:51:30.088] debug2: DbdHost = head-node >>>> [2024-05-29T20:51:30.088] debug2: DbdPort = 7032 >>>> [2024-05-29T20:51:30.088] debug2: DebugFlags = (null) >>>> [2024-05-29T20:51:30.088] debug2: DebugLevel = 6 >>>> [2024-05-29T20:51:30.088] debug2: DebugLevelSyslog = 10 >>>> [2024-05-29T20:51:30.088] debug2: DefaultQOS = (null) >>>> [2024-05-29T20:51:30.088] debug2: LogFile = /var/log/slurmdbd.log >>>> [2024-05-29T20:51:30.088] debug2: MessageTimeout = 100 >>>> [2024-05-29T20:51:30.088] debug2: Parameters = (null) >>>> [2024-05-29T20:51:30.088] debug2: PidFile = /run/slurmdbd.pid >>>> [2024-05-29T20:51:30.088] debug2: PluginDir = >>>> /usr/lib/x86_64-linux-gnu/slurm-wlm >>>> [2024-05-29T20:51:30.088] debug2: PrivateData = none >>>> [2024-05-29T20:51:30.088] debug2: PurgeEventAfter = 1 months* >>>> [2024-05-29T20:51:30.088] debug2: PurgeJobAfter = 12 months* >>>> [2024-05-29T20:51:30.088] debug2: PurgeResvAfter = 1 months* >>>> [2024-05-29T20:51:30.088] debug2: PurgeStepAfter = 1 months >>>> [2024-05-29T20:51:30.088] debug2: PurgeSuspendAfter = 1 months >>>> [2024-05-29T20:51:30.088] debug2: PurgeTXNAfter = 12 months >>>> [2024-05-29T20:51:30.088] debug2: PurgeUsageAfter = 24 months >>>> [2024-05-29T20:51:30.088] debug2: SlurmUser = root(0) >>>> [2024-05-29T20:51:30.089] debug2: StorageBackupHost = (null) >>>> [2024-05-29T20:51:30.089] debug2: StorageHost = localhost >>>> [2024-05-29T20:51:30.089] debug2: StorageLoc = slurm_acct_db >>>> [2024-05-29T20:51:30.089] debug2: StoragePort = 3306 >>>> [2024-05-29T20:51:30.089] debug2: StorageType = >>>> accounting_storage/mysql >>>> [2024-05-29T20:51:30.089] debug2: StorageUser = slurm >>>> [2024-05-29T20:51:30.089] debug2: TCPTimeout = 2 >>>> [2024-05-29T20:51:30.089] debug2: TrackWCKey = 0 >>>> [2024-05-29T20:51:30.089] debug2: TrackSlurmctldDown= 0 >>>> [2024-05-29T20:51:30.089] debug2: acct_storage_p_get_connection: >>>> request new connection 1 >>>> [2024-05-29T20:51:30.089] debug2: Attempting to connect to >>>> localhost:3306 >>>> [2024-05-29T20:51:30.090] slurmdbd version 19.05.5 started >>>> [2024-05-29T20:51:30.090] debug2: running rollup at Wed May 29 20:51:30 >>>> 2024 >>>> [2024-05-29T20:51:30.091] debug2: Everything rolled up >>>> [2024-05-29T20:51:49.673] Terminate signal (SIGINT or SIGTERM) received >>>> [2024-05-29T20:51:49.673] debug: rpc_mgr shutting down >>>> >>>> >>>> >>>> my config file looks like this >>>> >>>> ArchiveEvents=yes >>>> ArchiveJobs=yes >>>> ArchiveResvs=yes >>>> ArchiveSteps=no >>>> ArchiveSuspend=no >>>> ArchiveTXN=no >>>> ArchiveUsage=no >>>> PurgeEventAfter=1month >>>> PurgeJobAfter=12month >>>> PurgeResvAfter=1month >>>> PurgeStepAfter=1month >>>> PurgeSuspendAfter=1month >>>> PurgeTXNAfter=12month >>>> PurgeUsageAfter=24month >>>> # Authentication info >>>> AuthType=auth/munge >>>> # slurmDBD info >>>> DbdAddr=localhost >>>> DbdHost=head-node >>>> DbdPort=7032 >>>> SlurmUser=root >>>> MessageTimeout=100 >>>> DebugLevel=5 >>>> #DefaultQOS=normal,standby >>>> LogFile=/var/log/slurmdbd.log >>>> PidFile=/run/slurmdbd.pid >>>> #PrivateData=accounts,users,usage,jobs >>>> #TrackWCKey=yes >>>> # >>>> # Database info >>>> StorageType=accounting_storage/mysql >>>> StorageHost=localhost >>>> StoragePort=3306 >>>> StoragePass=slurmdbpass >>>> StorageUser=slurm >>>> StorageLoc=slurm_acct_db >>>> I used standard names and passwords to get started and I will change >>>> later >>>> >>>> but everytime I try to start slurmdbd.service it crashes and I have >>>> that log that I shared with you >>>> >>>> I use these versions >>>> >>>> slurmdbd -V >>>> slurm-wlm 19.05.5 >>>> mysql Ver 15.1 Distrib 10.3.39-MariaDB, for debian-linux-gnu (x86_64) >>>> using readline 5.2 >>>> Everything else Is working properly except I cannot get slurmdbd to >>>> work and at this point I exhausted all my possible trials :) looking for >>>> some expert insights :) >>>> >>>> >>>> Any idea what I am doing wrong here ? Also I didn't compile any slurm >>>> package. I used the binary from apt repos >>>> >>>> Any help will be appreciated >>>> >>>> Cheers >>>> >>>> Rad >>>> >>>> -- >>>> >>>> >>>> >>> >> >> -- >> *Rad Aniba, PhD* >> >> >> >> -- >> slurm-users mailing list -- slurm-users@lists.schedmd.com >> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> > > > -- > *Rad Aniba, PhD* > > -- *Rad Aniba, PhD*
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com