Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Chris Samuel
On Thursday, 30 November 2017 5:28:26 PM AEDT Chris Samuel wrote: > Are you starting it with systemctl? If so it might be taking too long for > systemd's liking to upgrade the tables and it might kill it. Ignore that - I skimmed your logs too quickly! [2017-11-29T16:15:22.086] slurmdbd version

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Chris Samuel
On Thursday, 30 November 2017 3:26:25 AM AEDT Bruno Santos wrote: > Managed to do some more progress on this. The problem seems to be related to > somehow the service still linking to an older version of slurmdbd I had > installed with apt. I have now hopefully fully cleaned the old version but >

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Bruno Santos
Managed to do some more progress on this. The problem seems to be related to somehow the service still linking to an older version of slurmdbd I had installed with apt. I have now hopefully fully cleaned the old version but when I try to start the service it is getting killed somehow. Any suggestio

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Philip Kovacs
Step back from slurm and confirm that MariaDb is up and responsive. # mysql -uroot -pEnter password: Welcome to the MariaDB monitor.  Commands end with ; or \g.Your MariaDB connection id is 8Server version: 10.2.9-MariaDB MariaDB Server Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Bruno Santos
Hi Barbara, This is a fresh install. I have installed slurm from source on Debian stretch and now trying to set it up correctly. MariaDB is running for but I am confused about the database configuration. I followed a tutorial (I can no longer find it) that showed me how to create the database and

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Barbara Krašovec
Did you upgrade SLURM or is it a fresh install? Are there any associations set? For instance, did you create the cluster with sacctmgr? sacctmgr add cluster Is mariadb/mysql server running, is slurmdbd running? Is it working? Try a simple test, such as: sacctmgr show user -s If it was an upgra

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Bruno Santos
Thank you Barbara, Unfortunately, it does not seem to be a munge problem. Munge can successfully authenticate with the nodes. I have increased the verbosity level and restarted the slurmctld and now I am getting more information about this: > Nov 29 14:08:16 plantae slurmctld[30340]: Registering

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Barbara Krašovec
Hello, does munge work? Try if decode works locally: munge -n | unmunge Try if decode works remotely: munge -n | ssh unmunge It seems as munge keys do not match... See comments inline.. > On 29 Nov 2017, at 14:40, Bruno Santos wrote: > > I actually just managed to figure that one out. > > T

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Barbara Krašovec
I was struggling like crazy with this one a while ago. Then I saw this in the slurm.conf man page: AccountingStoragePass The password used to gain access to the database to store the accounting data. Only used for database type storage plugins, ignored otherwise. In the case of

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Bruno Santos
I actually just managed to figure that one out. The problem was that I had setup AccountingStoragePass=magic in the slurm.conf file while after re-reading the documentation it seems this is only needed if I have a different munge instance controlling the logins to the database, which I don't. So c

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Andy Riebs
It looks like you don't have the munged daemon running. On 11/29/2017 08:01 AM, Bruno Santos wrote: Hi everyone, I have set-up slurm to use slurm_db and all was working fine. However I had to change the slurm.conf to play with user priority and upon restarting the slurmctl is fails with the f