We got the same problem on our clusters. It was due to our backup script
of mysql was locking the tables (and taking to long time).
If looking at ''mod_time'' and ''control_host'' of ''cluster_table'' in
the database:
select mod_time,control_host from cluster_table;
We found that ''mod_time''
On Saturday, 10 November 2018 6:22:26 AM AEDT Brian Andrus wrote:
> There are no firewalls and I have always been able to do 'sacctmgr show
> clusters' as well as things like 'squeue -M ALL' from both the db
> server and the cluster head.
What does "sacctmgr list clusters" say for you?
Remember
There are no firewalls and I have always been able to do 'sacctmgr show
clusters' as well as things like 'squeue -M ALL' from both the db
server and the cluster head.
For now, I will have to restart slurmctld on all the clusters when there
are changes to associations. But that is definitely
On Friday, 9 November 2018 5:38:22 AM AEDT Brian Andrus wrote:
> Where, slurmctld is not picking up new accounts unless it is restarted.
This is usually because slurmdbd cannot connect back to the slurmctld on the
management node to do the RPC to tell it that a new account/user/etc has
appeared
We use sssd with realmd
enumeration is off.
Brian Andrus
On 11/8/2018 11:26 AM, Marcin Stolarek wrote:
I have very similar issue for quite a time and I was unable to find
its root cause. Are you using sssd and AD as a data source with only a
subtree of entries searched - this is my case.
Di
I have very similar issue for quite a time and I was unable to find its
root cause. Are you using sssd and AD as a data source with only a subtree
of entries searched - this is my case.
Did you disable users enumeration? It also what I have. I didn’t find ang
evidence that it’s related but... may
All,
I am seeing what looks like the same issue as
https://bugs.schedmd.com/show_bug.cgi?id=2119
Where, slurmctld is not picking up new accounts unless it is restarted.
I have 4 clusters (non-federated), all using the same slurmdbd
When I added an association for user name=me cluster=DevOps
accou