Hello,

We are experiencing quite a number of database failures. We saw an outright 
failure a short while ago where we had to restart the maria database and the 
slurmdbd process. After restarting the database appear to be working well, 
however over the last few days I have notice quite a number of failures. For 
example -- see below. Does anyone understand what might be going wrong, why and 
whether we should be concerned, please? I understand that slurm databases can 
get quite large relatively quickly and so I wonder if this is memory related.

Best regards,
David

[root@blue51 slurm]# less slurmdbd.log-20190506.gz | grep failed
[2019-05-05T04:00:05.603] error: mysql_query failed: 1213 Deadlock found when 
trying to get lock; try restarting transaction
[2019-05-05T04:00:05.606] error: Cluster i5 rollup failed
[2019-05-05T23:00:07.017] error: mysql_query failed: 1213 Deadlock found when 
trying to get lock; try restarting transaction
[2019-05-05T23:00:07.018] error: Cluster i5 rollup failed
[2019-05-06T00:00:13.348] error: mysql_query failed: 1213 Deadlock found when 
trying to get lock; try restarting transaction
[2019-05-06T00:00:13.350] error: Cluster i5 rollup failed

Reply via email to