Hello,

Everyday we see several deadlocks in our slurmdbd log file. Together with the 
deadlock we always see a failed "roll up" operation. Please see below for an 
example.


We are running slurm 18.08.0 on our cluster. As far as we know these deadlocks 
are not adversely affecting the operation of the cluster. Each day jobs are 
"rolling" through the cluster and the utilisation of the cluster is constantly 
high. Furthermore, it doesn't appear that we are losing data in the database. 
I'm not a database expert and so I have no idea where to start with this. Our 
local db experts have taken a look and are nonplussed.


I wondered if anyone in the community had any ideas please. As an aside I've 
just started to experiment with v19* and it would be nice to think that these 
deadlocks will just go away in due course (following an eventual upgrade when 
that version is a bit more mature), however that may not be the case.


Best regards,

David


[2019-06-19T00:00:02.728] error: mysql_query failed: 1213 Deadlock found when 
trying to get lock; try restarting transaction
insert into "i5_assoc_usage_hour_table"
.....


[2019-06-19T00:00:02.729] error: Couldn't add assoc hour rollup
[2019-06-19T00:00:02.729] error: Cluster i5 rollup failed

Reply via email to