Hi Ole,
On 16/02/18 22:23, Ole Holm Nielsen wrote:
Question: Is it safer to wait for 17.11.4 where the issue will
presumably be solved?
I don't think the commit has been backported to 17.11.x to date.
It's in master (for 18.08) here:
commit 4a16541bf0e005e1984afd4201b97df482e269ee
Author: T
We're planning to upgrade Slurm 17.02 to 17.11 soon, so it's important
for us to test the slurmdbd and database upgrade before doing the actual
upgrade.
I've made a *successful* upgrade of the database migration from 17.02 to
17.11, making a dry run on an offlined compute node running CentOS 7
FYI - SchedMD has now solved the issue in the master branch.
https://github.com/SchedMD/slurm/commit/4a16541bf0e005e1984afd4201b97df482e269ee#diff-7649dde209b4e528e3ba8bb090b19f63
Best regards,
Jessica Nettelblad, UPPMAX
On Wed, Feb 14, 2018 at 3:31 PM, Bjørn-Helge Mevik
wrote:
> Thanks for
Thanks for the heads-up! We're currently runnint 17.02.7 with MariaDB,
and haven't seen this problem, but we are going to upgrade to 17.11 in
the not-to-far-future.
--
Cheers,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP sig
TLDR; If you get a timeout for the Slurm database, and a longer timelimit
in innodb doesn't help, you might want to consider loosening the lock mode
in MariaDB.
The long story!
So, we’ve just upgraded our main cluster to 17.11.3 and moved our database
to Mariadb. There have been some glitches and