Re: [slurm-users] MariaDB lock problems for sacctmgr delete query

2018-02-14 Thread Jessica Nettelblad
FYI - SchedMD has now solved the issue in the master branch. https://github.com/SchedMD/slurm/commit/4a16541bf0e005e1984afd4201b97df482e269ee#diff-7649dde209b4e528e3ba8bb090b19f63 Best regards, Jessica Nettelblad, UPPMAX On Wed, Feb 14, 2018 at 3:31 PM, Bjørn-Helge Mevik wrote: > Thanks for

Re: [slurm-users] slurmstepd: error: Exceeded job memory limit at some point.

2018-02-14 Thread John DeSantis
Geert, Considering the the following response from Loris: > Maybe once in a while a simulation really does just use more memory > than you were expecting. Have a look at the output of > > sacct -j 123456 -o jobid,maxrss,state --units=M > > with the appropriate job ID. This can certainly hap

Re: [slurm-users] MariaDB lock problems for sacctmgr delete query

2018-02-14 Thread Bjørn-Helge Mevik
Thanks for the heads-up! We're currently runnint 17.02.7 with MariaDB, and haven't seen this problem, but we are going to upgrade to 17.11 in the not-to-far-future. -- Cheers, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP sig

Re: [slurm-users] slurmstepd: error: Exceeded job memory limit at some point.

2018-02-14 Thread Chris Bridson (NBI)
Also consider any cached information, e.g. nfs . You won't necessarily see this, but might be getting accounted for in the cgroup, depending on your setup/settings. -Original Message- From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of Loris Bennett Sent: 14 Fe

Re: [slurm-users] slurmstepd: error: Exceeded job memory limit at some point.

2018-02-14 Thread Loris Bennett
Geert Kapteijns writes: > Hi everyone, > > I’m running into out-of-memory errors when I specify an array job. Needless > to say, 100M should be more than enough, and increasing the allocated memory > to 1G doesn't solve the problem. I call my script as > follows: sbatch --array=100-199 run_batc

[slurm-users] slurmstepd: error: Exceeded job memory limit at some point.

2018-02-14 Thread Geert Kapteijns
Hi everyone, I’m running into out-of-memory errors when I specify an array job. Needless to say, 100M should be more than enough, and increasing the allocated memory to 1G doesn't solve the problem. I call my script as follows: sbatch --array=100-199 run_batch_job. run_batch_job contains #!/bin/e