Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-10 Thread Christopher Benjamin Coffey
We've attempted setting JobAcctGatherFrequency=task=0 and there is no change. We have settings: ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup,task/affinity JobAcctGatherType=jobacct_gather/cgroup Odd ... wonder why we don't see it help. Here is how we verify: === #!/bin/bash #SBATCH --

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-09 Thread Paddy Doyle
On Wed, Jan 09, 2019 at 12:44:03PM +0100, Bj?rn-Helge Mevik wrote: > Paddy Doyle writes: > > > Looking back through the mailing list, it seems that from 2015 onwards the > > recommendation from Danny was to use 'jobacct_gather/linux' instead of > > 'jobacct_gather/cgroup'. I didn't pick up on th

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-09 Thread Bjørn-Helge Mevik
Paddy Doyle writes: > Looking back through the mailing list, it seems that from 2015 onwards the > recommendation from Danny was to use 'jobacct_gather/linux' instead of > 'jobacct_gather/cgroup'. I didn't pick up on that properly, so we kept with > the cgroup version. > > Is anyone else still us

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-08 Thread Christopher Benjamin Coffey
" Looking back through the mailing list, it seems that from 2015 onwards the recommendation from Danny was to use 'jobacct_gather/linux' instead of 'jobacct_gather/cgroup'. I didn't pick up on that properly, so we kept with the cgroup version." Ahh, hmm I need to dig up that recommenda

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-08 Thread Paddy Doyle
A small addition: I forgot to mention our JobAcct params: JobAcctGatherFrequency=task=30 JobAcctGatherType=jobacct_gather/cgroup I've done a small bit of playing around on a test cluster. Changing to 'JobAcctGatherFrequency=0' (i.e. only gather at job end) seems to then give correct values for th

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-04 Thread Christopher Benjamin Coffey
Actually we double checked and are seeing it in normal jobs too. — Christopher Coffey High-Performance Computing Northern Arizona University 928-523-1167 On 1/4/19, 9:24 AM, "slurm-users on behalf of Paddy Doyle" wrote: Hi Chris, We're seeing it on 18.08.3, so I was hoping that

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-04 Thread Paddy Doyle
Hi Chris, We're seeing it on 18.08.3, so I was hoping that it was fixed in 18.08.4 (recently upgraded from 17.02 to 18.08.3). Note that we're seeing it in regular jobs (haven't tested job arrays). I think it's cgroups-related; there's a similar bug here: https://bugs.schedmd.com/show_bug.cgi?id=

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-04 Thread Christopher Benjamin Coffey
I'm surprised no one else is seeing this issue? I wonder if you have 18.08 you can take a moment and run jobeff on a job in one of your users job arrays. I'm guessing jobeff will show the same issue as we are seeing. The issue is that usercpu is incorrect, and off by many orders of magnitude. B

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2018-12-21 Thread Christopher Benjamin Coffey
So this issue is occurring only with job arrays. — Christopher Coffey High-Performance Computing Northern Arizona University 928-523-1167 On 12/21/18, 12:15 PM, "slurm-users on behalf of Chance Bryce Carl Nelson" wrote: Hi folks, calling sacct with the usercpu flag enable

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2018-12-21 Thread Sam Hawarden
lson Sent: Saturday, 22 December 2018 08:11 To: slurm-users@lists.schedmd.com Subject: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays Hi folks, calling sacct with the usercpu flag enabled seems to provide cpu times far above expected values for job array indices. Th

[slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2018-12-21 Thread Chance Bryce Carl Nelson
Hi folks, calling sacct with the usercpu flag enabled seems to provide cpu times far above expected values for job array indices. This is also reported by seff. For example, executing the following job script: #!/bin/bash #SBATCH --job-name