Re: [slurm-users] Larger jobs tend to get starved out on our cluster

2019-01-09 Thread Loris Bennett
Hi David, If your maximum run-time is more than the 2 1/2 days (3600 minutes) you have set for bf_window, you might need to increase bf_window accordingly. See the description here: https://slurm.schedmd.com/sched_config.html Cheers, Loris Baker D.J. writes: > Hello, > > A colleague intima

[slurm-users] slurmd: error: Error binding slurm stream socket: Address already in use

2019-01-09 Thread Alseny Diallo
Hello, I am trying to install slurm in a small test cluster. Just after installation the nodes were up and running, but after rebooting the machines, the following error appears :    slurmd: debug:  switch NONE plugin loaded   slurmd: error: Error binding slurm stream socket: Address already

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu values

2019-01-09 Thread Christopher Benjamin Coffey
Thanks... looks like the bug should get some attention now that a paying site is complaining: https://bugs.schedmd.com/show_bug.cgi?id=6332 Thanks Jurij! Best, Chris — Christopher Coffey High-Performance Computing Northern Arizona University 928-523-1167 On 1/9/19, 7:24 AM, "slurm-users on

[slurm-users] Larger jobs tend to get starved out on our cluster

2019-01-09 Thread Baker D . J .
Hello, A colleague intimated that he thought that larger jobs were tending to get starved out on our slurm cluster. It's not a busy time at the moment so it's difficult to test this properly. Back in November it was not completely unusual for a larger job to have to wait up to a week to start.

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-09 Thread Paddy Doyle
On Wed, Jan 09, 2019 at 12:44:03PM +0100, Bj?rn-Helge Mevik wrote: > Paddy Doyle writes: > > > Looking back through the mailing list, it seems that from 2015 onwards the > > recommendation from Danny was to use 'jobacct_gather/linux' instead of > > 'jobacct_gather/cgroup'. I didn't pick up on th

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu on Job Arrays

2019-01-09 Thread Bjørn-Helge Mevik
Paddy Doyle writes: > Looking back through the mailing list, it seems that from 2015 onwards the > recommendation from Danny was to use 'jobacct_gather/linux' instead of > 'jobacct_gather/cgroup'. I didn't pick up on that properly, so we kept with > the cgroup version. > > Is anyone else still us