Re: [slurm-users] Slurm forgetting about job dependencies

2018-12-19 Thread Eli V
Looking through the slurm.conf docs and greping around the source code it looks like MinJobAge might be what I need to adjust. I changed it by 2 orders of magnitude, 300 -> 300_000 on our dev cluster. I'll see how things go. On Wed, Dec 19, 2018 at 1:14 PM Eli V wrote: > > Does slurm remove job c

[slurm-users] Slurm forgetting about job dependencies

2018-12-19 Thread Eli V
Does slurm remove job completion info from it's memory after a while? Might explain a why I'm seeing job's getting cancled when there dependent predecessor step finished ok. Below is the egrep '352209(1|2)_11' from slurmctld.log. The 3522092 job array was created with -d aftercorr:3522091. Looks li