Yes, I saw the same issue. Default for unset DefMemPerCPU changed from
unlimited in earlier versions to 0. I just set it to 384 in slurm.conf
so simple things run fine and make sure users always set a sane value
on submission.

On Mon, Jun 11, 2018 at 6:40 PM, Roberts, John E. <jerobe...@anl.gov> wrote:
> I see this in the debug logs:
> "memory per node set to 1M in partition bdwall"
>
> I seemingly can alleviate this if I set RealMemory=foo in the Node 
> definitions, but this just seems like something that shouldn't be necessary.
> Did this become a required field after 16.05??
>
> Thanks!
> John
>
> On 6/11/18, 4:12 PM, "Roberts, John E." <jerobe...@anl.gov> wrote:
>
>     Nothing I assume isn't correct:
>
>     DefMemPerNode           = UNLIMITED
>     MaxMemPerNode           = UNLIMITED
>     MemLimitEnforce         = Yes
>     PropagateResourceLimitsExcept = MEMLOCK
>
>     CPU vars aren't set and never were.
>
>     Thanks!
>     John
>
>     On 6/11/18, 4:09 PM, "slurm-users on behalf of Renfro, Michael" 
> <slurm-users-boun...@lists.schedmd.com on behalf of ren...@tntech.edu> wrote:
>
>         Anything in particular set for DefMemPerCPU in your slurm.conf?
>
>         > On Jun 11, 2018, at 3:50 PM, Roberts, John E. <jerobe...@anl.gov> 
> wrote:
>         >
>         > Hi,
>         >
>         >    Seeing this after an upgrade today. I now can't get any jobs to 
> run. Things were fin before the upgrade. Any Ideas?
>         >
>         >    slurmstepd: error: Job 535721 exceeded memory limit (1160 > 
> 1024), being killed
>         >    slurmstepd: error: Exceeded job memory limit
>
>
>
>
>
>

Reply via email to