Hi,
Thanks again for all the suggestions.
It turns out that on our cluster we can't use the cgroups because of the old
kernel,
but setting
JobAcctGatherParams=UsePSS
resolved the problems.
Regards,
Sergey
On Fri, 2019-01-11 at 10:37 +0200, Janne Blomqvist wrote:
> On 11/01/2019
Hi Janne,
On Fri, 2019-01-11 at 10:37 +0200, Janne Blomqvist wrote:
> On 11/01/2019 08.29, Sergey Koposov wrote:
> > What is your memory limit configuration in slurm? Anyway, a few things to
> > check:
I guess these are the most relevant (uncommented) params I could see in the
slurm.conf are
Se
Sergey Koposov writes:
> The trick is that my code uses memory mapping (i.e. mmap) of one
> single large file (~12 Gb) in each thread on each node.
> With this technique in the past despite the fact the file is
> (read-only) mmaped in say 16 threads, the actual memory footprint was
> still ~ 12 G
On 11/01/2019 08.29, Sergey Koposov wrote:
> Hi,
>
> I've recently migrated to slurm from pbs on our cluster. Because of that, now
> the job memory limits are
> strictly enforced and that causes my code to get killed.
> The trick is that my code uses memory mapping (i.e. mmap) of one single large