I actually have disabled the swap partition (!) since the system goes really bad and based on my experience I have to enter the room and reset the affected machine (!). Otherwise I have to wait for long times to see it get back to normal.
When I ssh to the node with root user, the ulimit -a says unlimited virtual memory. So, it seems that the root have unlimited value while users have limited value. Regards, Mahmood On Sun, Apr 15, 2018 at 10:26 PM, Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> wrote: > Hi Mahmood, > > It seems your compute node is configured with this limit: > > virtual memory (kbytes, -v) 72089600 > > So when the batch job tries to set a higher limit (ulimit -v 82089600) than > permitted by the system (72089600), this must surely get rejected, as you > have discovered! > > You may want to reconfigure your compute nodes' limits, for example by > setting the virtual memory limit to "unlimited" in your configuration. If > the nodes has a very small RAM memory + swap space size, you might encounter > Out Of Memory errors... > > /Ole