The man page for sbatch says this about the --nice option:
--nice[=adjustment]
>Run the job with an adjusted scheduling priority within Slurm. With no
>adjustment value the scheduling priority is decreased by 100. A
>negative nice value increases the priority, otherwise decreases
Specifying --mem to Slurm only tells it to find a node that has that much, not
to enforce a limit as far as I know. That node has that much so it finds it.
You probably want to enable UsePAM and setup the pam.d slurm files and
/etc/security/limits.conf to keep users under the 64000MB physical me
Bill,
Thing is that both user and root see unlimited virtual memory when
they directly ssh to the node. However, when the job is submitted, the
user limits change. That means, slurm modifies something.
The script is
#SBATCH --job-name=hvacSteadyFoam
#SBATCH --output=hvacSteadyFoam.log
#SBATCH --n
Mahmood, sorry to presume. I meant to address the root user and your ssh to the
node in your example.
At our site, we use UsePAM=1 in our slurm.conf, and our /etc/pam.d/slurm and
slurm.pam files both contain pam_limits.so, so it could be that way for you,
too. I.e. Slurm could be setting the l
Excuse me... I think the problem is not pam.d.
How do you interpret the following output?
[hamid@rocks7 case1_source2]$ sbatch slurm_script.sh
Submitted batch job 53
[hamid@rocks7 case1_source2]$ tail -f hvacSteadyFoam.log
max memory size (kbytes, -m) 65536000
open files
BTW, the memory size of the node is 64GB.
Regards,
Mahmood
On Sun, Apr 15, 2018 at 10:56 PM, Mahmood Naderan wrote:
> I actually have disabled the swap partition (!) since the system goes
> really bad and based on my experience I have to enter the room and
> reset the affected machine (!). Oth
Are you using pam_limits.so in any of your /etc/pam.d/ configuration files?
That would be enforcing /etc/security/limits.conf for all users which are
usually unlimited for root. Root’s almost always allowed to do stuff bad enough
to crash the machine or run it out of resources. If the /etc/pam.d
I actually have disabled the swap partition (!) since the system goes
really bad and based on my experience I have to enter the room and
reset the affected machine (!). Otherwise I have to wait for long
times to see it get back to normal.
When I ssh to the node with root user, the ulimit -a says u
Hi Mahmood,
It seems your compute node is configured with this limit:
virtual memory (kbytes, -v) 72089600
So when the batch job tries to set a higher limit (ulimit -v 82089600)
than permitted by the system (72089600), this must surely get rejected,
as you have discovered!
You may
Hi,
The user can run "ulimit -v VALUE" on the frontend. However, when I
put that command in a slurm script, it says that operation is not
permitted by the user!
[hamid@rocks7 case1_source2]$ ulimit -v 82089600
[hamid@rocks7 case1_source2]$ cat slurm_script.sh
#!/bin/bash
#SBATCH --job-name=hvacSte
10 matches
Mail list logo