Excuse me, I have confused with that.
While the cgroup value is 68GB, I run on terminal and see the VSZ is about
80GB and the program runs normally.
However, with slurm on that node, I can not run.
how much memory are you requesting from Slurm in your job?
Why on terminal I can run, but I can not run via slurm?
the purpose of slurm is to allocate resources. logging into a node "bare"
is "evading" everything slurm does.
I wonder if slurm gets the right value from kernel's cgroup.
you have it backwards. slurm creates a cgroup for the job (step)
and uses the cgroup control to tell the kernel how much memory to
permit the job-step to use.
I would like to locally solve the problem for blast and I am not seeking a
system wide solution right now.
there's nothing unique about your system or blast (which is extremely common
on many large slurm installs).
regards, mark hahn