This is only so relevant, but the scenario presents itself similarly. This is
not in a scheduler environment, but we have an interactive server that would
have PS hangs on certain tasks (top -bn1 is a way around that, BTW, if it’s
hard to even find out what the process is). For us, it appeared t
Is this parameter applied to each cgroup? Or just the system itself? Seems like
just the system itself.
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
On 12/4/18, 10:13 AM, "slurm-users on behalf of Christopher Benjamin Coffey"
wrote:
Interesti
Interesting! I'll have a look - thanks!
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
On 11/30/18, 1:41 AM, "slurm-users on behalf of John Hearns"
wrote:
Chris, I have delved deep into the OOM killer code and interaction with
cpusets in the p
Chris, I have delved deep into the OOM killer code and interaction with
cpusets in the past (*).
That experience is not really relevant!
However I always recommend looking at this sysctl parameter
min_free_kbytes
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/perform
On 29-11-2018 19:27, Christopher Benjamin Coffey wrote:
We've been noticing an issue with nodes from time to time that become "wedged",
or unusable. This is a state where ps, and w hang. We've been looking into this for a
while when we get time and finally put some more effort into it yesterday
Hi,
We've been noticing an issue with nodes from time to time that become "wedged",
or unusable. This is a state where ps, and w hang. We've been looking into this
for a while when we get time and finally put some more effort into it
yesterday. We came across this blog which describes almost th