Looking into this more it looks like memory.max_usage_in_byte and
memory.usage_in_bytes also count file cache. Which is very surprising and
not at all useful. But total_rss in memory.stat shows a more correct
number. Looking at that one for a real job gives me around 30 GB, which
matches my other d
We are using cgroups to track resource usage of our jobs. The jobs are run
in docker with docker's --parent-cgroup flag pointing at the slurm job's
cgroup. This works great for limiting memory usage.
Unfortunately the maximum memory usage, maxRSS, is not accurately reported
in sacct. While the cgr