We're a Grid Engine shop, and we have the execd/shepherds place each job in
its own cgroup with CPU and memory limits in place. This lets our users
make efficient use of our HPC resources whether they're running single-slot
jobs, or multi-node jobs. Unfortunately we don't have a mechanism to limit
Hi Chris,
We have looked at this _a_ _lot_ on Titan:
A Multi-faceted Approach to Job Placement for Improved Performance on
Extreme-Scale Systems
https://ieeexplore.ieee.org/document/7877165/
This issue we have is small jobs "inside" large jobs interfering with the
larger jobs. The item that is