date:20180609

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Skylar Thompson

We're a Grid Engine shop, and we have the execd/shepherds place each job in its own cgroup with CPU and memory limits in place. This lets our users make efficient use of our HPC resources whether they're running single-slot jobs, or multi-node jobs. Unfortunately we don't have a mechanism to limit

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Scott Atchley

Hi Chris, We have looked at this _a_ _lot_ on Titan: A Multi-faceted Approach to Job Placement for Improved Performance on Extreme-Scale Systems https://ieeexplore.ieee.org/document/7877165/ This issue we have is small jobs "inside" large jobs interfering with the larger jobs. The item that is