Skylar Thomson wrote: >Unfortunately we don't have a mechanism to limit >network usage or local scratch usage, but the former is becoming less of a >problem with faster edge networking, and we have an opt-in bookkeeping mechanism >for the latter that isn't enforced but works well enough to keep people happy. That is interesting to me. At ASML I worked on setting up Quality of Service, ie bandwidth limits, for GPFS storage and MPI traffic. GPFS does have QoS limits inbuilt, but these are intended to limit the backgrouns housekeeping tasks rather than to limit user processes. But it does have the concept. With MPI you can configure different QoS levels for different traffic.
More relevently I did have a close discussion with Parav Pandit who is working on the network QoS stuff. I am sure there is something more up to date than this https://www.openfabrics.org/images/eventpresos/2016presentations/115rdmacont.pdf Sadly this RDMA stuff needs a recent 4-series kernel. I guess the discussion on whether or not you should go with a bleeding edge kernel is for another time! But yes cgroups have configurable network limits with the latest kernels. Also being cheeky, and I probably have mentioned them before, here is a plug for Ellexus https://www.ellexus.com/ Worth mentioning I have no connection with them!
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf