David Robinson created AURORA-1181:
--------------------------------------
Summary: optimize host_drain to speed up maintenance
Key: AURORA-1181
URL: https://issues.apache.org/jira/browse/AURORA-1181
Project: Aurora
Issue Type: Task
Components: Maintenance
Reporter: David Robinson
Priority: Minor
Aurora's maintenance primitives, whilst great, can be frustrating to use when
dealing with large clusters, primarily due to the speed of draining hosts. The
host_drain feature does accept a grouping function that can be used to drain
hosts in batches, but for large clusters we typically don't want to arbitrarily
divide the cluster into groups/batches and would prefer instead to drain
everything that was requested, where possible, without violating the SLA.
eg, 100 hosts in need of maintenance, with each host running 1 task (of many)
from 100 different jobs -- all 100 hosts could be drained simultaneously
without violating the SLA.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)