On 4/3/2013 6:33 PM, jmaher wrote:
I looked at the data used to calculate the offenders, and I found:
total type, total jobs, total duration, total hours
try builders, 3525, 12239477, 3399.85472222
try testers, 71821, 121294315, 33692.8652778
inbound builders, 7862, 30877533, 8577.0925
inbound testers, 121641, 182883638, 50801.0105556
other builders, 14690, 26990702, 7497.41722222
other testers, 75170, 111729324, 31035.9233333
totals: 294709, 486014989.0, 135004.163611
The sheriffs and releng and I have been talking about this problem for
the last month or two, knowing that we were running way in the red. We
have a bunch of solutions, but we hadn't yet crunched the numbers to see
what our best solution is on the way forward. Our best solution is
certainly going to be some combination of process change combined with
some amount of technical optimizations. But what we focus on when is
the million dollar question.
Joel and I did some calculations:
* 200 pushes/day[1]
* 325 test jobs/push
* 25 builds/push
* .41 hours/test (on average, from above numbers)
* 1.1 hours/build (on average, based on try values from above)
Then you can approximate what the load of Kat's suggestion would look
like: 200pushes/day * ((325test/push * .41hrs/test) + (25builds/push *
1.1 hrs/bld)) = 32150hrs/day
So we need 32150 compute hours per day to keep up.
If you see above our totals for the week of data that gps provided us
with you can see that we are currently running at: 135004hours/week /
7days = 19286 compute hours/day
So, while I really like Kat's proposal from a sheriffing standpoint, my
back of the napkin math here makes me worry that our current
infrastructure can't support it.
The only way I see to do something like this approach would be to batch
patches together in order to reduce the number of pushes, or to run
tests intermittently like both dbaron and jmaher mentioned.
We need to figure out what we can do to make this better--it's a
terrible problem. We're running way in the red, and we do need to see
what we can do to be more efficient with the infrastructure we've got
while at the same time finding ways to expand it. There are expansion
efforts underway but expanding the physical infrastructure is not a
short term project. This is on our goals (both releng and ateam) for Q2,
and any help crunching numbers to understand our most effective path
forward is appreciated.
Clint
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform