On Thu, Aug 30, 2018 at 4:46 PM, Nicholas Alexander <nalexan...@mozilla.com> wrote:
> Hi Sofia, > > On Fri, Aug 17, 2018 at 4:51 PM, Sofia Carrillo <secarrill...@gmail.com> > wrote: > >> Hello all, >> >> My name is Sofia, and I was an intern this summer on the Build Systems >> team. Some developers have shown interest in the work I was doing this >> summer, so I wanted to share this with you in case you found the >> information useful. >> >> One of the tools I worked on generates a report that can spit out the >> most "expensive" files (those which change frequently and significantly >> impact build times) of the tree. If you are interested in getting your >> hands on this report, instructions can be found below. >> >> Though my internship has ended, please feel free to contact me with any >> questions. I hope these tools are useful and can provide some insight! Also >> feel free to contact either mshal or chmanchester in #build. >> >> Instructions to generate the report: >> https://docs.google.com/presentation/d/1dhfMGhFbevkoOShjyhlB >> hpWyNT49MkpifHgq2292yR4/edit?usp=sharing >> >> My presentation slides (for more context of my project): >> https://docs.google.com/presentation/d/1V0C6xKrHyTrJAFkSLGXd >> m3DpEfJqkjsV1VLcyp9QxgQ/edit?usp=sharing >> > > This is really interesting work; thanks for doing it and distributing it! > > I have a question about the slide about "expensive files in the last 30 > days on m-c". I see `python/mozversioncontrol/mozversioncontrol/__init__.py` > in the table. That's not an "expensive input" in my mind. Can you explain > how that's entering the expense calculations? > > The rebuild time ("Minutes" column in that chart) is calculated from the tup database by starting at the file listed, building out a subgraph of all downstream files, and summing up the runtime of all the commands in that subgraph. The "#Changes" field is the number of pushes that changed that file within the timeframe (I think this just comes from an hg pushlog query?), and those two numbers get multiplied to become the "Total". This file looks very expensive in terms of build time because it is included by all GENERATED_FILES (via file_generate.py -> buildconfig.py -> mozbuild/base.py -> mozversioncontrol), and in turn virtually the entire rest of the DAG depends on at least one GENERATED_FILES. In practice changing this file is not quite as bad as it looks at first glance, since we can skip downstream commands of GENERATED_FILES when the outputs are unchanged. That is a runtime optimization though, which is something this report can't easily take into account. Additionally, it seems that file changed a lot within that 30-day window because it happened to have a burst of activity then, but it's not a worst offender in the 60 or 180-day lists because it doesn't normally change that often. So we may find occasional spurious files like this when using shorter time windows. By my count, changing mozversioncontrol/__init__.py *does* run about 260 commands after configure (various python scripts, xpidl, and such) before the build system realizes it can skip the other ~3500 commands it was planning to do. So while that file is not as expensive as the report indicates, it probably is still worth investigating if we can trim down the amount of python scripts that are consumed by file_generate.py, since any one of them will invalidate all GENERATED_FILES. -Mike
_______________________________________________ dev-builds mailing list dev-builds@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-builds