[You can read a version of this with better formatting on my blog:
https://ehsanakhgari.org/blog/2017-09-21/quantum-flow-
engineering-newsletter-25]
Hi everyone,
The Quantum Flow project started as a cross-functional effort to study and
fix the most serious performance issues of Firefox affecting real world
browsing use cases for the Firefox 57 release. Thanks to the hard work of
everyone who helped us along the way, we believe that we have managed to
fix a significant portion of the issues discovered in the past seven months
or so that this project has run and have managed to achieve the performance
goals that we had set for ourselves.
A Short Retrospective
Looking back at the past months, the Quantum Flow project went through
three different stages in terms of the type of issues we focused on (even
though these stages weren't consecutive in time). In the first stage, we
focused most of our energy on gathering information about performance
issues and planning work that would fit in the scope of Firefox 57. A lot
of time was spent doing profiling and measurements to gather evidence
around the most problematic areas of the code needing attention. We also
spent time thinking about what parts of our plans may or may not finish in
time for Firefox 57, and thought about which issues were the most urgent to
fix, and which ones could be deprioriotized to future releases. In this
time we had a work week
<https://ehsanakhgari.org/blog/2017-04-07/quantum-flow-engineering-newsletter-4>
to consult various platform teams for help in various areas of their
expertise.
After the scope of all the work that was necessary to be done became more
clear, we knew that we needed help from other teams to oversee some aspects
of the ongoing work independently. We started to work with the Firefox
front-end team to ask for their help on reducing synchronous layout/style
flushes and scheduled timers in the UI code. That relationship grew with
their effort on reducing the browser start-up time (which has been a great
success so far!) and eventually with the Photon Performance project in
place, the amount of effort on the Quantum Flow side to keep track of the
UI performance was greatly reduced, which was a huge help. Another
successful example here was working with the layout team to improve reflow
performance <https://bugzilla.mozilla.org/show_bug.cgi?id=FastReflows>.
During our measurements we had seen a lot of reflow performance issues,
which were not easy for us to diagnose and analyze. The layout team was
really busy with a lot of ongoing work all along, but they did a great job
at keeping track of the issues we reported to them and fix the ones which
were important. They also didn't stop there, a lot of great work happened
based on other independent investigations which would benefit reflow
performance in these past few months.
The last stage and the longest running one perhaps was a cross-functional
effort at fixing the bugs we had discovered and prioritized which I'm sure
you are all familiar with by now. There were some challenges to overcome
here. Perhaps the most obvious was the sheer number of bugs at hand. I
remember triage meetings with 50+ bugs to go through, and we had to be
careful to not miscategorize something or not miss something important.
The other challenge was the amount of time we had and the number of people
who could help with fixing the bugs. The number of person-hours of help we
could get from each team depended on the existing workload of the team and
their other priorities, so sometimes it was hard to predict how much help
we can count on in a given area especially in the earlier stages. As more
people got involved, we needed to do more communication to make sure things
kept moving forward, and nobody was blocked on something where we could
give help. Because of the number of bugs on our plate, we also always
needed more help! We continually thought about new ways of seeking help
from more people and teams. Due to the short amount of time that we had
before our final deadline for the development of Firefox 57, we had to
experiment with ideas to deal with these issues quickly, see what works,
abandon what didn't, and rinse and repeat.
Over this time, at the time of this writing, we triaged 895 bugs in total
<https://bugzilla.mozilla.org/buglist.cgi?field0-0-0=status_whiteboard&list_id=13791792&query_format=advanced&type0-0-0=substring&value0-0-0=%5Bqf%3A&order=bug_id&limit=0>.
We used a three-tier priority scheme, and out of these, 277 P1 bugs
<https://bugzilla.mozilla.org/buglist.cgi?quicksearch=FIXED%20sw%3A%22[qf%3Ap1]%22>,
38 P2 bugs
<https://bugzilla.mozilla.org/buglist.cgi?quicksearch=FIXED%20sw%3A%22%5Bqf%3Ap2%5D%22>
and 54 P3 bugs
<https://bugzilla.mozilla.org/buglist.cgi?quicksearch=FIXED%20sw%3A%22%5Bqf%3Ap3%5D%22>
were fixed (as in, marked FIXED, not counting other resolutions such as
WONTFIX, DUPLICATE, etc.). We mass-moved all remaining open P1 bugs to P2
(because the definition of P1 was bugs that we want to fix for the 57
release), and we are now left with 141 P2 open bugs
<https://bugzilla.mozilla.org/buglist.cgi?quicksearch=OPEN%20sw%3A%22%5Bqf%3Ap2%5D%22>
and 133 open P3 bugs
<https://bugzilla.mozilla.org/buglist.cgi?quicksearch=OPEN%20sw%3A%22[qf%3Ap3]%22>
.
The Future of Quantum Flow
From the beginning of this project, it was clear that we are going to
discover more issues that we will have time to fix for Firefox 57, so the
fact that we have so many open bugs is a good thing. We tried to be
intentional in what we focus on first for Firefox 57, but we didn't expect
to stop there.
Now we have a large pool of existing bugs, some in progress, some in need
of owners to pick them up. But we also need more people to measure the
performance of the browser in their areas of expertise, and plan future
work to improve the existing issues. Going forward, we would like to start
experimenting with a different structure to continue the performance work
that Quantum Flow started. Instead of having a small team doing triage and
prioritization in a centralized fashion, we would like to distribute this
across the existing teams at Mozilla, to allow them to integrate the
performance work with their existing development work (if they don't
already!) and triage and fix these issues on their own pace.
So, instead of using the [qf] status whiteboard tag, we will use the
existing perf keyword
<https://bugzilla.mozilla.org/buglist.cgi?keywords=perf> in Bugzilla, and
when triaging, teams will use the normal Bugzilla Priority fields to assign
priority to the performance bugs. The existing set of open QF bugs will be
moved to use the perf keyword as well to unify how we handle these bugs.
Long term, we view performance as an aspect of quality very much like
stability and security. This means both running periodic projects to
improve on the solid foundation that we have built so far, and continually
monitor and measure the performance of the various aspects of the browser
to make sure regressions are caught and fixed in time, and new features are
developed with performance in mind from early on. And to do so, we need
to continue our investment in the tooling around performance, perfherder
<https://treeherder.mozilla.org/perf.html>, AWFY <https://arewefastyet.com/>,
Profiler <https://perf-html.io/>, arewesmoothyet
<https://arewesmoothyet.com/>, and perhaps more tools in the future.
Continuing the QF project into the future is an area of active planning. I
expect more details about the next steps for this project will be shared as
the plans continue to shape up.
On QF Newsletters
The idea of starting to write an ongoing newsletter about the Quantum Flow
project was first suggested to me by Bill McCloskey. Our first goal was to
find a way to increase the visibility into what the QF project is, and
highlight people's contributions to the effort, because we felt like due to
the speed at which the project was spun up and the vast scope of the
project, it may be difficult for a lot of people to understand in detail
what actually happens under the hood.
Over time, I tried to shed light into the most important aspects of the
work that happened in the project, and document the history of the
technical work that happened throughout the past seven months or so. At
its lowest level, Quantum Flow consisted of many performance bug fixes all
over the place. I think it was important to highlight that fact, but also
explain to some extent how and why some classes of bugs were investigated
in-depth. But it's also possible to not see the forest when looking at the
trees, so at times I also tried to highlight some of the ongoing higher
level efforts in addition to listing the individual fixes landing each week.
Another super important goal of mine was to give credit where it's due.
What we have accomplished for Quantum Flow (and Firefox 57) so far couldn't
have been accomplished without the help of many wonderful people in the
community. I often think that we need to take more opportunities to thank
people for the hard work that they are doing, and I took these newsletters
as my opportunity to do just that!
Also, I have always wanted to read newsletters like this myself about what
happens in projects and teams that I'm not actively involved in myself.
I'm extremely happy that there are now a
<https://wiki.mozilla.org/SecurityEngineering/Newsletter> number
<https://mozillagfx.wordpress.com/category/wr-newsletter/> of
<https://mozilla.github.io/firefox-browser-architecture/posts/2017-08-24-browser-architecture-newsletter-2.html>
newsletters <https://dolske.wordpress.com/category/photonupdate/> that
people have started writing recently, and I look forward to seeing even
more of them! Writing about what we work on is important, it helps share
knowledge across Mozilla and keeps people informed and engaged. I urge
more people to write about what they do, and I'll try to continue to do so
myself.
Before closing, it is time to give credits where it's due one last time in
these series. Firstly, I would like to thank Jan de Mooij, Mike Conley,
and Florian Quèze who helped me with collecting the credits section at the
end of the newsletters in the past few months! Also, I would like to
extend to thank the following people who helped land the last performance
improvements for Firefox 57:
- Andrew McCreight enabled the single-compartment for all JSMs
<https://bugzilla.mozilla.org/show_bug.cgi?id=1381961>! This reduces the
memory overhead
<https://treeherder.mozilla.org/perf.html#/alerts?id=9496> incurred by
each JSM, and also side-step the “compartment crossing” performance penalty
when JSMs access objects in one another. Here’s our ts_paint performance
test (time to first window paint in ms) - that fall off around September 16
was caused by this work!
[image: ts_paint graph around Sep 16 for Windows32/64, Linux64 and Mac]
<http://ehsanakhgari.org/wp-content/uploads/2017/09/Screenshot-from-2017-09-21-15-37-21.png>
- Jim Chen enabled the single-compartment JSMs for Android
<https://bugzilla.mozilla.org/show_bug.cgi?id=1400886>! This brought
some nice speed
<https://treeherder.mozilla.org/perf.html#/alerts?id=9542> and memory
usage <https://treeherder.mozilla.org/perf.html#/alerts?id=9586> wins on
Android like desktop above.
- Olli Pettay fixed a recent regression
<https://bugzilla.mozilla.org/show_bug.cgi?id=1398605> from one of my
patches <https://bugzilla.mozilla.org/show_bug.cgi?id=1392891> which was
caused by creating nsContentList objects too eagerly.
Cheers,
--
Ehsan