On Wed, May 9, 2018 at 11:01 AM, Ted Mielczarek <t...@mielczarek.org> wrote:
> On Wed, May 9, 2018, at 1:11 PM, L. David Baron wrote: > > > mozregression won't be able to bisect into inbound branches then, but I > > > believe we've always been expiring build artifacts created from > integration > > > branches after a few months in any case. > > > > > > My impression was that people use mozregression primarily for tracking > down > > > relatively recent regressions. Please correct me if I'm wrong. > > > > It's useful for tracking down regressions no matter how old the > > regression is; I pretty regularly see mozregression finding useful > > data on bugs that regressed multiple years ago. > > To be clear here--we still have an archive of nightly builds dating back > to 2004, so you should be able to bisect to a single day using that. We > haven't ever had a great policy for retaining individual CI builds like > these tinderbox-builds. They're definitely useful, and storage is not that > expensive, but given the number of build configurations we produce nowadays > and the volume of changes being pushed we can't archive everything forever. It's worth noting that once builds are deterministic, a build system is effectively a highly advanced caching mechanism. It follows that cache eviction is therefore a tolerable problem: if the entry isn't in the cache, you just build again! Artifact retention and expiration boils down to a trade-off between the cost of storage and the convenience of accessing something immediately (as opposed to waiting several dozen minutes to populate the cache). The good news is that Linux Firefox builds have been effectively deterministic (modulo PGO and some minor build details like the build time) for several months now (thanks, glandium!). And moving to Clang on all platforms will make it easier to achieve deterministic builds on other platforms. The bad news is we still have many areas of CI that are not hermetic and attempts to retrigger Firefox build tasks in the future have a very high possibility of failing for numerous reasons (e.g. some dependent task of the build hits a 3rd party server that is no longer available or has deleted a file). In other words, our CI results may not be reproducible in the future. So if we delete an artifact, even though the build is deterministic, we may not have all the inputs to reconstruct that result. Making CI hermetic and reproducible far in the future is a hard problem. There are esoteric failure scenarios like "what if we need to fetch content from a server in 2030 but TLS 1.2 has been disabled due to a critical vulnerability and code in the hermetic build task doesn't support TLS 1.3." In order to realistically achieve reproducible builds in the future, we need to store *all* inputs somewhere reliable where they will always be available. Version control is one possibility. A content-indexed service like tooltool is another. (At Google, they check in the source code for Clang, glibc, binutils, Linux, etc into version control so all they need is a version revision and a bootstrap compiler (which I also suspect they check into the monorepo) to rebuild the world from source.) What I'm trying to say is we're making strides towards making builds deterministic and reproducible far in the future. So hopefully in a few years we won't need to be concerned about deleting old data because our answer will be "we can easily reproduce it at any time." _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform