Colleagues, I have a patch series that adds an opt-in --enable-node-environment configure flag and, when that flag is set, uses Node (via Webpack) to generate the Activity Stream content bundle. This patch series does not try to solve a few hard problems:
1) vendoring Node modules into the tree 2) installing $topobjdir/node_modules at build time efficiently. There's a green artifact build of my prototype at https://treeherder.mozilla.org/#/jobs?repo=try&revision=d138f854139f2e389867b01d2f2afe59f2975783 I owe some folks (dmose, jlaster) details on what is need in order to land this opt-in prototype and, more importantly, how to make the prototype not opt-in. To that end, I talked to most of the build peers (chmanchester, gps, mshal, ted) yesterday in YVR. The results were not what I expected of that discussion were not what I expected. First, some context. The build system is investing heavily into capturing the full dependency DAG (Directed Acyclic Graph) in order to produce correct builds. The current build backends, and in particular the dominant RecursiveMake build backend, do not capture the full DAG. Capturing the full DAG is required to use "modern" build systems like Tup, Buck, or Bazel. Any sub-component of the build must therefore either correspond to edges in the DAG (these inputs, these outputs) or, if it does its own caching and invalidation, expose its internal DAG. In the current build system, C compiler invocations are the prototype of the first situation and cargo is the prototype of the second situation. What I did not know is that the build peers are contributing code to cargo to have it expose its internal DAG, and that all of the "modern" build systems (in particular Buck) need this functionality to integrate against cargo. Second, my appraisal of the situation. Integrating Node will be very challenging. On the one hand, |yarn install| (or |npm install|) is, like cargo, in the second situation -- it is its own build system that does its own caching and invalidation. That means that to integrate into the build system it must expose its internal DAG. It's possible that yarn could expose its own DAG, but Node modules can define arbitrary pre- and post-install scripts, which are essential to the module ecosystem. I can't imagine us being able to capture the "leaf DAG" of every installed module -- there are no rules out at the leaves. On the second hand, the most general form of integration (which I have been pursuing) is to enable the build system to invoke arbitrary yarn verbs (like `GENERATED_FILES[...].script = 'yarn.py'; GENERATED_FILES[...].flags = 'run arbitary_yarn_verb'). Arbitrary yarn verbs are, well, arbitrary -- they could be simple, like C compiler invocations, or they could be build systems in their own right, like Webpack. For arbitrary yarn verbs, I don't think it's feasible to extract DAGs from the Node ecosystem tools involved. Third, what is to be done. The build peers most invested in the transition to a "modern" build system (here, Tup) are chmanchester and mshal. They conclude that it is not possible to integrate build systems into each other without significant work exposing internal DAGs (which we are willing to do for cargo). They instead propose that build systems not integrate but instead run in serial. That is, the "Node bits" run either first (and provide inputs to the rest of the build system) or run second (and consume outputs from the rest of the build system). Of course, that arrangement sacrifices parallelism and throughput, but at least the final output will be correct. This leads me to propose that we treat |yarn install| as a separate build system that runs before the main build system. It manages its own caching and invalidation, and produces $topobjdir/node_modules. |yarn install| is intended to efficiently determine that its output is up-to-date, so perhaps the overhead of running it every time we build will be acceptable. (Otherwise, we try to find ways to invoke it less frequently.) We then have a choice. We can either push _all_ Node invocations into the first build system and accept what I expect to be a big performance penalty in practice; or we can restrict the Node integration in the main build system to commands that we are confident are not their own build systems. The former is fully general but will require non-trivial effort to implement in the build system, I expect -- perhaps a new build backend, specialized to Node, and some glue code in |mach build| to manage ordering the systems. In addition, such an arrangement could never allow Node bits to depend on regular build system bits, since the Node bits would always happen first. That might make some sense right now, since all of the Node projects we're integrating stand-alone (usually on GitHub!) but as more of the core Firefox front-end functionality leverages Node that will look worse and worse. Even exposing AppConstants.jsm to Node could be fraught (if the actual contents are required, for example to tree shake on the basis of build flags). The latter is restrictive -- for example, we might support only Rollup but not Webpack, since Rollup is more clearly inputs-to-outputs and Webpack is more focused on incremental builds -- and requires labour to audit and add support for new tools. However, it requires less up front build system modifications and is easier to transition to gradually. Fourth, my conclusion. I prefer working within the existing build system and invoking Node commands rather than arbitrary yarn verbs. The fast path to landing this as an opt-in therefore looks like: - adding a new "node" build tier before "pre-export" that runs |yarn install| - restricting to audited Node-consuming commands like |node webpack| and |node rollup| in the build system. After that we can tackle vendoring Node modules into the tree, which does not appear to have anything fundamental blocking it. Phew! That's a wall of text. Please correct me if I'm misunderstanding things, or if my explanations need clarification. As I said, the results of this discussion were not what I expected, so this is mostly new to me :/ I'll wait to collect some feedback on this summary before trying to figure out next steps. Yours, Nick
_______________________________________________ dev-builds mailing list dev-builds@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-builds