On Fri, Aug 5, 2016 at 9:52 AM, Boris Zbarsky <bzbar...@mit.edu> wrote:

> On 8/5/16 12:20 AM, Gregory Szorc wrote:
>
>> If someone could make WebIDL or IPDL processing faster, that would help
>> people with high core machines, including distributed compilation
>> environments. I believe WebIDL is the longer pole.
>>
>
> Just to double-check that we're talking about the same numbers, what I see
> on my laptop is that WebIDL generation (mach build dom/bindings/export)
> takes 15s wall-clock time or so.  That's with a hot disk cache, though.
>

> Is that comparable to the numbers you're seeing?
>

It takes ~10s on my i7-6700K.


>
> I just checked how this number breaks down over here, and it looks about
> like this, if I didn't mess up my timings:
>
> 1) 3.2s parsing the WebIDL files and building the data model.
> 2) .6s writing out the global webidl files (RegisterBindings, etc).
> 3) 10s writing out the binding files.
> 4) 1s outside the generate_build_files() function somewhere, which means
> not really webidl processing.
>
> Parsing and building the data model is a pretty serial process.  We could
> try to parallelize it a bit if we made some significant changes to the
> parser, I guess, and parsed into multiple data models that we then merge
> together.  I don't have a good feel for how expensive this merging step
> would be.
>
> The writing out of things is, in general, an embarrassingly parallel
> task.  The reason we don't do it in parallel right now is that all the
> parallel bits need access to the output of the parser.  In the past we
> pickled this output and then deserialized it, but this turned out to have
> significant overhead, so we removed it.
>
> If there are ways we can parallelize this bit without ending up with too
> much overhead for getting the data model to all the parallel pieces, we
> might get some nice wins here.  The obvious thing with just spinning up
> multiple threads is unlikely to win due to the Python GIL, but it's worth
> at least checking, I guess.  If we use multiple processes instead, we need
> to find a quick-ish way to ship the parser output to them.


The overhead of serialization/deserialization + process startup has a
strong chance of canceling out any perf wins with Python.

Despite the GIL, Python can get nice performance wins when using multiple
threads for I/O bound tasks (including network operations). If all it is
doing is writing files, try plugging in a
concurrent.futures.ThreadPoolExecutor to do the file writing. Search the
tree for "ThreadPoolExecutor" to find example usage.

The build peers have also talked about introducing Rust tooling for the
build system. This WebIDL performance problem seems like a great use case
for Rust. However, I imagine rewriting the WebIDL parser in Rust could be
quite involved. If that's something someone wants to do, I don't think the
build peers will object too loudly. Although there are concerns about
introducing a new build requirement. I haven't been following Rust
developments closely enough to know where we stand on that. I'd rather the
build system not be the first component that requires Rust to build
Firefox.
_______________________________________________
dev-builds mailing list
dev-builds@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-builds

Reply via email to