On 2/13/13 7:23 PM, Robert O'Callahan wrote:
I think it's critical to examine the performance of Servo DOM+layout at its
most vulnerable, which I think is when JS is synchronously pounding on it.
A couple of microbenchmarks spring to mind:
1) DOM bindings performance for simple DOM operations, e.g.
for (var i = 0; i < 1000000; ++i) {
element.setAttribute("class", "foo");
}
This one should be easier. We've been looking at this microbenchmark for
a while. We need a couple of things to make this fast:
(1) The DOM representation needs to be made into a Rust struct, using
the unsafe code scheme I alluded to in my reply to Boris.
(2) Rust needs the manual control over stack switching that has more or
less achieved consensus on the Rust team. That way we can remove all
stack switching costs.
I suspect these two will get us down to within 2x of Gecko/WebKit, and
possibly much better.
Icing on the cake will be moving from the copy on write scheme to the
new immutability scheme. This will save one pointer load. Additionally,
being able to tag functions as not needing the stack growth check will
save 3 instructions.
2) Harder: performance of synchronous layout, e.g.
for (var i = 0; i < 100000; ++i) {
element.setAttribute("style", "width:" + i + "px");
element.getBoundingClientRect().width;
}
This is harder to make performant because of the message passing. When
the scheduler is rewritten we will have better performance here.
I did a microbenchmark just now to test Brian's proposed scheme that he
is implementing. I took the naive Fibonacci function (traditionally used
to test scheduling overheads) and added two context switches, a pointer
load, and a memory barrier for every single recursive call (source is at
[1]). Here are my results:
Without the context switches (baseline):
real 0m3.733s
user 0m3.705s
sys 0m0.006s
With the context switches:
real 0m12.249s
user 0m12.057s
sys 0m0.021s
That's a 3.28x slowdown. This matches what projects like Cilk saw in
their papers for Fibonacci. The Fibonacci function is basically just one
load and store and add, so this is basically a test of pure overhead
versus a function call.
I suspect that the function call overhead will be completely swamped by
the overhead of parsing the modified "width" selector alone. So I'm
feeling guardedly optimistic about this approach. I can try more
microbenchmarks later that pretend to parse the CSS selector and whatnot
if people are interested. Of course, nothing will substitute for testing
the actual implementation.
Patrick
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo