On 2/13/13 7:23 PM, Robert O'Callahan wrote:
I think it's critical to examine the performance of Servo DOM+layout at its
most vulnerable, which I think is when JS is synchronously pounding on it.
A couple of microbenchmarks spring to mind:
1) DOM bindings performance for simple DOM operations, e.g.
   for (var i = 0; i < 1000000; ++i) {
     element.setAttribute("class", "foo");
   }

This one should be easier. We've been looking at this microbenchmark for a while. We need a couple of things to make this fast:

(1) The DOM representation needs to be made into a Rust struct, using the unsafe code scheme I alluded to in my reply to Boris.

(2) Rust needs the manual control over stack switching that has more or less achieved consensus on the Rust team. That way we can remove all stack switching costs.

I suspect these two will get us down to within 2x of Gecko/WebKit, and possibly much better.

Icing on the cake will be moving from the copy on write scheme to the new immutability scheme. This will save one pointer load. Additionally, being able to tag functions as not needing the stack growth check will save 3 instructions.

2) Harder: performance of synchronous layout, e.g.
   for (var i = 0; i < 100000; ++i) {
     element.setAttribute("style", "width:" + i + "px");
     element.getBoundingClientRect().width;
   }

This is harder to make performant because of the message passing. When the scheduler is rewritten we will have better performance here.

I did a microbenchmark just now to test Brian's proposed scheme that he is implementing. I took the naive Fibonacci function (traditionally used to test scheduling overheads) and added two context switches, a pointer load, and a memory barrier for every single recursive call (source is at [1]). Here are my results:

Without the context switches (baseline):

real    0m3.733s
user    0m3.705s
sys     0m0.006s

With the context switches:

real    0m12.249s
user    0m12.057s
sys     0m0.021s

That's a 3.28x slowdown. This matches what projects like Cilk saw in their papers for Fibonacci. The Fibonacci function is basically just one load and store and add, so this is basically a test of pure overhead versus a function call.

I suspect that the function call overhead will be completely swamped by the overhead of parsing the modified "width" selector alone. So I'm feeling guardedly optimistic about this approach. I can try more microbenchmarks later that pretend to parse the CSS selector and whatnot if people are interested. Of course, nothing will substitute for testing the actual implementation.

Patrick

_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Reply via email to