On 3/29/14 7:22 PM, Patrick Walton wrote:
On a related note, I have been tossing around ideas today for using SIMD to match multiple selectors that have the same "shape" in parallel. For example, if we have ".foo #a" and ".bar #a" it may be possible to use the packed comparison instructions in SSE4 to match both at the same time. Obviously, this adds significant complexity and correspondingly increased maintenance burden, and its effectiveness depends on how often selectors have the same shape in the wild (if it works at all). So I'm filing it into the "potentially interesting project, not a high priority" mental bin. Could be neat though.
Just for fun, I tried some experiments with using SSE4 SIMD instructions to match the four selectors `.class0 #foo`/`.class1 #foo`/`.class2 #foo`/`.class3 #foo` in parallel on some random DOMs (500,000 DOM nodes with rand(0..8) random classes per node, assuming 16 classes in the stylesheet). I observed a 27% speedup on my Core i7. This is not amazing, and I suspect the problem is that the effectiveness of the increased parallelism provided by the vector instructions is offset by the increased number of memory accesses that the SIMD instructions force you into.
Of course, I should try on a snapshot of a real Web page (the HTML5 spec, perhaps), but I don't expect to do much better. 27% is not bad, but there are obviously much higher priority things to try first (e.g. multithreading or GPUs, both of which win by a lot more).
Patrick _______________________________________________ dev-servo mailing list dev-servo@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-servo