On 15/04/2014 06:00, Patrick Walton wrote:
Hi everyone,

I'm considering using 32-bit fixed-point (16 bits for the fraction, 16
bits for the integer portion) for percentages in CSS. The reason is that
we can use the following four SSE4/AVX instructions on x86 to compute
two sides at once:

     ; xmm0 contains the style values (percentage or fixed length)
     ; xmm1 contains ~0 if the value is a percentage or 0 if the value
     ; is fixed
     ; xmm2 contains the value the percentages are relative to
     0x0000000100000b68 <+8>:     vpshufd xmm2,xmm2,0x0
     0x0000000100000b6d <+13>:    vpmulld xmm2,xmm2,xmm0
     0x0000000100000b72 <+18>:    vpsrad xmm2,xmm2,0x10
     0x0000000100000b77 <+23>:    vblendvps xmm0,xmm0,xmm2,xmm1
     ; result in xmm0

Is the plan to somehow convince LLVM to emit these instructions, or to have Assembly source code in Servo? If the latter, does this mean a second code path for other architectures? Should we have a way to test both, like we do for CPU vs. GPU rendering?


This is done in many places in layout: grep for `MaybeAuto` to find
them. Almost all places where one side is computed, another side is
computed too, and in some places, all four sides are computed, which can
be done with only one or two more instructions in the above sequence (as
XMM registers are 128 bit).

If we’re going through this code, it could be a good time to fix https://github.com/mozilla-servo/rust-cssparser/issues/35


I have done a simple microbenchmark and found that using SSE4/AVX to
compute two sides at once is 66% faster than doing it sequentially using
branches. This is true even if percentages are not used and all branches
are predicted correctly.

Thoughts?

--
Simon Sapin
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Reply via email to