Hi everyone,
I'm considering using 32-bit fixed-point (16 bits for the fraction, 16
bits for the integer portion) for percentages in CSS. The reason is that
we can use the following four SSE4/AVX instructions on x86 to compute
two sides at once:
; xmm0 contains the style values (percentage or fixed length)
; xmm1 contains ~0 if the value is a percentage or 0 if the value
; is fixed
; xmm2 contains the value the percentages are relative to
0x0000000100000b68 <+8>: vpshufd xmm2,xmm2,0x0
0x0000000100000b6d <+13>: vpmulld xmm2,xmm2,xmm0
0x0000000100000b72 <+18>: vpsrad xmm2,xmm2,0x10
0x0000000100000b77 <+23>: vblendvps xmm0,xmm0,xmm2,xmm1
; result in xmm0
This is done in many places in layout: grep for `MaybeAuto` to find
them. Almost all places where one side is computed, another side is
computed too, and in some places, all four sides are computed, which can
be done with only one or two more instructions in the above sequence (as
XMM registers are 128 bit).
I have done a simple microbenchmark and found that using SSE4/AVX to
compute two sides at once is 66% faster than doing it sequentially using
branches. This is true even if percentages are not used and all branches
are predicted correctly.
Thoughts?
Patrick
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo