On Wed, 30 Jan 2013 19:34:58 -0000, Søren Sandmann <[email protected]> wrote:
It's simply that the speedup you got:SNB i5-2500s: firefox-chalkboard 25.9s -> 19.6s: 1.32x speedup is rather large for a change that doesn't even introduce a fast path or use SIMD. That suggests either that something dumb is going on in pixman or in the benchmark, or that we want SIMD variants of this operation.
By chance, I happened to be looking at this today. I'm finding a whopping 15% of the runtime for the whole of cairo-perf-trace is in a single type of composite operation in firefox-chalkboard, and it's an over_8888_8888 which can't be matched by STD_FAST_PATH, SIMPLE_NEAREST_FAST_PATH or SIMPLE_BILINEAR_FAST_PATH. STD_FAST_PATH fails because FAST_PATH_SAMPLES_COVER_CLIP_NEAREST isn't set in the source flags, and SIMPLE_NEAREST_FAST_PATH fails because FAST_PATH_SCALE_TRANSFORM isn't set. (If you're curious, the precise source flags are 0x207ca77.) Obviously this means that in this case, we're not getting the benefits of any platform-specific fast paths. Perhaps what we need is a "pad" equivalent of fast_composite_tiled_repeat()? Ben _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
