On Sat, 19 Jan 2013 16:16:49 +0000
Ben Avison <[email protected]> wrote:

> Move the entire contents of pixman-arm-simd-asm.S to a new file;
> ultimately this will only retain the scaled operations, so it is
> named pixman-arm-simd-asm-scaled.S. Added new header file
> pixman-arm-simd-asm.h, containing the macros which are the basis of
> all the new ARMv6 implementations, although at this point in the
> series, nothing uses them and the library should be binary-identical.

[...]

More comments describing the input arguments for the "preload_line"
macro would be definitely welcome. And also some high level overview
for the "narrow", "medium" and "wide" cases.

>
+.macro preload_line    narrow_case, bpp, bpp_shift, base
> + .if bpp > 0
> +  .if narrow_case && (bpp <= dst_w_bpp)
> +        /* In these cases, each line for each channel is in either 1 or 2 
> cache lines */
> +        PF  bic,    WK0, base, #31
> +        PF  pld,    [WK0]
> +        PF  add,    WK1, base, X, LSL #bpp_shift
> +        PF  sub,    WK1, WK1, #1
> +        PF  bic,    WK1, WK1, #31
> +        PF  cmp,    WK1, WK0
> +        PF  beq,    90f
> +        PF  pld,    [WK1]
> +90:
> +  .else
> +        PF  bic,    WK0, base, #31
> +        PF  pld,    [WK0]
> +        PF  add,    WK1, base, X, lsl #bpp_shift
> +        PF  sub,    WK1, WK1, #1
> +        PF  bic,    WK1, WK1, #31
> +        PF  cmp,    WK1, WK0
> +        PF  beq,    92f

> +91:     PF  add,    WK0, WK0, #32
> +        PF  cmp,    WK0, WK1
> +        PF  pld,    [WK0]
> +        PF  bne,    91b

How many iterations does this loop typically run? If this tries to
preload the whole scanline (as the name of the macro implies), then
we may have some problems after 3rd iteration.

ARM11 can only support three outstanding cache misses at a time, and
on the 4th iteration the PLD instruction will block because it is
treated mostly as LDR without destination register (and as NOP for
TLB misses).

The TLB misses are another potential source of performance problems. If
you try to prefetch too much and too far away, you may move into the
next page, which is not present in TLB and all the nice prefetches will
be wasted.

> +92:
> +  .endif
> + .endif
> +.endm

-- 
Best regards,
Siarhei Siamashka
_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman

Reply via email to