https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116654

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
FAIL: gcc.dg/vect/costmodel/ppc/costmodel-slp-12.c scan-tree-dump-times vect
"vectorizing stmts using SLP" 3 

the testcase needs adjustment (will push fix)

FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times
\\\\mlxvl\\\\M 30
FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times
\\\\mstxvl\\\\M 10

gcc.target/powerpc/p9-vec-length-full-8.c: \\mlxvl\\M found 21 times
FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times \\mlxvl\\M
30
gcc.target/powerpc/p9-vec-length-full-8.c: \\mstxvl\\M found 7 times
FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times
\\mstxvl\\M 10

(I hate these kind of testcases)

It looks like [u]int64_t and double are not using -with-len.  The difference
is that those no longer require peeling for gaps since the target can compose
a V2D{F,I} by pieces so we code-generate

  vect__9.333_29 = MEM <vector(2) double> [(double *)vectp_src.331_12];
  vectp_src.331_30 = vectp_src.331_12 + 16;
  _31 = MEM[(double *)vectp_src.331_30];
  vect__9.334_32 = {_31, 0.0};
  vect__9.335_33 = VEC_PERM_EXPR <vect__9.333_29, vect__9.334_32, { 0, 2 }>;

and get

.L46:   
        ld 9,0(4)
        ld 10,16(4)
        lxv 12,0(3)
        addi 4,4,32
        addi 3,3,16
        mtvsrdd 0,10,9
        xvadddp 0,0,12
        stxv 0,-16(3)
        bdnz .L46

and no epilogue.  I think that's better than -with-len.  It doesn't work
for the other sizes since we have no code to compose say a V4SI from
a V2SI and a SI.

I'm going to adjust the expected counts in the asm-scan.

Reply via email to