On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote: > With transposition issue addressed, the only blocker I see are some > simple testcases we can add to the suite. They don't have to be real > extensive. And one motivating example for the list archives, ideally > the glibc malloc case.
Here are some testcases. 2016-09-30 Segher Boessenkool <seg...@kernel.crashing.org> gcc/testsuite/ * gcc.target/powerpc/shrink-wrap-separate-0.c: New. * gcc.target/powerpc/shrink-wrap-separate-1.c: New. * gcc.target/powerpc/shrink-wrap-separate-2.c: New. diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c new file mode 100644 index 0000000..dea0611 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target powerpc*-*-* } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler {#before\M.*\mmflr\M} } } */ + +/* This tests if shrink-wrapping for separate components works. + + r20 (a callee-saved register) is forced live at the start, so that we + get it saved in a prologue at the start of the function. + The link register only needs to be saved if x is non-zero; without + separate shrink-wrapping it would however be saved in the one prologue. + The test tests if the mflr insn ends up behind the prologue. */ + +void g(void); + +void f(int x) +{ + register int r20 asm("20") = x; + asm("#before" : : "r"(r20)); + if (x) + g(); + asm(""); // no tailcall of g +} diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c new file mode 100644 index 0000000..735b606 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile { target powerpc*-*-* } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler {\mmflr\M.*\mbl\M.*\mmflr\M.*\mbl\M} } } */ + +/* This tests if shrink-wrapping for separate components creates more + than one prologue when that is useful. In this case, it saves the + link register before both the call to g and the call to h. */ + +void g(void) __attribute__((noreturn)); +void h(void) __attribute__((noreturn)); + +void f(int x) +{ + if (x == 42) + g(); + if (x == 31) + h(); +} diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c new file mode 100644 index 0000000..b22564a --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile { target powerpc*-*-* } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler {\mmflr\M.*\mbl\M.*\mmflr\M.*\mbl\M} } } */ + +/* This tests if shrink-wrapping for separate components puts a prologue + inside a loop when that is useful. In this case, it saves the link + register before each call: both calls happen with probability .10, + so saving the link register happens with .80 per execution of f on + average, which is smaller than 1 which you would get if you saved + it outside the loop. */ + +int *a; +void g(void); + +void f(int x) +{ + int j; + for (j = 0; j < 4; j++) { + if (__builtin_expect(a[j], 0)) + g(); + asm("#" : : : "memory"); + if (__builtin_expect(a[j], 0)) + g(); + a[j]++; + } +}