https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782

--- Comment #31 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> 
---
I don't know how relevant this is to the exchange2 problem yet,
but for:

---------------------------------------------------------------------
#define PROB 0.95

struct L {
  int data;
  L *next;
  L *inner;
};

template<int N>
struct S {
  static __attribute__((always_inline)) void f(L *head, int inc) {
    while (head) {
      asm volatile ("// Loop %0" :: "i" (N));
      int subinc = head->data + inc;
      if (__builtin_expect_with_probability (bool(head->inner), 0, PROB))
        S<N-1>::f(head->inner, subinc);
      head->data = subinc;
      head = head->inner;
    }
  }
};

template<>
struct S<0> {
  static void f(L *, int) {
    asm volatile ("// foo" ::: "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7");
  }
};

void x(L *head) { S<10>::f(head, 1); }
---------------------------------------------------------------------

compiled on aarch64 with -O3, we always seem to spill in the outer
loops regardless of the value of PROB.  That is, the inner loops
always seem to get priority even if their bb frequencies are low.
I would have expected that moving PROB from 0.05 (inner loop very
likely to be executed) to 0.95 (inner loop very likely to be skipped)
would have moved the spills from the outer loops to the inner loops.

An unrelated issue on aarch64 is that IP0 and IP1 are explicitly
clobbered by call instructions, via CALL_INSN_FUNCTION_USAGE.
We have to do this because the linker is allowed to insert code
that clobbers those registers even if we “know” that the target
of the call doesn't clobber them.  However, clobbering the registers
that way prevents them from being used for registers that are live
across the call, even if the call is very unlikely to be executed.
Hacking a fix for that does reduce the number of spills, but doesn't
get rid of the ones that matter in exchange2.

Reply via email to