http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57249

            Bug ID: 57249
           Summary: Unrolling too late for inlining
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: glisse at gcc dot gnu.org

Hello,

this code is a variant of the code at
http://stackoverflow.com/questions/16493290/why-is-inlined-function-slower-than-function-pointer

typedef void (*Fn)();

long sum = 0;

inline void accu() {
  sum+=4;
}

static const Fn map[4] = {&accu, &accu, &accu, &accu};

void f(bool opt) {
  const long N = 10000000L;
  if (opt)
  {
    for (long i = 0; i < N; i++)
    {
      accu();
      accu();
      accu();
      accu();
    }
  }
  else
  {
    for (long i = 0; i < N; i++)
    {
      for (int j = 0; j < 4; j++)
        (*map[j])();
    }
  }
}


In the first loop, g++ -O3 inlines the 4 accu() calls in the einline pass.
Later passes optimize the whole loop to a single +=. In the second loop, we
need to wait until the inner loop is unrolled to see the accu() calls, and
there is no inlining pass after that (and then it would still need the right
passes to optimize the outer loop to sum+=160000000).

I am not sure what the right solution is, since too aggressive early unrolling
can be bad for other optimizations. Note that LLVM manages to optimize the
whole function to a single +=.

Reply via email to