https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86609

            Bug ID: 86609
           Summary: Reassociate (int) round sequences
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

Not entirely sure what to call this optimisation.
Consider:
int f(double x, double *p)
{
  double r = __builtin_round (x);
  *p = r;
  return (int) r;
}

For aarch64 GCC with -O2 -fno-math-errno generates:
f:
        frinta  d0, d0  // 7    [c=8 l=4]  rounddf2
        str     d0, [x0]        // 8    [c=4 l=4]  *movdf_aarch64/7
        fcvtzs  w0, d0  // 14   [c=8 l=4]  fix_truncdfsi2
        ret             // 29   [c=0 l=4]  *do_return


The problem here is that the two FRINT* operations cannot be done in parallel.
Clang can break the chain like so:
f:                                      // @f
// %bb.0:                               // %entry
        frinta  d1, d0
        fcvtas  w8, d0
        str     d1, [x0]
        mov     w0, w8
        ret

Note how the two expensive operations are now independent.
I think in C terms this means transforming the above to:
int f2 (double x, double *p)
{
  double r = __builtin_round (x);
  *p = r;
  return (int)__builtin_iround (x);
}

Would this be something for the reassociation pass to do?

Reply via email to