https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93023

            Bug ID: 93023
           Summary: give preference to address iv without offset in ivopts
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fxue at os dot amperecomputing.com
  Target Milestone: ---

>From address IVs with same base and index, ivopts always pick up one with
non-zero offset. This does not incur extra cost on architecture like X86, which
has LEA instruction to combine offset into address computation. But on ARM, one
more add-with-offset instruction is required.

   X86:   lea addr_reg, base[index + offset]

   ARM:   add addr_reg, base, index
          add addr_reg, addr_reg, offset

So choosing IV w/o offset can save one instruction in most situations.
Here is an example, compile it on aarch64.

  int data[100];
  int fn1 ();

  void fn2 (int b, int n)
  {
    int i;

    for (i = 0; i < n; i++, b++)
      {
        data[b + 10] = 1;
        fn1 ();
        data[b + 3] = 2;
      }
  }

Analysis into ivopts shows that those address IVs have same in-loop cost, and
IV w/o offset does have smaller pre-loop setup cost. But since the setup cost
will be averaged to each iteration, the minor cost difference will go due to
round-off by integer division. To fix this round-off error, cost can be
represented in a more accurate way, such as adding a fraction part to make it a
fixpoint number.

Reply via email to