https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93023
Bug ID: 93023
Summary: give preference to address iv without offset in ivopts
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: fxue at os dot amperecomputing.com
Target Milestone: ---
>From address IVs with same base and index, ivopts always pick up one with
non-zero offset. This does not incur extra cost on architecture like X86, which
has LEA instruction to combine offset into address computation. But on ARM, one
more add-with-offset instruction is required.
X86: lea addr_reg, base[index + offset]
ARM: add addr_reg, base, index
add addr_reg, addr_reg, offset
So choosing IV w/o offset can save one instruction in most situations.
Here is an example, compile it on aarch64.
int data[100];
int fn1 ();
void fn2 (int b, int n)
{
int i;
for (i = 0; i < n; i++, b++)
{
data[b + 10] = 1;
fn1 ();
data[b + 3] = 2;
}
}
Analysis into ivopts shows that those address IVs have same in-loop cost, and
IV w/o offset does have smaller pre-loop setup cost. But since the setup cost
will be averaged to each iteration, the minor cost difference will go due to
round-off by integer division. To fix this round-off error, cost can be
represented in a more accurate way, such as adding a fraction part to make it a
fixpoint number.