https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173

Jiong Wang <jiwang at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[5.0 regression] [AArch64]  |[5.0 regression] [AArch64]
                   |Performance regression due  |Can't ivopt array base
                   |to r213488                  |address while ARM can

--- Comment #11 from Jiong Wang <jiwang at gcc dot gnu.org> ---
(In reply to amker from comment #3)
> I have seen potential improvement of bzip/gzip on arm 32.  It relates to
> addressing mode which affecting loop invariant hoisting in kernel loop of
> these two benchmarks.  I once had a patch but didn't follow up that.  I
> think it's worthy of methodical investigation, rather than case by case
> changes.
> 
> Thanks,
> bin

exactly.

the fix in lra elimination only reduce one unnecessary add instructions

add    x1, x29, 48 
add    x0, x1, x0, sxtw
ldrb    w0, [x0, -16]

 transformed into

add    x0, 29, x0, sxtw
ldrb    w0, [x0, 32]

Pinski'a case is fixed by this.

But for Seb's case, still the base address calculation is not hoisted outside
the loop which is critical. And If we re-associate ((virtual_fp + reg) +
offset) into ((virtual_fp + offset) + reg), then the later RTL GCSE pre pass
will identify virtual_fp + offset as loop invariant and do the hoisting. But
the re-association is not always good when there are multi-use etc.

While for ARM backend, although there is the lra elimination issue, there
is no base address hoisting issue.  From the tree dump, ARM and AArch64 do get
difference result after ivopt pass.

Will create a seperate bugzilla for lra elimination issue

Reply via email to