https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173
Jiong Wang <jiwang at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|[5.0 regression] [AArch64] |[5.0 regression] [AArch64] |Performance regression due |Can't ivopt array base |to r213488 |address while ARM can --- Comment #11 from Jiong Wang <jiwang at gcc dot gnu.org> --- (In reply to amker from comment #3) > I have seen potential improvement of bzip/gzip on arm 32. It relates to > addressing mode which affecting loop invariant hoisting in kernel loop of > these two benchmarks. I once had a patch but didn't follow up that. I > think it's worthy of methodical investigation, rather than case by case > changes. > > Thanks, > bin exactly. the fix in lra elimination only reduce one unnecessary add instructions add x1, x29, 48 add x0, x1, x0, sxtw ldrb w0, [x0, -16] transformed into add x0, 29, x0, sxtw ldrb w0, [x0, 32] Pinski'a case is fixed by this. But for Seb's case, still the base address calculation is not hoisted outside the loop which is critical. And If we re-associate ((virtual_fp + reg) + offset) into ((virtual_fp + offset) + reg), then the later RTL GCSE pre pass will identify virtual_fp + offset as loop invariant and do the hoisting. But the re-association is not always good when there are multi-use etc. While for ARM backend, although there is the lra elimination issue, there is no base address hoisting issue. From the tree dump, ARM and AArch64 do get difference result after ivopt pass. Will create a seperate bugzilla for lra elimination issue