[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

tianyang.chou at gmail dot com via Gcc-bugs Tue, 18 Mar 2025 20:07:27 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932


--- Comment #23 from Tianyang Chou <tianyang.chou at gmail dot com> ---
(In reply to Tianyang Chou from comment #22)
> (In reply to Tamar Christina from comment #21)
> > Thus finally fixed
> 
> Hi Tamar,
> 
> Is there any other prerequisite patches for this patch: "perform affine fold
> to unsigned on non address expressions." ?
> 
> I apply your patch to gcc-13.2.0 official and got 10% performance increase
> on aarch64, but applying the patch to gcc-12.3.0 official got 0 performance
> boost.
> 
> So I wonder if there are some related patches between gcc-12.3.0 and
> gcc-13.2.0 which bring the difference.

I saw GCC-12.3.0 official with the patch generate asm for the test case like
below:

ldp w7, w6, [x0, 216]
ldp w5, w4, [x0, 252]
ldr w3, [x0, 288]
ldr w2, [x0, 292]
sub w7, w7, #1
sub w6, w6, #1
sub w5, w5, #1
sub w4, w4, #1
sub w3, w3, #1
stp w7, w6, [x0, 216]
stp w5, w4, [x0, 252]
add x0, x0, 324
sub w2, w2, #1
str w3, [x0, -36]
str w2, [x0, -32]

and gcc-13.2.0 offcial with the patch generate asm like:

mvni v3.2s, 0
ldr d2, [x0, 216]
ldr d1, [x0, 252]
ldr d0, [x0, 288]
add v2,2s, v2,2s, v3,2s
add x0,x0, 324
add v1,2s, v1,2s, v3.2s
add v0,2s, v0.2s, v3.2s
str d2, [x0, -108]
str d1, [x0, -72]
str d0, [x0, -36]

It seems code didn't get vectorized when using GCC-12.3.0

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

Reply via email to