> On 22 Oct 2025, at 7:51 pm, Richard Biener <[email protected]> wrote: > > External email: Use caution opening links or attachments > > > On Tue, Oct 21, 2025 at 11:55 PM Kugan Vivekanandarajah > <[email protected]> wrote: >> >> Hi Richard, >> Thanks for the review. >> >>> On 16 Oct 2025, at 11:06 pm, Richard Biener <[email protected]> >>> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Wed, Oct 15, 2025 at 11:08 PM Kugan Vivekanandarajah >>> <[email protected]> wrote: >>>> >>>> Hi, >>>> >>>> This patch enables Loop Invariant Code Motion (LICM) to hoist loads that >>>> alias with stores rom the loaded value. >>>> >>>> The pattern a[i] = a[0] is common in TSVC benchmarks (s293): >>>> >>>> for (int i = 0; i < N; i++) >>>> a[i] = a[0]; >>>> >>>> Previously, GCC conservatively rejected hoisting a[0] due to potential >>>> aliasing when i==0. However, this is a "self write" - even when aliasing >>>> occurs, we're writing back the same value, making hoisting safe. >>>> >>>> The optimization checks that: >>>> 1. One reference is a load, the other is a store >>>> 2. The stored SSA value equals the loaded SSA value >>>> 3. Only simple cases with single accesses per reference >>>> >>>> This enables vectorization of these patterns by allowing the vectorizer >>>> to see the hoisted loop-invariant value. >>>> >>>> With the patch, the loop now vectorizes and generates: >>>> >>>> .L2: >>>> - ldr s31, [x1] >>>> - str s31, [x0], 4 >>>> - cmp x0, x2 >>>> + str q31, [x0], 16 >>>> + cmp x0, x1 >>>> bne .L2 >>>> >>>> gcc/ChangeLog: >>> >>> So refs_independent_p has no particular oder on the arguments it >>> receives and both could be stores. It seems to me the special-casing >>> would be better done in ref_indep_loop_p for the kind == lim_raw >>> only. Also >>> >>> + if (((ref1->loaded && !ref1->stored && ref2->stored) >>> + || (ref2->loaded && !ref2->stored && ref1->stored)) >>> + && is_self_write (ref1, ref2)) >>> + { >>> >>> you seem to imply both orders are handled, but is_self_write only >>> handles ref1 being the load. Handing this upthread makes the >>> order obvious. >>> >>> So can you see to move this up? Otherwise I think this should be OK. >>> I'll note this does not only handle the "obvious" case but also >> >> In the attached patched, I have moved based on the suggestion. >> Bootstrappped and regression tested on aarch64-linux-gnu with no New >> regressions. >> Is this OK? > > It seems you attached the old patch?
Apologies. I sent the wrong patch. Here is the correct one.
Thanks,
Kugan
>
>> Thanks,
>> Kugan
>>
>>
>>>
>>> void foo(int *p, int *q, int j)
>>> {
>>> for (int i =0; i < N; ++i)
>>> p[i] = q[j];
>>> }
>>>
>>> Richard.
>>>
>>>> * tree-ssa-loop-im.cc (is_self_write): New.
>>>> (refs_independent_p): Allow hoisting when aliasing references
>>>> form a self write pattern.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> * gcc.dg/vect/vect-licm-hoist-1.c: New.
>>>> * gcc.dg/vect/vect-licm-hoist-2.c: Likewise.
>>>>
>>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New
>>>> regressions.
>>>>
>>>> Is this Ok?
>>>>
>>>> Signed-off-by: Kugan Vivekanandarajah <[email protected]>
>>>>
>>>> Thanks,
>>>> Kugan
0001-tree-optimization-v2-Allow-LICM-to-hoist-loads-in-se.patch
Description: 0001-tree-optimization-v2-Allow-LICM-to-hoist-loads-in-se.patch
