> On 23 Oct 2025, at 8:37 pm, Richard Biener <[email protected]> wrote: > > External email: Use caution opening links or attachments > > > On Wed, Oct 22, 2025 at 12:12 PM Kugan Vivekanandarajah > <[email protected]> wrote: >> >> >> >>> On 22 Oct 2025, at 7:51 pm, Richard Biener <[email protected]> >>> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Tue, Oct 21, 2025 at 11:55 PM Kugan Vivekanandarajah >>> <[email protected]> wrote: >>>> >>>> Hi Richard, >>>> Thanks for the review. >>>> >>>>> On 16 Oct 2025, at 11:06 pm, Richard Biener <[email protected]> >>>>> wrote: >>>>> >>>>> External email: Use caution opening links or attachments >>>>> >>>>> >>>>> On Wed, Oct 15, 2025 at 11:08 PM Kugan Vivekanandarajah >>>>> <[email protected]> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> This patch enables Loop Invariant Code Motion (LICM) to hoist loads that >>>>>> alias with stores rom the loaded value. >>>>>> >>>>>> The pattern a[i] = a[0] is common in TSVC benchmarks (s293): >>>>>> >>>>>> for (int i = 0; i < N; i++) >>>>>> a[i] = a[0]; >>>>>> >>>>>> Previously, GCC conservatively rejected hoisting a[0] due to potential >>>>>> aliasing when i==0. However, this is a "self write" - even when aliasing >>>>>> occurs, we're writing back the same value, making hoisting safe. >>>>>> >>>>>> The optimization checks that: >>>>>> 1. One reference is a load, the other is a store >>>>>> 2. The stored SSA value equals the loaded SSA value >>>>>> 3. Only simple cases with single accesses per reference >>>>>> >>>>>> This enables vectorization of these patterns by allowing the vectorizer >>>>>> to see the hoisted loop-invariant value. >>>>>> >>>>>> With the patch, the loop now vectorizes and generates: >>>>>> >>>>>> .L2: >>>>>> - ldr s31, [x1] >>>>>> - str s31, [x0], 4 >>>>>> - cmp x0, x2 >>>>>> + str q31, [x0], 16 >>>>>> + cmp x0, x1 >>>>>> bne .L2 >>>>>> >>>>>> gcc/ChangeLog: >>>>> >>>>> So refs_independent_p has no particular oder on the arguments it >>>>> receives and both could be stores. It seems to me the special-casing >>>>> would be better done in ref_indep_loop_p for the kind == lim_raw >>>>> only. Also >>>>> >>>>> + if (((ref1->loaded && !ref1->stored && ref2->stored) >>>>> + || (ref2->loaded && !ref2->stored && ref1->stored)) >>>>> + && is_self_write (ref1, ref2)) >>>>> + { >>>>> >>>>> you seem to imply both orders are handled, but is_self_write only >>>>> handles ref1 being the load. Handing this upthread makes the >>>>> order obvious. >>>>> >>>>> So can you see to move this up? Otherwise I think this should be OK. >>>>> I'll note this does not only handle the "obvious" case but also >>>> >>>> In the attached patched, I have moved based on the suggestion. >>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New >>>> regressions. >>>> Is this OK? >>> >>> It seems you attached the old patch? >> >> Apologies. I sent the wrong patch. Here is the correct one. > > This seems to be an incremental patch of some sorts. Applying some "guessing" > the overall thing looks OK, I'd probably try micro-optimizing this > timing critical > part a bit by pre-computing load_stmt outside of the refs_to_check iteration > and only call is_self_write if that was successful (lim_raw && single > access in loop). > > Please send an updated and complete patch. You might want to double-check it > before posting ;) > Apologies for the mixup again. Here is the patch with added check for ref1->loaded && ref2->stored
Bootstrapped and regression tests on aarch64-linuc-gnu with no new regressions.
Thanks,
Kugan
> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>>>
>>>> Thanks,
>>>> Kugan
>>>>
>>>>
>>>>>
>>>>> void foo(int *p, int *q, int j)
>>>>> {
>>>>> for (int i =0; i < N; ++i)
>>>>> p[i] = q[j];
>>>>> }
>>>>>
>>>>> Richard.
>>>>>
>>>>>> * tree-ssa-loop-im.cc (is_self_write): New.
>>>>>> (refs_independent_p): Allow hoisting when aliasing references
>>>>>> form a self write pattern.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog:
>>>>>>
>>>>>> * gcc.dg/vect/vect-licm-hoist-1.c: New.
>>>>>> * gcc.dg/vect/vect-licm-hoist-2.c: Likewise.
>>>>>>
>>>>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New
>>>>>> regressions.
>>>>>>
>>>>>> Is this Ok?
>>>>>>
>>>>>> Signed-off-by: Kugan Vivekanandarajah <[email protected]>
>>>>>>
>>>>>> Thanks,
>>>>>> Kugan
>>
>>
0001-tree-optimization-Allow-LICM-to-hoist-loads-in-self-.patch
Description: 0001-tree-optimization-Allow-LICM-to-hoist-loads-in-self-.patch
