On Mon, Nov 17, 2025 at 1:03 AM Kugan Vivekanandarajah
<[email protected]> wrote:
>
>
>
> > On 23 Oct 2025, at 8:37 pm, Richard Biener <[email protected]> 
> > wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Wed, Oct 22, 2025 at 12:12 PM Kugan Vivekanandarajah
> > <[email protected]> wrote:
> >>
> >>
> >>
> >>> On 22 Oct 2025, at 7:51 pm, Richard Biener <[email protected]> 
> >>> wrote:
> >>>
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> >>> On Tue, Oct 21, 2025 at 11:55 PM Kugan Vivekanandarajah
> >>> <[email protected]> wrote:
> >>>>
> >>>> Hi Richard,
> >>>> Thanks for the review.
> >>>>
> >>>>> On 16 Oct 2025, at 11:06 pm, Richard Biener 
> >>>>> <[email protected]> wrote:
> >>>>>
> >>>>> External email: Use caution opening links or attachments
> >>>>>
> >>>>>
> >>>>> On Wed, Oct 15, 2025 at 11:08 PM Kugan Vivekanandarajah
> >>>>> <[email protected]> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> This patch enables Loop Invariant Code Motion (LICM) to hoist loads 
> >>>>>> that
> >>>>>> alias with stores rom the loaded value.
> >>>>>>
> >>>>>> The pattern a[i] = a[0] is common in TSVC benchmarks (s293):
> >>>>>>
> >>>>>> for (int i = 0; i < N; i++)
> >>>>>>  a[i] = a[0];
> >>>>>>
> >>>>>> Previously, GCC conservatively rejected hoisting a[0] due to potential
> >>>>>> aliasing when i==0. However, this is a "self write" - even when 
> >>>>>> aliasing
> >>>>>> occurs, we're writing back the same value, making hoisting safe.
> >>>>>>
> >>>>>> The optimization checks that:
> >>>>>> 1. One reference is a load, the other is a store
> >>>>>> 2. The stored SSA value equals the loaded SSA value
> >>>>>> 3. Only simple cases with single accesses per reference
> >>>>>>
> >>>>>> This enables vectorization of these patterns by allowing the vectorizer
> >>>>>> to see the hoisted loop-invariant value.
> >>>>>>
> >>>>>> With the patch, the loop now vectorizes and generates:
> >>>>>>
> >>>>>>      .L2:
> >>>>>> -       ldr     s31, [x1]
> >>>>>> -       str     s31, [x0], 4
> >>>>>> -       cmp     x0, x2
> >>>>>> +       str     q31, [x0], 16
> >>>>>> +       cmp     x0, x1
> >>>>>>      bne     .L2
> >>>>>>
> >>>>>> gcc/ChangeLog:
> >>>>>
> >>>>> So refs_independent_p has no particular oder on the arguments it
> >>>>> receives and both could be stores.  It seems to me the special-casing
> >>>>> would be better done in ref_indep_loop_p for the kind == lim_raw
> >>>>> only.  Also
> >>>>>
> >>>>> +      if (((ref1->loaded && !ref1->stored && ref2->stored)
> >>>>> +          || (ref2->loaded && !ref2->stored && ref1->stored))
> >>>>> +         && is_self_write (ref1, ref2))
> >>>>> +       {
> >>>>>
> >>>>> you seem to imply both orders are handled, but is_self_write only
> >>>>> handles ref1 being the load.  Handing this upthread makes the
> >>>>> order obvious.
> >>>>>
> >>>>> So can you see to move this up?  Otherwise I think this should be OK.
> >>>>> I'll note this does not only handle the "obvious" case but also
> >>>>
> >>>> In the attached patched, I have moved based on the suggestion.
> >>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New 
> >>>> regressions.
> >>>> Is this OK?
> >>>
> >>> It seems you attached the old patch?
> >>
> >> Apologies. I sent the wrong patch. Here is the correct one.
> >
> > This seems to be an incremental patch of some sorts.  Applying some 
> > "guessing"
> > the overall thing looks OK, I'd probably try micro-optimizing this
> > timing critical
> > part a bit by pre-computing load_stmt outside of the refs_to_check iteration
> > and only call is_self_write if that was successful (lim_raw && single
> > access in loop).
> >
> > Please send an updated and complete patch.  You might want to double-check 
> > it
> > before posting ;)
> >
> Apologies for the mixup again. Here is the patch with added check for 
> ref1->loaded && ref2->stored

This is the original patch again ....

So NACK.

:/

Please triple-check what you attach.

Richard.

> Bootstrapped and regression tests on aarch64-linuc-gnu with no new 
> regressions.
>
> Thanks,
> Kugan
>
>
> > Thanks,
> > Richard.
> >
> >> Thanks,
> >> Kugan
> >>
> >>>
> >>>> Thanks,
> >>>> Kugan
> >>>>
> >>>>
> >>>>>
> >>>>> void foo(int *p, int *q, int j)
> >>>>> {
> >>>>> for  (int i =0; i < N; ++i)
> >>>>>  p[i] = q[j];
> >>>>> }
> >>>>>
> >>>>> Richard.
> >>>>>
> >>>>>>      * tree-ssa-loop-im.cc (is_self_write): New.
> >>>>>>      (refs_independent_p): Allow hoisting when aliasing references
> >>>>>>      form a self write pattern.
> >>>>>>
> >>>>>> gcc/testsuite/ChangeLog:
> >>>>>>
> >>>>>>      * gcc.dg/vect/vect-licm-hoist-1.c: New.
> >>>>>>      * gcc.dg/vect/vect-licm-hoist-2.c: Likewise.
> >>>>>>
> >>>>>> Bootstrappped and regression tested on aarch64-linux-gnu with no New
> >>>>>> regressions.
> >>>>>>
> >>>>>> Is this Ok?
> >>>>>>
> >>>>>> Signed-off-by: Kugan Vivekanandarajah <[email protected]>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Kugan
> >>
> >>
>

Reply via email to