On Mon, Mar 7, 2022 at 12:51 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
>
> > Is there a reason that only inserts to DImode registers are implemented?
> > IMO, these peepholes should also handle inserts to SImode.
>
> I wasn't able to construct a test case that produced a byte or word insert
> into an SImode register.  The front-ends and middle-end end up producing
> different code sequences, and -m32 changes the ABI so that small structs
> get passed in memory rather than in registers.
>
> Here's the expanded testcase that I investigated:
>
> struct DataCL { char a; int b; };
> struct DataWL { short a; int b; };
> struct DataIL { int a; int b; };
>
> struct DataCI { char a; short b; };
> struct DataWI { short a; short b; };
>
> char c;
> short w;
> int i;
>
> DataCL foo_cl(int idx) { return { c }; }
> DataCL bar_cl(int idx) { return { c, 0 }; }
> DataWL foo_wl(int idx) { return { w }; }
> DataWL bar_wl(int idx) { return { w, 0 }; }
> DataIL foo_il(int idx) { return { i }; }
> DataIL bar_il(int idx) { return { i, 0 }; }
>
> DataCI foo_ci(int idx) { return { c }; }
> DataCI bar_ci(int idx) { return { c, 0 }; }
> DataWI foo_wi(int idx) { return { w }; }
> DataWI bar_wi(int idx) { return { w, 0 }; }
>
>
> I agree that for completeness similar peepholes handling inserts into
> SImode would be a good thing, but these wouldn't be restricted by
> TARGET_64BIT and would therefore require additional -m32 testing.
> The DImode peepholes I can justify for stage4 as PR tree-opt/98335
> is a regression, SImode peepholes would be more of a "leap of faith".
> If you’d be willing to accept a patch without a testcase, let me know.

We have plenty of these, where we assume that if the pattern works in
one mode, it also works in other modes. So, I think that by changing
DI mode to SWI48 mode iterator, we are on the safe side. We can also
trust bootstrap and regression tests here.

IMO, the patch with SWI48 mode iterator is OK.

Thanks,
Uros.

>
> It's also a pity that subreg handling in combine doesn't allow merging
> these inserts into zero registers to be combined to zero_extends in a
> machine independent way.  My recent patch for PR 95126 (awaiting
> review) should also allow front-ends and middle-end passes more
> flexibility in optimizing small struct constructors.
>
> Thanks (as always) for reviewing patches so quickly.
>
> Roger
> --
>
>

Reply via email to