Hi Uros,

> Is there a reason that only inserts to DImode registers are implemented?
> IMO, these peepholes should also handle inserts to SImode.

I wasn't able to construct a test case that produced a byte or word insert
into an SImode register.  The front-ends and middle-end end up producing
different code sequences, and -m32 changes the ABI so that small structs
get passed in memory rather than in registers.

Here's the expanded testcase that I investigated:

struct DataCL { char a; int b; };
struct DataWL { short a; int b; };
struct DataIL { int a; int b; };

struct DataCI { char a; short b; };
struct DataWI { short a; short b; };

char c;
short w;
int i;

DataCL foo_cl(int idx) { return { c }; }
DataCL bar_cl(int idx) { return { c, 0 }; }
DataWL foo_wl(int idx) { return { w }; }
DataWL bar_wl(int idx) { return { w, 0 }; }
DataIL foo_il(int idx) { return { i }; }
DataIL bar_il(int idx) { return { i, 0 }; }

DataCI foo_ci(int idx) { return { c }; }
DataCI bar_ci(int idx) { return { c, 0 }; }
DataWI foo_wi(int idx) { return { w }; }
DataWI bar_wi(int idx) { return { w, 0 }; }


I agree that for completeness similar peepholes handling inserts into 
SImode would be a good thing, but these wouldn't be restricted by
TARGET_64BIT and would therefore require additional -m32 testing.
The DImode peepholes I can justify for stage4 as PR tree-opt/98335
is a regression, SImode peepholes would be more of a "leap of faith".
If you’d be willing to accept a patch without a testcase, let me know.

It's also a pity that subreg handling in combine doesn't allow merging
these inserts into zero registers to be combined to zero_extends in a
machine independent way.  My recent patch for PR 95126 (awaiting
review) should also allow front-ends and middle-end passes more
flexibility in optimizing small struct constructors. 

Thanks (as always) for reviewing patches so quickly.

Roger
--


Reply via email to