https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91635

--- Comment #19 from Jim Wilson <wilson at gcc dot gnu.org> ---
I did cross toolchain builds and checks for rv32i/newlib and rv64gc/glibc. 
There were no regressions.

Since the splitters exist to reduce code size, I looked at the libc and
libstdc++ sizes with and without the patch.

For the rv32i build, libstdc++ is the same size, and libc is 4 bytes smaller. 
I extracted a testcase.
int
isnanf (float x)
{
  union u { float f; int i; } u;
  u.f = x;
  int i = u.i;
  i &= 0x7fffffff;
  return i > 0x7f800000;
}
Without the patch with -O2 -S we get
        slli    a0,a0,1
        li      a5,2139095040
        srli    a0,a0,1
        sgt     a0,a0,a5
        ret
with the patch we get
        li      a5,-16777216
        slli    a0,a0,1
        sgtu    a0,a0,a5
        ret
The issue here is that the extra temporary used for the splitter that generates
the slli/srli gives combine a little more freedom, and it manages to find an
optimization that was missed before.

For the rv64gc build, libc is 38 bytes smaller and libstdc++ is 34 bytes
larger.  The libc issue appears to be the same as above.  The libstdc++ issue
is more interesting.  I extracted a testcase.
unsigned long
sub (long l)
{
  union u {
    struct s { int a : 19; unsigned int b : 13; int x; } s;
    long l;
  } u;
  u.l = l;
  return u.s.b;
}
Without the patch, with -O2 -S, we generate
        srliw   a0,a0,19
        ret
and with the patch we get
        srli    a0,a0,19
        slli    a0,a0,51
        srli    a0,a0,51
        ret
The issue here is that there is no REG_DEAD note for the temporary that we
generated. So without the patch combine generates
Successfully matched this instruction:
(set (reg:DI 78)
    (zero_extract:DI (reg:DI 82)
        (const_int 13 [0xd])
        (const_int 19 [0x13])))
and with the patch combine generates
Failed to match this instruction:
(parallel [
        (set (reg:DI 78)
            (zero_extract:DI (reg:DI 82)
                (const_int 13 [0xd])
                (const_int 19 [0x13])))
        (set (reg:DI 83)
            (and:DI (ashift:DI (reg:DI 82)
                    (const_int 32 [0x20]))
                (const_int -2251799813685248 [0xfff8000000000000])))
    ])
where reg 83 is the temporary generated by the splitter which has no REG_DEAD
note.

Since the temporary appears to be generally useful, and lack of a REG_DEAD note
causes problems, I think it would be better to modify the two define_split
patterns to have a clobber instead of calling gen_reg_rtx.  I do get the
REG_DEAD note in this case, and the correct single instruction output for the
second testcase.  The three define_insn_and_split patterns can remain the same
as they aren't causing any trouble.

Though I suppose there is a question here of whether combine will ever
accidentally substitute in a paradoxical reg for a clobber.  I would hope that
it doesn't.  I can check for that.

Since I have a new patch, and new testcases, I need to redo the builds and
checks from scratch.

Reply via email to