https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98981

--- Comment #4 from Jim Wilson <wilson at gcc dot gnu.org> ---
With this testcase

extern void sub2 (void);

void
sub (int *i, int *j)
{
  int k = *i + 1;
  *j = k;
  if (k == 0)
    sub2 ();
}

Compiling without the riscv_rtx_cost patch, I get
        lw      a5,0(a0)
        addiw   a4,a5,1
        sw      a4,0(a1)
        beq     a4,zero,.L4
compiling with the riscv_rtx_cost patch, I get
        lw      a5,0(a0)
        addiw   a4,a5,1
        sw      a4,0(a1)
        addiw   a5,a5,1
        beq     a5,zero,.L4

The problem here is that we have RTL
(insn 9 7 10 2 (set (reg:SI 76)
        (plus:SI (subreg:SI (reg:DI 78 [ *i_4(D) ]) 0)
            (const_int 1 [0x1]))) "tmp.c":6:7 3 {addsi3}
     (expr_list:REG_DEAD (reg:DI 78 [ *i_4(D) ])
        (nil)))
(insn 10 9 11 2 (set (reg/v:DI 73 [ k ])
        (sign_extend:DI (reg:SI 76))) "tmp.c":6:7 90 {extendsidi2}
     (nil))
with the SImode 76 used for the sw and the DImode 73 used for the beq.  With
the riscv_rtx_cost change, which makes the sign_extend add cheap, combine folds
the add into the sign-extend to get
(insn 9 7 10 2 (set (reg:SI 76)
        (plus:SI (subreg:SI (reg:DI 78 [ *i_4(D) ]) 0)
            (const_int 1 [0x1]))) "tmp.c":6:7 3 {addsi3}
     (nil))
(insn 10 9 11 2 (set (reg/v:DI 73 [ k ])
        (sign_extend:DI (plus:SI (subreg:SI (reg:DI 78 [ *i_4(D) ]) 0)
                (const_int 1 [0x1])))) "tmp.c":6:7 5 {*addsi3_extended}
     (expr_list:REG_DEAD (reg:DI 78 [ *i_4(D) ])
        (nil)))
and now we have two adds which is wrong.  Without the patch, combine does
nothing, then ree manages to merge the sign_extend into the add and subseg the
sw, so we end up with only the one addw and no explicit sign extend.

My take on this is that we never should have generated the SImode add in the
first place, because we don't actually have that instruction.  If we would have
generated the sign-extended add at rtl generation time, and subreged the
result, then we would have gotten the right result.  In theory.

So I think the riscv_rtx_cost change is useful, but only if combined with the
rtl generation change I proposed in comment 1.

Reply via email to