https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116466

--- Comment #4 from cui xu <leidian900 at outlook dot com> ---
(In reply to Jeffrey A. Law from comment #2)
> Looking at this, I would fully expect that in an optimizing compilation that
> the redundant extension would be eliminated.   Are you seeing the redundant
> sign extensions in the final assembly output or just in the intermediate RTL?
> 
> If the former, can you attach a testcase?
> 
> Thanks.

I made the following changes to the "addv" instruction pattern in riscv.md:

if (TARGET_64BIT && <MODE>mode == SImode)
    {
      rtx t3 = gen_reg_rtx (DImode);
      rtx t4 = gen_reg_rtx (DImode);
      rtx t5 = gen_reg_rtx (DImode);

      emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
      if (GET_CODE (operands[1]) != CONST_INT)
        emit_insn (gen_extend_insn (t4, operands[1], DImode, SImode, 0));
      else
        t4 = operands[1];
      if (GET_CODE (operands[2]) != CONST_INT)
        emit_insn (gen_extend_insn (t5, operands[2], DImode, SImode, 0));
      else
        t5 = operands[2];
      emit_insn (gen_adddi3 (t3, t4, t5));

      PUT_MODE_RAW(operands[0], DImode);
      riscv_expand_conditional_branch (operands[3], NE, operands[0], t3);
      PUT_MODE_RAW(operands[0], SImode);
    }

And the following define_insn has been added to the riscv.md file:

(define_insn "*branchsidi"
  [(set (pc)
  (if_then_else
   (match_operator 1 "order_operator"
      [(match_operand:SI 2 "register_operand" " r")
      (match_operand:DI 3 "register_operand" " r")])
    (label_ref (match_operand 0 "" ""))
    (pc)))]
  ""
  "b%C1\t%2,%z3,%0"
  [(set_attr "type" "branch")
   (set_attr "mode" "none")])

After performing unoptimized testing on the code with the above changes, I
found that the changes removed the extra extension instructions.

The comparison of the results from optimized testing is as follows:
no changes:
 test:
        add     a5,a0,a1
        addw    a0,a0,a1
        sub     a0,a0,a5
        snez    a0,a0
        ret

changes:
 test:
        li      a5,0
        addw    a4,a0,a1
        add     a1,a0,a1
        bne     a4,a1,.L4
 .L2:
        sext.w  a0,a5
        ret
 .L4:
        li      a5,1
        j       .L2
I'm not sure whether the code after my modifications is better or worse; I need
further clarification from you.

Here are my simple test cases:

/* Compilation command:./cc1 -march=rv64g -mabi=lp64 test.c */
int test(int a, int b)
{   
    int result;
    return __builtin_sadd_overflow(a, b, &result);
}

Reply via email to