The patch for target/116256 significantly simplified the condition and, I guess not too surprisingly, exposed a minor code quality regression.
Specifically the split part of the define_insn_and_split only splits after reload (because we use a match_scratch). So there's nothing to combine the load-immediate with the subsequent add into an addi when the immediate fits into a simm12 field.
This patch adjusts the split code to handle that scenario directly and generate the more efficient code. We can squeeze out the slli in this test with a bit more work, but that's out of scope right now since that isn't a regression.
Tested in my tester. Waiting on pre-commit testing to render a verdict. jeff
PR target/116256 gcc/ * config/riscv/riscv.md (reassociating constant addition): Adjust split code to generate addi directly when possible. gcc/testsuite * gcc.target/riscv/pr116256-1.c: New test. diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 95951605fb4..84bce409bc7 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -4684,10 +4684,22 @@ (define_insn_and_split "" "(TARGET_64BIT && riscv_const_insns (operands[3], false) == 1)" "#" "&& reload_completed" - [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 2))) - (set (match_dup 4) (match_dup 3)) - (set (match_dup 0) (plus:DI (match_dup 0) (match_dup 4)))] - "" + [(const_int 0)] + "{ + rtx x = gen_rtx_ASHIFT (DImode, operands[1], operands[2]); + emit_insn (gen_rtx_SET (operands[0], x)); + + /* If the constant fits in a simm12, use it directly as we do not + get another good chance to optimize things again. */ + if (!SMALL_OPERAND (INTVAL (operands[3]))) + emit_move_insn (operands[4], operands[3]); + else + operands[4] = operands[3]; + + x = gen_rtx_PLUS (DImode, operands[0], operands[4]); + emit_insn (gen_rtx_SET (operands[0], x)); + DONE; + }" [(set_attr "type" "arith")]) (define_insn_and_split "" @@ -4700,13 +4712,26 @@ (define_insn_and_split "" "(TARGET_64BIT && riscv_const_insns (operands[3], false) == 1)" "#" "&& reload_completed" - [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 2))) - (set (match_dup 4) (match_dup 3)) - (set (match_dup 0) (sign_extend:DI (plus:SI (match_dup 5) (match_dup 6))))] + [(const_int 0)] "{ operands[1] = gen_lowpart (DImode, operands[1]); operands[5] = gen_lowpart (SImode, operands[0]); operands[6] = gen_lowpart (SImode, operands[4]); + + rtx x = gen_rtx_ASHIFT (DImode, operands[1], operands[2]); + emit_insn (gen_rtx_SET (operands[0], x)); + + /* If the constant fits in a simm12, use it directly as we do not + get another good chance to optimize things again. */ + if (!SMALL_OPERAND (INTVAL (operands[3]))) + emit_move_insn (operands[4], operands[3]); + else + operands[6] = operands[3]; + + x = gen_rtx_PLUS (SImode, operands[5], operands[6]); + x = gen_rtx_SIGN_EXTEND (DImode, x); + emit_insn (gen_rtx_SET (operands[0], x)); + DONE; }" [(set_attr "type" "arith")]) diff --git a/gcc/testsuite/gcc.target/riscv/pr116256-1.c b/gcc/testsuite/gcc.target/riscv/pr116256-1.c new file mode 100644 index 00000000000..9543716cd68 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr116256-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcb -mabi=lp64d" { target { rv64 } } } */ + + +bool f1(long a) +{ + long b = a << 4; + return b == -128; +} + +/* We want to verify that we have generated addi + rather than li+add. */ +/* { dg-final { scan-assembler-not "add\t" } } */ +/* { dg-final { scan-assembler "addi\t" } } */ +