On 11/18/22 14:26, Philipp Tomsich wrote:
On Fri, 18 Nov 2022 at 22:13, Jeff Law <jeffreya...@gmail.com> wrote:

On 11/9/22 16:07, Philipp Tomsich wrote:
Handling the register-const_int addition has very quickly escalated to
creating a full sign-extended 32bit constant and performing a
register-register for RISC-V in GCC so far, resulting in sequences like
(for the case of "a + 2048"):
       li      a5,4096
       addi    a5,a5,-2048
       add     a0,a0,a5

By adding an expansion for add<mode>3, we can emit optimised RTL that
matches the capabilities of RISC-V better by adding support for the
following, previously unoptimised cases:
    - addi + addi
       addi    a0,a0,2047
       addi    a0,a0,1
    - li + sh[123]add (if Zba is enabled)
       li      a5,960
       sh3add  a0,a5,a0

With this commit, we also fix up riscv_adjust_libcall_cfi_prologue()
and riscv_adjust_libcall_cfi_epilogue() to not use gen_add3_insn, as
the expander will otherwise wrap the resulting set-expression in an
insn (causing an ICE at dwarf2-time) when invoked with -msave-restore.

This closes the gap to LLVM, which has already been emitting these
optimised sequences.

Note that this benefits is perlbench (in SPEC CPU 2017), which needs
to add the constant 3840.

gcc/ChangeLog:

       * config/riscv/bitmanip.md (*shNadd): Rename.
       (riscv_shNadd<X:mode>): Expose as gen_riscv_shNadd{di/si}.
       * config/riscv/predicates.md (const_arith_shifted123_operand):
       New predicate (for constants that are a simm12, shifted by
       1, 2 or 3).
       (const_arith_2simm12_operand): New predicate (that can be
       expressed by adding 2 simm12 together).
       (addi_operand): New predicate (an immedaite operand suitable
       for the new add<mode>3 expansion).
       * config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue):
       Don't use gen_add3_insn, where a RTX instead of an INSN is
       required (otherwise this will break as soon as we have a
       define_expand for add<mode>3).
       (riscv_adjust_libcall_cfi_epilogue): Same.
       * config/riscv/riscv.md (addsi3): Rename.
       (riscv_addsi3): New name for addsi3.
       (adddi3): Rename.
       (riscv_adddi3): New name for adddi3.
       (add<mode>3): New expander that handles the basic and fancy
       (such as li+sh[123]add, addi+addi, ...) cases for adding
       register-register and register-const_int.

gcc/testsuite/ChangeLog:

       * gcc.target/riscv/addi.c: New test.
       * gcc.target/riscv/zba-shNadd-06.c: New test.

Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
---

   gcc/config/riscv/bitmanip.md                  |  2 +-
   gcc/config/riscv/predicates.md                | 28 +++++++++
   gcc/config/riscv/riscv.cc                     | 10 ++--
   gcc/config/riscv/riscv.md                     | 58 ++++++++++++++++++-
   gcc/testsuite/gcc.target/riscv/addi.c         | 39 +++++++++++++
   .../gcc.target/riscv/zba-shNadd-06.c          | 11 ++++
   6 files changed, 141 insertions(+), 7 deletions(-)
   create mode 100644 gcc/testsuite/gcc.target/riscv/addi.c
   create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c



diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 171a0cdced6..289ff7470c6 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -464,6 +464,60 @@
     [(set_attr "type" "arith")
      (set_attr "mode" "DI")])

+(define_expand "add<mode>3"
+  [(set (match_operand:GPR           0 "register_operand"      "=r,r")
+     (plus:GPR (match_operand:GPR 1 "register_operand"      " r,r")
+               (match_operand:GPR 2 "addi_operand"          " r,I")))]
+  ""
+{
+  if (arith_operand (operands[2], <MODE>mode))
+    emit_insn (gen_riscv_add<mode>3 (operands[0], operands[1], operands[2]));
+  else if (const_arith_2simm12_operand (operands[2], <MODE>mode))
+    {
+      /* Split into two immediates that add up to the desired value:
+       * e.g., break up "a + 2445" into:
+       *         addi        a0,a0,2047
+       *      addi   a0,a0,398
+       */
Nit.  GNU comment style please.


+
+      HOST_WIDE_INT val = INTVAL (operands[2]);
+      HOST_WIDE_INT saturated = HOST_WIDE_INT_M1U << (IMM_BITS - 1);
+
+      if (val >= 0)
+      saturated = ~saturated;
+
+      val -= saturated;
+
+      rtx tmp = gen_reg_rtx (<MODE>mode);
Can't add<mode>3 be generated by LRA?  If so, don't you have to guard
against going into this path as we shouldn't be creating new pseudos at
that point (I know LRA can create some internally, but I don't think it
handles new ones showing up due to target expanders).


Similarly for the shifted_123 case immediately following.


If we do indeed have an issue here, I'm not sure how best to resolve.
If the output operand does not overlap with the inputs, then we're
golden and can just re-use it to form the constant.  If not,  then it's
a bit tougher.  I'm not keen to add a test of no_new_pseudos to the
operand predicate, but I don't see a better option yet.
 From a cursory glance, LRA does not try to go through gen_add3_insn,
but rather forms PLUS rtx.  This matches my expectation, as (if we
can't create a pseudo) the original add<mode>3 definition would run
into the same problem (as an immediate that doesn't satisfy the
SMALL_OPERAND test will need to be placed in a register).

I should have remembered that.  It vectors through emit_add3_insn which you and I have looked at it not terribly long ago.  We don't define addptr3, so it'll just form the expression directly rather than going through the expanders.


So, OK.


jeff


Reply via email to