On 5/24/23 17:14, Jivan Hakobyan via Gcc-patches wrote:
Subject:
[RFC] RISC-V: Eliminate extension after for *w instructions
From:
Jivan Hakobyan via Gcc-patches <gcc-patches@gcc.gnu.org>
Date:
5/24/23, 17:14

To:
gcc-patches@gcc.gnu.org


`This patch tries to prevent generating unnecessary sign extension
after *w instructions like "addiw" or "divw".

The main idea of it is to add SUBREG_PROMOTED fields during expanding.

I have tested on SPEC2017 there is no regression.
Only gcc.dg/pr30957-1.c test failed.
To solve that I did some changes in loop-iv.cc, but not sure that it is
suitable.


gcc/ChangeLog:
         * config/riscv/bitmanip.md (rotrdi3): New pattern.
         (rotrsi3): Likewise.
         (rotlsi3): Likewise.
         * config/riscv/riscv-protos.h (riscv_emit_binary): New function
         declaration
         * config/riscv/riscv.cc (riscv_emit_binary): Removed static
         * config/riscv/riscv.md (addsi3): New pattern
         (subsi3): Likewise.
         (negsi2): Likewise.
         (mulsi3): Likewise.
         (<optab>si3): New pattern for any_div.
         (<optab>si3): New pattern for any_shift.
         * loop-iv.cc (get_biv_step_1):  Process src of extension when it
PLUS

gcc/testsuite/ChangeLog:
         * testsuite/gcc.target/riscv/shift-and-2.c: New test
         * testsuite/gcc.target/riscv/shift-shift-2.c: New test
         * testsuite/gcc.target/riscv/sign-extend.c: New test
         * testsuite/gcc.target/riscv/zbb-rol-ror-03.c: New test


-- With the best regards Jivan Hakobyan


extend.diff

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 
96d31d92670b27d495dc5a9fbfc07e8767f40976..0430af7c95b1590308648dc4d5aaea78ada71760
 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -304,9 +304,9 @@
    [(set_attr "type" "bitmanip,load")
     (set_attr "mode" "HI")])
-(define_expand "rotr<mode>3"
-  [(set (match_operand:GPR 0 "register_operand")
-       (rotatert:GPR (match_operand:GPR 1 "register_operand")
+(define_expand "rotrdi3"
+  [(set (match_operand:DI 0 "register_operand")
+       (rotatert:DI (match_operand:DI 1 "register_operand")
                     (match_operand:QI 2 "arith_operand")))]
    "TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB"
The condition for this expander needs to be adjusted.

Previously it used the GPR iterator.  The GPR iterator is defined like this:


(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])

Note how the DI case is conditional on TARGET_64BIT.

This impacts the HAVE_* macros that are generated from the MD file in insn-flags.h:

#define HAVE_rotrsi3 (TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB)
#define HAVE_rotrdi3 ((TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB) && (TARGET_64BIT))

Note how the rotrdi3 has the && (TARGET_64BIT) on the end.

With your change we would expose rotrdi3 independent of TARGET_64BIT which is not what we want.


Sorry I didn't catch that earlier.  I'll fix this minor problem.



@@ -544,7 +562,7 @@
        rtx t5 = gen_reg_rtx (DImode);
        rtx t6 = gen_reg_rtx (DImode);
- emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
+      riscv_emit_binary(PLUS, operands[0], operands[1], operands[2]);
Just a note. In GCC we always emit a space between the function name and the open parenthesis for its argument list. I fixed a few of these.

@@ -867,8 +938,8 @@
emit_insn (gen_smul<mode>3_highpart (hp, operands[1], operands[2]));
        emit_insn (gen_mul<mode>3 (operands[0], operands[1], operands[2]));
-      emit_insn (gen_ashr<mode>3 (lp, operands[0],
-                                 GEN_INT (BITS_PER_WORD - 1)));
+      riscv_emit_binary(ASHIFTRT, lp, operands[0],
+                                 GEN_INT (BITS_PER_WORD - 1));
Another formatting nit. When we wrap lines for an argument list, we line up the arguments. So something like this

frobit (a, b, c
        d, e, f);



Obviously that's not a great example as it doesn't need wrapping, but it should clearly show how we indent things in this case. I've fixed up this nit.


diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index 
6c40db947f7f549303f8bb4d4f38aa98b6561bcc..bec1ea7e4ccf7291bb3dba91161f948e66c7bea9
 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -637,7 +637,7 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx 
reg,
  {
    rtx set, rhs, op0 = NULL_RTX, op1 = NULL_RTX;
    rtx next, nextr;
-  enum rtx_code code;
+  enum rtx_code code, prev_code;
So as I mentioned earlier, PREV_CODE might be used without being initialized. I've initialized it to "UNKNOWN" which is a special RTX code which can be used for this purpose.

If we are changing a target independent file the standard is that we bootstrap and regression test on at least one primary platform such as x86_64 linux. This would have been caught by that bootstrap process as it's a pretty simple uninitialized object use to analyze.


    rtx_insn *insn = DF_REF_INSN (def);
    df_ref next_def;
    enum iv_grd_result res;
@@ -697,6 +697,23 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, 
rtx reg,
        return false;
op0 = XEXP (rhs, 0);
+
+      if (GET_CODE (op0) == PLUS)
I've added a few comments before this code indicating why it was added. And WRT formatting, we strongly prefer to use a tab rather than 8 spaces. I think it's pretty annoying myself, but I live with it as it's the standard for the GCC project. I've fixed these nits as well.

WRT the impacts. We haven't run this on the emulator, but we can easily look at the instruction counts for spec2017. It's a clear improvement with the biggest improvements in the xz benchmark (1-2%), with others in the .5% range.

There is one regression of note with how we expand comparisons against string constants. I'm going to file a bug for that so it doesn't get lost. In my opinion the benefits of this patch outweigh the impact of that minor regression in omnetpp. I'll file a bug for the omnetpp regression.

What will be really interesting will be to test the TARGET_REP_EXTENDED patch from Philipp again. I wouldn't be surprised at all if much of the benefit of TARGET_REP_EXTENDED is negated by exposing the target's actual semantics like you've done.

This regressed the zba-shNadd-05 test, which isn't a big surprise given this test was designed to test a pattern which is sensitive to the structure of 32bit arithmetic. I've adjusted the appropriate define_insn_and_split so that it matches the updated RTL.

There's a couple of other splitters in bitmanip.md which may need similar treatment. But I don't see cases for them in the testsuite, so it's hard to know if those splitters are useful and to test if we've got any adjustments to them correct.


Bootstrapped and regression tested on x86 linux. Regression tested on riscv64 as well. Attached is the version I pushed to the trunk.

Thanks again,


Jeff


commit 99bfdb072e67fa3fe294d86b4b2a9f686f8d9705
Author: Jeff Law <j...@ventanamicro.com>
Date:   Wed Jun 7 13:40:16 2023 -0600

    RISC-V: Eliminate extension after for *w instructions
    
    This patch tries to prevent generating unnecessary sign extension
    after *w instructions like "addiw" or "divw".
    
    The main idea of it is to add SUBREG_PROMOTED fields during expanding.
    
    I have tested on SPEC2017 there is no regression.
    Only gcc.dg/pr30957-1.c test failed.
    To solve that I did some changes in loop-iv.cc, but not sure that it is
    suitable.
    
    gcc/ChangeLog:
            * config/riscv/bitmanip.md (rotrdi3, rotrsi3, rotlsi3): New 
expanders.
            (rotrsi3_sext): Expose generator.
            (rotlsi3 pattern): Hide generator.
            * config/riscv/riscv-protos.h (riscv_emit_binary): New function
            declaration.
            * config/riscv/riscv.cc (riscv_emit_binary): Removed static
            * config/riscv/riscv.md (addsi3, subsi3, negsi2): Hide generator.
            (mulsi3, <optab>si3): Likewise.
            (addsi3, subsi3, negsi2, mulsi3, <optab>si3): New expanders.
            (addv<mode>4, subv<mode>4, mulv<mode>4): Use riscv_emit_binary.
            (<u>mulsidi3): Likewise.
            (addsi3_extended, subsi3_extended, negsi2_extended): Expose 
generator.
            (mulsi3_extended, <optab>si3_extended): Likewise.
            (splitter for shadd feeding divison): Update RTL pattern to account
            for changes in how 32 bit ops are expanded for TARGET_64BIT.
            * loop-iv.cc (get_biv_step_1): Process src of extension when it 
PLUS.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.target/riscv/shift-and-2.c: New tests.
            * gcc.target/riscv/shift-shift-2.c: Adjust expected output.
            * gcc.target/riscv/sign-extend.c: New test.
            * gcc.target/riscv/zbb-rol-ror-03.c: Adjust expected output.
    
    Co-authored-by: Jeff Law  <j...@ventanamicro.com>

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 96d31d92670..c42e7b890db 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -47,10 +47,10 @@ (define_insn "*shNadd"
 ; implicit sign-extensions.
 (define_split
   [(set (match_operand:DI 0 "register_operand")
-       (sign_extend:DI (div:SI (plus:SI (subreg:SI (ashift:DI 
(match_operand:DI 1 "register_operand")
-                                                              
(match_operand:QI 2 "imm123_operand")) 0)
-                                                   (subreg:SI 
(match_operand:DI 3 "register_operand") 0))
-               (subreg:SI (match_operand:DI 4 "register_operand") 0))))
+       (sign_extend:DI (div:SI (plus:SI (ashift:SI (subreg:SI 
(match_operand:DI 1 "register_operand") 0)
+                                                   (match_operand:QI 2 
"imm123_operand"))
+                                        (subreg:SI (match_operand:DI 3 
"register_operand") 0))
+                               (subreg:SI (match_operand:DI 4 
"register_operand") 0))))
    (clobber (match_operand:DI 5 "register_operand"))]
   "TARGET_64BIT && TARGET_ZBA"
    [(set (match_dup 5) (plus:DI (ashift:DI (match_dup 1) (match_dup 2)) 
(match_dup 3)))
@@ -304,11 +304,11 @@ (define_insn "*zero_extendhi<GPR:mode>2_zbb"
   [(set_attr "type" "bitmanip,load")
    (set_attr "mode" "HI")])
 
-(define_expand "rotr<mode>3"
-  [(set (match_operand:GPR 0 "register_operand")
-       (rotatert:GPR (match_operand:GPR 1 "register_operand")
+(define_expand "rotrdi3"
+  [(set (match_operand:DI 0 "register_operand")
+       (rotatert:DI (match_operand:DI 1 "register_operand")
                     (match_operand:QI 2 "arith_operand")))]
-  "TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB"
+  "TARGET_64BIT && (TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB)"
 {
   if (TARGET_XTHEADBB && !immediate_operand (operands[2], VOIDmode))
     FAIL;
@@ -322,6 +322,26 @@ (define_insn "*rotrsi3"
   "ror%i2%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
+(define_expand "rotrsi3"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+       (rotatert:SI (match_operand:SI 1 "register_operand" "r")
+                    (match_operand:QI 2 "arith_operand" "rI")))]
+  "TARGET_ZBB || TARGET_ZBKB || TARGET_XTHEADBB"
+{
+  if (TARGET_XTHEADBB && !immediate_operand (operands[2], VOIDmode))
+    FAIL;
+  if (TARGET_64BIT && register_operand (operands[2], QImode))
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_rotrsi3_sext (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_insn "*rotrdi3"
   [(set (match_operand:DI 0 "register_operand" "=r")
        (rotatert:DI (match_operand:DI 1 "register_operand" "r")
@@ -330,7 +350,7 @@ (define_insn "*rotrdi3"
   "ror%i2\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
-(define_insn "*rotrsi3_sext"
+(define_insn "rotrsi3_sext"
   [(set (match_operand:DI 0 "register_operand" "=r")
        (sign_extend:DI (rotatert:SI (match_operand:SI 1 "register_operand" "r")
                                  (match_operand:QI 2 "arith_operand" "rI"))))]
@@ -338,7 +358,7 @@ (define_insn "*rotrsi3_sext"
   "ror%i2%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
-(define_insn "rotlsi3"
+(define_insn "*rotlsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
        (rotate:SI (match_operand:SI 1 "register_operand" "r")
                   (match_operand:QI 2 "register_operand" "r")))]
@@ -346,6 +366,24 @@ (define_insn "rotlsi3"
   "rol%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
+(define_expand "rotlsi3"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+       (rotate:SI (match_operand:SI 1 "register_operand" "r")
+                  (match_operand:QI 2 "register_operand" "r")))]
+  "TARGET_ZBB || TARGET_ZBKB"
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_rotlsi3_sext (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_insn "rotldi3"
   [(set (match_operand:DI 0 "register_operand" "=r")
        (rotate:DI (match_operand:DI 1 "register_operand" "r")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 9782f1794fb..38e4125424b 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -61,6 +61,7 @@ extern const char *riscv_output_return ();
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
 extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx);
 extern void riscv_expand_conditional_branch (rtx, enum rtx_code, rtx, rtx);
+extern rtx riscv_emit_binary (enum rtx_code code, rtx dest, rtx x, rtx y);
 #endif
 extern bool riscv_expand_conditional_move (rtx, rtx, rtx, rtx);
 extern rtx riscv_legitimize_call_address (rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 60ebd9903e5..de30bf4e567 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1415,7 +1415,7 @@ riscv_emit_set (rtx target, rtx src)
 
 /* Emit an instruction of the form (set DEST (CODE X Y)).  */
 
-static rtx
+rtx
 riscv_emit_binary (enum rtx_code code, rtx dest, rtx x, rtx y)
 {
   return riscv_emit_set (dest, gen_rtx_fmt_ee (code, GET_MODE (dest), x, y));
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index be960583101..38b8fba2a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -513,7 +513,7 @@ (define_insn "add<mode>3"
   [(set_attr "type" "fadd")
    (set_attr "mode" "<UNITMODE>")])
 
-(define_insn "addsi3"
+(define_insn "*addsi3"
   [(set (match_operand:SI          0 "register_operand" "=r,r")
        (plus:SI (match_operand:SI 1 "register_operand" " r,r")
                 (match_operand:SI 2 "arith_operand"    " r,I")))]
@@ -522,6 +522,24 @@ (define_insn "addsi3"
   [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
 
+(define_expand "addsi3"
+  [(set (match_operand:SI          0 "register_operand" "=r,r")
+       (plus:SI (match_operand:SI 1 "register_operand" " r,r")
+                (match_operand:SI 2 "arith_operand"    " r,I")))]
+  ""
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_addsi3_extended (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_insn "adddi3"
   [(set (match_operand:DI          0 "register_operand" "=r,r")
        (plus:DI (match_operand:DI 1 "register_operand" " r,r")
@@ -545,7 +563,7 @@ (define_expand "addv<mode>4"
       rtx t5 = gen_reg_rtx (DImode);
       rtx t6 = gen_reg_rtx (DImode);
 
-      emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
+      riscv_emit_binary (PLUS, operands[0], operands[1], operands[2]);
       if (GET_CODE (operands[1]) != CONST_INT)
        emit_insn (gen_extend_insn (t4, operands[1], DImode, SImode, 0));
       else
@@ -591,7 +609,7 @@ (define_expand "uaddv<mode>4"
        emit_insn (gen_extend_insn (t3, operands[1], DImode, SImode, 0));
       else
        t3 = operands[1];
-      emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
+      riscv_emit_binary (PLUS, operands[0], operands[1], operands[2]);
       emit_insn (gen_extend_insn (t4, operands[0], DImode, SImode, 0));
 
       riscv_expand_conditional_branch (operands[3], LTU, t4, t3);
@@ -606,7 +624,7 @@ (define_expand "uaddv<mode>4"
   DONE;
 })
 
-(define_insn "*addsi3_extended"
+(define_insn "addsi3_extended"
   [(set (match_operand:DI               0 "register_operand" "=r,r")
        (sign_extend:DI
             (plus:SI (match_operand:SI 1 "register_operand" " r,r")
@@ -653,7 +671,7 @@ (define_insn "subdi3"
   [(set_attr "type" "arith")
    (set_attr "mode" "DI")])
 
-(define_insn "subsi3"
+(define_insn "*subsi3"
   [(set (match_operand:SI           0 "register_operand" "= r")
        (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
                  (match_operand:SI 2 "register_operand" "  r")))]
@@ -662,6 +680,24 @@ (define_insn "subsi3"
   [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
 
+(define_expand "subsi3"
+  [(set (match_operand:SI           0 "register_operand" "= r")
+       (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
+                 (match_operand:SI 2 "register_operand" "  r")))]
+  ""
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_subsi3_extended (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_expand "subv<mode>4"
   [(set (match_operand:GPR            0 "register_operand" "= r")
        (minus:GPR (match_operand:GPR 1 "reg_or_0_operand" " rJ")
@@ -676,7 +712,7 @@ (define_expand "subv<mode>4"
       rtx t5 = gen_reg_rtx (DImode);
       rtx t6 = gen_reg_rtx (DImode);
 
-      emit_insn (gen_subsi3 (operands[0], operands[1], operands[2]));
+      riscv_emit_binary (MINUS, operands[0], operands[1], operands[2]);
       if (GET_CODE (operands[1]) != CONST_INT)
        emit_insn (gen_extend_insn (t4, operands[1], DImode, SImode, 0));
       else
@@ -725,7 +761,7 @@ (define_expand "usubv<mode>4"
        emit_insn (gen_extend_insn (t3, operands[1], DImode, SImode, 0));
       else
        t3 = operands[1];
-      emit_insn (gen_subsi3 (operands[0], operands[1], operands[2]));
+      riscv_emit_binary (MINUS, operands[0], operands[1], operands[2]);
       emit_insn (gen_extend_insn (t4, operands[0], DImode, SImode, 0));
 
       riscv_expand_conditional_branch (operands[3], LTU, t3, t4);
@@ -741,7 +777,7 @@ (define_expand "usubv<mode>4"
 })
 
 
-(define_insn "*subsi3_extended"
+(define_insn "subsi3_extended"
   [(set (match_operand:DI               0 "register_operand" "= r")
        (sign_extend:DI
            (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
@@ -770,7 +806,7 @@ (define_insn "negdi2"
   [(set_attr "type" "arith")
    (set_attr "mode" "DI")])
 
-(define_insn "negsi2"
+(define_insn "*negsi2"
   [(set (match_operand:SI         0 "register_operand" "=r")
        (neg:SI (match_operand:SI 1 "register_operand" " r")))]
   ""
@@ -778,7 +814,24 @@ (define_insn "negsi2"
   [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
 
-(define_insn "*negsi2_extended"
+(define_expand "negsi2"
+  [(set (match_operand:SI         0 "register_operand" "=r")
+       (neg:SI (match_operand:SI 1 "register_operand" " r")))]
+  ""
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_negsi2_extended (t, operands[1]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
+(define_insn "negsi2_extended"
   [(set (match_operand:DI          0 "register_operand" "=r")
        (sign_extend:DI
         (neg:SI (match_operand:SI 1 "register_operand" " r"))))]
@@ -814,7 +867,7 @@ (define_insn "mul<mode>3"
   [(set_attr "type" "fmul")
    (set_attr "mode" "<UNITMODE>")])
 
-(define_insn "mulsi3"
+(define_insn "*mulsi3"
   [(set (match_operand:SI          0 "register_operand" "=r")
        (mult:SI (match_operand:SI 1 "register_operand" " r")
                 (match_operand:SI 2 "register_operand" " r")))]
@@ -823,6 +876,24 @@ (define_insn "mulsi3"
   [(set_attr "type" "imul")
    (set_attr "mode" "SI")])
 
+(define_expand "mulsi3"
+  [(set (match_operand:SI          0 "register_operand" "=r")
+       (mult:SI (match_operand:SI 1 "register_operand" " r")
+                (match_operand:SI 2 "register_operand" " r")))]
+  "TARGET_ZMMUL || TARGET_MUL"
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_mulsi3_extended (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_insn "muldi3"
   [(set (match_operand:DI          0 "register_operand" "=r")
        (mult:DI (match_operand:DI 1 "register_operand" " r")
@@ -868,8 +939,8 @@ (define_expand "mulv<mode>4"
 
       emit_insn (gen_smul<mode>3_highpart (hp, operands[1], operands[2]));
       emit_insn (gen_mul<mode>3 (operands[0], operands[1], operands[2]));
-      emit_insn (gen_ashr<mode>3 (lp, operands[0],
-                                 GEN_INT (BITS_PER_WORD - 1)));
+      riscv_emit_binary (ASHIFTRT, lp, operands[0],
+                        GEN_INT (BITS_PER_WORD - 1));
 
       riscv_expand_conditional_branch (operands[3], NE, hp, lp);
     }
@@ -923,7 +994,7 @@ (define_expand "umulv<mode>4"
   DONE;
 })
 
-(define_insn "*mulsi3_extended"
+(define_insn "mulsi3_extended"
   [(set (match_operand:DI              0 "register_operand" "=r")
        (sign_extend:DI
            (mult:SI (match_operand:SI 1 "register_operand" " r")
@@ -1024,7 +1095,7 @@ (define_expand "<u>mulsidi3"
   "(TARGET_ZMMUL || TARGET_MUL) && !TARGET_64BIT"
 {
   rtx temp = gen_reg_rtx (SImode);
-  emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
+  riscv_emit_binary (MULT, temp, operands[1], operands[2]);
   emit_insn (gen_<su>mulsi3_highpart (riscv_subword (operands[0], true),
                                     operands[1], operands[2]));
   emit_insn (gen_movsi (riscv_subword (operands[0], false), temp));
@@ -1055,7 +1126,7 @@ (define_expand "usmulsidi3"
   "(TARGET_ZMMUL || TARGET_MUL) && !TARGET_64BIT"
 {
   rtx temp = gen_reg_rtx (SImode);
-  emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
+  riscv_emit_binary (MULT, temp, operands[1], operands[2]);
   emit_insn (gen_usmulsi3_highpart (riscv_subword (operands[0], true),
                                     operands[1], operands[2]));
   emit_insn (gen_movsi (riscv_subword (operands[0], false), temp));
@@ -1084,7 +1155,7 @@ (define_insn "usmulsi3_highpart"
 ;;  ....................
 ;;
 
-(define_insn "<optab>si3"
+(define_insn "*<optab>si3"
   [(set (match_operand:SI             0 "register_operand" "=r")
        (any_div:SI (match_operand:SI 1 "register_operand" " r")
                    (match_operand:SI 2 "register_operand" " r")))]
@@ -1093,6 +1164,24 @@ (define_insn "<optab>si3"
   [(set_attr "type" "idiv")
    (set_attr "mode" "SI")])
 
+(define_expand "<optab>si3"
+  [(set (match_operand:SI             0 "register_operand" "=r")
+       (any_div:SI (match_operand:SI 1 "register_operand" " r")
+                   (match_operand:SI 2 "register_operand" " r")))]
+  "TARGET_DIV"
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_<optab>si3_extended (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_insn "<optab>di3"
   [(set (match_operand:DI             0 "register_operand" "=r")
        (any_div:DI (match_operand:DI 1 "register_operand" " r")
@@ -1118,7 +1207,7 @@ (define_expand "<u>divmod<mode>4"
       DONE;
   })
 
-(define_insn "*<optab>si3_extended"
+(define_insn "<optab>si3_extended"
   [(set (match_operand:DI                 0 "register_operand" "=r")
        (sign_extend:DI
            (any_div:SI (match_operand:SI 1 "register_operand" " r")
@@ -2072,7 +2161,7 @@ (define_insn "riscv_pause"
 ;; expand_shift_1 can do this automatically when SHIFT_COUNT_TRUNCATED is
 ;; defined, but use of that is discouraged.
 
-(define_insn "<optab>si3"
+(define_insn "*<optab>si3"
   [(set (match_operand:SI     0 "register_operand" "= r")
        (any_shift:SI
            (match_operand:SI 1 "register_operand" "  r")
@@ -2088,6 +2177,24 @@ (define_insn "<optab>si3"
   [(set_attr "type" "shift")
    (set_attr "mode" "SI")])
 
+(define_expand "<optab>si3"
+  [(set (match_operand:SI     0 "register_operand" "= r")
+       (any_shift:SI (match_operand:SI 1 "register_operand" "  r")
+                (match_operand:QI 2 "arith_operand"    " rI")))]
+  ""
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_<optab>si3_extend (t, operands[1], operands[2]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
 (define_insn "<optab>di3"
   [(set (match_operand:DI 0 "register_operand"     "= r")
        (any_shift:DI
@@ -2122,7 +2229,7 @@ (define_insn_and_split "*<optab><GPR:mode>3_mask_1"
   [(set_attr "type" "shift")
    (set_attr "mode" "<GPR:MODE>")])
 
-(define_insn "*<optab>si3_extend"
+(define_insn "<optab>si3_extend"
   [(set (match_operand:DI                   0 "register_operand" "= r")
        (sign_extend:DI
            (any_shift:SI (match_operand:SI 1 "register_operand" "  r")
diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index 6c40db947f7..858c5ee84f2 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -637,7 +637,7 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx 
reg,
 {
   rtx set, rhs, op0 = NULL_RTX, op1 = NULL_RTX;
   rtx next, nextr;
-  enum rtx_code code;
+  enum rtx_code code, prev_code = UNKNOWN;
   rtx_insn *insn = DF_REF_INSN (def);
   df_ref next_def;
   enum iv_grd_result res;
@@ -697,6 +697,27 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, 
rtx reg,
        return false;
 
       op0 = XEXP (rhs, 0);
+
+      /* rv64 wraps SImode arithmetic inside an extension to DImode.
+        This matches the actual hardware semantics.  So peek inside
+        the extension and see if we have simple arithmetic that we
+        can analyze.  */
+      if (GET_CODE (op0) == PLUS)
+       {
+         rhs = op0;
+         op0 = XEXP (rhs, 0);
+         op1 = XEXP (rhs, 1);
+
+         if (CONSTANT_P (op0))
+           std::swap (op0, op1);
+
+         if (!simple_reg_p (op0) || !CONSTANT_P (op1))
+           return false;
+
+         prev_code = code;
+         code = PLUS;
+       }
+
       if (!simple_reg_p (op0))
        return false;
 
@@ -769,6 +790,11 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, 
rtx reg,
       else
        *outer_step = simplify_gen_binary (code, outer_mode,
                                           *outer_step, op1);
+
+      if (prev_code == SIGN_EXTEND)
+       *extend = IV_SIGN_EXTEND;
+      else if (prev_code == ZERO_EXTEND)
+       *extend = IV_ZERO_EXTEND;
       break;
 
     case SIGN_EXTEND:
diff --git a/gcc/testsuite/gcc.target/riscv/shift-and-2.c 
b/gcc/testsuite/gcc.target/riscv/shift-and-2.c
index bc01e8ef992..ee9925b7498 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-and-2.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-and-2.c
@@ -38,5 +38,24 @@ sub6 (long i, long j)
 {
   return i << (j & 0x3f);
 }
+
+/* Test for <optab>si3_extend. */
+int
+sub7 (int i, int j) {
+  return (i << 10) & j;
+}
+
+/* Test for <optab>si3_extend. */
+unsigned
+sub8 (unsigned i, unsigned j) {
+  return (i << 10) & j;
+}
+
+/* Test for <optab>si3_extend. */
+unsigned
+sub9 (unsigned i, unsigned j) {
+  return (i >> 10) & j;
+}
+
 /* { dg-final { scan-assembler-not "andi" } } */
 /* { dg-final { scan-assembler-not "sext.w" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c 
b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
index 5f93be15ac5..bc8c4ef3828 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
@@ -38,5 +38,6 @@ sub5 (unsigned int i)
 }
 /* { dg-final { scan-assembler-times "slli" 5 } } */
 /* { dg-final { scan-assembler-times "srli" 5 } } */
-/* { dg-final { scan-assembler-times "slliw" 1 } } */
-/* { dg-final { scan-assembler-times "srliw" 1 } } */
+/* { dg-final { scan-assembler-times ",40" 2 } } */ /* For sub5 test */
+/* { dg-final { scan-assembler-not "slliw" } } */
+/* { dg-final { scan-assembler-not "srliw" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sign-extend.c 
b/gcc/testsuite/gcc.target/riscv/sign-extend.c
new file mode 100644
index 00000000000..6f840194833
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sign-extend.c
@@ -0,0 +1,81 @@
+/* { dg-do compile { target { riscv64*-*-* } } } */
+/* { dg-options "-march=rv64gc -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+
+unsigned
+foo1 (unsigned x, unsigned y, unsigned z)
+{
+  return x & (y - z);
+}
+
+int
+foo2 (int x, int y, int z)
+{
+  return x & (y - z);
+}
+
+unsigned
+foo3 (unsigned x, unsigned y, unsigned z)
+{
+  return x & (y * z);
+}
+
+int
+foo4 (int x, int y, int z)
+{
+  return x & (y * z);
+}
+
+unsigned
+foo5 (unsigned x, unsigned y)
+{
+  return x & (y / x);
+}
+
+int
+foo6 (int x, int y)
+{
+  return x & (y / x);
+}
+
+unsigned
+foo7 (unsigned x, unsigned y)
+{
+  return x & (y % x);
+}
+
+int
+foo8 (int x, int y)
+{
+  return x & (y % x);
+}
+
+int
+foo9 (int x)
+{
+  return x & (-x);
+}
+
+unsigned
+foo10 (unsigned x, unsigned y)
+{
+  return x & (y + x);
+}
+
+
+unsigned
+foo11 (unsigned x)
+{
+  return x & (15 + x);
+}
+
+/* { dg-final { scan-assembler-times "subw" 2 } } */
+/* { dg-final { scan-assembler-times "addw" 1 } } */
+/* { dg-final { scan-assembler-times "addiw" 1 } } */
+/* { dg-final { scan-assembler-times "mulw" 2 } } */
+/* { dg-final { scan-assembler-times "divw" 1 } } */
+/* { dg-final { scan-assembler-times "divuw" 1 } } */
+/* { dg-final { scan-assembler-times "remw" 1 } } */
+/* { dg-final { scan-assembler-times "remuw" 1 } } */
+/* { dg-final { scan-assembler-times "negw" 1 } } */
+/* { dg-final { scan-assembler-not "sext.w" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c 
b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
index b44d7fe8920..e7e5cbb9a1a 100644
--- a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
@@ -16,4 +16,5 @@ unsigned int ror(unsigned int rs1, unsigned int rs2)
 
 /* { dg-final { scan-assembler-times "rolw" 1 } } */
 /* { dg-final { scan-assembler-times "rorw" 1 } } */
-/* { dg-final { scan-assembler-not "and" } } */
\ No newline at end of file
+/* { dg-final { scan-assembler-not "and" } } */
+/* { dg-final { scan-assembler-not "sext.w" } } */

Reply via email to