On 5/24/23 17:14, Jivan Hakobyan via Gcc-patches wrote:
Subject:
[RFC] RISC-V: Eliminate extension after for *w instructions
From:
Jivan Hakobyan via Gcc-patches <gcc-patches@gcc.gnu.org>
Date:
5/24/23, 17:14
To:
gcc-patches@gcc.gnu.org
`This patch tries to prevent generating unnecessary sign extension
after *w instructions like "addiw" or "divw".
The main idea of it is to add SUBREG_PROMOTED fields during expanding.
I have tested on SPEC2017 there is no regression.
Only gcc.dg/pr30957-1.c test failed.
To solve that I did some changes in loop-iv.cc, but not sure that it is
suitable.
gcc/ChangeLog:
* config/riscv/bitmanip.md (rotrdi3): New pattern.
(rotrsi3): Likewise.
(rotlsi3): Likewise.
* config/riscv/riscv-protos.h (riscv_emit_binary): New function
declaration
* config/riscv/riscv.cc (riscv_emit_binary): Removed static
* config/riscv/riscv.md (addsi3): New pattern
(subsi3): Likewise.
(negsi2): Likewise.
(mulsi3): Likewise.
(<optab>si3): New pattern for any_div.
(<optab>si3): New pattern for any_shift.
* loop-iv.cc (get_biv_step_1): Process src of extension when it
PLUS
gcc/testsuite/ChangeLog:
* testsuite/gcc.target/riscv/shift-and-2.c: New test
* testsuite/gcc.target/riscv/shift-shift-2.c: New test
* testsuite/gcc.target/riscv/sign-extend.c: New test
* testsuite/gcc.target/riscv/zbb-rol-ror-03.c: New test
-- With the best regards Jivan Hakobyan
extend.diff
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index
96d31d92670b27d495dc5a9fbfc07e8767f40976..0430af7c95b1590308648dc4d5aaea78ada71760
100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -304,9 +304,9 @@
[(set_attr "type" "bitmanip,load")
(set_attr "mode" "HI")])
-(define_expand "rotr<mode>3"
- [(set (match_operand:GPR 0 "register_operand")
- (rotatert:GPR (match_operand:GPR 1 "register_operand")
+(define_expand "rotrdi3"
+ [(set (match_operand:DI 0 "register_operand")
+ (rotatert:DI (match_operand:DI 1 "register_operand")
(match_operand:QI 2 "arith_operand")))]
"TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB"
The condition for this expander needs to be adjusted.
Previously it used the GPR iterator. The GPR iterator is defined like this:
(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
Note how the DI case is conditional on TARGET_64BIT.
This impacts the HAVE_* macros that are generated from the MD file in
insn-flags.h:
#define HAVE_rotrsi3 (TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB)
#define HAVE_rotrdi3 ((TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB) &&
(TARGET_64BIT))
Note how the rotrdi3 has the && (TARGET_64BIT) on the end.
With your change we would expose rotrdi3 independent of TARGET_64BIT
which is not what we want.
Sorry I didn't catch that earlier. I'll fix this minor problem.
@@ -544,7 +562,7 @@
rtx t5 = gen_reg_rtx (DImode);
rtx t6 = gen_reg_rtx (DImode);
- emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
+ riscv_emit_binary(PLUS, operands[0], operands[1], operands[2]);
Just a note. In GCC we always emit a space between the function name
and the open parenthesis for its argument list. I fixed a few of these.
@@ -867,8 +938,8 @@
emit_insn (gen_smul<mode>3_highpart (hp, operands[1], operands[2]));
emit_insn (gen_mul<mode>3 (operands[0], operands[1], operands[2]));
- emit_insn (gen_ashr<mode>3 (lp, operands[0],
- GEN_INT (BITS_PER_WORD - 1)));
+ riscv_emit_binary(ASHIFTRT, lp, operands[0],
+ GEN_INT (BITS_PER_WORD - 1));
Another formatting nit. When we wrap lines for an argument list, we
line up the arguments. So something like this
frobit (a, b, c
d, e, f);
Obviously that's not a great example as it doesn't need wrapping, but it
should clearly show how we indent things in this case. I've fixed up
this nit.
diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index
6c40db947f7f549303f8bb4d4f38aa98b6561bcc..bec1ea7e4ccf7291bb3dba91161f948e66c7bea9
100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -637,7 +637,7 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx
reg,
{
rtx set, rhs, op0 = NULL_RTX, op1 = NULL_RTX;
rtx next, nextr;
- enum rtx_code code;
+ enum rtx_code code, prev_code;
So as I mentioned earlier, PREV_CODE might be used without being
initialized. I've initialized it to "UNKNOWN" which is a special RTX
code which can be used for this purpose.
If we are changing a target independent file the standard is that we
bootstrap and regression test on at least one primary platform such as
x86_64 linux. This would have been caught by that bootstrap process as
it's a pretty simple uninitialized object use to analyze.
rtx_insn *insn = DF_REF_INSN (def);
df_ref next_def;
enum iv_grd_result res;
@@ -697,6 +697,23 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode,
rtx reg,
return false;
op0 = XEXP (rhs, 0);
+
+ if (GET_CODE (op0) == PLUS)
I've added a few comments before this code indicating why it was added.
And WRT formatting, we strongly prefer to use a tab rather than 8
spaces. I think it's pretty annoying myself, but I live with it as it's
the standard for the GCC project. I've fixed these nits as well.
WRT the impacts. We haven't run this on the emulator, but we can easily
look at the instruction counts for spec2017. It's a clear improvement
with the biggest improvements in the xz benchmark (1-2%), with others in
the .5% range.
There is one regression of note with how we expand comparisons against
string constants. I'm going to file a bug for that so it doesn't get
lost. In my opinion the benefits of this patch outweigh the impact of
that minor regression in omnetpp. I'll file a bug for the omnetpp
regression.
What will be really interesting will be to test the TARGET_REP_EXTENDED
patch from Philipp again. I wouldn't be surprised at all if much of the
benefit of TARGET_REP_EXTENDED is negated by exposing the target's
actual semantics like you've done.
This regressed the zba-shNadd-05 test, which isn't a big surprise given
this test was designed to test a pattern which is sensitive to the
structure of 32bit arithmetic. I've adjusted the appropriate
define_insn_and_split so that it matches the updated RTL.
There's a couple of other splitters in bitmanip.md which may need
similar treatment. But I don't see cases for them in the testsuite, so
it's hard to know if those splitters are useful and to test if we've got
any adjustments to them correct.
Bootstrapped and regression tested on x86 linux. Regression tested on
riscv64 as well. Attached is the version I pushed to the trunk.
Thanks again,
Jeff
commit 99bfdb072e67fa3fe294d86b4b2a9f686f8d9705
Author: Jeff Law <j...@ventanamicro.com>
Date: Wed Jun 7 13:40:16 2023 -0600
RISC-V: Eliminate extension after for *w instructions
This patch tries to prevent generating unnecessary sign extension
after *w instructions like "addiw" or "divw".
The main idea of it is to add SUBREG_PROMOTED fields during expanding.
I have tested on SPEC2017 there is no regression.
Only gcc.dg/pr30957-1.c test failed.
To solve that I did some changes in loop-iv.cc, but not sure that it is
suitable.
gcc/ChangeLog:
* config/riscv/bitmanip.md (rotrdi3, rotrsi3, rotlsi3): New
expanders.
(rotrsi3_sext): Expose generator.
(rotlsi3 pattern): Hide generator.
* config/riscv/riscv-protos.h (riscv_emit_binary): New function
declaration.
* config/riscv/riscv.cc (riscv_emit_binary): Removed static
* config/riscv/riscv.md (addsi3, subsi3, negsi2): Hide generator.
(mulsi3, <optab>si3): Likewise.
(addsi3, subsi3, negsi2, mulsi3, <optab>si3): New expanders.
(addv<mode>4, subv<mode>4, mulv<mode>4): Use riscv_emit_binary.
(<u>mulsidi3): Likewise.
(addsi3_extended, subsi3_extended, negsi2_extended): Expose
generator.
(mulsi3_extended, <optab>si3_extended): Likewise.
(splitter for shadd feeding divison): Update RTL pattern to account
for changes in how 32 bit ops are expanded for TARGET_64BIT.
* loop-iv.cc (get_biv_step_1): Process src of extension when it
PLUS.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/shift-and-2.c: New tests.
* gcc.target/riscv/shift-shift-2.c: Adjust expected output.
* gcc.target/riscv/sign-extend.c: New test.
* gcc.target/riscv/zbb-rol-ror-03.c: Adjust expected output.
Co-authored-by: Jeff Law <j...@ventanamicro.com>
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 96d31d92670..c42e7b890db 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -47,10 +47,10 @@ (define_insn "*shNadd"
; implicit sign-extensions.
(define_split
[(set (match_operand:DI 0 "register_operand")
- (sign_extend:DI (div:SI (plus:SI (subreg:SI (ashift:DI
(match_operand:DI 1 "register_operand")
-
(match_operand:QI 2 "imm123_operand")) 0)
- (subreg:SI
(match_operand:DI 3 "register_operand") 0))
- (subreg:SI (match_operand:DI 4 "register_operand") 0))))
+ (sign_extend:DI (div:SI (plus:SI (ashift:SI (subreg:SI
(match_operand:DI 1 "register_operand") 0)
+ (match_operand:QI 2
"imm123_operand"))
+ (subreg:SI (match_operand:DI 3
"register_operand") 0))
+ (subreg:SI (match_operand:DI 4
"register_operand") 0))))
(clobber (match_operand:DI 5 "register_operand"))]
"TARGET_64BIT && TARGET_ZBA"
[(set (match_dup 5) (plus:DI (ashift:DI (match_dup 1) (match_dup 2))
(match_dup 3)))
@@ -304,11 +304,11 @@ (define_insn "*zero_extendhi<GPR:mode>2_zbb"
[(set_attr "type" "bitmanip,load")
(set_attr "mode" "HI")])
-(define_expand "rotr<mode>3"
- [(set (match_operand:GPR 0 "register_operand")
- (rotatert:GPR (match_operand:GPR 1 "register_operand")
+(define_expand "rotrdi3"
+ [(set (match_operand:DI 0 "register_operand")
+ (rotatert:DI (match_operand:DI 1 "register_operand")
(match_operand:QI 2 "arith_operand")))]
- "TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB"
+ "TARGET_64BIT && (TARGET_ZBB || TARGET_XTHEADBB || TARGET_ZBKB)"
{
if (TARGET_XTHEADBB && !immediate_operand (operands[2], VOIDmode))
FAIL;
@@ -322,6 +322,26 @@ (define_insn "*rotrsi3"
"ror%i2%~\t%0,%1,%2"
[(set_attr "type" "bitmanip")])
+(define_expand "rotrsi3"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (rotatert:SI (match_operand:SI 1 "register_operand" "r")
+ (match_operand:QI 2 "arith_operand" "rI")))]
+ "TARGET_ZBB || TARGET_ZBKB || TARGET_XTHEADBB"
+{
+ if (TARGET_XTHEADBB && !immediate_operand (operands[2], VOIDmode))
+ FAIL;
+ if (TARGET_64BIT && register_operand (operands[2], QImode))
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_rotrsi3_sext (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_insn "*rotrdi3"
[(set (match_operand:DI 0 "register_operand" "=r")
(rotatert:DI (match_operand:DI 1 "register_operand" "r")
@@ -330,7 +350,7 @@ (define_insn "*rotrdi3"
"ror%i2\t%0,%1,%2"
[(set_attr "type" "bitmanip")])
-(define_insn "*rotrsi3_sext"
+(define_insn "rotrsi3_sext"
[(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI (rotatert:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:QI 2 "arith_operand" "rI"))))]
@@ -338,7 +358,7 @@ (define_insn "*rotrsi3_sext"
"ror%i2%~\t%0,%1,%2"
[(set_attr "type" "bitmanip")])
-(define_insn "rotlsi3"
+(define_insn "*rotlsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(rotate:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:QI 2 "register_operand" "r")))]
@@ -346,6 +366,24 @@ (define_insn "rotlsi3"
"rol%~\t%0,%1,%2"
[(set_attr "type" "bitmanip")])
+(define_expand "rotlsi3"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (rotate:SI (match_operand:SI 1 "register_operand" "r")
+ (match_operand:QI 2 "register_operand" "r")))]
+ "TARGET_ZBB || TARGET_ZBKB"
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_rotlsi3_sext (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_insn "rotldi3"
[(set (match_operand:DI 0 "register_operand" "=r")
(rotate:DI (match_operand:DI 1 "register_operand" "r")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 9782f1794fb..38e4125424b 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -61,6 +61,7 @@ extern const char *riscv_output_return ();
extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx);
extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx);
extern void riscv_expand_conditional_branch (rtx, enum rtx_code, rtx, rtx);
+extern rtx riscv_emit_binary (enum rtx_code code, rtx dest, rtx x, rtx y);
#endif
extern bool riscv_expand_conditional_move (rtx, rtx, rtx, rtx);
extern rtx riscv_legitimize_call_address (rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 60ebd9903e5..de30bf4e567 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1415,7 +1415,7 @@ riscv_emit_set (rtx target, rtx src)
/* Emit an instruction of the form (set DEST (CODE X Y)). */
-static rtx
+rtx
riscv_emit_binary (enum rtx_code code, rtx dest, rtx x, rtx y)
{
return riscv_emit_set (dest, gen_rtx_fmt_ee (code, GET_MODE (dest), x, y));
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index be960583101..38b8fba2a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -513,7 +513,7 @@ (define_insn "add<mode>3"
[(set_attr "type" "fadd")
(set_attr "mode" "<UNITMODE>")])
-(define_insn "addsi3"
+(define_insn "*addsi3"
[(set (match_operand:SI 0 "register_operand" "=r,r")
(plus:SI (match_operand:SI 1 "register_operand" " r,r")
(match_operand:SI 2 "arith_operand" " r,I")))]
@@ -522,6 +522,24 @@ (define_insn "addsi3"
[(set_attr "type" "arith")
(set_attr "mode" "SI")])
+(define_expand "addsi3"
+ [(set (match_operand:SI 0 "register_operand" "=r,r")
+ (plus:SI (match_operand:SI 1 "register_operand" " r,r")
+ (match_operand:SI 2 "arith_operand" " r,I")))]
+ ""
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_addsi3_extended (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_insn "adddi3"
[(set (match_operand:DI 0 "register_operand" "=r,r")
(plus:DI (match_operand:DI 1 "register_operand" " r,r")
@@ -545,7 +563,7 @@ (define_expand "addv<mode>4"
rtx t5 = gen_reg_rtx (DImode);
rtx t6 = gen_reg_rtx (DImode);
- emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
+ riscv_emit_binary (PLUS, operands[0], operands[1], operands[2]);
if (GET_CODE (operands[1]) != CONST_INT)
emit_insn (gen_extend_insn (t4, operands[1], DImode, SImode, 0));
else
@@ -591,7 +609,7 @@ (define_expand "uaddv<mode>4"
emit_insn (gen_extend_insn (t3, operands[1], DImode, SImode, 0));
else
t3 = operands[1];
- emit_insn (gen_addsi3 (operands[0], operands[1], operands[2]));
+ riscv_emit_binary (PLUS, operands[0], operands[1], operands[2]);
emit_insn (gen_extend_insn (t4, operands[0], DImode, SImode, 0));
riscv_expand_conditional_branch (operands[3], LTU, t4, t3);
@@ -606,7 +624,7 @@ (define_expand "uaddv<mode>4"
DONE;
})
-(define_insn "*addsi3_extended"
+(define_insn "addsi3_extended"
[(set (match_operand:DI 0 "register_operand" "=r,r")
(sign_extend:DI
(plus:SI (match_operand:SI 1 "register_operand" " r,r")
@@ -653,7 +671,7 @@ (define_insn "subdi3"
[(set_attr "type" "arith")
(set_attr "mode" "DI")])
-(define_insn "subsi3"
+(define_insn "*subsi3"
[(set (match_operand:SI 0 "register_operand" "= r")
(minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
(match_operand:SI 2 "register_operand" " r")))]
@@ -662,6 +680,24 @@ (define_insn "subsi3"
[(set_attr "type" "arith")
(set_attr "mode" "SI")])
+(define_expand "subsi3"
+ [(set (match_operand:SI 0 "register_operand" "= r")
+ (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
+ (match_operand:SI 2 "register_operand" " r")))]
+ ""
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_subsi3_extended (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_expand "subv<mode>4"
[(set (match_operand:GPR 0 "register_operand" "= r")
(minus:GPR (match_operand:GPR 1 "reg_or_0_operand" " rJ")
@@ -676,7 +712,7 @@ (define_expand "subv<mode>4"
rtx t5 = gen_reg_rtx (DImode);
rtx t6 = gen_reg_rtx (DImode);
- emit_insn (gen_subsi3 (operands[0], operands[1], operands[2]));
+ riscv_emit_binary (MINUS, operands[0], operands[1], operands[2]);
if (GET_CODE (operands[1]) != CONST_INT)
emit_insn (gen_extend_insn (t4, operands[1], DImode, SImode, 0));
else
@@ -725,7 +761,7 @@ (define_expand "usubv<mode>4"
emit_insn (gen_extend_insn (t3, operands[1], DImode, SImode, 0));
else
t3 = operands[1];
- emit_insn (gen_subsi3 (operands[0], operands[1], operands[2]));
+ riscv_emit_binary (MINUS, operands[0], operands[1], operands[2]);
emit_insn (gen_extend_insn (t4, operands[0], DImode, SImode, 0));
riscv_expand_conditional_branch (operands[3], LTU, t3, t4);
@@ -741,7 +777,7 @@ (define_expand "usubv<mode>4"
})
-(define_insn "*subsi3_extended"
+(define_insn "subsi3_extended"
[(set (match_operand:DI 0 "register_operand" "= r")
(sign_extend:DI
(minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
@@ -770,7 +806,7 @@ (define_insn "negdi2"
[(set_attr "type" "arith")
(set_attr "mode" "DI")])
-(define_insn "negsi2"
+(define_insn "*negsi2"
[(set (match_operand:SI 0 "register_operand" "=r")
(neg:SI (match_operand:SI 1 "register_operand" " r")))]
""
@@ -778,7 +814,24 @@ (define_insn "negsi2"
[(set_attr "type" "arith")
(set_attr "mode" "SI")])
-(define_insn "*negsi2_extended"
+(define_expand "negsi2"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (neg:SI (match_operand:SI 1 "register_operand" " r")))]
+ ""
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_negsi2_extended (t, operands[1]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
+(define_insn "negsi2_extended"
[(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI
(neg:SI (match_operand:SI 1 "register_operand" " r"))))]
@@ -814,7 +867,7 @@ (define_insn "mul<mode>3"
[(set_attr "type" "fmul")
(set_attr "mode" "<UNITMODE>")])
-(define_insn "mulsi3"
+(define_insn "*mulsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(mult:SI (match_operand:SI 1 "register_operand" " r")
(match_operand:SI 2 "register_operand" " r")))]
@@ -823,6 +876,24 @@ (define_insn "mulsi3"
[(set_attr "type" "imul")
(set_attr "mode" "SI")])
+(define_expand "mulsi3"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (mult:SI (match_operand:SI 1 "register_operand" " r")
+ (match_operand:SI 2 "register_operand" " r")))]
+ "TARGET_ZMMUL || TARGET_MUL"
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_mulsi3_extended (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_insn "muldi3"
[(set (match_operand:DI 0 "register_operand" "=r")
(mult:DI (match_operand:DI 1 "register_operand" " r")
@@ -868,8 +939,8 @@ (define_expand "mulv<mode>4"
emit_insn (gen_smul<mode>3_highpart (hp, operands[1], operands[2]));
emit_insn (gen_mul<mode>3 (operands[0], operands[1], operands[2]));
- emit_insn (gen_ashr<mode>3 (lp, operands[0],
- GEN_INT (BITS_PER_WORD - 1)));
+ riscv_emit_binary (ASHIFTRT, lp, operands[0],
+ GEN_INT (BITS_PER_WORD - 1));
riscv_expand_conditional_branch (operands[3], NE, hp, lp);
}
@@ -923,7 +994,7 @@ (define_expand "umulv<mode>4"
DONE;
})
-(define_insn "*mulsi3_extended"
+(define_insn "mulsi3_extended"
[(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI
(mult:SI (match_operand:SI 1 "register_operand" " r")
@@ -1024,7 +1095,7 @@ (define_expand "<u>mulsidi3"
"(TARGET_ZMMUL || TARGET_MUL) && !TARGET_64BIT"
{
rtx temp = gen_reg_rtx (SImode);
- emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
+ riscv_emit_binary (MULT, temp, operands[1], operands[2]);
emit_insn (gen_<su>mulsi3_highpart (riscv_subword (operands[0], true),
operands[1], operands[2]));
emit_insn (gen_movsi (riscv_subword (operands[0], false), temp));
@@ -1055,7 +1126,7 @@ (define_expand "usmulsidi3"
"(TARGET_ZMMUL || TARGET_MUL) && !TARGET_64BIT"
{
rtx temp = gen_reg_rtx (SImode);
- emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
+ riscv_emit_binary (MULT, temp, operands[1], operands[2]);
emit_insn (gen_usmulsi3_highpart (riscv_subword (operands[0], true),
operands[1], operands[2]));
emit_insn (gen_movsi (riscv_subword (operands[0], false), temp));
@@ -1084,7 +1155,7 @@ (define_insn "usmulsi3_highpart"
;; ....................
;;
-(define_insn "<optab>si3"
+(define_insn "*<optab>si3"
[(set (match_operand:SI 0 "register_operand" "=r")
(any_div:SI (match_operand:SI 1 "register_operand" " r")
(match_operand:SI 2 "register_operand" " r")))]
@@ -1093,6 +1164,24 @@ (define_insn "<optab>si3"
[(set_attr "type" "idiv")
(set_attr "mode" "SI")])
+(define_expand "<optab>si3"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (any_div:SI (match_operand:SI 1 "register_operand" " r")
+ (match_operand:SI 2 "register_operand" " r")))]
+ "TARGET_DIV"
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_<optab>si3_extended (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_insn "<optab>di3"
[(set (match_operand:DI 0 "register_operand" "=r")
(any_div:DI (match_operand:DI 1 "register_operand" " r")
@@ -1118,7 +1207,7 @@ (define_expand "<u>divmod<mode>4"
DONE;
})
-(define_insn "*<optab>si3_extended"
+(define_insn "<optab>si3_extended"
[(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI
(any_div:SI (match_operand:SI 1 "register_operand" " r")
@@ -2072,7 +2161,7 @@ (define_insn "riscv_pause"
;; expand_shift_1 can do this automatically when SHIFT_COUNT_TRUNCATED is
;; defined, but use of that is discouraged.
-(define_insn "<optab>si3"
+(define_insn "*<optab>si3"
[(set (match_operand:SI 0 "register_operand" "= r")
(any_shift:SI
(match_operand:SI 1 "register_operand" " r")
@@ -2088,6 +2177,24 @@ (define_insn "<optab>si3"
[(set_attr "type" "shift")
(set_attr "mode" "SI")])
+(define_expand "<optab>si3"
+ [(set (match_operand:SI 0 "register_operand" "= r")
+ (any_shift:SI (match_operand:SI 1 "register_operand" " r")
+ (match_operand:QI 2 "arith_operand" " rI")))]
+ ""
+{
+ if (TARGET_64BIT)
+ {
+ rtx t = gen_reg_rtx (DImode);
+ emit_insn (gen_<optab>si3_extend (t, operands[1], operands[2]));
+ t = gen_lowpart (SImode, t);
+ SUBREG_PROMOTED_VAR_P (t) = 1;
+ SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+ emit_move_insn (operands[0], t);
+ DONE;
+ }
+})
+
(define_insn "<optab>di3"
[(set (match_operand:DI 0 "register_operand" "= r")
(any_shift:DI
@@ -2122,7 +2229,7 @@ (define_insn_and_split "*<optab><GPR:mode>3_mask_1"
[(set_attr "type" "shift")
(set_attr "mode" "<GPR:MODE>")])
-(define_insn "*<optab>si3_extend"
+(define_insn "<optab>si3_extend"
[(set (match_operand:DI 0 "register_operand" "= r")
(sign_extend:DI
(any_shift:SI (match_operand:SI 1 "register_operand" " r")
diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index 6c40db947f7..858c5ee84f2 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -637,7 +637,7 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx
reg,
{
rtx set, rhs, op0 = NULL_RTX, op1 = NULL_RTX;
rtx next, nextr;
- enum rtx_code code;
+ enum rtx_code code, prev_code = UNKNOWN;
rtx_insn *insn = DF_REF_INSN (def);
df_ref next_def;
enum iv_grd_result res;
@@ -697,6 +697,27 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode,
rtx reg,
return false;
op0 = XEXP (rhs, 0);
+
+ /* rv64 wraps SImode arithmetic inside an extension to DImode.
+ This matches the actual hardware semantics. So peek inside
+ the extension and see if we have simple arithmetic that we
+ can analyze. */
+ if (GET_CODE (op0) == PLUS)
+ {
+ rhs = op0;
+ op0 = XEXP (rhs, 0);
+ op1 = XEXP (rhs, 1);
+
+ if (CONSTANT_P (op0))
+ std::swap (op0, op1);
+
+ if (!simple_reg_p (op0) || !CONSTANT_P (op1))
+ return false;
+
+ prev_code = code;
+ code = PLUS;
+ }
+
if (!simple_reg_p (op0))
return false;
@@ -769,6 +790,11 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode,
rtx reg,
else
*outer_step = simplify_gen_binary (code, outer_mode,
*outer_step, op1);
+
+ if (prev_code == SIGN_EXTEND)
+ *extend = IV_SIGN_EXTEND;
+ else if (prev_code == ZERO_EXTEND)
+ *extend = IV_ZERO_EXTEND;
break;
case SIGN_EXTEND:
diff --git a/gcc/testsuite/gcc.target/riscv/shift-and-2.c
b/gcc/testsuite/gcc.target/riscv/shift-and-2.c
index bc01e8ef992..ee9925b7498 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-and-2.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-and-2.c
@@ -38,5 +38,24 @@ sub6 (long i, long j)
{
return i << (j & 0x3f);
}
+
+/* Test for <optab>si3_extend. */
+int
+sub7 (int i, int j) {
+ return (i << 10) & j;
+}
+
+/* Test for <optab>si3_extend. */
+unsigned
+sub8 (unsigned i, unsigned j) {
+ return (i << 10) & j;
+}
+
+/* Test for <optab>si3_extend. */
+unsigned
+sub9 (unsigned i, unsigned j) {
+ return (i >> 10) & j;
+}
+
/* { dg-final { scan-assembler-not "andi" } } */
/* { dg-final { scan-assembler-not "sext.w" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
index 5f93be15ac5..bc8c4ef3828 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
@@ -38,5 +38,6 @@ sub5 (unsigned int i)
}
/* { dg-final { scan-assembler-times "slli" 5 } } */
/* { dg-final { scan-assembler-times "srli" 5 } } */
-/* { dg-final { scan-assembler-times "slliw" 1 } } */
-/* { dg-final { scan-assembler-times "srliw" 1 } } */
+/* { dg-final { scan-assembler-times ",40" 2 } } */ /* For sub5 test */
+/* { dg-final { scan-assembler-not "slliw" } } */
+/* { dg-final { scan-assembler-not "srliw" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sign-extend.c
b/gcc/testsuite/gcc.target/riscv/sign-extend.c
new file mode 100644
index 00000000000..6f840194833
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sign-extend.c
@@ -0,0 +1,81 @@
+/* { dg-do compile { target { riscv64*-*-* } } } */
+/* { dg-options "-march=rv64gc -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+
+unsigned
+foo1 (unsigned x, unsigned y, unsigned z)
+{
+ return x & (y - z);
+}
+
+int
+foo2 (int x, int y, int z)
+{
+ return x & (y - z);
+}
+
+unsigned
+foo3 (unsigned x, unsigned y, unsigned z)
+{
+ return x & (y * z);
+}
+
+int
+foo4 (int x, int y, int z)
+{
+ return x & (y * z);
+}
+
+unsigned
+foo5 (unsigned x, unsigned y)
+{
+ return x & (y / x);
+}
+
+int
+foo6 (int x, int y)
+{
+ return x & (y / x);
+}
+
+unsigned
+foo7 (unsigned x, unsigned y)
+{
+ return x & (y % x);
+}
+
+int
+foo8 (int x, int y)
+{
+ return x & (y % x);
+}
+
+int
+foo9 (int x)
+{
+ return x & (-x);
+}
+
+unsigned
+foo10 (unsigned x, unsigned y)
+{
+ return x & (y + x);
+}
+
+
+unsigned
+foo11 (unsigned x)
+{
+ return x & (15 + x);
+}
+
+/* { dg-final { scan-assembler-times "subw" 2 } } */
+/* { dg-final { scan-assembler-times "addw" 1 } } */
+/* { dg-final { scan-assembler-times "addiw" 1 } } */
+/* { dg-final { scan-assembler-times "mulw" 2 } } */
+/* { dg-final { scan-assembler-times "divw" 1 } } */
+/* { dg-final { scan-assembler-times "divuw" 1 } } */
+/* { dg-final { scan-assembler-times "remw" 1 } } */
+/* { dg-final { scan-assembler-times "remuw" 1 } } */
+/* { dg-final { scan-assembler-times "negw" 1 } } */
+/* { dg-final { scan-assembler-not "sext.w" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
index b44d7fe8920..e7e5cbb9a1a 100644
--- a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-03.c
@@ -16,4 +16,5 @@ unsigned int ror(unsigned int rs1, unsigned int rs2)
/* { dg-final { scan-assembler-times "rolw" 1 } } */
/* { dg-final { scan-assembler-times "rorw" 1 } } */
-/* { dg-final { scan-assembler-not "and" } } */
\ No newline at end of file
+/* { dg-final { scan-assembler-not "and" } } */
+/* { dg-final { scan-assembler-not "sext.w" } } */