From: Pan Li <[email protected]>
The avg_ceil has the rounding mode towards +inf, while the
vaadd.vv has the rnu which totally match the sematics. From
RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundoff_signed(v, 1) = (signed(v) >> 1) + v[0]
If v[0] is bit 0, nothing need to do as there is no rounding.
If v[0] is bit 1, there will be rounding with 2 cases.
Case 1: v is positive.
roundoff_signed(v, 1) = (signed(v) >> 1) + 1, aka round towards +inf
roundoff_signed(2 + 3, 1) = (5 >> 1) + 1 = 3
Case 2: v is negative.
roundoff_signed(v, 1) = (signed(v) >> 1) + 1, aka round towards +inf
roundoff_signed(-9 + 2, 1) = (-7 >> 1) + 1 = -4 + 1 = -3
Thus, we can leverage the vaadd with rnu directly for avg_ceil.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec.md (avg<v_double_trunc>3_ceil): Add insn
expand to leverage vaadd with rnu directly.
Signed-off-by: Pan Li <[email protected]>
---
gcc/config/riscv/autovec.md | 25 ++++++-------------------
1 file changed, 6 insertions(+), 19 deletions(-)
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index a54f552a80c..5ac7b62c2cf 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2510,25 +2510,12 @@ (define_expand "avg<v_double_trunc>3_ceil"
(match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
(const_int 1)))))]
"TARGET_VECTOR"
-{
- /* First emit a widening addition. */
- rtx tmp1 = gen_reg_rtx (<MODE>mode);
- rtx ops1[] = {tmp1, operands[1], operands[2]};
- insn_code icode = code_for_pred_dual_widen (PLUS, SIGN_EXTEND, <MODE>mode);
- riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
-
- /* Then add 1. */
- rtx tmp2 = gen_reg_rtx (<MODE>mode);
- rtx ops2[] = {tmp2, tmp1, const1_rtx};
- icode = code_for_pred_scalar (PLUS, <MODE>mode);
- riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops2);
-
- /* Finally, a narrowing shift. */
- rtx ops3[] = {operands[0], tmp2, const1_rtx};
- icode = code_for_pred_narrow_scalar (ASHIFTRT, <MODE>mode);
- riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3);
- DONE;
-})
+ {
+ insn_code icode = code_for_pred (UNSPEC_VAADD, <V_DOUBLE_TRUNC>mode);
+ riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP_VXRM_RNU,
operands);
+ DONE;
+ }
+)
;; csrwi vxrm, 2
;; vaaddu.vv vd, vs2, vs1
--
2.43.0