Hi Robin,
On 09/07/2025 11:11, Robin Dapp wrote:
Hi Paul-Antoine,
+;; Intermediate pattern for vfwmacc.vf and vfwmsac.vf used by combine
+(define_insn_and_split "*extend_vf_<mode>"
+ [(set (match_operand:VWEXTF 0 "register_operand")
+ (vec_duplicate:VWEXTF
+ (float_extend:<VEL>
+ (match_operand:<VSUBEL> 1 "register_operand"))))]
+ "TARGET_VECTOR"
Looks like that needs a can_create_pseudo_p () as well.
Right, fixed in v2.
diff --git gcc/config/riscv/vector.md gcc/config/riscv/vector.md
index 6753b01db59..ddaa16cda1a 100644
--- gcc/config/riscv/vector.md
+++ gcc/config/riscv/vector.md
@@ -7267,10 +7267,10 @@ (define_insn
"@pred_widen_mul_<optab><mode>_scalar"
(plus_minus:VWEXTF
(mult:VWEXTF
(float_extend:VWEXTF
- (vec_duplicate:<V_DOUBLE_TRUNC>
- (match_operand:<VSUBEL> 3 "register_operand" "
f")))
- (float_extend:VWEXTF
- (match_operand:<V_DOUBLE_TRUNC> 4 "register_operand" "
vr")))
+ (match_operand:<V_DOUBLE_TRUNC> 4 "register_operand" "
vr"))
+ (vec_duplicate:VWEXTF
+ (float_extend:<VEL>
+ (match_operand:<VSUBEL> 3 "register_operand" "
f"))))
Hmm, this is not just a reordering but changes from (float_extend
(vec_dup ...) to (vec_dup (float_extend ...)).
Is the original pattern not used anywhere? I don't think one is more
canonical than the other. Do we fold a sequence like
vfmv.v.f
vfcvt
vfmadd
differently? Or are we just missing a test for it?
The original pattern was not exercised by any pre-existing test. I tried
but failed to come up with a testcase that would expand to
float_extend ∘ vec_duplicate
rather than
vec_duplicate ∘ float_extend.
It seems unlikely that the vectoriser would pick the former since in
most circumstances it is probably more efficient to float-extend a
scalar than a vector.
That being said, one way to test the original pattern, if deemed
important to keep, might be to create a synthetic RTL testcase, similar
to testsuite/gcc.target/riscv/cset-sext-rtl.c.
What do you think?
diff --git gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/
vf_mulop_widen_run.h gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/
vf_mulop_widen_run.h
new file mode 100644
index 00000000000..36d7f281576
--- /dev/null
+++ gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vf_mulop_widen_run.h
@@ -0,0 +1,32 @@
+#ifndef HAVE_DEFINED_VF_MULOP_WIDEN_RUN_H
+#define HAVE_DEFINED_VF_MULOP_WIDEN_RUN_H
+
+#include <assert.h>
+
+#define N 512
+
+__attribute__((optimize("-fno-tree-vectorize")))
I would rather use an asm volatile inside the respective loop.
Changed as suggested in v2.
+int main ()
+{
+ T1 f[N]; + T1 in[N]; + T2 out[N]; + T2 out2[N];
Trailing whitespaces.
Removed in v2.
Also, seems like the CI picked up the patch but didn't run it?
I will send v2 once completed into a new thread so that the CI gets a
chance to run it.
Thanks,
--
PA