Re: [PATCH] [RFC] Move STMT_VINFO_TYPE to SLP_TREE_TYPE

2025-07-18 Thread Robin Dapp
Can the risc-v people try to sort out this up to a point where I can just s/STMT_VINFO_TYPE/SLP_TREE_TYPE there? I think for us this mainly (only?) concerns the dynamic LMUL heuristic. Currently we go through all vectorized instructions of the loop's blocks, lookup the stmt_vec_info and then get

Re: [PATCH v2] RISC-V: Support RVVDImode for avg3_floor auto vect

2025-07-15 Thread Robin Dapp
The avg3_floor pattern leverage the add and shift rtl with the DOUBLE_TRUNC mode iterator. Aka, RVVDImode iterator will generate avg3rvvsimode_floor, only the element size QI, HI and SI are allowed. Thus, this patch would like to support the DImode by the standard name, with the iterator V_VLSI_

[PATCH] expand: Allow fixed-point arithmetic for RDIV_EXPR.

2025-07-15 Thread Robin Dapp
Hi, r16-2175-g5aa21765236730 introduced an assert for floating-point modes when expanding an RDIV_EXPR but forgot fixed-point modes. This patch adds ALL_FIXED_POINT_MODE_P to the assert. Bootstrap and regtest running on x86, aarch64, and power10. Regtested on rv64gcv. Regtest on arm running,

[PATCH] RISC-V: Fix vsetvl merge rule.

2025-07-14 Thread Robin Dapp
Hi, In PR120297 we fuse vsetvl e8,mf2,... vsetvl e64,m1,... into vsetvl e64,m4,... Individually, that's ok but we also change the new vsetvl's demand to "SEW only" even though the first original one demanded SEW >= 8 and ratio = 16. As we forget the ratio after the merge we find that the vse

Re: [PATCH v2] RISC-V: Vector-scalar widening multiply-(subtract-)accumulate [PR119100]

2025-07-14 Thread Robin Dapp
This pattern enables the combine pass (or late-combine, depending on the case) to merge a float_extend'ed vec_duplicate into a plus-mult or minus-mult RTL instruction. Before this patch, we have three instructions, e.g.: fcvt.s.h fa5,fa5 vfmv.v.f v24,fa5 vfmadd.vv v8,v24,v1

Re: [PATCH] expand: ICE if asked to expand RDIV with non-float type.

2025-07-14 Thread Robin Dapp
For the record, the Linaro CI notified me that this caused regressions: Produces 2 regressions: | | regressions.sum: | Running gcc:gcc.dg/dg.exp ... | FAIL: gcc.dg/pr103248.c (internal compiler error: in optab_for_tree_code, at optabs-tree.cc:85) | FAIL: gcc.dg/pr103248.c (test for excess e

[PATCH v3 4/5] vect: Misalign checks for gather/scatter.

2025-07-11 Thread Robin Dapp
This patch adds simple misalignment checks for gather/scatter operations. Previously, we assumed that those perform element accesses internally so alignment does not matter. The riscv vector spec however explicitly states that vector operations are allowed to fault on element-misaligned accesses.

[PATCH v3 1/5] ifn: Add helper functions for gather/scatter.

2025-07-11 Thread Robin Dapp
This patch adds access helpers for the gather/scatter offset and scale parameters. gcc/ChangeLog: * internal-fn.cc (expand_scatter_store_optab_fn): Use new function. (expand_gather_load_optab_fn): Ditto. (internal_fn_offset_index): Ditto. (internal_fn_scale

[PATCH v3 5/5] riscv: testsuite: Fix misalignment check.

2025-07-11 Thread Robin Dapp
This fixes a thinko in the misalignment check. If we want to check for vector misalignment support we need to load 16-byte elements, not 8-byte elements that will never be misaligned. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Fix misalignment check. --- gcc/testsuite/lib/targe

[PATCH v3 3/5] vect: Add is_gather_scatter argument to misalignment hook.

2025-07-11 Thread Robin Dapp
This patch adds an is_gather_scatter argument to the support_vector_misalignment hook. All targets but riscv do not care about alignment for gather/scatter so return true for is_gather_scatter. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_builtin_support_vector_misalignment):

[PATCH v3 2/5] vect: Add helper macros for gather/scatter.

2025-07-11 Thread Robin Dapp
This encapsulates the IFN and the builtin-function way of handling gather/scatter via three defines: GATHER_SCATTER_IFN_P GATHER_SCATTER_LEGACY_P GATHER_SCATTER_EMULATED_P and introduces a helper define for SLP operand handling as well. gcc/ChangeLog: * tree-vect-slp.cc (GATHER_SC

[PATCH v3 0/5] vect: Misalign for gather/scatter.

2025-07-11 Thread Robin Dapp
an alias pointer. I deferred that for now, though. The whole series was regtested and bootstrapped on x86, aarch64, and power10 and I built the patches individually on x86 as well as riscv. It was also regtested on rv64gcv_zvl512b. Robin Dapp (5): ifn: Add helper functions for gather/scatter. vec

[PATCH] expand: ICE if asked to expand RDIV with non-float type.

2025-07-10 Thread Robin Dapp
Hi, this patch adds asserts that ensure we only expand an RDIV_EXPR with actual float mode. It also replaces the RDIV_EXPR in setting a vectorized loop's length by EXACT_DIV_EXPR. The code in question is only used with length-control targets (riscv, powerpc, s390). Bootstrapped and regtested o

[PATCH v2] RISC-V: Make zero-stride load broadcast a tunable.

2025-07-10 Thread Robin Dapp
Hi, Changes from v1: - Use Himode broadcast instead of float broadcast, saving two conversion insns. Let's be daring and leave the thorough testing to the CI first while my own testing is in progress :) This patch makes the zero-stride load broadcast idiom dependent on a uarch-tunable "us

Re: [PATCH] RISC-V: Make zero-stride load broadcast a tunable.

2025-07-10 Thread Robin Dapp
Oh, I guess I didn't expand enough about my thought: I don't care that we have bad performance/bad code gen here if Zvfh is mandatory for RVA23 since that means not many people and core will fall into this code gen path. But RVA23 will go to this code gen patch, which means we will go this path fo

Re: [PATCH] RISC-V: Make zero-stride load broadcast a tunable.

2025-07-10 Thread Robin Dapp
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 6753b01db59..866aaf1e8a0 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -1580,8 +1580,27 @@ (define_insn_and_split "*vec_duplicate" "&& 1" [(const_int 0)] { -riscv_vector::emit_vlma

[PATCH] RISC-V: Make zero-stride load broadcast a tunable.

2025-07-10 Thread Robin Dapp
Hi, This patch makes the zero-stride load broadcast idiom dependent on a uarch-tunable "use_zero_stride_load". Right now we have quite a few paths that reach a strided load and some of them are not exactly straightforward. While broadcast is relatively rare on rv64 targets it is more common on

Re: [PATCH] RISC-V: Vector-scalar widening multiply-(subtract-)accumulate [PR119100]

2025-07-10 Thread Robin Dapp
The original pattern was not exercised by any pre-existing test. I tried but failed to come up with a testcase that would expand to float_extend ∘ vec_duplicate rather than vec_duplicate ∘ float_extend. Ok, so we indeed don't have a test and the intrinsics tests unfortunately are no help

Re: [PATCH] RISC-V: Vector-scalar widening multiply-(subtract-)accumulate [PR119100]

2025-07-09 Thread Robin Dapp
Hi Paul-Antoine, +;; Intermediate pattern for vfwmacc.vf and vfwmsac.vf used by combine +(define_insn_and_split "*extend_vf_" + [(set (match_operand:VWEXTF 0 "register_operand") +(vec_duplicate:VWEXTF + (float_extend: +(match_operand: 1 "register_operand"] + "TARGET_VECTOR"

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vssub.vv to vssub.vx on GR2VR cost

2025-07-09 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vssub.vv into vssub.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: Jeff has already pre-a

[PATCH] RISC-V: Do not use vsetivli for THeadVector.

2025-07-08 Thread Robin Dapp
Hi, in emit_vlmax_insn_lra we use a vsetivli for an immediate AVL. XTHeadVector does not support this, so guard appropriately. Regtested on rv64gcv_zvl512b. Regards Robin PR target/120461 gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vlmax_insn_lra): Do not emit vset

[PATCH] RISC-V: Ignore non-types in builtin function hash.

2025-07-08 Thread Robin Dapp
Hi, if a user passes a string that doesn't represent a variable we still try to compute a hash for its type. Its tree does not represent a type but just an exceptional, though. This patch just ignores it, leaving the error to the checking code later. Regtested on rv64gcv_zvl512b. Regards Rob

Re: [PATCH v3 3/4] RISC-V: Implement unsigned scalar SAT_MUL from uint128_t

2025-07-04 Thread Robin Dapp
This generally looks OK to me (including the tests). + HOST_WIDE_INT max = ((uint64_t)1 << bitsize) - 1; Wouldn't a uint64_t type for max be clearer? I guess the worst that can happen is compiling on a 32-bit host for a 64-bit target and get bitsize == 32 here. Do we even support this? If

Re: [PATCH v3 0/3] RISC-V: Combine vec_duplicate + vsadd.vv to vsadd.vx on GR2VR cost

2025-07-04 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vsadd.vv into vsadd.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: OK. -- Regards Robin

Re: [PATCH] [RISC-V] Fix shift type for RVV interleaved stepped patterns [PR120356]

2025-07-02 Thread Robin Dapp
CI-testing was failed: https://github.com/ewlu/gcc-precommit-ci/issues/3585#issuecomment-3022157670 for sat_u_add-5-u32.c and vect-reduc-sad-1.c. These failures are compile issues appeared due to afdo-crossmodule-1b.c file. For some reason, in both cases the following snippets are being inserted i

Re: [PATCH v2] vect: Misalign checks for gather/scatter.

2025-07-02 Thread Robin Dapp
I'm not sure? I'd prefer some refactoring to make this more obvious (and the split between the two functions doesn't help ...). If you're sure it's all covered then ignore this comment, I can do the refactoring as followup. It just wasn't obvious to me. Ah, I think I misread your original com

Re: [PATCH v2] vect: Misalign checks for gather/scatter.

2025-07-02 Thread Robin Dapp
The else (get_group_load_store_type) can end up returning VMAT_GATHER_SCATTER and thus require the above checking as well. Isn't this already covered by if (*memory_access_type == VMAT_ELEMENTWISE || (*memory_access_type == VMAT_GATHER_SCATTER && GATHER_SCATTER_LEGACY_P (*gs_inf

Re: [PATCH] [RISC-V] Fix shift type for RVV interleaved stepped patterns [PR120356]

2025-07-01 Thread Robin Dapp
It corrects the shift type of interleaved stepped patterns for const vector expanding in LRA. The shift instruction was initially LSHIFTRT, and it seems still should be the same type for both LRA and other cases. This is OK, thanks. -- Regards Robin

Re: [PATCH] RISC-V: Vector-scalar negate-multiply-(subtract-)accumulate [PR119100]

2025-06-29 Thread Robin Dapp
This is failing pre-commit testing: linux rv64gcv lp64d medlow multilib: FAIL: gcc.target/riscv/rvv/base/bug-4.c (internal compiler error: in extract_insn, at recog.cc:2882) FAIL: gcc.target/riscv/rvv/base/bug-4.c (test for excess errors) linux rv32gcv ilp32d medlow multilib: FAIL: gcc.target

Re: [PATCH v2 0/4] RISC-V: Combine vec_duplicate + vssubu.vv to vssubu.vx on GR2VR cost

2025-06-27 Thread Robin Dapp
Is there anyway we can retrigger the test somewhere ? If no I can send a v3 series with the commit reordered and see. I don't think there's a way other than re-submitting. But if you're sure you tested properly and the CI is mistaken we can go ahead. I just wanted to make sure as with the s

Re: [PATCH v2 0/4] RISC-V: Combine vec_duplicate + vssubu.vv to vssubu.vx on GR2VR cost

2025-06-27 Thread Robin Dapp
OK. Hmm, I'm still seeing test failures in the CI. Could you check if those are valid? -- Regards Robin

[PATCH v2] vect: Misalign checks for gather/scatter.

2025-06-27 Thread Robin Dapp
Hi, Changes from v1: - Add gather_scatter argument to support_vector_misalignment. - Don't rely on DR_BASE_ALIGNMENT. - Add IFN helpers and use them. - Add gather/scatter helper macros. - Clarify is_packed handling in docs. This patch adds simple misalignment checks for gather/scatter operations

Re: [PATCH v2 0/4] RISC-V: Combine vec_duplicate + vssubu.vv to vssubu.vx on GR2VR cost

2025-06-27 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vssubu.vv into vssubu.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: OK. -- Regards Robi

Re: [PATCH] vect: Misalign checks for gather/scatter.

2025-06-27 Thread Robin Dapp
Maybe we can pass a scalar mode to the hook when we ask for SCATTER/GATHER? That might need fixups in other targets of course, but it would make it clear what we're asking for? How about an additional argument bool gather_scatter to make it more explicit? Then we could just if (gather_scatt

Re: [PATCH v1 2/4] RISC-V: Add test for vec_duplicate + vssubu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-26 Thread Robin Dapp
Hi Pan, diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h index 2932e189186..0af8b969f47 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h +++ b/gcc/testsuite/gcc.target/riscv/rvv/auto

Re: [committed] RISC-V: Add comment and reorder the the include files in riscv.md [NFC]

2025-06-26 Thread Robin Dapp
Hi Kito, This patch adds a comment to the riscv.md file to clarify the purpose of the file and reorders the include files for better organization. this seems to have broken the build. I believe that's due to -(include "vector.md") (include "vector-crypto.md") because vector crypto depend

Re: [PATCH] RISC-V: update prepare_ternary_operands to handle the vector-scalar case [PR120828]

2025-06-26 Thread Robin Dapp
I guess I missed it when I first ran the testsuite before sending the patch for review. I rebased and re-ran the testsuite after getting approved and saw the regression. But at that point I realised Jeff had already merged it. Anyway, I'll regtest more carefully next time! The CI helps with th

Re: [PATCH] RISC-V: update prepare_ternary_operands to handle the vector-scalar case [PR120828]

2025-06-26 Thread Robin Dapp
This is a followup to 92e1893e0 "RISC-V: Add patterns for vector-scalar multiply-(subtract-)accumulate" that caused an ICE in some cases where the mult operands were wrongly swapped. This patch ensures that operands are not swapped in the vector-scalar case. This looks reasonable, so OK for the

Re: [PATCH] vect: Misalign checks for gather/scatter.

2025-06-26 Thread Robin Dapp
+ bool is_misaligned = scalar_align < inner_vectype_sz; + bool is_packed = scalar_align > 1 && is_misaligned; + + *misalignment = !is_misaligned ? 0 : inner_vectype_sz - scalar_align; + + if (targetm.vectorize.support_vector_misalignment + (TYPE_MODE (vectype), inner_

Re: [PATCH] vect: Misalign checks for gather/scatter.

2025-06-25 Thread Robin Dapp
This change reminds me that we lack documentation about arguments of most of the "complicated" internal functions ... I didn't mention it but I got implicitly reminded several times while writing the patch... ;) An overhaul has been on my todo list for a while but of course it never was top pr

[PATCH] vect: Misalign checks for gather/scatter.

2025-06-25 Thread Robin Dapp
Hi, this patch adds simple misalignment checks for gather/scatter operations. Previously, we assumed that those perform element accesses internally so alignment does not matter. The riscv vector spec however explicitly states that vector operations are allowed to fault on element-misaligned acc

Re: [PATCH] RISC-V: Refactor the function bitmap_union_of_preds_with_entry

2025-06-24 Thread Robin Dapp
Hi Ma Jin, thanks for looking into this, it has been on my todo list with very low priority since the vsetvl rewrite. + /* Handle case with no predecessors (including ENTRY block). */ + if (EDGE_COUNT (b->preds) == 0) { - e = EDGE_PRED (b, ix); - bitmap_copy (dst, src[e->src

Re: [PATCH] RISC-V: Add patterns for vector-scalar multiply-(subtract-)accumulate [PR119100]

2025-06-24 Thread Robin Dapp
This LGTM for the trunk. -- Regards Robin

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vsaddu.vv to vsaddu.vx on GR2VR cost

2025-06-21 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vsaddu.vv into vsaddu.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: OK, thanks. -- Rega

Re: [PATCH v2] RISC-V: Fix ICE for expand_select_vldi [PR120652]

2025-06-20 Thread Robin Dapp
OK, thanks. -- Regards Robin

Re: [PATCH v1] RISC-V: Fix ICE for expand_select_vldi [PR120652]

2025-06-20 Thread Robin Dapp
Hi Pan, +(define_special_predicate "vectorization_factor_operand" + (match_code "const_int,const_poly_int")) + Does immediate_operand () work instead of a new predicate? -- Regards Robin

Re: [PATCH] RISC-V: Add generic tune as default.

2025-06-18 Thread Robin Dapp
@@ -78,6 +79,7 @@ RISCV_CORE("sifive-e31", "rv32imac", "sifive-3-series") RISCV_CORE("sifive-e34", "rv32imafc", "sifive-3-series") RISCV_CORE("sifive-e76", "rv32imafc", "sifive-7-series") +RISCV_CORE("generic", "rv64gc","generic") ^^^ Drop this and add -mtune=ge

Re: [PATCH v1] RISC-V: Refine VX combine test case 0 to avoid code duplication

2025-06-16 Thread Robin Dapp
The case 0 for vx combine def functions are most the same across the different test files. Thus, re-arrange them in one place to avoid code duplication. OK. -- Regards Robin

Re: [PATCH] RISC-V: testsuite: fix an obvious build error

2025-06-10 Thread Robin Dapp
OK. -- Regards Robin

Re: [PATCH] RISC-V: Add patterns for vector-scalar negate-(multiply-add/sub) [PR119100]

2025-06-10 Thread Robin Dapp
This is OK for the trunk. -- Regards Robin

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-06 Thread Robin Dapp
Note it’s far from obvious to me whether for stride and gather loads the alignment of the elements loaded falls under the scalar or vector load restriction. Is this explicitly spelled out for risc-v or is that your interpretation? We have the following in the vector spec: If an element acces

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-06 Thread Robin Dapp
At first I thought if we only cared about element misalignment checking the first element/pointer should be sufficient. But riscv's gathers as well as strided loads allow byte offsets rather than element-sized offsets so there could be 16-bit loads with a stride of e.g. 1 byte. Wait, no that

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-06 Thread Robin Dapp
At least on aarch64, the gathers and scatters use (mem:BLK (scratch:P)), i.e. a wildcard memory access. There's no good way in RTL to represent multiplie distinct locations in a single reference. (unspec on its own doesn't imply a memory access) At first I thought if we only cared about elemen

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-06 Thread Robin Dapp
I think the spotted correctness issues wrt alignment/aliasing should be addressed up-front. In the end the gather/stride-load is probably an UNSPEC, so there's no MEM RTX with wrong info? How would we query the target on whether it can handle the alignment here? Usually we go through vect_suppo

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-06 Thread Robin Dapp
Yes. Note I don't see we guarantee element alignment for gather/scatter either, nor do the IFNs seem to have encoding space for alignment. The effective type for TBAA seems also missing there ... Regarding vector_vector_composition_type I had a try and attached a preliminary V3. I'm not reall

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-06 Thread Robin Dapp
In case the riscv strided vector load instruction has additional requirements on the loaded (scalar) element alignment then we'd have to implement this. For the moment the vectorizer will really emit scalar loads here, so that's fine (though eventually inefficient). For the strided vector load th

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-05 Thread Robin Dapp
But that would not pass the alignment check either, no? In fact, I assume that for strided loads we have a scalar type as component (ptype), so we always get supported unaligned accesses here? Perhaps I'm missing something, though. What I was missing is that we're using the same element size

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-05 Thread Robin Dapp
But that would not pass the alignment check either, no? In fact, I assume that for strided loads we have a scalar type as component (ptype), so we always get supported unaligned accesses here? I was thinking of the case where we have e.g. a group of 4 int8s and use a strided load with int32 el

Re: [PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-06-05 Thread Robin Dapp
So I do wonder how this interacts with vector_vector_composition_type, in fact the difference is that for strided_load we know the composition happens as part of a load, so how about instead extending this function, pass it VLS_LOAD/STORE and also consider strided_loads as composition kind there?

Re: [PATCH v1 0/4] RISC-V: Combine vec_duplicate + vdiv.vv to vdiv.vx on GR2VR cost

2025-06-03 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vdiv.vv into vdiv.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: The series is OK, thanks.

Re: [PATCH v2 1/2] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]

2025-06-03 Thread Robin Dapp
This series is OK now, thanks. -- Regards Robin

Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-03 Thread Robin Dapp
1. riscv64-linux-gcc -march=rv64gc -march=foo-cpu -mtune=foo-cpu 2. riscv64-linux-gcc -march=rv64gc -march=foo-cpu 3. riscv64-linux-gcc -march=rv64gc -march=unset -mtune=unset -mcpu=foo-cpu Preference to me: - Prefer option 1. - Less prefer option 3. (acceptable but I don't like) - Strongly disli

Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-02 Thread Robin Dapp
I don't quite follow this part. IIUC the rules before this patch were -march=ISA: Generate code that requires the given ISA, without changing the tuning model. -mcpu=CPU: Generate code for the given CPU, targeting all the extensions that CPU supports and using the best known tu

Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-01 Thread Robin Dapp
This rule clearly applies to directly related options like -ffoo and -fno-foo, but it’s less obvious for unrelated pairs like -ffoo and -fbar especially when there is traditionally strong specifics. In many cases, the principle of "the most specific option wins" governs the behavior. Here

Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-01 Thread Robin Dapp
I stumped across this change from https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/88 and I want to express my strong disagreement with this change. Perhaps I'm accustomed to Arm's behavior, but I believe using -march= to target a specific CPU isn't ideal. * -march=X: (exe

Re: [PATCH v1] RISC-V: Fix line too long format issue for autovect.md [NFC]

2025-05-31 Thread Robin Dapp
Inspired by the avg_ceil patches, notice there were even more lines too long from autovec.md. So fix that format issues. OK. -- Regards Robin

Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]

2025-05-30 Thread Robin Dapp
Hi Paul-Antoine, overall the patch looks reasonable to me now, provided the fr2vr followup. BTW it's the late-combine pass that performs the optimization, not the combine pass. You might still want to fix this in the commit message. Please CC patchworks...@rivosinc.com for the next version

Re: [PATCH v1 0/3] Refine the avg_ceil with fixed point vaadd

2025-05-30 Thread Robin Dapp
Looks like the CI cannot tell patch series? There are 3 patches and the CI will run for each one. Of course, the first one will have scan failure due to expanding change, but the second one reconciles them. Finally the third one will have all test passed as below, I think it indicates all test

Re: [PATCH v1 0/3] Refine the avg_ceil with fixed point vaadd

2025-05-30 Thread Robin Dapp
Similar to the avg_floor, the avg_ceil has the rounding mode towards +inf, while the vaadd.vv has the rnu which totally match the sematics. From RVV spec, the fixed vaadd.vv with rnu, The CI shows some scan failures in vls/avg-[456].c and widen/vec-avg-rv32gcv.c. Also, the lint check complains

Re: [PATCH] testsuite: RISC-V: Fix the typo in param-autovec-mode.c

2025-05-28 Thread Robin Dapp
This patch fixes the typo in the test case `param-autovec-mode.c` in the RISC-V autovec testsuite. The option `autovec-mode` is changed to `riscv-autovec-mode` to match the expected parameter name. OK of course :) -- Regards Robin

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost

2025-05-28 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vmul.vv into vmul.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: OK. -- Regards Robin

Re: [PATCH v2 0/3] Refine the avg_floor with fixed point vaadd

2025-05-28 Thread Robin Dapp
LGTM, thanks. -- Regards Robin

[PATCH v2 0/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-27 Thread Robin Dapp
The first patch makes SLP paths unreachable and the second one removes those entirely. The third patch does the actual strided-load work. Bootstrapped and regtested on x86 and aarch64. Regtested on rv64gcv_zvl512b. Robin Dapp (3): vect: Make non-SLP paths unreachable in strided slp

[PATCH v2 3/3] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-27 Thread Robin Dapp
From: Robin Dapp This patch enables strided loads for VMAT_STRIDED_SLP. Instead of building vectors from scalars or other vectors we can use strided loads directly when applicable. The current implementation limits strided loads to cases where we can load entire groups and not subsets of them

[PATCH v2 2/3] vect: Remove non-SLP paths in strided slp/elementwise.

2025-05-27 Thread Robin Dapp
This removes the non-SLP paths that were made unreachable in the previous patch. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Remove non-SLP paths. --- gcc/tree-vect-stmts.cc | 49 -- 1 file changed, 18 insertions(+), 31 deletions(-) d

[PATCH v2 1/3] vect: Make non-SLP paths unreachable in strided slp/elementwise.

2025-05-27 Thread Robin Dapp
From: Robin Dapp This replaces if (slp) with if (1) and if (!slp) with if (0). gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Make non-SLP paths unreachable. --- gcc/tree-vect-stmts.cc | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc

Re: [PATCH 2/2] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-27 Thread Robin Dapp
This mangles in the non-SLP path removal, can you please separate that out? So should patch 1/2 do more than it does, i.e. fully remove the non-slp paths rather than just if (0) them? -- Regards Robin

Re: [PATCH 2/2] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-27 Thread Robin Dapp
That would be appreciated (but is of course a larger task - I was fine with the partial thing you did). Ok. Then to move things forward I'll do a 2/3 for this one first. Once we're through the review cycle for the series I can work on the non-slp removal for the full function. -- Regards R

Re: [PATCH 2/2] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-27 Thread Robin Dapp
On Tue, May 27, 2025 at 2:44 PM Robin Dapp wrote: > This mangles in the non-SLP path removal, can you please separate that > out? So should patch 1/2 do more than it does, i.e. fully remove the non-slp paths rather than just if (0) them? There should be a separate 2/3 that does thi

[PATCH] RISC-V: Avoid division by zero in check_builtin_call [PR120436].

2025-05-27 Thread Robin Dapp
Hi, in check_builtin_call we eventually perform a division by zero when no vector modes are present. This patch just avoids the division in that case. Regtested on rv64gcv_zvl512b. I guess this is obvious enough that it can be pushed after the CI approves. Regards Robin PR target/1

Re: [PATCH v1 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_floor

2025-05-26 Thread Robin Dapp
-(define_expand "avg3_floor" - [(set (match_operand: 0 "register_operand") - (truncate: -(ashiftrt:VWEXTI - (plus:VWEXTI - (sign_extend:VWEXTI - (match_operand: 1 "register_operand")) - (sign_extend:VWEXTI - (match_operand: 2 "register_operand"))] +(define_expan

Re: simple frm save/restore strategy (was Re: [PATCH 3/6] RISC-V: frm/mode-switch: remove dubious frm edge insertion before call_insn)

2025-05-26 Thread Robin Dapp
2. OK'ish: A bunch of testcases see more reads/writes as PRE of redundant read/writes is punted to later passes which obviously needs more work. 3. NOK: We loose the ability to instrument local RM writes - especially in the testsuite.   e.g.      a.  instrinsic setting a static RM b. get_frm

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost

2025-05-26 Thread Robin Dapp
OK, thanks. -- Regards Robin

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vor.vv to vor.vx on GR2VR cost

2025-05-23 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vor.vv into vor.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: OK, thanks. -- Regards Robin

Re: [PATCH 3/6] RISC-V: frm/mode-switch: remove dubious frm edge insertion before call_insn

2025-05-22 Thread Robin Dapp
AFAICT the main difference to standard mode switching is that we (ab)use it to set the rounding mode to the value it had initially, either at function entry or after a call.  That's different to regular mode switching which assumes "static" rounding modes for different instructions. Standard c

Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]

2025-05-22 Thread Robin Dapp
Hi Paul-Antoine, Please find attached a revised version of the patch. Compared to the previous iteration, I have: * Rebased on top of Pan's work; * Updated the cost model; * Added a second pattern to handle the case where PLUS_MINUS operands are swapped; * Added compile and run tests. I boot

Re: [PATCH] RISC-V: Add autovec mode param.

2025-05-21 Thread Robin Dapp
Could you make a simple testcase that could vectorize two loops in different modes (e.g one SI and one SF) and with this param will only auto vec on loop? I added a test now in the attached v2 that checks that we vectorize with the requested mode. Right now the patch only takes away "additiona

Re: [PATCH] RISC-V: Support CPUs in -march.

2025-05-21 Thread Robin Dapp
I could imagine that is a simpler way to set the march since the march string becomes terribly long - we have an arch string more than 300 char...so I support this, although I think this should be discuss with LLVM community, but I think it's fine to accept as a GCC extension. So LGTM, go ahead t

[PATCH] RISC-V: Support CPUs in -march.

2025-05-21 Thread Robin Dapp
Hi, This patch allows an -march string like -march=sifive-p670 in order to allow overriding a previous -march in a simple way. Suppose we have a Makefile that specifies -march=rv64gc by default. A user-specified -mcpu=sifive-p670 would be after the -march in the options string and thus only s

[PATCH] RISC-V: Default-initialize variable.

2025-05-21 Thread Robin Dapp
Hi, this patch initializes saved_vxrm_mode to VXRM_MODE_NONE. This is a warning (but no error) when building the compiler so better fix it. Regtested on rv64gcv_zvl512b. Going to commit as obvious if the CI is happy. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (singleton_vx

[PATCH] RISC-V: Add autovec mode param.

2025-05-21 Thread Robin Dapp
Hi, This patch adds a --param=autovec-mode=. When the param is specified we make autovectorize_vector_modes return exactly this mode if it is available. This helps when testing different vectorizer settings. Regtested on rv64gcv_zvl512b. Regards Robin gcc/ChangeLog: * config/riscv/r

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vand.vv to vand.vx on GR2VR cost

2025-05-21 Thread Robin Dapp
This patch would like to introduce the combine of vec_dup + vand.vv into vand.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: OK, thanks. -- Regards Rob

Re: [PATCH 3/6] RISC-V: frm/mode-switch: remove dubious frm edge insertion before call_insn

2025-05-20 Thread Robin Dapp
Maybe I'm missing something there. Particularly whether or not you can know anything about frm's value after a call has returned. Normally the answer to this kind of question is a hard no. AFAICT the main difference to standard mode switching is that we (ab)use it to set the rounding mode to

[PATCH 2/2] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-20 Thread Robin Dapp
This patch enables strided loads for VMAT_STRIDED_SLP. Instead of building vectors from scalars or other vectors we can use strided loads directly when applicable. The current implementation limits strided loads to cases where we can load entire groups and not subsets of them. A future improveme

[PATCH 0/2] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-20 Thread Robin Dapp
The second patch adds strided-load support for strided-slp memory access. The first patch makes the respective non-slp paths unreachable. Robin Dapp (2): vect: Remove non-SLP paths in strided slp and elementwise. vect: Use strided loads for VMAT_STRIDED_SLP. gcc/internal-fn.cc

[PATCH 1/2] vect: Remove non-SLP paths in strided slp and elementwise.

2025-05-20 Thread Robin Dapp
This replaces if (slp) with if (1) and if (!slp) with if (0). gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Make non-slp paths unreachable. --- gcc/tree-vect-stmts.cc | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/gcc/tree-vect-stmts

Re: [PATCH v1 0/8] RISC-V: Combine vec_duplicate + vrsub.vv to vrsub.vx on GR2VR cost

2025-05-19 Thread Robin Dapp
The series LGTM. I didn't check all the tests in detail to be honest :) -- Regards Robin

Re: [PATCH][RFC] Allow the target to request a masked vector epilogue

2025-05-16 Thread Robin Dapp
I was thinking of adding a vectorization_mode class that would encapsulate the mode and whether to allow masking or alternatively to make the vector_modes array (and the m_suggested_epilogue_mode) a std::pair of mode and mask flag? Without having a very strong opinion (or the full background) on

Re: [PATCH v1 00/10] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-16 Thread Robin Dapp
Excuse the delay, I was attending the RISC-V Summit Europe. The series LGTM. -- Regards Robin

Re: [PATCH v1 0/7] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-12 Thread Robin Dapp
I think we need the run tests for each op combine up to a point. But for asm check, Seems we can put it together? I mean something like below: +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param=gpr2vr-cost=0" } */ + +#include "vx_binary.h" + +DEF_VX_BINARY_CASE_0(int3

  1   2   3   4   5   6   7   8   9   10   >