from:"pan2 . li"

[PATCH v1 6/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 1

2025-05-18 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add asm check for vrsub with GR

[PATCH v1 3/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 1

2025-05-18 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add vrsub asm dump check.

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vxor.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-26 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost

2025-05-25 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vxor.vv into vxor.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vxor.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-25 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 0/3] Refine the avg_floor with fixed point vaadd

2025-05-26 Thread pan2 . li

From: Pan Li The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that discard least-significant information. For float point which is not two's complement, the "discard least-significant information" indicates truncation round.

[PATCH v1 2/3] RISC-V: Reconcile the existing test for avg_floor

2025-05-26 Thread pan2 . li

From: Pan Li Some existing avg_floor test need updated due to change to leverage vaadd.vv directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check to vaadd. * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto. * gcc.target/ris

[PATCH v1 3/3] RISC-V: Add test cases for avg_floor vaadd implementation

2025-05-26 Thread pan2 . li

From: Pan Li Add asm and run testcase for avg_floor vaadd implementation. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/avg.h: New test. * gcc.target/riscv/rvv/autovec/avg_dat

[PATCH v1 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_floor

2025-05-26 Thread pan2 . li

From: Pan Li The signed avg_floor totally match the sematics of fixed point rvv insn vaadd, within round down. Thus, leverage it directly to implement the avf_floor. The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that disc

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost

2025-05-26 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vxor.vv to the vxor.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v2 2/3] RISC-V: Reconcile the existing test for avg_floor

2025-05-27 Thread pan2 . li

From: Pan Li Some existing avg_floor test need updated due to change to leverage vaadd.vv directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check to vaadd. * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto. * gcc.target/ris

[PATCH v2 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_floor

2025-05-27 Thread pan2 . li

From: Pan Li The signed avg_floor totally match the sematics of fixed point rvv insn vaadd, within round down. Thus, leverage it directly to implement the avf_floor. The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that disc

[PATCH v2 0/3] Refine the avg_floor with fixed point vaadd

2025-05-27 Thread pan2 . li

From: Pan Li The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that discard least-significant information. For float point which is not two's complement, the "discard least-significant information" indicates truncation round.

[PATCH v2 3/3] RISC-V: Add test cases for avg_floor vaadd implementation

2025-05-27 Thread pan2 . li

From: Pan Li Add asm and run testcase for avg_floor vaadd implementation. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/avg.h: New test. * gcc.target/riscv/rvv/autovec/avg_dat

[PATCH v1] RISC-V: Avoid scalar unsigned SAT_ADD test data duplication

2025-05-16 Thread pan2 . li

From: Pan Li Some of the previous scalar unsigned SAT_ADD test data are duplicated in different test files. This patch would like to move them into a shared header file, to avoid the test data duplication. The below test suites are passed for this patch series. * The rv64gcv fully regression te

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vor.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-22 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add tes

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vor.vv to vor.vx on GR2VR cost

2025-05-22 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vor.vv to the vor.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the GR

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vor.vv to vor.vx on GR2VR cost

2025-05-22 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vor.vv into vor.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | ...

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vor.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-22 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm

[PATCH v1] RISC-V: Leverage get_vector_binary_rtx_cost to avoid code dup [NFC]

2025-06-03 Thread pan2 . li

From: Pan Li Some similar code could be wrapped to func get_vector_binary_rtx_cost, thus leverage this function to avoid code duplication. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (get_vector_bin

[PATCH v1 1/4] RISC-V: Combine vec_duplicate + vidv.vv to vdiv.vx on GR2VR cost

2025-06-02 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vdiv.vv to the vdiv.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 3/3] RISC-V: Add test cases for avg_ceil vaadd implementation

2025-05-29 Thread pan2 . li

From: Pan Li Add asm and run testcase for avg_ceil vaadd implementation. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/avg.h: Add test helper macros. * gcc.target/riscv/rvv/au

[PATCH v1 2/3] RISC-V: Reconcile the existing test for avg_ceil

2025-05-29 Thread pan2 . li

From: Pan Li Some existing avg_floor test need updated due to change to leverage vaadd.vv directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-4.c: Update asm check to vaadd. * gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto. * gcc.target/ris

[PATCH v1 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_ceil

2025-05-29 Thread pan2 . li

From: Pan Li The avg_ceil has the rounding mode towards +inf, while the vaadd.vv has the rnu which totally match the sematics. From RVV spec, the fixed vaadd.vv with rnu, roundoff_signed(v, d) = (signed(v) >> d) + r r = v[d - 1] For vaadd, d = 1, then we have roundoff_signed(v, 1) = (signed(v

[PATCH v1 0/3] Refine the avg_ceil with fixed point vaadd

2025-05-29 Thread pan2 . li

From: Pan Li Similar to the avg_floor, the avg_ceil has the rounding mode towards +inf, while the vaadd.vv has the rnu which totally match the sematics. From RVV spec, the fixed vaadd.vv with rnu, roundoff_signed(v, d) = (signed(v) >> d) + r r = v[d - 1] For vaadd, d = 1, then we have roundof

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost

2025-05-28 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vmul.vv to the vmul.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vmul.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-28 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vmul.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-28 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost

2025-05-28 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vmul.vv into vmul.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 4/5] RISC-V: Add test for vec_dup + vmax.vv combine case 1 with max func 0 and GR2VR cost 0, 1 and 2

2025-06-11 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check for vmax.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-

[PATCH v1 5/5] RISC-V: Add test for vec_dup + vmax.vv combine case 1 with max func 1 and GR2VR cost 0, 1 and 2

2025-06-11 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check for vmax.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-

[PATCH v1 0/5] RISC-V: Combine vec_duplicate + vmax.vv to vmax.vx on GR2VR cost

2025-06-11 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vmax.vv into vmax.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 2/5] RISC-V: Add test for vec_dup + vmax.vv combine case 0 with max func 0 and GR2VR cost 0, 2 and 15

2025-06-11 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check for max func 1 vmax.vx combine. * gcc.target/riscv/rvv/autovec

[PATCH v1 1/4] RISC-V: Combine vec_duplicate + vremu.vv to vremu.vx on GR2VR cost

2025-06-09 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vremu.vv to the vremu.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if th

[PATCH v1 2/4] RISC-V: Reconcile the existing test for vremu.vx combine

2025-06-09 Thread pan2 . li

From: Pan Li Some existing vrem related test need some adjust for the asm check due to cost model. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the asm check for vremu. * gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto. S

[PATCH v1 4/4] RISC-V: Add test for vec_duplicate + vremu.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-09 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vremu.vv combine to vremu.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check for vremu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx

[PATCH v1 0/4] RISC-V: Combine vec_duplicate + vremu.vv to vremu.vx on GR2VR cost

2025-06-09 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vremu.vv into vremu.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: |

[PATCH v1 3/4] RISC-V: Add test for vec_duplicate + vremu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-09 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vremu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-

[PATCH v1 1/5] RISC-V: Combine vec_duplicate + vmax.vv to vmax.vx on GR2VR cost

2025-06-12 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vmax.vv to the vmax.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 3/5] RISC-V: Add test for vec_dup + vmax.vv combine case 0 with max func 1 and GR2VR cost 0, 2 and 15

2025-06-12 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check for max func 1 vmax.vx combine. * gcc.target/riscv/rvv/autovec

[PATCH v1 0/4] RISC-V: Combine vec_duplicate + vdiv.vv to vdiv.vx on GR2VR cost

2025-06-02 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vdiv.vv into vdiv.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 4/4] RISC-V: RISC-V: Reconcile the existing test for vdiv.vx combine

2025-06-02 Thread pan2 . li

From: Pan Li Some existing vdiv related test need some adjust for the asm check. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust the asm check for vdiv. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto. * gcc.ta

[PATCH v1 3/4] RISC-V: Add test for vec_duplicate + vdiv.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-02 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 2/4] RISC-V: Add test for vec_duplicate + vdiv.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-02 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a

[PATCH v1] RISC-V: Fix line too long format issue for autovect.md [NFC]

2025-05-30 Thread pan2 . li

From: Pan Li Inspired by the avg_ceil patches, notice there were even more lines too long from autovec.md. So fix that format issues. gcc/ChangeLog: * config/riscv/autovec.md: Fix line too long for sorts of pattern. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 54

[PATCH v1] RISC-V: Fix ICE for gcc.dg/graphite/pr33576.c with rv32gcv

2025-06-04 Thread pan2 . li

From: Pan Li The div of rvv has not such insn v2 = div (vec_dup (x), v1), thus the generated rtl like that hit the unreachable assert when expand insn. This patch would like to remove op div from the binary op form (vec_dup (x), v) to avoid pattern matching by mistake. No new test introduced as

[PATCH v1 1/4] RISC-V: Combine vec_duplicate + vrem.vv to vrem.vx on GR2VR cost

2025-06-08 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vrem.vv to the vrem.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 2/4] RISC-V: Reconcile the existing test for vrem.vx combine

2025-06-08 Thread pan2 . li

From: Pan Li Some existing vrem related test need some adjust for the asm check due to cost model. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the asm check for vrem. * gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto. Si

[PATCH v1 3/4] RISC-V: Add test for vec_duplicate + vrem.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-08 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check for vrem.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1

[PATCH v1 0/4] RISC-V: Combine vec_duplicate + vrem.vv to vrem.vx on GR2VR cost

2025-06-08 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vrem.vv into vrem.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 4/4] RISC-V: Add test for vec_duplicate + vrem.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-08 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check for vrem.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-

[PATCH v1 2/4] RISC-V: Add test for vec_duplicate + vdivu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-06 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vdivu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/v

[PATCH v1 4/4] RISC-V: Reconcile the existing test for vdivu.vx combine

2025-06-06 Thread pan2 . li

From: Pan Li Some existing vdiv related test need some adjust for the asm check due to cost model. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust the asm check for vdivu. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditt

[PATCH v1 0/4] RISC-V: Combine vec_duplicate + vdivu.vv to vdivu.vx on GR2VR cost

2025-06-06 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vdivu.vv into vdivu.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: |

[PATCH v1 1/4] RISC-V: Combine vec_duplicate + vidvu.vv to vdivu.vx on GR2VR cost

2025-06-06 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vdivu.vv to the vdivu.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if th

[PATCH v1 3/4] RISC-V: Add test for vec_duplicate + vdivu.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-06 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check for vdivu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx

[PATCH v1] RISC-V: Refine VX combine test case 0 to avoid code duplication

2025-06-15 Thread pan2 . li

From: Pan Li The case 0 for vx combine def functions are most the same across the different test files. Thus, re-arrange them in one place to avoid code duplication. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Leverage helper macros to avoid code d

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vmaxu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-14 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmaxu.vv combine to vmaxu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vmaxu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/v

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vmaxu.vv to vmaxu.vx on GR2VR cost

2025-06-14 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vmaxu.vv into vmaxu.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: Case 0:

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vmaxu.vv to vmaxu.vx on GR2VR cost

2025-06-14 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vmaxu.vv to the vmaxu.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if th

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vmaxu.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-14 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmaxu.vv combine to vmaxu.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check for vmaxu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vmin.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-16 Thread pan2 . li

From: Pan Li Add asm dump check test for vec_duplicate + vmin.vv combine to vmin.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check for vmin.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vmin.vv to vmin.vx on GR2VR cost

2025-06-16 Thread pan2 . li

From: Pan Li This patch would like to combine the vec_duplicate + vmin.vv to the vmin.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vmin.vv to vmin.vx on GR2VR cost

2025-06-16 Thread pan2 . li

From: Pan Li This patch would like to introduce the combine of vec_dup + vmin.vv into vmin.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: Case 0: |

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vmin.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-16 Thread pan2 . li

From: Pan Li Add asm dump check and run test for vec_duplicate + vmin.vv combine to vmin.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.

[PATCH v2] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-01 Thread pan2.li--- via Gcc-patches

From: Pan Li Fix the bug of the rvv bool mode precision with the adjustment. The bits size of vbool*_t will be adjusted to [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The adjusted mode precison of vbool*_t will help underlying pass to make t

[PATCH v3] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-02 Thread pan2.li--- via Gcc-patches

From: Pan Li Fix the bug of the rvv bool mode precision with the adjustment. The bits size of vbool*_t will be adjusted to [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The adjusted mode precison of vbool*_t will help underlying pass to make t

[PATCH v4] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-06 Thread pan2.li--- via Gcc-patches

From: Pan Li Fix the bug of the rvv bool mode precision with the adjustment. The bits size of vbool*_t will be adjusted to [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The adjusted mode precison of vbool*_t will help underlying pass to make t

[PATCH v5] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-07 Thread pan2.li--- via Gcc-patches

From: Pan Li Fix the bug of the rvv bool mode precision with the adjustment. The bits size of vbool*_t will be adjusted to [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The adjusted mode precison of vbool*_t will help underlying pass to make t

[PATCH] RISC-V: Bugfix for rvv bool mode size adjustment

2023-03-07 Thread pan2.li--- via Gcc-patches

From: yes Fix the bug of the rvv bool mode size by the adjustment. Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64]) of the vbool*_t, the mode size (aka byte size) will be adjusted to [1, 1, 1, 1, 2, 4, 8] according to the rvv spec 1.0 isa. The adjustment will provide correct inf

[PATCH] RTL: Bugfix for wrong code with v16hi compare & mask

2023-03-24 Thread pan2.li--- via Gcc-patches

From: Pan Li Fix the bug of the incorrect code generation for the below code sample. typedef unsigned short __attribute__((__vector_size__ (32))) V; typedef unsigned short u16; void foo (V m, u16 *ret) { V v = 6 > ((V) { 2049, 8 } & m); *ret = v[0]; // + a + b + c + d; } Before this patch.

[PATCH v2] RISCV: Bugfix for wrong code with v16hi compare & mask

2023-03-25 Thread pan2.li--- via Gcc-patches

From: yes Fix the bug of the incorrect code generation for the below code sample. typedef unsigned short __attribute__((__vector_size__ (32))) V; typedef unsigned short u16; void foo (V m, u16 *ret) { V v = 6 > ((V) { 2049, 8 } & m); *ret = v[0]; // + a + b + c + d; } Before this patch. ad

< 2 3 4 5 6 7

601 - 672 of 672 matches

Mail list logo