[PATCH v1 1/4] RISC-V: Combine vec_duplicate + vidv.vv to vdiv.vx on GR2VR cost

2025-06-02 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vdiv.vv to the vdiv.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 2/4] RISC-V: Add test for vec_duplicate + vdiv.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-02 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a

[PATCH v1 3/4] RISC-V: Add test for vec_duplicate + vdiv.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-02 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 4/4] RISC-V: RISC-V: Reconcile the existing test for vdiv.vx combine

2025-06-02 Thread pan2 . li
From: Pan Li Some existing vdiv related test need some adjust for the asm check. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust the asm check for vdiv. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto. * gcc.ta

[PATCH v1 0/4] RISC-V: Combine vec_duplicate + vdiv.vv to vdiv.vx on GR2VR cost

2025-06-02 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vdiv.vv into vdiv.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1] RISC-V: Fix line too long format issue for autovect.md [NFC]

2025-05-30 Thread pan2 . li
From: Pan Li Inspired by the avg_ceil patches, notice there were even more lines too long from autovec.md. So fix that format issues. gcc/ChangeLog: * config/riscv/autovec.md: Fix line too long for sorts of pattern. Signed-off-by: Pan Li --- gcc/config/riscv/autovec.md | 54

[PATCH v1 3/3] RISC-V: Add test cases for avg_ceil vaadd implementation

2025-05-29 Thread pan2 . li
From: Pan Li Add asm and run testcase for avg_ceil vaadd implementation. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/avg.h: Add test helper macros. * gcc.target/riscv/rvv/au

[PATCH v1 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_ceil

2025-05-29 Thread pan2 . li
From: Pan Li The avg_ceil has the rounding mode towards +inf, while the vaadd.vv has the rnu which totally match the sematics. From RVV spec, the fixed vaadd.vv with rnu, roundoff_signed(v, d) = (signed(v) >> d) + r r = v[d - 1] For vaadd, d = 1, then we have roundoff_signed(v, 1) = (signed(v

[PATCH v1 2/3] RISC-V: Reconcile the existing test for avg_ceil

2025-05-29 Thread pan2 . li
From: Pan Li Some existing avg_floor test need updated due to change to leverage vaadd.vv directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-4.c: Update asm check to vaadd. * gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto. * gcc.target/ris

[PATCH v1 0/3] Refine the avg_ceil with fixed point vaadd

2025-05-29 Thread pan2 . li
From: Pan Li Similar to the avg_floor, the avg_ceil has the rounding mode towards +inf, while the vaadd.vv has the rnu which totally match the sematics. From RVV spec, the fixed vaadd.vv with rnu, roundoff_signed(v, d) = (signed(v) >> d) + r r = v[d - 1] For vaadd, d = 1, then we have roundof

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vmul.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-28 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vmul.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-28 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost

2025-05-28 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vmul.vv to the vmul.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vmul.vv to vmul.vx on GR2VR cost

2025-05-28 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vmul.vv into vmul.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v2 3/3] RISC-V: Add test cases for avg_floor vaadd implementation

2025-05-27 Thread pan2 . li
From: Pan Li Add asm and run testcase for avg_floor vaadd implementation. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/avg.h: New test. * gcc.target/riscv/rvv/autovec/avg_dat

[PATCH v2 2/3] RISC-V: Reconcile the existing test for avg_floor

2025-05-27 Thread pan2 . li
From: Pan Li Some existing avg_floor test need updated due to change to leverage vaadd.vv directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check to vaadd. * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto. * gcc.target/ris

[PATCH v2 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_floor

2025-05-27 Thread pan2 . li
From: Pan Li The signed avg_floor totally match the sematics of fixed point rvv insn vaadd, within round down. Thus, leverage it directly to implement the avf_floor. The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that disc

[PATCH v2 0/3] Refine the avg_floor with fixed point vaadd

2025-05-27 Thread pan2 . li
From: Pan Li The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that discard least-significant information. For float point which is not two's complement, the "discard least-significant information" indicates truncation round.

[PATCH v1 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_floor

2025-05-26 Thread pan2 . li
From: Pan Li The signed avg_floor totally match the sematics of fixed point rvv insn vaadd, within round down. Thus, leverage it directly to implement the avf_floor. The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that disc

[PATCH v1 3/3] RISC-V: Add test cases for avg_floor vaadd implementation

2025-05-26 Thread pan2 . li
From: Pan Li Add asm and run testcase for avg_floor vaadd implementation. The below test suites are passed for this patch series. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/avg.h: New test. * gcc.target/riscv/rvv/autovec/avg_dat

[PATCH v1 2/3] RISC-V: Reconcile the existing test for avg_floor

2025-05-26 Thread pan2 . li
From: Pan Li Some existing avg_floor test need updated due to change to leverage vaadd.vv directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check to vaadd. * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto. * gcc.target/ris

[PATCH v1 0/3] Refine the avg_floor with fixed point vaadd

2025-05-26 Thread pan2 . li
From: Pan Li The spec of RVV is somehow not that clear about the difference between the float point and fixed point for the rounding that discard least-significant information. For float point which is not two's complement, the "discard least-significant information" indicates truncation round.

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost

2025-05-26 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vxor.vv to the vxor.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vxor.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-26 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vxor.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-25 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vxor.vv to vxor.vx on GR2VR cost

2025-05-25 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vxor.vv into vxor.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vor.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-22 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vor.vv to vor.vx on GR2VR cost

2025-05-22 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vor.vv to the vor.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the GR

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vor.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-22 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add tes

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vor.vv to vor.vx on GR2VR cost

2025-05-22 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vor.vv into vor.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | ...

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vand.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-20 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vand.vv combine to vand.vx, with the GR2VR cost is 0, 2 and 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add t

[PATCH v1 1/3] RISC-V: RISC-V: Combine vec_duplicate + vand.vv to vand.vx on GR2VR cost

2025-05-20 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vand.vv to the vand.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vand.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-20 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vand.vv combine to vand.vx, with the GR2VR cost is 0, 1 and 2. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vand.vv to vand.vx on GR2VR cost

2025-05-20 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vand.vv into vand.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 3/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 1

2025-05-18 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add vrsub asm dump check.

[PATCH v1 6/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 1

2025-05-18 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add asm check for vrsub with GR

[PATCH v1 7/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 2

2025-05-18 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add asm check for vrsub with GR

[PATCH v1 8/8] RISC-V: Tweak the asm check test of vx combine on GR2VR cost [NFC]

2025-05-18 Thread pan2 . li
From: Pan Li Tweak the asm check with define T uint8_t for adding more vx test easily, as well as less possibility to make mistake. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i

[PATCH v1 5/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 1 with GR2VR cost 0

2025-05-18 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check for vrsub case 1

[PATCH v1 2/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 0

2025-05-18 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vrsub.vv combine to vrsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vrsub asm check. *

[PATCH v1 4/8] RISC-V: Add test for vec_duplicate + vrsub.vv combine case 0 with GR2VR cost 15

2025-05-18 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add asm check for vrsub with GR

[PATCH v1 1/8] RISC-V: Combine vec_duplicate + vrsub.vv to vrsub.vx on GR2VR cost

2025-05-18 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vrub.vv to the vrsub.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 0/8] RISC-V: Combine vec_duplicate + vrsub.vv to vrsub.vx on GR2VR cost

2025-05-18 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vsub.vv into vsub.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1] RISC-V: Avoid scalar unsigned SAT_ADD test data duplication

2025-05-16 Thread pan2 . li
From: Pan Li Some of the previous scalar unsigned SAT_ADD test data are duplicated in different test files. This patch would like to move them into a shared header file, to avoid the test data duplication. The below test suites are passed for this patch series. * The rv64gcv fully regression te

[PATCH v1 07/10] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 0

2025-05-13 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add test cases for vsub vx combin

[PATCH v1 10/10] RISC-V: Reuse test name for vx combine test data [NFC]

2025-05-13 Thread pan2 . li
From: Pan Li For run test, we have a name like add/sub to indicate the testcase. So we can reuse this to identify the test data instead of a new one. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/a

[PATCH v1 06/10] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 15

2025-05-13 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add test cases for vsub vx combin

[PATCH v1 09/10] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 2

2025-05-13 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add test cases for vsub vx combin

[PATCH v1 03/10] RISC-V: Adjust vx combine test case to avoid name conflict

2025-05-13 Thread pan2 . li
From: Pan Li Given we will put all vx combine for int8 in a single file, we need to make sure the generate function for different types and ops has different function name. Thus, refactor the test helper macros for avoiding possible function name conflict. The below test suites are passed for t

[PATCH v1 04/10] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 0

2025-05-13 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vector sub vx combine

[PATCH v1 08/10] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 1

2025-05-13 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add test cases for vsub vx combin

[PATCH v1 02/10] RISC-V: Rename vx_vadd-* testcase to vx-* for all vx combine [NFC]

2025-05-13 Thread pan2 . li
From: Pan Li We would like to arrange all vx combine asm check test into one file for better management. Thus, rename vx_vadd-* to vx-*. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i16.c: Move to... * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: ..

[PATCH v1 05/10] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 1

2025-05-13 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add test cases for vsub vx combine

[PATCH v1 01/10] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-13 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vsub.vv to the vsub.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 00/10] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-13 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vsub.vv into vsub.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 5/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 0

2025-05-11 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c: New test. * gcc.target/riscv

[PATCH v1 7/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 2

2025-05-11 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c: New test. * gcc.target/riscv

[PATCH v1 2/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 0

2025-05-11 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test data for v

[PATCH v1 4/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 15

2025-05-11 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c: New test. * gcc.target/riscv

[PATCH v1 6/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 1 with GR2VR cost 1

2025-05-11 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c: New test. * gcc.target/riscv

[PATCH v1 3/7] RISC-V: Add test for vec_duplicate + vsub.vv combine case 0 with GR2VR cost 1

2025-05-11 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c: New test. * gcc.target/riscv/

[PATCH v1 0/7] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-11 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vsub.vv into vsub.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. There will be two cases for the combine: Case 0: | .

[PATCH v1 1/7] RISC-V: Combine vec_duplicate + vsub.vv to vsub.vx on GR2VR cost

2025-05-11 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vsub.vv to the vsub.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if the

[PATCH v1 3/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine case 1 with GR2VR cost 0

2025-05-08 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx. The late-combine will take action when GR2VR cost is 0, because the vmv and the vadd.vx will consume the same cost of GR2VR. Aka: Before: L1: vmv.v.x vadd.vv J L1 After: L1: vadd.vx J L1 The b

[PATCH v1 5/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine case 1 with GR2VR cost 2

2025-05-08 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx with the cost of GR2VR is 2. The testcases is not that tidy according to the result, but we will continue tuning the cost model for this. The below test suites are passed for this patch. * The rv64gcv full

[PATCH v1 4/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine case 1 with GR2VR cost 1

2025-05-08 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx with the cost of GR2VR is 1. The testcases is not that tidy according to the result, but we will continue tuning the cost model for this. The below test suites are passed for this patch. * The rv64gcv full

[PATCH v1 0/5] Add testcases for another case of vec_duplicate + vadd.vv combine

2025-05-08 Thread pan2 . li
From: Pan Li We have the testcase for vec_duplicate + vadd.vv combine as below already, aka: Before: ... vmv.v.x L1: vadd.vv J L1 ... After: ... L1: vadd.vx J L1 ... But there is still another case like below: Before: ... L1: vmv.v.x vadd.vv J L1 ... After: ...

[PATCH v1 1/5] RISC-V: Separate the test running of rvv vx_vf

2025-05-08 Thread pan2 . li
From: Pan Li The default test running in rvv.exp takes the -fno-vect-cost-model for most of these options. It is not that suitable as the vx_vf test depends on the cost-model. Thus, separate the vx_vf test cases without -fno-vect-cost-model in another options. The below test suites are passed

[PATCH v1 2/5] RISC-V: Rename VX_BINARY test helper to VX_BINARY_CASE_0

2025-05-08 Thread pan2 . li
From: Pan Li This patch would like to rename the VX_BINARY within CASE_0 suffix, as we have another case of VX_BINARY test code. Aka case 1: L1: vmv.v.x vadd.vv J L1 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Rename VX_BINARY to VX_BINARY_

[PATCH v4 6/6] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 15

2025-05-06 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx. The late-combine will not take action when GR2VR cost is 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec

[PATCH v4 4/6] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 0

2025-05-06 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vadd.vv combine to vadd.vx. Introduce new folder to hold all related testcases. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.ex

[PATCH v4 0/6] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-05-06 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vadd.vv into vadd.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. A helper function get_gr2vr_cost is introduced to make s

[PATCH v4 5/6] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 1

2025-05-06 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx. The late-combine will not take action when GR2VR cost is 1. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/

[PATCH v4 1/6] RISC-V: Add new option --param=gpr2vr-cost= for rvv insn

2025-05-06 Thread pan2 . li
From: Pan Li During investigate the combine from vec_dup and vop.vv into vop.vx, we need to depend on the cost of the insn operate from the gpr to vr, for example, vadd.vx. Thus, for better control and test, we introduce a new option, aka below: --param=gpr2vr-cost= To specific the cost value

[PATCH v4 3/6] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-05-06 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR, it will: * The pattern matching will be active by default. * The cost of GR2VR will be added to the tot

[PATCH v4 2/6] RISC-V: Add gr2vr cost helper function

2025-05-06 Thread pan2 . li
From: Pan Li After we introduced the --param=gpr2vr-cost option to set the cost value of when operation act from gpr to vr, we would like to introduce a new helper function to get the cost of gp2vr. And then make sure all reference to gr2vr should go this helper function. The helper function wi

[PATCH v1 3/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 0

2025-05-03 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vadd.vv combine to vadd.vx. Introduce new folder to hold all related testcases. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.ex

[PATCH v1 2/5] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-05-03 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR, it will: * The pattern matching will be active by default. * The cost of GR2VR will be added to the tot

[PATCH v1 4/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 1

2025-05-03 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx. The late-combine will not take action when GR2VR cost is 1. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/

[PATCH v1 1/5] RISC-V: Add new option --param=rvv-gr2vr-cost= for rvv insn

2025-05-03 Thread pan2 . li
From: Pan Li During investigate the combine from vec_dup and vop.vv into vop.vx, we need to depend on the cost of the insn operate from the gr to vr, for example, vadd.vx. Thus, for better control and test, we introduce a new option, aka below: --param=rvv-gr2vr-cost= To specific the cost valu

[PATCH v1 5/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 15

2025-05-03 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx. The late-combine will not take action when GR2VR cost is 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec

[PATCH v3 0/5] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-05-03 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vadd.vv into vadd.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. The below test suites are passed for this patch series.

[PATCH v1 2/3] RISC-V: Add testcases for scalar unsigned integer SAT_ADD form 7

2025-04-28 Thread pan2 . li
From: Pan Li This patch will add testcase for unsigned integer SAT_ADD form 7: #define DEF_SAT_U_ADD_FMT_7(WT, T) \ T __attribute__((noinline))\ sat_u_add_##WT##_##T##_fmt_7(T x, T y) \ { \ T max = -1; \

[PATCH v1 0/3] Support form 7 of unsigned integer SAT_ADD

2025-04-28 Thread pan2 . li
From: Pan Li This patch serices would like to support form 7 of the unsigned integer SAT_ADD. Different to another forms of SAT_ADD, the form 7 will leverage a wider type to tell overflow or not, aka: #define DEF_SAT_U_ADD_FMT_7(WT, T) \ T __attribute__((noinline))\ sat_u_

[PATCH v1 1/3] Match: Support form 7 for unsigned integer SAT_ADD

2025-04-28 Thread pan2 . li
From: Pan Li This patch would like to support the form 7 of the unsigned integer SAT_ADD, aka below example. #define DEF_SAT_U_ADD_FMT_7(WT, T) \ T __attribute__((noinline))\ sat_u_add_##WT##_##T##_fmt_7(T x, T y) \ { \ T max = -1;

[PATCH v1 3/3] RISC-V: Add testcases for vector unsigned integer SAT_ADD form 7

2025-04-28 Thread pan2 . li
From: Pan Li This patch will add testcase for unsigned integer SAT_ADD form 7: #define DEF_VEC_SAT_U_ADD_FMT_9(WT, T) \ void __attribute__((noinline)) \ vec_sat_u_add_##WT##_##T##_fmt_9 (T *out, T *op_1, T *o

[PATCH v1 4/4] RISC-V: Extract vector stepped for expand_const_vector [NFC]

2025-04-22 Thread pan2 . li
From: Pan Li Consider the expand_const_vector is quit long (about 500 lines) and complicated, we would like to extract the different case into different functions. For example, the const vector stepped will be extracted into expand_const_vector_stepped. The below test suites are passed for this

[PATCH v1 3/4] RISC-V: Extract vector duplicate for expand_const_vector [NFC]

2025-04-22 Thread pan2 . li
From: Pan Li Consider the expand_const_vector is quit long (about 500 lines) and complicated, we would like to extract the different case into different functions. For example, the const vector duplicate will be extracted into expand_const_vector_duplicate, and then expand_const_vector_duplicate

[PATCH v1 2/4] RISC-V: Extract vec_series for expand_const_vector [NFC]

2025-04-22 Thread pan2 . li
From: Pan Li Consider the expand_const_vector is quit long (about 500 lines) and complicated, we would like to extract the different case into different functions. For example, the const vec_series will be extracted into expand_const_vec_series. The below test suites are passed for this patch.

[PATCH v1 1/4] RISC-V: Extract vec_duplicate for expand_const_vector [NFC]

2025-04-22 Thread pan2 . li
From: Pan Li Consider the expand_const_vector is quit long (about 500 lines) and complicated, we would like to extract the different case into different functions. For example, the const vec_duplicate will be extracted into expand_const_vec_duplicate. The below test suites are passed for this p

[PATCH v1 0/4] Refactor long function expand_const_vector

2025-04-22 Thread pan2 . li
From: Pan Li Per discussion from PR118931 thread, the expand_const_vector is quit long with more than 500 lines, which is unfriendly for debugging and maintaince. Thus, we extract some sub functions to make it clear and delicate the concrete const vector expanding into sub functions. Aka: expan

[PATCH v2 3/3] RISC-V: Add testcases for vec_duplicate + vadd.vv combine to vadd.vx

2025-04-19 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vadd.vv combine to vadd.vx. Introduce new folder to hold all related testcases. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.ex

[PATCH v2 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-04-19 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR, it will: * The pattern matching will be inactive if GR2VR cost is zero. * The cost of GR2VR will be add

[PATCH v2 0/3] Introduce vec_dup + vadd.vv combine to vadd.vx

2025-04-19 Thread pan2 . li
From: Pan Li This patch series would like to introudce the vec_dup + vadd.vv combine to vadd.vx, based on the cost of the GR2VR. For example as below. v1 = vec_dup(x2) v2 = vec_add_vv(v3, v1) will be optimized to below in late-combine v2 = vec_add_vx(v3, x3) If and only if the cost of (vec_d

[PATCH v2 2/3] RISC-V: Adjust the testcases after vec_duplicate + vadd.vv combine

2025-04-19 Thread pan2 . li
From: Pan Li After we support the vec_duplicate + vadd.vv combine to vadd.vx, the existing testcases need some adjust for asm dump check times. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/

[PATCH 3/3][GCC16-Stage-1] RISC-V: Add testcases for vec_duplicate + vadd.vv combine to vadd.vx

2025-04-17 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vadd.vv combine to vadd.vx. Introduce new folder to hold all related testcases. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.ex

[PATCH 2/3][GCC16-Stage-1] RISC-V: Adjust the testcases after vec_duplicate + vadd.vv combine

2025-04-17 Thread pan2 . li
From: Pan Li After we support the vec_duplicate + vadd.vv combine to vadd.vx, the existing testcases need some adjust for asm dump check times. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/

[PATCH 1/3][GCC16-Stage-1] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-04-17 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR, it will: * The pattern matching will be inactive if GR2VR cost is zero. * The cost of GR2VR will be add

[PATCH v1][GCC16-Stage-1] RISC-V: Remove unnecessary frm restore volatile define_insn

2025-04-16 Thread pan2 . li
From: Pan Li After we add the frm register to the global_regs, we may not need to define_insn that volatile to emit the frm restore insns. The cooperatively-managed global register will help to handle this, instead of emit the volatile define_insn explicitly. gcc/ChangeLog: * config/ri

[PATCH v1] RISC-V: Refine the testcases for cond_widen_complicate-3

2025-03-12 Thread pan2 . li
From: Pan Li Rearrange the test cases of cond_widen_complicate-3 by different types into different files, instead of put all types together. Then we can easily reduce the range when asm check fails. The below test suites are passed locally, let's wait online CI says. * The rv64gcv fully regress

  1   2   3   4   5   6   7   >