[PATCH v1] RISC-V: Adjust overlap attr after revert d3544cea63d and e65aaf8efe1

2024-04-22 Thread pan2 . li
From: Pan Li After we reverted below 2 commits, the reference to attr need some adjustment as the group_overlap is no longer available. * RISC-V: Robostify the W43, W86, W87 constraint enabled attribute * RISC-V: Rename vconstraint into group_overlap The below tests are passed for this patch.

[PATCH v1] RISC-V: Add xfail test case for highpart overlap of vext.vf

2024-04-23 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 62685890d88 RISC-V: Support highpart overlap for vext.vf The below test suites are passed for this patc

[PATCH v1] Revert "RISC-V: Support highpart register overlap for vwcvt"

2024-04-24 Thread pan2 . li
From: Pan Li This reverts commit bdad036da32f72b84a96070518e7d75c21706dc2. --- gcc/config/riscv/constraints.md | 23 gcc/config/riscv/riscv.md | 24 gcc/config/riscv/vector-crypto.md | 21 ++-- gcc/config/riscv/vector.md

[PATCH v1] RISC-V: Add early clobber to the dest of vwsll

2024-04-24 Thread pan2 . li
From: Pan Li We missed the existing early clobber for the dest operand of vwsll pattern when resolve the conflict of revert register overlap. Thus add it back to the pattern. Unfortunately, we have no test to cover this part and will improve this after GCC-15 open. The below tests are passed f

[PATCH v1] RISC-V: Add xfail test case for highpart register overlap of vwcvt

2024-04-24 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. bdad036da32 RISC-V: Support highpart register overlap for vwcvt The below test suites are passed for th

[PATCH v1] RISC-V: Add test cases for insn does not satisfy its constraints [PR114714]

2024-04-25 Thread pan2 . li
From: Pan Li We have one ICE when RVV register overlap is enabled. We reverted this feature as it is in stage 4 and there is no much time to figure a better solution for this. Thus, for now add the related test cases which will trigger ICE when register overlap enabled. This will gate the RVV

[PATCH v1] RISC-V: Fix ICE for legitimize move on subreg const_poly_move

2024-04-27 Thread pan2 . li
From: Pan Li When we build with isl, there will be a ICE for graphite in both the c/c++ and fortran. The legitimize move cannot take care of below rtl. (set (subreg:DI (reg:TI 237) 8) (subreg:DI (const_poly_int:TI [4, 2]) 8)) Then we will have ice similar to below: internal compiler error: in

[PATCH v2] RISC-V: Fix ICE for legitimize move on subreg const_poly_int [PR114885]

2024-04-29 Thread pan2 . li
From: Pan Li When we build with isl, there will be a ICE for graphite in both the c/c++ and fortran. The legitimize move cannot take care of below rtl. (set (subreg:DI (reg:TI 237) 8) (subreg:DI (const_poly_int:TI [4, 2]) 8)) Then we will have ice similar to below: internal compiler error: in

[PATCH v3] Internal-fn: Introduce new internal function SAT_ADD

2024-04-29 Thread pan2 . li
From: Pan Li Update in v3: * Rebase upstream for conflict. Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as

[PATCH v3] DSE: Fix ICE after allow vector type in get_stored_val

2024-04-30 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, the valididate_subreg treats the vector type's size is less than vector register as invalid. Then we will have ICE here. This patch would like to fix it by filter-out th

[PATCH v4] DSE: Fix ICE after allow vector type in get_stored_val

2024-05-02 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, the valididate_subreg treats the vector type's size is less than vector register as invalid. Then we will have ICE here. This patch would like to fix it by filter-out th

[PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-06 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_AD

[PATCH v4 2/3] VECT: Support new IFN SAT_ADD for unsigned vector int

2024-05-06 Thread pan2 . li
From: Pan Li This patch depends on below scalar enabling patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd

[PATCH v4 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-06 Thread pan2 . li
From: Pan Li This patch depends on below middle-end enabling patches for scalar and vector. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650823.html The patch also implement the SAT_ADD in the riscv backend as the sample for b

[PATCH v1] RISC-V: Make full-vec-move1.c test robust for optimization

2024-05-08 Thread pan2 . li
From: Pan Li During investigate the support of early break autovec, we notice the test full-vec-move1.c will be optimized to 'return 0;' in main function body. Because somehow the value of V type is compiler time constant, and then the second loop will be considered as assert (true). Thus, th

[PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-11 Thread pan2 . li
From: Pan Li For the vfw vx format RVV intrinsic, the scalar type _Float16 also requires the zvfh extension. Unfortunately, we only check the vector tree type and miss the scalar _Float16 type checking. For example: vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t v

[PATCH v1 1/3] Vect: Support loop len in vectorizable early exit

2024-05-13 Thread pan2 . li
From: Pan Li This patch adds early break auto-vectorization support for target which use length on partial vectorization. Consider this following example: unsigned vect_a[802]; unsigned vect_b[802]; void test (unsigned x, int n) {  for (int i = 0; i < n; i++)  {    vect_b[i] = x + i;    i

[PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-13 Thread pan2 . li
From: Pan Li This patch depends on below middle-end implementation. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html After we support the loop lens for the vectorizable, we would like to implement the feature for the RISC-V target. Given below example: unsigned vect_a[1923]; un

[PATCH v1 3/3] RISC-V: Enable vectorizable early exit test

2024-05-13 Thread pan2 . li
From: Pan Li This patch depends on below 2 patches. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651460.html After we supported vectorizable early exit in RISC-V, we would like to enable the gcc vect test for vectorizable ear

[committed] RISC-V: Fix format issue for trailing operator [NFC]

2024-05-13 Thread pan2 . li
From: Pan Li This patch would like to fix below format issue of trailing operator. === ERROR type #1: trailing operator (4 error(s)) === gcc/config/riscv/riscv-vector-builtins.cc:4641:39: if ((exts & RVV_REQUIRE_ELEN_FP_16) && gcc/config/riscv/riscv-vector-builtins.cc:4651:39: if ((exts & RVV_

[PATCH v1 1/2] RISC-V: Add testcases for form 3 of unsigned vector .SAT_ADD IMM

2024-08-29 Thread pan2 . li
From: Pan Li This patch would like to add test cases for the unsigned vector .SAT_ADD when one of the operand is IMM. Form 3: #define DEF_VEC_SAT_U_ADD_IMM_FMT_3(T, IMM) \ T __attribute__((noinline)) \ vec_sat_u_add_imm##IMM

[PATCH v1 2/2] RISC-V: Add testcases for form 4 of unsigned vector .SAT_ADD IMM

2024-08-29 Thread pan2 . li
From: Pan Li This patch would like to add test cases for the unsigned vector .SAT_ADD when one of the operand is IMM. Form 4: #define DEF_VEC_SAT_U_ADD_IMM_FMT_4(T, IMM) \ T __attribute__((noinline)) \ vec_sat_u_ad

[PATCH v1] RISC-V: Refactor gen zero_extend rtx for SAT_* when expand SImode in RV64

2024-08-30 Thread pan2 . li
From: Pan Li In previous, we have some specially handling for both the .SAT_ADD and .SAT_SUB for unsigned int. There are similar to take care of SImode in RV64 for zero extend. Thus refactor these two helper function into one for possible code duplication. The below test suite are passed for t

[PATCH v1] Vect: Support form 1 of vector signed integer .SAT_ADD

2024-08-30 Thread pan2 . li
From: Pan Li This patch would like to support the vector signed ssadd pattern for the RISC-V backend. Aka Form 1: #define DEF_VEC_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *x, T *y, unsign

[PATCH v1 1/2] Match: Add int type fits check for form 1 of .SAT_SUB imm operand

2024-09-01 Thread pan2 . li
From: Pan Li This patch would like to add strict check for imm operand of .SAT_SUB matching. We have no type checking for imm operand in previous, which may result in unexpected IL to be catched by .SAT_SUB pattern. We leverage the int_fits_type_p here to make sure the imm operand is a int type

[PATCH v1 2/2] Match: Add int type fits check for form 2 of .SAT_SUB imm operand

2024-09-01 Thread pan2 . li
From: Pan Li This patch would like to add strict check for imm operand of .SAT_SUB matching. We have no type checking for imm operand in previous, which may result in unexpected IL to be catched by .SAT_SUB pattern. We leverage the int_fits_type_p here to make sure the imm operand is a int type

[PATCH v1] RISC-V: Allow IMM operand for unsigned scalar .SAT_ADD

2024-09-02 Thread pan2 . li
From: Pan Li This patch would like to allow the IMM operand of the unsigned scalar .SAT_ADD. Like the operand 0, the operand 1 of .SAT_ADD will be zero extended to Xmode before underlying code generation. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc

[PATCH v1] Match: Support form 2 for scalar signed integer .SAT_ADD

2024-09-03 Thread pan2 . li
From: Pan Li This patch would like to support the form 2 of the scalar signed integer .SAT_ADD. Aka below example: Form 2: #define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \ T __attribute__((noinline)) \ sat_s_add_##T##_fmt_2 (T x, T y) \ {

[PATCH v1 1/2] Genmatch: Support new flow for phi on condition

2024-09-04 Thread pan2 . li
From: Pan Li The gen_phi_on_cond can only support below control flow for cond from day 1. Aka: +--+ | def | | ... | +-+ | cond |-->| def | +--+ | ... | | +-+ | | v | +-+ | | PHI |<--+ +-+ U

[PATCH v1 2/2] Match: Support form 3 for scalar signed integer .SAT_ADD

2024-09-04 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the scalar signed integer .SAT_ADD. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_add_##T##_fmt_3 (T x, T y)

[PATCH v1] RISC-V: Fix SAT_* dump check failure due to middle-end change.

2024-09-04 Thread pan2 . li
From: Pan Li Some middl-end change may effect on the times of .SAT_*. Thus, refine the dump check for SAT_*, from the scan-times to scan as we only care about the .SAT_* exist or not. And there will an other PATCH to perform similar refinement and this PATCH only fix the failed test cases. gcc

[PATCH v2 1/2] Genmatch: Support control flow graph case 1 for phi on condition

2024-09-05 Thread pan2 . li
From: Pan Li The gen_phi_on_cond can only support below control flow for cond from day 1. Aka: +--+ | def | | ... | +-+ | cond |-->| def | +--+ | ... | | +-+ | | v | +-+ | | PHI |<--+ +-+ U

[PATCH v2 2/2] Match: Support form 3 for scalar signed integer .SAT_ADD

2024-09-05 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the scalar signed integer .SAT_ADD. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_add_##T##_fmt_3 (T x, T y)

[PATCH v1] RISC-V: Fix asm check for Vector SAT_* due to middle-end change

2024-09-10 Thread pan2 . li
From: Pan Li The middle-end change makes the effect on the layout of the assembly for vector SAT_*. This patch would like to fix it and make it robust. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: Adjust asm check and make it robust.

[PATCH v3 1/5] Genmatch: Add control flow graph match for case 0 and case 1

2024-09-10 Thread pan2 . li
From: Pan Li The gen_phi_on_cond can only support below control flow for cond from day 1. Aka: +--+ | def | | ... | +-+ | cond |-->| def | +--+ | ... | | +-+ | | v | +-+ | | PHI |<--+ +-+ U

[PATCH v3 3/5] Genmatch: Refine the gen_phi_on_cond by match_cond_with_binary_phi

2024-09-10 Thread pan2 . li
From: Pan Li This patch would like to leverage the match_cond_with_binary_phi to match the phi on cond, and get the true/false arg if matched. This helps a lot to simplify the implementation of gen_phi_on_cond. Before this patch: basic_block _b1 = gimple_bb (_a1); if (gimple_phi_num_args (_a1)

[PATCH v3 4/5] Match: Support form 3 for scalar signed integer .SAT_ADD

2024-09-10 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the scalar signed integer .SAT_ADD. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_add_##T##_fmt_3 (T x, T y)

[PATCH v3 2/5] Match: Add interface match_cond_with_binary_phi for true/false arg

2024-09-10 Thread pan2 . li
From: Pan Li When matching the cond with 2 args phi node, we need to figure out which arg of phi node comes from the true edge of cond block, as well as the false edge. This patch would like to add interface to perform the action and return the true and false arg in TREE type. There will be som

[PATCH v3 5/5] RISC-V: Fix vector SAT_ADD dump check due to middle-end change

2024-09-10 Thread pan2 . li
From: Pan Li This patch would like fix the dump check times of vector SAT_ADD. The middle-end change makes the match times from 2 to 4 times. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/b

[PATCH v1] RISC-V: Implement SAT_ADD for signed integer vector

2024-09-11 Thread pan2 . li
From: Pan Li This patch would like to implement the ssadd for vector integer. Aka form 1 of ssadd vector. Form 1: #define DEF_VEC_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out

[PATCH v4 1/4] Match: Add interface match_cond_with_binary_phi for true/false arg

2024-09-12 Thread pan2 . li
From: Pan Li When matching the cond with 2 args phi node, we need to figure out which arg of phi node comes from the true edge of cond block, as well as the false edge. This patch would like to add interface to perform the action and return the true and false arg in TREE type. There will be som

[PATCH v4 2/4] Genmatch: Refine the gen_phi_on_cond by match_cond_with_binary_phi

2024-09-12 Thread pan2 . li
From: Pan Li This patch would like to leverage the match_cond_with_binary_phi to match the phi on cond, and get the true/false arg if matched. This helps a lot to simplify the implementation of gen_phi_on_cond. Before this patch: basic_block _b1 = gimple_bb (_a1); if (gimple_phi_num_args (_a1)

[PATCH v4 3/4] Match: Support form 3 for scalar signed integer .SAT_ADD

2024-09-12 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the scalar signed integer .SAT_ADD. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_add_##T##_fmt_3 (T x, T y)

[PATCH v4 4/4] RISC-V: Fix vector SAT_ADD dump check due to middle-end change

2024-09-12 Thread pan2 . li
From: Pan Li This patch would like fix the dump check times of vector SAT_ADD. The middle-end change makes the match times from 2 to 4 times. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/b

[PATCH v1] RISC-V: Fix signed SAT_ADD test case for int64_t

2024-09-12 Thread pan2 . li
From: Pan Li The int8_t test for signed SAT_ADD is sat_s_add-1.c, the sat_s_add-4.c should be for int64_t. Thus, update sat_s_add-4.c for int64_t type. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_s_add-4.c: Update test for int64_t instead of int8_t. Signed-off-by: Pan Li

[PATCH v1] RISC-V: Add testcases for form 2 of signed scalar SAT_ADD

2024-09-12 Thread pan2 . li
From: Pan Li This patch would like to add testcases of the signed scalar SAT_ADD for form 2. Aka: Form 2: #define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \ T __attribute__((noinline)) \ sat_s_add_##T##_fmt_2 (T x, T y) \ {

[PATCH v1] Match: Remove unnecessary types_match for case 1 of signed SAT_ADD

2024-09-12 Thread pan2 . li
From: Pan Li Given all commutative binary operators requires types matching for both operands. Remove the types_match check for case 1 of the signed SAT_ADD, because we have (bit_xor @0 @1), which ensure the operands have the correct TREE type. The below test suites are passed for this patch. *

[PATCH v1 4/4] RISC-V: Add testcases for form 1 of scalar signed SAT_TRUNC

2024-10-08 Thread pan2 . li
From: Pan Li Form 1: #define DEF_SAT_S_TRUNC_FMT_1(WT, NT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 3/4] RISC-V: Implement scalar SAT_TRUNC for signed integer

2024-10-08 Thread pan2 . li
From: Pan Li This patch would like to implement the sstrunc for scalar signed integer. Form 1: #define DEF_SAT_S_TRUNC_FMT_1(WT, NT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x) \ {

[PATCH v1 1/4] Match: Support form 1 for scalar signed integer SAT_TRUNC

2024-10-08 Thread pan2 . li
From: Pan Li This patch would like to support the form 1 of the scalar signed integer SAT_TRUNC. Aka below example: Form 1: #define DEF_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x)

[PATCH v1 2/4] Widening-Mul: Fix one bug of consume after phi node released

2024-10-08 Thread pan2 . li
From: Pan Li When try to matching saturation related pattern on PHI node, we may have to try each pattern for all phi node of bb. Aka: for each PHI node in bb: gphi *phi = xxx; try_match_sat_add (, phi); try_match_sat_sub (, phi); try_match_sat_trunc (, phi); The PHI node will be remov

[PATCH v1 2/2] RISC-V: Add testcases for form 2 of scalar signed SAT_TRUNC

2024-10-08 Thread pan2 . li
From: Pan Li Form 2: #define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 1/2] Match: Support form 2 for scalar signed integer SAT_TRUNC

2024-10-08 Thread pan2 . li
From: Pan Li This patch would like to support the form 2 of the scalar signed integer SAT_TRUNC. Aka below example: Form 2: #define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x)

[PATCH] Match: Remove dup match pattern for signed_integer_sat_sub [PR117141]

2024-10-14 Thread pan2 . li
From: Pan Li This patch would like to fix the warning as below: /home/slyfox/dev/git/gcc/gcc/match.pd:3424:3 warning: duplicate pattern (cond^ (ne (imagpart (IFN_SUB_OVERFLOW:c@2 @0 @1)) integer_zerop) ^ /home/slyfox/dev/git/gcc/gcc/match.pd:3397:3 warning: previous pattern defined here (con

[PATCH] RISC-V: Fix UNRESOLVED testcases for SAT alu vector mode

2024-10-14 Thread pan2 . li
From: Pan Li Some saturation related alu testcases missed additional option for expand check, which result in some UNRESOLVED issues. This patch would like to fix it by adding the option back as other testcases. The below test are passed for this patch. * The rv64gcv fully regression test. It

[PATCH 03/11] RISC-V: Implement vector SAT_TRUNC for signed integer

2024-10-14 Thread pan2 . li
From: Pan Li This patch would like to implement the sstrunc for vector signed integer. Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *i

[PATCH 01/11] Match: Support form 1 for vector signed integer SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li This patch would like to support the form 1 of the vector signed integer SAT_TRUNC. Aka below example: Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT#

[PATCH 09/11] RISC-V: Add testcases for form 6 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 6: #define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT *out, WT *in, unsigned limit) \ {

[PATCH 06/11] RISC-V: Add testcases for form 3 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 3: #define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT *out, WT *in, unsigned limit) \ {

[PATCH 07/11] RISC-V: Add testcases for form 4 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 4: #define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \ {

[PATCH 04/11] RISC-V: Add testcases for form 1 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \ {

[PATCH 02/11] Vect: Try the pattern of vector signed integer SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Almost the same as vector unsigned integer SAT_TRUNC, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog:

[PATCH 08/11] RISC-V: Add testcases for form 5 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 5: #define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT *out, WT *in, unsigned limit) \ {

[PATCH 05/11] RISC-V: Add testcases for form 2 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 2: #define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT *out, WT *in, unsigned limit) \ {

[PATCH 10/11] RISC-V: Add testcases for form 7 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 7: #define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT *out, WT *in, unsigned limit) \ {

[PATCH 11/11] RISC-V: Add testcases for form 8 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 8: #define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT *out, WT *in, unsigned limit) \ {

[PATCH v1 2/2] RISC-V: Add testcases for form 3 of scalar signed SAT_TRUNC

2024-10-09 Thread pan2 . li
From: Pan Li Form 3: #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 1/2] Match: Support form 3 for scalar signed integer SAT_TRUNC

2024-10-09 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the scalar signed integer SAT_TRUNC. Aka below example: Form 3: #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x)

[PATCH v1 1/4] Match: Support form 1 for vector signed integer SAT_SUB

2024-10-10 Thread pan2 . li
From: Pan Li This patch would like to support the form 1 of the vector signed integer SAT_SUB. Aka below example: Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1

[PATCH v1 2/4] Vect: Try the pattern of vector signed integer SAT_SUB

2024-10-10 Thread pan2 . li
From: Pan Li Almost the same as vector unsigned integer SAT_SUB, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog:

[PATCH v1 3/4] RISC-V: Implement vector SAT_SUB for signed integer

2024-10-10 Thread pan2 . li
From: Pan Li This patch would like to implement the sssub for vector signed integer. Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, un

[PATCH v1 4/4] RISC-V: Add testcases for form 1 of vector signed SAT_SUB

2024-10-10 Thread pan2 . li
From: Pan Li Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ {

[PATCH v1 1/2] Match: Support form 4 for scalar signed integer SAT_TRUNC

2024-10-09 Thread pan2 . li
From: Pan Li This patch would like to support the form 4 of the scalar signed integer SAT_TRUNC. Aka below example: Form 4: #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x)

[PATCH v1 2/2] RISC-V: Add testcases for form 4 of scalar signed SAT_TRUNC

2024-10-09 Thread pan2 . li
From: Pan Li Form 4: #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 2/3] RISC-V: Add testcases for form 3 of scalar signed SAT_SUB

2024-10-07 Thread pan2 . li
From: Pan Li Form 3: #define DEF_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline)) \ sat_s_sub_##T##_fmt_3 (T x, T y) \ {\ T minus;

[PATCH v1 3/3] RISC-V: Add testcases for form 4 of scalar signed SAT_SUB

2024-10-07 Thread pan2 . li
From: Pan Li Form 4: #define DEF_SAT_S_SUB_FMT_4(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_sub_##T##_fmt_4 (T x, T y) \ { \ T minus;

[PATCH v1 1/3] Match: Support form 3 and form 4 for scalar signed integer SAT_SUB

2024-10-07 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 and form 4 of the scalar signed integer SAT_SUB. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_add_##T##_fmt_3 (T x, T y)

[PATCH v1 4/4] RISC-V: Add testcases for form 8 of scalar signed SAT_TRUNC

2024-10-10 Thread pan2 . li
From: Pan Li Form 8: #define DEF_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_8 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 2/4] RISC-V: Add testcases for form 6 of scalar signed SAT_TRUNC

2024-10-10 Thread pan2 . li
From: Pan Li Form 6: #define DEF_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_6 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 3/4] RISC-V: Add testcases for form 7 of scalar signed SAT_TRUNC

2024-10-10 Thread pan2 . li
From: Pan Li Form 7: #define DEF_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_7 (WT x) \ { \ NT trunc = (NT)x;

[PATCH v1 1/4] RISC-V: Add testcases for form 5 of scalar signed SAT_TRUNC

2024-10-10 Thread pan2 . li
From: Pan Li Form 5: #define DEF_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_5 (WT x) \ { \ NT trunc = (NT)x;

[PATCH 1/5] Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to introduce new IFN for strided load and store. LOAD: v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias) STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias) The IFN target below code example similar as below void foo (int * a, int * b, int

[PATCH 2/5] Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR for invariant stride memory access. For example as below void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Bef

[PATCH 3/5] RISC-V: Adjust the gather-scatter testcases due to middle-end change

2024-10-23 Thread pan2 . li
From: Pan Li After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the strided case need to be adjust for IR check. The below test suites are passed for this patch: * The riscv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/gather-scatter/strided

[PATCH 5/5] RISC-V: Add testcases for form 1 of MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li Form 1: void __attribute__((noinline))\ vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \ long stride, size_t size)\ {

[PATCH 4/5] RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in the RISC-V backend by leveraging the vector strided load/store insn. For example: void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stri

[PATCH 1/4] RISC-V: Add testcases for form 2 of vector signed SAT_SUB

2024-10-11 Thread pan2 . li
From: Pan Li Form 2: #define DEF_VEC_SAT_S_SUB_FMT_2(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \ {

[PATCH 2/4] Match: Support form 3 for vector signed integer SAT_SUB

2024-10-11 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the vector signed integer SAT_SUB. Aka below example: Form 3: #define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_3

[PATCH 4/4] RISC-V: Add testcases for form 4 of vector signed SAT_SUB

2024-10-11 Thread pan2 . li
From: Pan Li Form 4: #define DEF_VEC_SAT_S_SUB_FMT_4(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_4 (T *out, T *op_1, T *op_2, unsigned limit) \ {

[PATCH 3/4] RISC-V: Add testcases for form 3 of vector signed SAT_SUB

2024-10-11 Thread pan2 . li
From: Pan Li Form 3: #define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_sub_##T##_fmt_3 (T *out, T *op_1, T *op_2, unsigned limit) \ {

[PATCH 1/5] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-10-29 Thread pan2 . li
From: Pan Li There are sorts of forms for the unsigned SAT_ADD. Some of them are complicated while others are cheap. This patch would like to simplify the complicated form into the cheap ones. For example as below: >From the form 4 (branch): SAT_U_ADD = (X + Y) < x ? -1 : (X + Y). To (bran

[PATCH 4/5] Match: Remove usadd_left_part_1 as it has only one reference [NFC]

2024-10-29 Thread pan2 . li
From: Pan Li In previous, we extract matching usadd_left_part_1 to avoid duplication. After we simplify some usadd patterns into cheap form, there will be only one reference to this matching. Thus, remove this matching pattern and unfold it to the reference place. The below test suites are pass

[PATCH 5/5] Match: Update the comments of unsigned integer SAT_ADD [NFC]

2024-10-29 Thread pan2 . li
From: Pan Li Sorts of comments of unsigned integer SAT_ADD matching is not updated to date. This patch would like to refine it. The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap tests. 3. The x86 fully regression tests. gcc/ChangeLog:

[PATCH] Match: Simplify branch form 3 of unsigned SAT_ADD into branchless

2024-10-25 Thread pan2 . li
From: Pan Li There are sorts of forms for the unsigned SAT_ADD. Some of them are complicated while others are cheap. This patch would like to simplify the complicated form into the cheap ones. For example as below: >From the form 3 (branch): SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1. To (bra

[PATCH v1] Doc: Add doc for standard name mask_len_strided_load{store}m

2024-10-29 Thread pan2 . li
From: Pan Li This patch would like to add doc for the below 2 standard names. 1. strided load: v = mask_len_strided_load (ptr, stried, mask, len, bias) 2. strided store: mask_len_stried_store (ptr, stride, v, mask, len, bias) gcc/ChangeLog: * doc/md.texi: Add doc for mask_len_stried_lo

[PATCH 3/5] Match: Simplify branch form 8 of unsigned SAT_ADD into branchless

2024-10-29 Thread pan2 . li
From: Pan Li There are sorts of forms for the unsigned SAT_ADD. Some of them are complicated while others are cheap. This patch would like to simplify the complicated form into the cheap ones. For example as below: >From the form 8 (branch): SAT_U_ADD = x > (T)(x + y) ? -1 : (x + y). To (b

[PATCH 2/5] Match: Simplify branch form 7 of unsigned SAT_ADD into branchless

2024-10-29 Thread pan2 . li
From: Pan Li There are sorts of forms for the unsigned SAT_ADD. Some of them are complicated while others are cheap. This patch would like to simplify the complicated form into the cheap ones. For example as below: >From the form 7 (branch): SAT_U_ADD = x <= (T)(x + y) ? (x + y) : -1. To (

[PATCH v1 5/5] Test: Add testcases for form 16 of unsigned integer SAT_ADD simplify

2024-11-05 Thread pan2 . li
From: Pan Li The phiopt2 pass will also try the gimple_simplify for the form 16 of unsigned integer SAT_ADD. Thus add the testcase to make sure it will be performed in phiopt2 pass. The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap test

[PATCH v1 3/5] Test: Add testcases for form 14 of unsigned integer SAT_ADD simplify

2024-11-05 Thread pan2 . li
From: Pan Li The phiopt2 pass will also try the gimple_simplify for the form 14 of unsigned integer SAT_ADD. Thus add the testcase to make sure it will be performed in phiopt2 pass. The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap test

[PATCH v1 2/5] Test: Add testcases for form 13 of unsigned integer SAT_ADD simplify

2024-11-05 Thread pan2 . li
From: Pan Li The phiopt2 pass will also try the gimple_simplify for the form 12 of unsigned integer SAT_ADD. Thus add the testcase to make sure it will be performed in phiopt2 pass. The below test suites are passed for this patch: 1. The rv64gcv fully regression tests. 2. The x86 bootstrap test

<    1   2   3   4   5   6   7   >