From: Pan Li
After we reverted below 2 commits, the reference to attr need some
adjustment as the group_overlap is no longer available.
* RISC-V: Robostify the W43, W86, W87 constraint enabled attribute
* RISC-V: Rename vconstraint into group_overlap
The below tests are passed for this patch.
From: Pan Li
We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.
62685890d88 RISC-V: Support highpart overlap for vext.vf
The below test suites are passed for this patc
From: Pan Li
This reverts commit bdad036da32f72b84a96070518e7d75c21706dc2.
---
gcc/config/riscv/constraints.md | 23
gcc/config/riscv/riscv.md | 24
gcc/config/riscv/vector-crypto.md | 21 ++--
gcc/config/riscv/vector.md
From: Pan Li
We missed the existing early clobber for the dest operand of vwsll
pattern when resolve the conflict of revert register overlap. Thus
add it back to the pattern. Unfortunately, we have no test to cover
this part and will improve this after GCC-15 open.
The below tests are passed f
From: Pan Li
We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.
bdad036da32 RISC-V: Support highpart register overlap for vwcvt
The below test suites are passed for th
From: Pan Li
We have one ICE when RVV register overlap is enabled. We reverted this
feature as it is in stage 4 and there is no much time to figure a better
solution for this. Thus, for now add the related test cases which will
trigger ICE when register overlap enabled.
This will gate the RVV
From: Pan Li
When we build with isl, there will be a ICE for graphite in both
the c/c++ and fortran. The legitimize move cannot take care of
below rtl.
(set (subreg:DI (reg:TI 237) 8) (subreg:DI (const_poly_int:TI [4, 2]) 8))
Then we will have ice similar to below:
internal compiler error: in
From: Pan Li
When we build with isl, there will be a ICE for graphite in both
the c/c++ and fortran. The legitimize move cannot take care of
below rtl.
(set (subreg:DI (reg:TI 237) 8) (subreg:DI (const_poly_int:TI [4, 2]) 8))
Then we will have ice similar to below:
internal compiler error: in
From: Pan Li
Update in v3:
* Rebase upstream for conflict.
Update in v2:
* Fix one failure for x86 bootstrap.
Original log:
This patch would like to add the middle-end presentation for the
saturation add. Aka set the result of add to the max when overflow.
It will take the pattern similar as
From: Pan Li
We allowed vector type for get_stored_val when read is less than or
equal to store in previous. Unfortunately, the valididate_subreg
treats the vector type's size is less than vector register as
invalid. Then we will have ICE here.
This patch would like to fix it by filter-out th
From: Pan Li
We allowed vector type for get_stored_val when read is less than or
equal to store in previous. Unfortunately, the valididate_subreg
treats the vector type's size is less than vector register as
invalid. Then we will have ICE here.
This patch would like to fix it by filter-out th
From: Pan Li
This patch would like to add the middle-end presentation for the
saturation add. Aka set the result of add to the max when overflow.
It will take the pattern similar as below.
SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))
Take uint8_t as example, we will have:
* SAT_AD
From: Pan Li
This patch depends on below scalar enabling patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html
For vectorize, we leverage the existing vect pattern recog to find
the pattern similar to scalar and let the vectorizer to perform
the rest part for standard name usadd
From: Pan Li
This patch depends on below middle-end enabling patches for scalar and vector.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650823.html
The patch also implement the SAT_ADD in the riscv backend as
the sample for b
From: Pan Li
During investigate the support of early break autovec, we notice
the test full-vec-move1.c will be optimized to 'return 0;' in main
function body. Because somehow the value of V type is compiler
time constant, and then the second loop will be considered as
assert (true).
Thus, th
From: Pan Li
For the vfw vx format RVV intrinsic, the scalar type _Float16 also
requires the zvfh extension. Unfortunately, we only check the
vector tree type and miss the scalar _Float16 type checking. For
example:
vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t v
From: Pan Li
This patch adds early break auto-vectorization support for target which
use length on partial vectorization. Consider this following example:
unsigned vect_a[802];
unsigned vect_b[802];
void test (unsigned x, int n)
{
for (int i = 0; i < n; i++)
{
vect_b[i] = x + i;
i
From: Pan Li
This patch depends on below middle-end implementation.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html
After we support the loop lens for the vectorizable, we would like to
implement the feature for the RISC-V target. Given below example:
unsigned vect_a[1923];
un
From: Pan Li
This patch depends on below 2 patches.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651460.html
After we supported vectorizable early exit in RISC-V, we would like to
enable the gcc vect test for vectorizable ear
From: Pan Li
This patch would like to fix below format issue of trailing operator.
=== ERROR type #1: trailing operator (4 error(s)) ===
gcc/config/riscv/riscv-vector-builtins.cc:4641:39: if ((exts &
RVV_REQUIRE_ELEN_FP_16) &&
gcc/config/riscv/riscv-vector-builtins.cc:4651:39: if ((exts &
RVV_
From: Pan Li
This patch would like to add test cases for the unsigned vector .SAT_ADD
when one of the operand is IMM.
Form 3:
#define DEF_VEC_SAT_U_ADD_IMM_FMT_3(T, IMM) \
T __attribute__((noinline)) \
vec_sat_u_add_imm##IMM
From: Pan Li
This patch would like to add test cases for the unsigned vector .SAT_ADD
when one of the operand is IMM.
Form 4:
#define DEF_VEC_SAT_U_ADD_IMM_FMT_4(T, IMM) \
T __attribute__((noinline)) \
vec_sat_u_ad
From: Pan Li
In previous, we have some specially handling for both the .SAT_ADD and
.SAT_SUB for unsigned int. There are similar to take care of SImode
in RV64 for zero extend. Thus refactor these two helper function
into one for possible code duplication.
The below test suite are passed for t
From: Pan Li
This patch would like to support the vector signed ssadd pattern
for the RISC-V backend. Aka
Form 1:
#define DEF_VEC_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_1 (T *out, T *x, T *y, unsign
From: Pan Li
This patch would like to add strict check for imm operand of .SAT_SUB
matching. We have no type checking for imm operand in previous, which
may result in unexpected IL to be catched by .SAT_SUB pattern.
We leverage the int_fits_type_p here to make sure the imm operand is
a int type
From: Pan Li
This patch would like to add strict check for imm operand of .SAT_SUB
matching. We have no type checking for imm operand in previous, which
may result in unexpected IL to be catched by .SAT_SUB pattern.
We leverage the int_fits_type_p here to make sure the imm operand is
a int type
From: Pan Li
This patch would like to allow the IMM operand of the unsigned
scalar .SAT_ADD. Like the operand 0, the operand 1 of .SAT_ADD
will be zero extended to Xmode before underlying code generation.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc
From: Pan Li
This patch would like to support the form 2 of the scalar signed
integer .SAT_ADD. Aka below example:
Form 2:
#define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \
T __attribute__((noinline)) \
sat_s_add_##T##_fmt_2 (T x, T y) \
{
From: Pan Li
The gen_phi_on_cond can only support below control flow for cond
from day 1. Aka:
+--+
| def |
| ... | +-+
| cond |-->| def |
+--+ | ... |
| +-+
| |
v |
+-+ |
| PHI |<--+
+-+
U
From: Pan Li
This patch would like to support the form 3 of the scalar signed
integer .SAT_ADD. Aka below example:
Form 3:
#define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline))\
sat_s_add_##T##_fmt_3 (T x, T y)
From: Pan Li
Some middl-end change may effect on the times of .SAT_*. Thus,
refine the dump check for SAT_*, from the scan-times to scan as
we only care about the .SAT_* exist or not. And there will an
other PATCH to perform similar refinement and this PATCH only
fix the failed test cases.
gcc
From: Pan Li
The gen_phi_on_cond can only support below control flow for cond
from day 1. Aka:
+--+
| def |
| ... | +-+
| cond |-->| def |
+--+ | ... |
| +-+
| |
v |
+-+ |
| PHI |<--+
+-+
U
From: Pan Li
This patch would like to support the form 3 of the scalar signed
integer .SAT_ADD. Aka below example:
Form 3:
#define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline))\
sat_s_add_##T##_fmt_3 (T x, T y)
From: Pan Li
The middle-end change makes the effect on the layout of the assembly
for vector SAT_*. This patch would like to fix it and make it robust.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: Adjust
asm check and make it robust.
From: Pan Li
The gen_phi_on_cond can only support below control flow for cond
from day 1. Aka:
+--+
| def |
| ... | +-+
| cond |-->| def |
+--+ | ... |
| +-+
| |
v |
+-+ |
| PHI |<--+
+-+
U
From: Pan Li
This patch would like to leverage the match_cond_with_binary_phi to
match the phi on cond, and get the true/false arg if matched. This
helps a lot to simplify the implementation of gen_phi_on_cond.
Before this patch:
basic_block _b1 = gimple_bb (_a1);
if (gimple_phi_num_args (_a1)
From: Pan Li
This patch would like to support the form 3 of the scalar signed
integer .SAT_ADD. Aka below example:
Form 3:
#define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline))\
sat_s_add_##T##_fmt_3 (T x, T y)
From: Pan Li
When matching the cond with 2 args phi node, we need to figure out
which arg of phi node comes from the true edge of cond block, as
well as the false edge. This patch would like to add interface
to perform the action and return the true and false arg in TREE type.
There will be som
From: Pan Li
This patch would like fix the dump check times of vector SAT_ADD. The
middle-end change makes the match times from 2 to 4 times.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/b
From: Pan Li
This patch would like to implement the ssadd for vector integer. Aka
form 1 of ssadd vector.
Form 1:
#define DEF_VEC_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_1 (T *out
From: Pan Li
When matching the cond with 2 args phi node, we need to figure out
which arg of phi node comes from the true edge of cond block, as
well as the false edge. This patch would like to add interface
to perform the action and return the true and false arg in TREE type.
There will be som
From: Pan Li
This patch would like to leverage the match_cond_with_binary_phi to
match the phi on cond, and get the true/false arg if matched. This
helps a lot to simplify the implementation of gen_phi_on_cond.
Before this patch:
basic_block _b1 = gimple_bb (_a1);
if (gimple_phi_num_args (_a1)
From: Pan Li
This patch would like to support the form 3 of the scalar signed
integer .SAT_ADD. Aka below example:
Form 3:
#define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline))\
sat_s_add_##T##_fmt_3 (T x, T y)
From: Pan Li
This patch would like fix the dump check times of vector SAT_ADD. The
middle-end change makes the match times from 2 to 4 times.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/b
From: Pan Li
The int8_t test for signed SAT_ADD is sat_s_add-1.c, the sat_s_add-4.c
should be for int64_t. Thus, update sat_s_add-4.c for int64_t type.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat_s_add-4.c: Update test for int64_t
instead of int8_t.
Signed-off-by: Pan Li
From: Pan Li
This patch would like to add testcases of the signed scalar SAT_ADD
for form 2. Aka:
Form 2:
#define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \
T __attribute__((noinline)) \
sat_s_add_##T##_fmt_2 (T x, T y) \
{
From: Pan Li
Given all commutative binary operators requires types matching
for both operands. Remove the types_match check for case 1 of
the signed SAT_ADD, because we have (bit_xor @0 @1), which ensure
the operands have the correct TREE type.
The below test suites are passed for this patch.
*
From: Pan Li
Form 1:
#define DEF_SAT_S_TRUNC_FMT_1(WT, NT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
This patch would like to implement the sstrunc for scalar signed
integer.
Form 1:
#define DEF_SAT_S_TRUNC_FMT_1(WT, NT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x) \
{
From: Pan Li
This patch would like to support the form 1 of the scalar signed
integer SAT_TRUNC. Aka below example:
Form 1:
#define DEF_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_1 (WT x)
From: Pan Li
When try to matching saturation related pattern on PHI node, we may have
to try each pattern for all phi node of bb. Aka:
for each PHI node in bb:
gphi *phi = xxx;
try_match_sat_add (, phi);
try_match_sat_sub (, phi);
try_match_sat_trunc (, phi);
The PHI node will be remov
From: Pan Li
Form 2:
#define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
This patch would like to support the form 2 of the scalar signed
integer SAT_TRUNC. Aka below example:
Form 2:
#define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x)
From: Pan Li
This patch would like to fix the warning as below:
/home/slyfox/dev/git/gcc/gcc/match.pd:3424:3 warning: duplicate pattern
(cond^ (ne (imagpart (IFN_SUB_OVERFLOW:c@2 @0 @1)) integer_zerop)
^
/home/slyfox/dev/git/gcc/gcc/match.pd:3397:3 warning: previous pattern
defined here
(con
From: Pan Li
Some saturation related alu testcases missed additional option
for expand check, which result in some UNRESOLVED issues. This
patch would like to fix it by adding the option back as other
testcases.
The below test are passed for this patch.
* The rv64gcv fully regression test.
It
From: Pan Li
This patch would like to implement the sstrunc for vector signed integer.
Form 1:
#define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *i
From: Pan Li
This patch would like to support the form 1 of the vector signed
integer SAT_TRUNC. Aka below example:
Form 1:
#define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT#
From: Pan Li
Form 6:
#define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 3:
#define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 4:
#define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 1:
#define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Almost the same as vector unsigned integer SAT_TRUNC, try to match
the signed version during the vector pattern matching.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
From: Pan Li
Form 5:
#define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 2:
#define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 7:
#define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 8:
#define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline))\
vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT *out, WT *in, unsigned limit) \
{
From: Pan Li
Form 3:
#define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
This patch would like to support the form 3 of the scalar signed
integer SAT_TRUNC. Aka below example:
Form 3:
#define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x)
From: Pan Li
This patch would like to support the form 1 of the vector signed
integer SAT_SUB. Aka below example:
Form 1:
#define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_1
From: Pan Li
Almost the same as vector unsigned integer SAT_SUB, try to match
the signed version during the vector pattern matching.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
From: Pan Li
This patch would like to implement the sssub for vector signed integer.
Form 1:
#define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, un
From: Pan Li
Form 1:
#define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
{
From: Pan Li
This patch would like to support the form 4 of the scalar signed
integer SAT_TRUNC. Aka below example:
Form 4:
#define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x)
From: Pan Li
Form 4:
#define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
Form 3:
#define DEF_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline)) \
sat_s_sub_##T##_fmt_3 (T x, T y) \
{\
T minus;
From: Pan Li
Form 4:
#define DEF_SAT_S_SUB_FMT_4(T, UT, MIN, MAX) \
T __attribute__((noinline))\
sat_s_sub_##T##_fmt_4 (T x, T y) \
{ \
T minus;
From: Pan Li
This patch would like to support the form 3 and form 4 of the scalar signed
integer SAT_SUB. Aka below example:
Form 3:
#define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \
T __attribute__((noinline))\
sat_s_add_##T##_fmt_3 (T x, T y)
From: Pan Li
Form 8:
#define DEF_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_8 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
Form 6:
#define DEF_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_6 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
Form 7:
#define DEF_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_7 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
Form 5:
#define DEF_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \
NT __attribute__((noinline)) \
sat_s_trunc_##WT##_to_##NT##_fmt_5 (WT x) \
{ \
NT trunc = (NT)x;
From: Pan Li
This patch would like to introduce new IFN for strided load and store.
LOAD: v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias)
STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias)
The IFN target below code example similar as below
void foo (int * a, int * b, int
From: Pan Li
This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR
for invariant stride memory access. For example as below
void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
for (int i = 0; i < n; i++)
a[i*stride] = b[i*stride] + 100;
}
Bef
From: Pan Li
After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the
strided case need to be adjust for IR check.
The below test suites are passed for this patch:
* The riscv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/gather-scatter/strided
From: Pan Li
Form 1:
void __attribute__((noinline))\
vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \
long stride, size_t size)\
{
From: Pan Li
This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in
the RISC-V backend by leveraging the vector strided load/store insn.
For example:
void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
for (int i = 0; i < n; i++)
a[i*stride] = b[i*stri
From: Pan Li
Form 2:
#define DEF_VEC_SAT_S_SUB_FMT_2(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_sub_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \
{
From: Pan Li
This patch would like to support the form 3 of the vector signed
integer SAT_SUB. Aka below example:
Form 3:
#define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_sub_##T##_fmt_3
From: Pan Li
Form 4:
#define DEF_VEC_SAT_S_SUB_FMT_4(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_sub_##T##_fmt_4 (T *out, T *op_1, T *op_2, unsigned limit) \
{
From: Pan Li
Form 3:
#define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_sub_##T##_fmt_3 (T *out, T *op_1, T *op_2, unsigned limit) \
{
From: Pan Li
There are sorts of forms for the unsigned SAT_ADD. Some of them are
complicated while others are cheap. This patch would like to simplify
the complicated form into the cheap ones. For example as below:
>From the form 4 (branch):
SAT_U_ADD = (X + Y) < x ? -1 : (X + Y).
To (bran
From: Pan Li
In previous, we extract matching usadd_left_part_1 to avoid duplication.
After we simplify some usadd patterns into cheap form, there will be
only one reference to this matching. Thus, remove this matching pattern
and unfold it to the reference place.
The below test suites are pass
From: Pan Li
Sorts of comments of unsigned integer SAT_ADD matching is not updated
to date. This patch would like to refine it.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
gcc/ChangeLog:
From: Pan Li
There are sorts of forms for the unsigned SAT_ADD. Some of them are
complicated while others are cheap. This patch would like to simplify
the complicated form into the cheap ones. For example as below:
>From the form 3 (branch):
SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1.
To (bra
From: Pan Li
This patch would like to add doc for the below 2 standard names.
1. strided load: v = mask_len_strided_load (ptr, stried, mask, len, bias)
2. strided store: mask_len_stried_store (ptr, stride, v, mask, len, bias)
gcc/ChangeLog:
* doc/md.texi: Add doc for mask_len_stried_lo
From: Pan Li
There are sorts of forms for the unsigned SAT_ADD. Some of them are
complicated while others are cheap. This patch would like to simplify
the complicated form into the cheap ones. For example as below:
>From the form 8 (branch):
SAT_U_ADD = x > (T)(x + y) ? -1 : (x + y).
To (b
From: Pan Li
There are sorts of forms for the unsigned SAT_ADD. Some of them are
complicated while others are cheap. This patch would like to simplify
the complicated form into the cheap ones. For example as below:
>From the form 7 (branch):
SAT_U_ADD = x <= (T)(x + y) ? (x + y) : -1.
To (
From: Pan Li
The phiopt2 pass will also try the gimple_simplify for the form 16
of unsigned integer SAT_ADD. Thus add the testcase to make sure
it will be performed in phiopt2 pass.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap test
From: Pan Li
The phiopt2 pass will also try the gimple_simplify for the form 14
of unsigned integer SAT_ADD. Thus add the testcase to make sure
it will be performed in phiopt2 pass.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap test
From: Pan Li
The phiopt2 pass will also try the gimple_simplify for the form 12
of unsigned integer SAT_ADD. Thus add the testcase to make sure
it will be performed in phiopt2 pass.
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap test
301 - 400 of 678 matches
Mail list logo