From: Pan Li
According to the semantics of the avg_floor and avg_ceil as below:
floor: op0 = (narrow) (((wide) op1 + (wide) op2) >> 1);
ceil: op0 = (narrow) (((wide) op1 + (wide) op2 + 1) >> 1);
Aka we have (const_int 1) as the op2 of the ashiftrt but seems missed.
Thus, add it back to align t
From: Pan Li
The previous test case doesn't leverage the right test helper macro,
it should be DEF_AVG_0_WRAP instead of DEF_AVG_0. We prefer the
test function name is test_avg_floor_int64_t_int32_t_0 instead
of test_avg_floor_WT_NT_0 for DEF_AVG_0(WT, NT).
The below test suites are passed for
From: Pan Li
Like the avg3_floor pattern, the avg3_ceil has the
similar issue that lack of the RVV DImode support.
Thus, this patch would like to support the DImode by
the standard name, with the iterator V_VLSI_D.
The below test suites are passed for this patch series.
* The rv64gcv fully regr
From: Pan Li
The avg3_floor pattern leverage the add and shift rtl
with the DOUBLE_TRUNC mode iterator. Aka, RVVDImode
iterator will generate avg3rvvsimode_floor, only the
element size QI, HI and SI are allowed.
Thus, this patch would like to support the DImode by
the standard name, with the it
From: Pan Li
The avg3_floor pattern leverage the add and shift rtl
with the DOUBLE_TRUNC mode iterator. Aka, RVVDImode
iterator will generate avg3rvvsimode_floor, only the
element size QI, HI and SI are allowed.
Thus, this patch would like to support the DImode by
the standard name, with the it
From: Pan Li
Add the run and asm testcase for rv32 SAT_MUL, widen mul from
uint8_t, uint16_t, uint32_t to uint64_t.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat/sat_u_mul-1-u16-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-1-u32-from-u64.c: New test.
* gcc.ta
From: Pan Li
The widen mul will have source type from N-bits to
dest type 2N-bits. The previous check only focus on
the HOST_WIDE_INT but not working for QI => HI, HI => SI
and SI to DImode. Thus, refine the widen mul precision
check as dest has twice bits of input.
gcc/ChangeLog:
* m
From: Pan Li
The widen mul will have source type from N-bits to
dest type 2N-bits. The previous check only focus on
the HOST_WIDE_INT but not working for QI => HI, HI => SI
and SI => DI. Thus, refine the widen mul precision
check, aka dest has twice bits of input.
The below test suites are pas
From: Pan Li
Add the run and asm testcase for rv32 SAT_MUL, widen mul from
uint8_t, uint16_t, uint32_t to uint64_t.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat/sat_u_mul-1-u16-from-u64.c: New test.
* gcc.target/riscv/sat/sat_u_mul-1-u32-from-u64.c: New test.
* gcc.ta
From: Pan Li
The widen mul has different source type for differnt platform,
like rv32 or rv64. For rv32, the source of widen mul is 32-bits
while 64-bits in rv64. Thus, leverage HOST_WIDE_INT is not that
correct and result in the pattern match failures in 32-bits system
like rv32.
Thus, levera
From: Pan Li
The widen mul has different source type for differnt machines,
like rv32 or rv64. The SAT_MUL pattern doesn't works well for
backend like rv32 in previous, thus we would like to refine it
by BITS_PER_WORD for precision check.
The below test suites are passed for this patch:
1. The
From: Pan Li
The sat scalar run test should not require the v extension, thus
take rv32 || rv64 instead of riscv_v for the requirement.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
* The rv32gcv fully regression test.
gcc/testsuite/ChangeLog:
From: Pan Li
The rv32 doesn't support __uint128, and then we will have
error like below during test.
error: '__int128' is not supported on this target.
Thus, we disable the uint128_t related test when rv32.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat/sat_arith.h: Add xlen check fo
From: Pan Li
Add asm dump check and run test for vec_duplicate + vssub.vv
combine to vssub.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
From: Pan Li
Add asm dump check test for vec_duplicate + vssub.vv combine to
vssub.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gc
From: Pan Li
This patch would like to combine the vec_duplicate + vssub.vv to the
vssub.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
This patch would like to introduce the combine of vec_dup + vssub.vv
into vssub.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
Add asm dump check test for vec_duplicate + vsadd.vv combine to
vsadd.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gc
From: Pan Li
Add asm dump check and run test for vec_duplicate + vsadd.vv
combine to vsadd.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
From: Pan Li
This patch would like to combine the vec_duplicate + vsadd.vv to the
vsadd.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
This patch would like to introduce the combine of vec_dup + vsadd.vv
into vsadd.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
Add run and tree-optimized check for unsigned scalar SAT_MUL from
uint128_t.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sat/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat/sat_arith_data.h: Add test data for
run test.
* gcc.target/riscv/
From: Pan Li
This patch would like to implement the SAT_MUL scalar unsigned from
uint128_t, aka:
NT __attribute__((noinline))
sat_u_mul_##NT##_fmt_1 (NT a, NT b)
{
uint128_t x = (uint128_t)a * (uint128_t)b;
NT max = -1;
if (x > (uint128_t)(max))
return max;
else
From: Pan Li
This patch series would like to support the unsigned SAT_MUL with
the help of uint128_t. Aka:
NT __attribute__((noinline))
sat_u_mul_##NT##_fmt_1 (NT a, NT b)
{
uint128_t x = (uint128_t)a * (uint128_t)b;
NT max = -1;
if (x > (uint128_t)(max))
return max;
else
return
From: Pan Li
This patch would like to try to match the SAT_MUL during
widening-mul pass, aka below pattern.
NT __attribute__((noinline))
sat_u_mul_##NT##_fmt_1 (NT a, NT b)
{
uint128_t x = (uint128_t)a * (uint128_t)b;
NT max = -1;
if (x > (uint128_t)(max))
return max;
From: Pan Li
This patch would like to add the middle-end presentation for the
unsigend saturation mul. Aka set the result of mul to the max
when overflow.
Take uint8_t as example, we will have:
* SAT_MUL (1, 127) => 127.
* SAT_MUL (2, 127) => 254.
* SAT_MUL (3, 127) => 255.
* SAT_MUL (25
From: Pan Li
Add asm dump check test for vec_duplicate + vssubu.vv combine to
vssubu.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vssubu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf
From: Pan Li
Add asm dump check and run test for vec_duplicate + vssubu.vv
combine to vssubu.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
From: Pan Li
The cost model change will make the default cost of vx to 2, thus
reconcile the asm check for this change.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u16.c:
Update the asm check due to cost model change.
* gcc.target/ri
From: Pan Li
This patch would like to combine the vec_duplicate + vssubu.vv to the
vssubu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if
From: Pan Li
This patch would like to introduce the combine of vec_dup + vssubu.vv
into vssubu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
Add asm dump check and run test for vec_duplicate + vssubu.vv
combine to vssubu.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
From: Pan Li
This patch would like to combine the vec_duplicate + vssubu.vv to the
vssubu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if
From: Pan Li
Add asm dump check test for vec_duplicate + vssubu.vv combine to
vssubu.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vssubu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf
From: Pan Li
This patch would like to introduce the combine of vec_dup + vssubu.vv
into vssubu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
The cost model change will make the default cost of vx to 2, thus
reconcile the asm check for this change.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u16.c:
Update the asm check due to cost model change.
* gcc.target/ri
From: Pan Li
Add asm dump check and run test for vec_duplicate + vssubu.vv
combine to vssubu.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
From: Pan Li
Add asm dump check test for vec_duplicate + vssubu.vv combine to
vssubu.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vssubu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf
From: Pan Li
The cost model change will make the default cost of vx to 2, thus
reconcile the asm check for this change.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub_trunc-1-u16.c:
Update the asm check due to cost model change.
* gcc.target/ri
From: Pan Li
This patch would like to combine the vec_duplicate + vssubu.vv to the
vssubu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if
From: Pan Li
This patch would like to introduce the combine of vec_dup + vssubu.vv
into vssubu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
Add asm dump check test for vec_duplicate + vsaddu.vv combine to
vsaddu.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vsaddu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf
From: Pan Li
Add asm dump check and run test for vec_duplicate + vsaddu.vv
combine to vsaddu.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
From: Pan Li
This patch would like to combine the vec_duplicate + vsaddu.vv to the
vsaddu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if
From: Pan Li
This patch would like to introduce the combine of vec_dup + vsaddu.vv
into vsaddu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
The will be one ICE when expand pass, the bt similar as below.
during RTL pass: expand
red.c: In function 'main':
red.c:20:5: internal compiler error: in require, at machmode.h:323
20 | int main() {
| ^~~~
0x2e0b1d6 internal_error(char const*, ...)
../../../gcc/
From: Pan Li
The will be one ICE when expand pass, the bt similar as below.
during RTL pass: expand
red.c: In function 'main':
red.c:20:5: internal compiler error: in require, at machmode.h:323
20 | int main() {
| ^~~~
0x2e0b1d6 internal_error(char const*, ...)
../../../gcc/
From: Pan Li
Add asm dump check test for vec_duplicate + vminu.vv combine to
vminu.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vminu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
This patch would like to introduce the combine of vec_dup + vminu.vv
into vminu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
This patch would like to combine the vec_duplicate + vminu.vv to the
vminu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
Add asm dump check and run test for vec_duplicate + vminu.vv
combine to vminu.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.
From: Pan Li
Add asm dump check and run test for vec_duplicate + vmin.vv
combine to vmin.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmin.vv
into vmin.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
|
From: Pan Li
Add asm dump check test for vec_duplicate + vmin.vv combine to
vmin.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vmin.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
This patch would like to combine the vec_duplicate + vmin.vv to the
vmin.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
The case 0 for vx combine def functions are most the same across
the different test files. Thus, re-arrange them in one place to
avoid code duplication.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Leverage
helper macros to avoid code d
From: Pan Li
Add asm dump check test for vec_duplicate + vmaxu.vv combine to vmaxu.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vmaxu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmaxu.vv
into vmaxu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
Add asm dump check test for vec_duplicate + vmaxu.vv combine to vmaxu.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vmaxu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/v
From: Pan Li
This patch would like to combine the vec_duplicate + vmaxu.vv to the
vmaxu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to
vmax.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for max func 1 vmax.vx combine.
* gcc.target/riscv/rvv/autovec
From: Pan Li
This patch would like to combine the vec_duplicate + vmax.vv to the
vmax.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for max func 1 vmax.vx combine.
* gcc.target/riscv/rvv/autovec
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmax.vv
into vmax.vx on the cost value of GR2VR. The late-combine will take place
if the cost of GR2VR is zero, or reject the combine if non-zero like 1,
15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vmax.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vmax.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vremu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-
From: Pan Li
Some existing vrem related test need some adjust for the
asm check due to cost model.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the
asm check for vremu.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.
S
From: Pan Li
This patch would like to combine the vec_duplicate + vremu.vv to the
vremu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
Add asm dump check test for vec_duplicate + vremu.vv combine to vremu.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vremu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
This patch would like to introduce the combine of vec_dup + vremu.vv into
vremu.vx on the cost value of GR2VR. The late-combine will take place
if the cost of GR2VR is zero, or reject the combine if non-zero like 1,
15
in test. There will be two cases for the combine:
Case 0:
|
From: Pan Li
Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vrem.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for vrem.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1
From: Pan Li
This patch would like to introduce the combine of vec_dup + vrem.vv into
vrem.vx on the cost value of GR2VR. The late-combine will take place
if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
This patch would like to combine the vec_duplicate + vrem.vv to the
vrem.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Some existing vrem related test need some adjust for the
asm check due to cost model.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the
asm check for vrem.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.
Si
From: Pan Li
Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vdivu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
Some existing vdiv related test need some adjust for the
asm check due to cost model.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust
the asm check for vdivu.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditt
From: Pan Li
Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vdivu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/v
From: Pan Li
This patch would like to introduce the combine of vec_dup + vdivu.vv into
vdivu.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
|
From: Pan Li
This patch would like to combine the vec_duplicate + vdivu.vv to the
vdivu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
The div of rvv has not such insn v2 = div (vec_dup (x), v1), thus
the generated rtl like that hit the unreachable assert when
expand insn. This patch would like to remove op div from
the binary op form (vec_dup (x), v) to avoid pattern matching
by mistake.
No new test introduced as
From: Pan Li
Some similar code could be wrapped to func get_vector_binary_rtx_cost,
thus leverage this function to avoid code duplication.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv.cc (get_vector_bin
From: Pan Li
This patch would like to combine the vec_duplicate + vdiv.vv to the
vdiv.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Some existing vdiv related test need some adjust for the
asm check.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust
the asm check for vdiv.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto.
* gcc.ta
From: Pan Li
This patch would like to introduce the combine of vec_dup + vdiv.vv into
vdiv.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Inspired by the avg_ceil patches, notice there were even more
lines too long from autovec.md. So fix that format issues.
gcc/ChangeLog:
* config/riscv/autovec.md: Fix line too long for sorts
of pattern.
Signed-off-by: Pan Li
---
gcc/config/riscv/autovec.md | 54
From: Pan Li
Add asm and run testcase for avg_ceil vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: Add test helper macros.
* gcc.target/riscv/rvv/au
From: Pan Li
The avg_ceil has the rounding mode towards +inf, while the
vaadd.vv has the rnu which totally match the sematics. From
RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundoff_signed(v, 1) = (signed(v
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
* gcc.target/ris
From: Pan Li
Similar to the avg_floor, the avg_ceil has the rounding mode
towards +inf, while the vaadd.vv has the rnu which totally match
the sematics. From RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundof
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
This patch would like to combine the vec_duplicate + vmul.vv to the
vmul.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmul.vv into
vmul.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm and run testcase for avg_floor vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: New test.
* gcc.target/riscv/rvv/autovec/avg_dat
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/ris
From: Pan Li
The signed avg_floor totally match the sematics of fixed point
rvv insn vaadd, within round down. Thus, leverage it directly
to implement the avf_floor.
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
disc
1 - 100 of 723 matches
Mail list logo