From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add asm check
for vrsub with GR
From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add vrsub asm
dump check.
From: Pan Li
Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
This patch would like to introduce the combine of vec_dup + vxor.vv into
vxor.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
discard least-significant information.
For float point which is not two's complement, the "discard
least-significant information" indicates truncation round.
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/ris
From: Pan Li
Add asm and run testcase for avg_floor vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: New test.
* gcc.target/riscv/rvv/autovec/avg_dat
From: Pan Li
The signed avg_floor totally match the sematics of fixed point
rvv insn vaadd, within round down. Thus, leverage it directly
to implement the avf_floor.
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
disc
From: Pan Li
This patch would like to combine the vec_duplicate + vxor.vv to the
vxor.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/ris
From: Pan Li
The signed avg_floor totally match the sematics of fixed point
rvv insn vaadd, within round down. Thus, leverage it directly
to implement the avf_floor.
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
disc
From: Pan Li
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
discard least-significant information.
For float point which is not two's complement, the "discard
least-significant information" indicates truncation round.
From: Pan Li
Add asm and run testcase for avg_floor vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: New test.
* gcc.target/riscv/rvv/autovec/avg_dat
From: Pan Li
Some of the previous scalar unsigned SAT_ADD test data are
duplicated in different test files. This patch would like to
move them into a shared header file, to avoid the test data
duplication.
The below test suites are passed for this patch series.
* The rv64gcv fully regression te
From: Pan Li
Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add tes
From: Pan Li
This patch would like to combine the vec_duplicate + vor.vv to the
vor.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR
From: Pan Li
This patch would like to introduce the combine of vec_dup + vor.vv into
vor.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| ...
From: Pan Li
Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm
From: Pan Li
Some similar code could be wrapped to func get_vector_binary_rtx_cost,
thus leverage this function to avoid code duplication.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv.cc (get_vector_bin
From: Pan Li
This patch would like to combine the vec_duplicate + vdiv.vv to the
vdiv.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm and run testcase for avg_ceil vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: Add test helper macros.
* gcc.target/riscv/rvv/au
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
* gcc.target/ris
From: Pan Li
The avg_ceil has the rounding mode towards +inf, while the
vaadd.vv has the rnu which totally match the sematics. From
RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundoff_signed(v, 1) = (signed(v
From: Pan Li
Similar to the avg_floor, the avg_ceil has the rounding mode
towards +inf, while the vaadd.vv has the rnu which totally match
the sematics. From RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundof
From: Pan Li
This patch would like to combine the vec_duplicate + vmul.vv to the
vmul.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmul.vv into
vmul.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vmax.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vmax.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmax.vv
into vmax.vx on the cost value of GR2VR. The late-combine will take place
if the cost of GR2VR is zero, or reject the combine if non-zero like 1,
15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to vmax.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for max func 1 vmax.vx combine.
* gcc.target/riscv/rvv/autovec
From: Pan Li
This patch would like to combine the vec_duplicate + vremu.vv to the
vremu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
Some existing vrem related test need some adjust for the
asm check due to cost model.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the
asm check for vremu.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.
S
From: Pan Li
Add asm dump check test for vec_duplicate + vremu.vv combine to vremu.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vremu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
This patch would like to introduce the combine of vec_dup + vremu.vv into
vremu.vx on the cost value of GR2VR. The late-combine will take place
if the cost of GR2VR is zero, or reject the combine if non-zero like 1,
15
in test. There will be two cases for the combine:
Case 0:
|
From: Pan Li
Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vremu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-
From: Pan Li
This patch would like to combine the vec_duplicate + vmax.vv to the
vmax.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vmax.vv combine to
vmax.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for max func 1 vmax.vx combine.
* gcc.target/riscv/rvv/autovec
From: Pan Li
This patch would like to introduce the combine of vec_dup + vdiv.vv into
vdiv.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Some existing vdiv related test need some adjust for the
asm check.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust
the asm check for vdiv.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto.
* gcc.ta
From: Pan Li
Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
Inspired by the avg_ceil patches, notice there were even more
lines too long from autovec.md. So fix that format issues.
gcc/ChangeLog:
* config/riscv/autovec.md: Fix line too long for sorts
of pattern.
Signed-off-by: Pan Li
---
gcc/config/riscv/autovec.md | 54
From: Pan Li
The div of rvv has not such insn v2 = div (vec_dup (x), v1), thus
the generated rtl like that hit the unreachable assert when
expand insn. This patch would like to remove op div from
the binary op form (vec_dup (x), v) to avoid pattern matching
by mistake.
No new test introduced as
From: Pan Li
This patch would like to combine the vec_duplicate + vrem.vv to the
vrem.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Some existing vrem related test need some adjust for the
asm check due to cost model.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Adjust the
asm check for vrem.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.
Si
From: Pan Li
Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check
for vrem.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1
From: Pan Li
This patch would like to introduce the combine of vec_dup + vrem.vv into
vrem.vx on the cost value of GR2VR. The late-combine will take place
if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vrem.vv combine to vrem.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vrem.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vdivu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/v
From: Pan Li
Some existing vdiv related test need some adjust for the
asm check due to cost model.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust
the asm check for vdivu.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditt
From: Pan Li
This patch would like to introduce the combine of vec_dup + vdivu.vv into
vdivu.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
|
From: Pan Li
This patch would like to combine the vec_duplicate + vdivu.vv to the
vdivu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
Add asm dump check test for vec_duplicate + vdivu.vv combine to vdivu.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vdivu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
The case 0 for vx combine def functions are most the same across
the different test files. Thus, re-arrange them in one place to
avoid code duplication.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Leverage
helper macros to avoid code d
From: Pan Li
Add asm dump check test for vec_duplicate + vmaxu.vv combine to vmaxu.vx,
with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check
for vmaxu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/v
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmaxu.vv
into vmaxu.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
From: Pan Li
This patch would like to combine the vec_duplicate + vmaxu.vv to the
vmaxu.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if th
From: Pan Li
Add asm dump check test for vec_duplicate + vmaxu.vv combine to vmaxu.vx,
with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check
for vmaxu.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx
From: Pan Li
Add asm dump check test for vec_duplicate + vmin.vv combine to
vmin.vx, with the GR2VR cost is 0, 1 and 2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vmin.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-
From: Pan Li
This patch would like to combine the vec_duplicate + vmin.vv to the
vmin.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmin.vv
into vmin.vx on the cost value of GR2VR. The late-combine will take
place if the cost of GR2VR is zero, or reject the combine if non-zero
like 1, 2, 15 in test. There will be two cases for the combine:
Case 0:
|
From: Pan Li
Add asm dump check and run test for vec_duplicate + vmin.vv
combine to vmin.vx, with the GR2VR cost is 0, 2 and 15.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.
From: Pan Li
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make t
From: Pan Li
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make t
From: Pan Li
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make t
From: Pan Li
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make t
From: yes
Fix the bug of the rvv bool mode size by the adjustment.
Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64])
of the vbool*_t, the mode size (aka byte size) will be adjusted to
[1, 1, 1, 1, 2, 4, 8] according to the rvv spec 1.0 isa. The
adjustment will provide correct inf
From: Pan Li
Fix the bug of the incorrect code generation for the
below code sample.
typedef unsigned short __attribute__((__vector_size__ (32))) V;
typedef unsigned short u16;
void
foo (V m, u16 *ret)
{
V v = 6 > ((V) { 2049, 8 } & m);
*ret = v[0]; // + a + b + c + d;
}
Before this patch.
From: yes
Fix the bug of the incorrect code generation for the
below code sample.
typedef unsigned short __attribute__((__vector_size__ (32))) V;
typedef unsigned short u16;
void
foo (V m, u16 *ret)
{
V v = 6 > ((V) { 2049, 8 } & m);
*ret = v[0]; // + a + b + c + d;
}
Before this patch.
ad
601 - 672 of 672 matches
Mail list logo