[Committed] RISC-V: Disable BSWAP optimization for NUNITS < 4

2023-11-23 Thread Juzhe-Zhong
When fixing bugs, I notice there is a piece odd codes look incorrect. which probably make codegen worse. #include typedef int8_t vnx2qi __attribute__ ((vector_size (2))); #define MASK_2(X, Y) (Y) - 1 - (X), (Y) - 2 - (X) #define PERMUTE(TYPE, NUNITS)

[PATCH] RISC-V: Fix inconsistency among all vectorization hooks

2023-11-24 Thread Juzhe-Zhong
This patches 200+ ICEs exposed by testing with rv64gc_zve64d. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112694 The rootcause is we disallow poly (1,1) size vectorization in preferred_simd_mode. with this following code: - if (TARGET_MIN_VLEN < 128 && TARGET_MAX_LMUL < RVV_M2) - retu

[PATCH] RISC-V: Remove incorrect function gate gather_scatter_valid_offset_mode_p

2023-11-25 Thread Juzhe-Zhong
Come back to review the codes of gather/scatter, notice gather_scatter_valid_offset_mode_p looks odd. gather_scatter_valid_offset_mode_p is supposed to block vluxei64/vsuxei64 in RV32 system. However, it failed to do that since it is passing data_mode instead of index mode: riscv_vector::gather

[Committed] RISC-V: Fix typo

2023-11-25 Thread Juzhe-Zhong
Fix typo. Committed. gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (alv_can_be_propagated_p): Fix typo. (avl_can_be_propagated_p): Ditto. (vlmax_ta_p): Ditto. --- gcc/config/riscv/riscv-avlprop.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/g

[Committed] RISC-V: Disable AVL propagation of slidedown instructions

2023-11-26 Thread Juzhe-Zhong
Re-check again RVV ISA, I find that we can't allow AVL propagation not only for vrgather, but also slidedown instructions. Committed. PR target/112599 gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Add slidedown. (vlmax_ta_p): Ditto. (

[PATCH] RISC-V: Fix VSETVL PASS regression

2023-11-27 Thread Juzhe-Zhong
This patch is regression fix patch, not an optimization patch. Since trunk GCC generates redundant vsetvl than GCC-13. This is the case: bb 2: def a2 (vsetvl a2, zero) bb 3: use a2 bb 4: use a2 (vle) before this patch: bb 2: vsetvl a2 zero bb 3: vsetvl zero, zero > should be eliminate

[PATCH] RISC-V: Disallow poly (1,1) VLA SLP interleave vectorization

2023-11-28 Thread Juzhe-Zhong
This patch fixes all following ICE in zve64d: FAIL: gcc.dg/vect/pr71259.c -flto -ffat-lto-objects (internal compiler error: in SET_TYPE_VECTOR_SUBPARTS, at tree.h:4248) FAIL: gcc.dg/vect/pr71259.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-alias-check-14.c (internal c

[PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread Juzhe-Zhong
Since Richard supports register filters recently, we are able to support highpart register overlap for widening RVV instructions. This patch support it for vwcvt intrinsics. I leverage real application user codes for vwcvt: https://github.com/riscv/riscv-v-spec/issues/929 https://godbolt.org/z/x

[PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread Juzhe-Zhong
Since Richard supports register filters recently, we are able to support highpart register overlap for widening RVV instructions. This patch support it for vwcvt intrinsics. I leverage real application user codes for vwcvt: https://github.com/riscv/riscv-v-spec/issues/929 https://godbolt.org/z/x

[PATCH] RISC-V: Support highpart overlap for vext.vf

2023-11-29 Thread Juzhe-Zhong
PR target/112431 gcc/ChangeLog: * config/riscv/vector.md: Support highpart overlap for vext.vf2 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/unop_v_constraint-2.c: Adapt test. * gcc.target/riscv/rvv/base/pr112431-4.c: New test. * gcc.target/riscv/

[Committed] RISC-V: Rename vconstraint into group_overlap

2023-11-29 Thread Juzhe-Zhong
Fix for Robin's suggestion. gcc/ChangeLog: * config/riscv/constraints.md (TARGET_VECTOR ? V_REGS : NO_REGS): Fix constraint. * config/riscv/riscv.md (no,W21,W42,W84,W41,W81,W82): Rename vconstraint into group_overlap. (no,yes): Ditto. (none,W21,W42,W84,W43,W86,W8

[Committed] RISC-V: Support highpart overlap for floating-point widen instructions

2023-11-29 Thread Juzhe-Zhong
This patch leverages the approach of vwcvt/vext.vf2 which has been approved. Their approaches are totally the same. Tested no regression and committed. PR target/112431 gcc/ChangeLog: * config/riscv/vector.md: Add widenning overlap. gcc/testsuite/ChangeLog: * gcc.targe

[PATCH] RISC-V: Support widening register overlap for vf4/vf8

2023-11-29 Thread Juzhe-Zhong
size_t foo (char const *buf, size_t len) { size_t sum = 0; size_t vl = __riscv_vsetvlmax_e8m8 (); size_t step = vl * 4; const char *it = buf, *end = buf + len; for (; it + step <= end;) { vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl); it += vl; vint8m1_t v1

[PATCH V4] RISC-V: Support Dynamic LMUL Cost model

2023-09-11 Thread Juzhe-Zhong
This patch support dynamic LMUL cost modeling with --param=riscv-autovec-lmul=dynamic. Consider this following case: void foo (int32_t *__restrict a, int32_t *__restrict b,int32_t *__restrict c, int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, int32_t *__res

[PATCH V5] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Juzhe-Zhong
This patch support dynamic LMUL cost modeling with --param=riscv-autovec-lmul=dynamic. Consider this following case: void foo (int32_t *__restrict a, int32_t *__restrict b,int32_t *__restrict c, int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2, int32_t *__res

[PATCH] RISC-V: Support VECTOR BOOL vcond_mask optab[PR111337]

2023-09-12 Thread Juzhe-Zhong
As this PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111337 We support VECTOR BOOL vcond_mask to fix this following ICE: 0x1a9e309 gimple_expand_vec_cond_expr ../../../../gcc/gcc/gimple-isel.cc:283 0x1a9ea56 execute ../../../../gcc/gcc/gimple-isel.cc:390 gcc/ChangeLog:

[PATCH V2] RISC-V: Support VECTOR BOOL vcond_mask optab[PR111337]

2023-09-12 Thread Juzhe-Zhong
PR target/111337 As this PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111337 We support VECTOR BOOL vcond_mask to fix this following ICE: 0x1a9e309 gimple_expand_vec_cond_expr ../../../../gcc/gcc/gimple-isel.cc:283 0x1a9ea56 execute ../../../../gcc/gcc/gimple-isel.cc:390 g

[committed] RISC-V: Remove redundant ABI test

2023-09-12 Thread Juzhe-Zhong
We only support and report warning for RVV types. We don't report warning for GNU vectors. So this testcase checking is incorrect and the FAIL is bogus. Remove it and commit it. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vector-abi-9.c: Removed. --- .../gcc.target/riscv/rvv/

[PATCH] RISC-V: Support VLS modes VEC_EXTRACT auto-vectorization

2023-09-13 Thread Juzhe-Zhong
This patch support VLS modes VEC_EXTRACT to fix PR111391: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391 I need VLS modes VEC_EXTRACT to fix this issue. I have run the whole gcc testsuite, notice this patch increase these 4 FAILs: FAIL: c-c++-common/vector-subscript-4.c -std=gnu++14 scan-t

[PATCH] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-13 Thread Juzhe-Zhong
This patch fixes PR111391: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391 PR target/111391 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Expand VLS to scalar move. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-9.c: Adapt tes

[PATCH V2] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-13 Thread Juzhe-Zhong
This patch fixes PR111391: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391 PR target/111391 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Expand VLS to scalar move. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-9.c: Adapt tes

[PATCH] RISC-V: Fix ICE in get_avl_or_vl_reg[PR111395]

2023-09-13 Thread Juzhe-Zhong
This patch fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111395 ICE PR target/111395 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (avl_info::operator==): Fix bug. (vector_insn_info::global_merge): Ditto. (vector_insn_info::get_avl_or_vl_reg): Ditto. (p

[PATCH V2] RISC-V: Fix ICE in get_avl_or_vl_reg

2023-09-14 Thread Juzhe-Zhong
update v1 -> v2: Add available fortran compiler check in rvv-fortran.exp. This patch fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111395 ICE PR target/111395 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (avl_info::operator==): Fix ICE. (vector_insn_info::global_merge

[PATCH V3] RISC-V: Fix ICE in get_avl_or_vl_reg

2023-09-14 Thread Juzhe-Zhong
update v1 -> v2: Add available fortran compiler check in rvv-fortran.exp. This patch fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111395 ICE update v2 -> v3: Remove redundant format. PR target/111395 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (avl_info::operator==): Fix

[Committed] RISC-V: Format VSETVL PASS code

2023-09-14 Thread Juzhe-Zhong
gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::global_eliminate_vsetvl_insn): Format it. --- gcc/config/riscv/riscv-vsetvl.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 8ec5

[PATCH V3] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-14 Thread Juzhe-Zhong
This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391 I notice that previous patch (V2 patch) cause additional execution fail of pr69719.c This FAIL is because of the latent BUG of VSETVL PASS. So this patch includes VSETVL PASS fix even though it's not related to the PR111391.

[PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-14 Thread Juzhe-Zhong
This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391 PR target/111391 gcc/ChangeLog: * config/riscv/autovec.md (@vec_extract): Remove @. (vec_extract): Ditto. * config/riscv/riscv-vsetvl.cc (emit_vsetvl_insn): Fix bug. (pass_vsetvl::local_e

[PATCH] RISC-V: Support VLS modes mask operations

2023-09-14 Thread Juzhe-Zhong
This patch support mask operations (comparison and logical). This patch reduce these FAILs of "vect" testsuite: FAIL: gcc.dg/vect/vect-bic-bitmask-12.c -flto -ffat-lto-objects scan-tree-dump dce7 "<=\\s*.+{ 255,.+}" FAIL: gcc.dg/vect/vect-bic-bitmask-12.c scan-tree-dump dce7 "<=\\s*.+{ 255,.+}"

[PATCH] test: Remove XPASS for RISCV

2023-09-15 Thread Juzhe-Zhong
Like ARM SVE, this test cause FAILs of XPASS: XPASS: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 72) XPASS: gcc.dg/Wstringop-overflow-47.c pr97027 (test for warnings, line 77) XPASS: gcc.dg/Wstringop-overflow-47.c pr97027 note (test for warnings, line 68) on RISC-V gcc/testsui

[PATCH] test: Block slp-16.c check for target support vect_strided6

2023-09-15 Thread Juzhe-Zhong
This testcase FAIL in RISC-V because RISC-V support vect_load_lanes with 6. FAIL: gcc.dg/vect/slp-16.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 2 FAIL: gcc.dg/vect/slp-16.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 Since it use vlseg6 (vect

[PATCH] test: Isolate slp-1.c check of target supports vect_strided5

2023-09-15 Thread Juzhe-Zhong
This test failed in RISC-V: FAIL: gcc.dg/vect/slp-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 4 FAIL: gcc.dg/vect/slp-1.c scan-tree-dump-times vect "vectorizing stmts using SLP" 4 Because this loop: /* SLP with unrolling by 8. */ for (i = 0; i < N; i

[PATCH] test: Block vect_strided5 for slp-34-big-array.c SLP check

2023-09-15 Thread Juzhe-Zhong
If failed on RISC-V since it use vect_store_lanes with array 5. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-34-big-array.c: Block SLP check for vect_strided5. --- gcc/testsuite/gcc.dg/vect/slp-34-big-array.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuit

[PATCH] test: Block SLP check of slp-34.c for vect_strided5

2023-09-15 Thread Juzhe-Zhong
Since RISC-V use vsseg5 which is the vect_store_lanes with stride 5 if failed on RISC-V. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-34.c: Block check for vect_strided5. --- gcc/testsuite/gcc.dg/vect/slp-34.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsui

[PATCH] test: Block SLP check of slp-35.c for vect_strided5

2023-09-15 Thread Juzhe-Zhong
gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-35.c: Block SLP check for vect_strided5 targets. --- gcc/testsuite/gcc.dg/vect/slp-35.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/slp-35.c b/gcc/testsuite/gcc.dg/vect/slp-35.c index 5e9f6739e1

[PATCH] internal-fn: Convert uninitialized SSA_NAME into SCRATCH rtx[PR110751]

2023-09-15 Thread Juzhe-Zhong
According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert it into SCRATCH rtx if the target predicate allows SCRATCH. It can help to reduce redundant data move instructions of targets like RISC-V. Here we

[PATCH] RISC-V: Support VLS modes reduction[PR111153]

2023-09-16 Thread Juzhe-Zhong
This patch supports VLS reduction vectorization. It can optimize the current reduction vectorization codegen with current COST model. #define DEF_REDUC_PLUS(TYPE)\ TYPE __attribute__ ((noinline, noclone))\ reduc_plus_##TYPE (TYPE * __restrict a, int n) \ {

[PATCH V2] internal-fn: Support undefined rtx for uninitialized SSA_NAME

2023-09-17 Thread Juzhe-Zhong
According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert it into SCRATCH rtx if the target predicate allows SCRATCH. It can help to reduce redundant data move instructions of targets like RISC-V. gcc/Cha

[PATCH V3] internal-fn: Support undefined rtx for uninitialized SSA_NAME

2023-09-17 Thread Juzhe-Zhong
According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert it into SCRATCH rtx if the target predicate allows SCRATCH. It can help to reduce redundant data move instructions of targets like RISC-V. Bootstr

[Committed] RISC-V: Remove redundant codes of VLS patterns[NFC]

2023-09-17 Thread Juzhe-Zhong
Consider those VLS patterns are the same VLA patterns. Now extend VI -> V_VLSI and VF -> V_VLSF. Then remove the redundant codes of VLS patterns. gcc/ChangeLog: * config/riscv/autovec-vls.md (3): Deleted. (copysign3): Ditto. (xorsign3): Ditto. (2): Ditto. *

[PATCH] RISC-V: Remove autovec-vls.md file and clean up VLS move modes[NFC]

2023-09-18 Thread Juzhe-Zhong
V auto-vectorization. -;; Copyright (C) 2023 Free Software Foundation, Inc. -;; Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd. - -;; This file is part of GCC. - -;; GCC is free software; you can redistribute it and/or modify -;; it under the terms of the GNU General

[Committed] RISC-V: Fix VSETVL PASS fusion bug

2023-09-18 Thread Juzhe-Zhong
There is an obvious fusion bug that is exposed by more VLS patterns support. After more VLS modes support, it cause following FAILs: FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c execution test FAIL: gcc.target/riscv/rvv

[Committed] RISC-V: Support VLS reduction

2023-09-18 Thread Juzhe-Zhong
Notice previous VLS reduction patch is missing some codes which cause multiple ICE: FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c (internal compiler error: in code_for_pred, at ./insn-opinit.h:1560) FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c (internal compiler error: in code

[Committed] RISC-V: Fix bogus FAILs of vsetvl testcases

2023-09-18 Thread Juzhe-Zhong
Due the the global codes change which change the CFG cause bogus vsetvl checking FAILs. Adapt testcases for the global codes change. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_single-21.c: Adapt test. * gcc.target/riscv/rvv/vsetvl/avl_single-26.c: Ditto.

[PATCH] RISC-V: Remove redundant vec_duplicate pattern

2023-09-18 Thread Juzhe-Zhong
Currently, VLS and VLA patterns are different. VLA is define_expand VLS is define_insn_and_split It makes no sense that they are different pattern format. Merge them into same pattern (define_insn_and_split). It can also be helpful for the future vv -> vx fwprop optimization. gcc/ChangeLog:

[PATCH] RISC-V: Fix RVV can change mode class bug

2023-09-18 Thread Juzhe-Zhong
After support the VLS mode conversion, current case triggers a latent bug that we are lucky we didn't encounter. This is a real bug in 'cprop_hardreg': orig:RVVMF8BI,16,16 new:V32BI,32,0 during RTL pass: cprop_hardreg auto.c: In function 'main': auto.c:79:1: internal compiler error: in partial_s

[PATCH V2] RISC-V: Fix RVV can change mode class bug

2023-09-18 Thread Juzhe-Zhong
After support the VLS mode conversion, current case triggers a latent bug that we are lucky we didn't encounter. This is a real bug in 'cprop_hardreg': orig:RVVMF8BI,16,16 new:V32BI,32,0 during RTL pass: cprop_hardreg auto.c: In function 'main': auto.c:79:1: internal compiler error: in partial_s

[Committed] RISC-V: Support integer FMA/FNMA VLS modes autovectorization

2023-09-19 Thread Juzhe-Zhong
Simpily extend the current VLA iterator and patterns. Regression passed with no difference. gcc/ChangeLog: * config/riscv/autovec.md: Add VLS modes. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS FMA/FNMA test

[Committed] RISC-V: Support VLS floating-point FMA/FNMA/FMS auto-vectorization

2023-09-19 Thread Juzhe-Zhong
Support VLS floating-point FMA/FNMA/FMS patterns. Regression no difference after this patch, Committed. gcc/ChangeLog: * config/riscv/autovec.md: Extend VLS floating-point modes. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vl

[Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-19 Thread Juzhe-Zhong
Extend current VLA patterns with VLS modes. Regression all passed. gcc/ChangeLog: * config/riscv/autovec.md: Extend VLS modes. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add unary test. * gcc.target/riscv/

[PATCH] RISC-V: Add FNMS floating-point VLS tests

2023-09-19 Thread Juzhe-Zhong
Add tests and committed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add FNMS VLS modes tests. * gcc.target/riscv/rvv/autovec/vls/fnms-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/fnms-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/fn

[Committed] RISC-V: Extend VLS modes in 'VWEXTI' iterator

2023-09-19 Thread Juzhe-Zhong
This patch extends 'VWEXT' iterator so that we will support integer extension/integer truncate/integer average VLS patterns. This patch reduce these following FAILs: FAIL: gcc.dg/pr92301.c execution test XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "

[Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]

2023-09-20 Thread Juzhe-Zhong
This bug is exposed when we support VLS integer conversion patterns. FAIL: c-c++-common/torture/pr53505.c execution. This is because incorrect vsetvl elimination by Phase 4: 10318: 0d207057vsetvli zero,zero,e32,m4,ta,ma 1031c: 5e003e57vmv.v.i v28

[Committed] RISC-V: Support VLS floating-point extend/truncate

2023-09-20 Thread Juzhe-Zhong
Regression passed. Committed. gcc/ChangeLog: * config/riscv/vector-iterators.md: Extend VLS floating-point. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-10.c: Adapt test. * gcc.target/riscv/rvv/autovec/widen/widen-11.c: Ditto. * gcc.target

[Committed V4] internal-fn: Support undefined rtx for uninitialized SSA_NAME[PR110751]

2023-09-20 Thread Juzhe-Zhong
According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert it into SCRATCH rtx if the target predicate allows SCRATCH. It can help to reduce redundant data move instructions of targets like RISC-V. Bootstr

[Committed] RISC-V: Support VLS INT <-> FP conversions

2023-09-20 Thread Juzhe-Zhong
Support INT <-> FP VLS auto-vectorization patterns. Regression passed. Committed. gcc/ChangeLog: * config/riscv/autovec.md: Extend VLS modes. * config/riscv/vector-iterators.md: Ditto. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/

[PATCH] RISC-V: Fix SUBREG move of VLS mode[PR111486]

2023-09-20 Thread Juzhe-Zhong
This patch fixes this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111486 Before this patch, we can only handle (subreg:DI (reg:V8QI)) The PR ICE: during RTL pass: reload testcase.c: In function 'foo': testcase.c:8:1: internal compiler error: in require, at machmode.h:313 8 | } |

[PATCH] RISC-V: Enable undefined support for RVV auto-vectorization[PR110751]

2023-09-21 Thread Juzhe-Zhong
Now GCC middle-end can support undefined value which is traslated into (scratch:mode). This patch is to enable RISC-V backend undefine value in ELSE value of COND_LEN_xxx/COND_xxx. Consider this following case: __attribute__((noipa)) void vrem_int8_t (int8_t * __restrict dst, int8_t * __re

[Committed] RISC-V: Support VLS mult high

2023-09-21 Thread Juzhe-Zhong
Regression passed. Committed. gcc/ChangeLog: * config/riscv/vector-iterators.md: Extend VLS modes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS mult high. * gcc.target/riscv/rvv/autovec/vls/mulh-1.c: New test. --- gcc/config/riscv/vector-

[Committed] RISC-V: Add more VLS unary tests

2023-09-21 Thread Juzhe-Zhong
Notice we are missing these tests. Committed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/abs-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/not-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/sqrt-1.c: New test. --- .../gcc.target/riscv/rvv/autovec

[Committed] RISC-V: Add VLS integer ABS support

2023-09-21 Thread Juzhe-Zhong
Regression passed. Committed. gcc/ChangeLog: * config/riscv/autovec.md: Extend VLS modes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/abs-2.c: New test. --- gcc/config/riscv/autovec.md | 6 +- .../gcc.target/riscv/rvv/autovec/vls/abs-2.c |

[PATCH] RISC-V: Add VLS conditional patterns support

2023-09-22 Thread Juzhe-Zhong
Regression passed. Committed. gcc/ChangeLog: * config/riscv/autovec.md: Add VLS conditional patterns. * config/riscv/riscv-protos.h (expand_cond_unop): Ditto. (expand_cond_binop): Ditto. (expand_cond_ternop): Ditto. * config/riscv/riscv-v.cc (expand_cond_u

[Committed] RISC-V: Remove @ of vec_duplicate pattern

2023-09-22 Thread Juzhe-Zhong
It's obvious the @ of vec_duplicate pattern is duplicate. Regression passed. Committed. gcc/ChangeLog: * config/riscv/riscv-v.cc (gen_const_vector_dup): Use global expand function. * config/riscv/vector.md (@vec_duplicate): Remove @. (vec_duplicate): Ditto. --- gcc/con

[Committed] RISC-V: Add VLS unary combine patterns

2023-09-22 Thread Juzhe-Zhong
gcc/ChangeLog: * config/riscv/autovec-opt.md: Add VLS modes for conditional ABS/SQRT. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/cond_abs-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/cond_sqrt-1.c: New test. --- gcc/config/riscv/autovec-opt.md

[PATCH] RISC-V: Fix AVL/VL bug of VSETVL PASS[PR111548]

2023-09-23 Thread Juzhe-Zhong
This patch fixes that AVL/VL reg incorrect fetch in VSETVL PASS. C/C++ regression passed. But gfortran didn't run yet. I am still finding a way to run it. Will commit it when I pass the fortran regression. PR target/111548 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (earlies

[PATCH] RISC-V: Add opaque integer modes to fix ICE on DSE[PR111590]

2023-09-25 Thread Juzhe-Zhong
When doing fortran test with 'V' extension enabled on RISC-V port. I saw multiple ICE: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111590 The root cause is on DSE: internal compiler error: in smallest_mode_for_size, at stor-layout.cc:356 0x1918f70 smallest_mode_for_size(poly_int<2u, unsigned lon

[PATCH] MATCH: Optimize COND_ADD_LEN reduction pattern

2023-09-26 Thread Juzhe-Zhong
This patch leverage this commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=62b505a4d5fc89 to optimize COND_LEN_ADD reduction pattern. We are doing optimization of VEC_COND_EXPR + COND_LEN_ADD -> COND_LEN_ADD. Consider thsi following case: #include void pr11594 (uint64_t *restrict a, ui

[PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern

2023-09-26 Thread Juzhe-Zhong
This patch leverage this commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=62b505a4d5fc89 to optimize COND_LEN_ADD reduction pattern. We are doing optimization of VEC_COND_EXPR + COND_LEN_ADD -> COND_LEN_ADD. Consider thsi following case: #include void pr11594 (uint64_t *restrict a, uint

[PATCH] MATCH: Optimize COND_ADD reduction pattern

2023-09-26 Thread Juzhe-Zhong
Current COND_ADD reduction pattern can't optimize floating-point vector. As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html Allow COND_ADD reduction pattern to optimize floating-point vector. Bootstrap and Regression is running. Ok for trunk if tests pass

[PATCH V3] MATCH: Optimize COND_ADD_LEN reduction pattern

2023-09-26 Thread Juzhe-Zhong
This patch leverage this commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=62b505a4d5fc89 to optimize COND_LEN_ADD reduction pattern. We are doing optimization of VEC_COND_EXPR + COND_LEN_ADD -> COND_LEN_ADD. Consider thsi following case: #include void pr11594 (uint64_t *restrict a, uint

[PATCH V2] MATCH: Optimize COND_ADD_LEN reduction pattern

2023-09-26 Thread Juzhe-Zhong
Current COND_ADD reduction pattern can't optimize floating-point vector. As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html Allow COND_ADD reduction pattern to optimize floating-point vector. Bootstrap and Regression is running. Ok for trunk if tests pass

[PATCH V2] MATCH: Optimize COND_ADD reduction pattern

2023-09-26 Thread Juzhe-Zhong
Current COND_ADD reduction pattern can't optimize floating-point vector. As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631336.html Allow COND_ADD reduction pattern to optimize floating-point vector. Bootstrap and Regression is running. Ok for trunk if tests pass

[Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]

2023-09-26 Thread Juzhe-Zhong
The mem-to-mem insn pattern is splitted from reg-to-mem/mem-to-reg/reg-to-reg causes ICE in RA since RA prefer they stay together. Now, we split mem-to-mem as a pure pre-RA split pattern and only allow define_insn match mem-to-mem VLS move in pre-RA stage (Forbid mem-to-mem move after RA). Teste

[PATCH V2] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]

2023-09-26 Thread Juzhe-Zhong
The mem-to-mem insn pattern is splitted from reg-to-mem/mem-to-reg/reg-to-reg causes ICE in RA since RA prefer they stay together. Now, we split mem-to-mem as a pure pre-RA split pattern and only allow define_insn match mem-to-mem VLS move in pre-RA stage (Forbid mem-to-mem move after RA). Teste

[PATCH V3] RISC-V: Remove mem-to-mem VLS move pattern[PR111566]

2023-09-26 Thread Juzhe-Zhong
PR target/111566 gcc/ChangeLog: * config/riscv/vector.md (*mov_mem_to_mem): Remove. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/mov-1.c: Adapt test. * gcc.target/riscv/rvv/autovec/vls/mov-10.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/m

[PATCH] DSE: Fix ICE when the mode with access_size don't exist on the target[PR111590]

2023-09-26 Thread Juzhe-Zhong
hen doing fortran test with 'V' extension enabled on RISC-V port. I saw multiple ICE: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111590 The root cause is on DSE: internal compiler error: in smallest_mode_for_size, at stor-layout.cc:356 0x1918f70 smallest_mode_for_size(poly_int<2u, unsigned long

[PATCH] ifcvt: Fix comments

2023-09-26 Thread Juzhe-Zhong
Fix comments since original comment is confusing. gcc/ChangeLog: * tree-if-conv.cc (is_cond_scalar_reduction): Fix comments. --- gcc/tree-if-conv.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 799f071965e..a8c

[SUBREG V3 1/4] DF: Add -ftrack-subreg-liveness option

2024-05-11 Thread Juzhe-Zhong
Add new flag -ftrack-subreg-liveness to enable track-subreg-liveness. This flag is enabled at -O3/fast. Co-authored-by: Lehua Ding gcc/ChangeLog: * common.opt: Add -ftrack-subreg-liveness option. * common.opt.urls: Ditto. * doc/invoke.texi: Ditto. * opts.cc: Ditt

[SUBREG V3 3/4] IRA: Add DF_LIVE_SUBREG problem

2024-05-11 Thread Juzhe-Zhong
This patch simple replace df_get_live_in to df_get_subreg_live_in and replace df_get_live_out to df_get_subreg_live_out. Co-authored-by: Lehua Ding gcc/ChangeLog: * ira-build.cc (create_bb_allocnos): Apply DF_LIVE_SUBREG data. (create_loop_allocnos): Diito. * ira-color.c

[SUBREG V3 4/4] LRA: Apply DF_LIVE_SUBREG data

2024-05-11 Thread Juzhe-Zhong
This patch apply the DF_LIVE_SUBREG to LRA pass. More changes were made to the LRA than the IRA since the LRA will modify the DF data directly. The main big changes are centered on the lra-lives.cc file. Co-authored-by: Lehua Ding gcc/ChangeLog: * lra-coalesce.cc (update_live_info): App

[SUBREG V3 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-05-11 Thread Juzhe-Zhong
x86-64 no regression. Co-authored-by: Lehua Ding Juzhe-Zhong (4): DF: Add -ftrack-subreg-liveness option DF: Add DF_LIVE_SUBREG problem IRA: Add DF_LIVE_SUBREG problem LRA: Apply DF_LIVE_SUBREG data gcc/Makefile.in | 1 + gcc/common.opt | 4 + gcc/common.opt.

[SUBREG V3 2/4] DF: Add DF_LIVE_SUBREG problem

2024-05-11 Thread Juzhe-Zhong
This patch add a new DF problem, named DF_LIVE_SUBREG. This problem is extended from the DF_LR problem and support track the subreg liveness of multireg pseudo if these pseudo satisfy the following conditions: 1. the mode size greater than it's REGMODE_NATURAL_SIZE. 2. the reg is used in insns

[SUBREG V4 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-05-12 Thread Juzhe-Zhong
subreg liveness tracking in the followup patches. Bootstrap and Regtested on x86-64 no regression. Co-authored-by: Lehua Ding Juzhe-Zhong (4): DF: Add -ftrack-subreg-liveness option DF: Add DF_LIVE_SUBREG problem IRA: Apply DF_LIVE_SUBREG data LRA: Apply DF_LIVE_SUBREG data gcc/M

[SUBREG V4 4/4] LRA: Apply DF_LIVE_SUBREG data

2024-05-12 Thread Juzhe-Zhong
--- gcc/lra-coalesce.cc| 27 +++- gcc/lra-constraints.cc | 109 ++--- gcc/lra-int.h | 4 + gcc/lra-lives.cc | 357 - gcc/lra-remat.cc | 8 +- gcc/lra-spills.cc | 27 +++- gcc/lra.cc | 10 +- 7 files ch

[SUBREG V4 1/4] DF: Add -ftrack-subreg-liveness option

2024-05-12 Thread Juzhe-Zhong
--- gcc/common.opt | 4 gcc/common.opt.urls | 3 +++ gcc/doc/invoke.texi | 8 gcc/opts.cc | 1 + 4 files changed, 16 insertions(+) diff --git a/gcc/common.opt b/gcc/common.opt index 40cab3cb36a..5710e817abe 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2163,6 +21

[SUBREG V4 2/4] DF: Add DF_LIVE_SUBREG problem

2024-05-12 Thread Juzhe-Zhong
--- gcc/Makefile.in | 1 + gcc/df-problems.cc | 886 ++- gcc/df.h | 159 +++ gcc/regs.h | 5 + gcc/sbitmap.cc | 98 + gcc/sbitmap.h| 2 + gcc/subreg-live-range.cc | 233 ++

[SUBREG V4 3/4] IRA: Apply DF_LIVE_SUBREG data

2024-05-12 Thread Juzhe-Zhong
--- gcc/ira-build.cc | 7 --- gcc/ira-color.cc | 8 gcc/ira-emit.cc | 12 ++-- gcc/ira-lives.cc | 7 --- gcc/ira.cc | 19 --- 5 files changed, 30 insertions(+), 23 deletions(-) diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc index ea593d5a087..2

[PATCH] RISC-V: Add RVV registers register spilling

2022-11-06 Thread juzhe . zhong
From: Ju-Zhe Zhong This patch support RVV scalable register spilling. prologue && epilogue handling pick up prototype from Monk Chiang . Co-authored-by: Monk Chiang gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_pred_move): Adjust for scalable register spilling. (legitimize_m

[PATCH] RISC-V: Fix bug reported by PR109535

2023-04-18 Thread juzhe . zhong
From: Ju-Zhe Zhong Fix bug reported by google/highway who is using rvv intrinsic: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109535 PR 109535 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (count_regno_occurrences): New function. (pass_vsetvl::cleanup_insns): Fix bug.

[PATCH] RISC-V: Fix bug of PR109535

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong PR 109535 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (count_regno_occurrences): New function. (pass_vsetvl::cleanup_insns): Fix bug. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr109535.C: New test. * gcc.target/riscv/rvv/

[PATCH] RISC-V: Fix bug of PR109535

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong Testcase coming from Kito. Co-authored-by: kito-cheng Co-authored-by: kito-cheng PR 109535 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (count_regno_occurrences): New function. (pass_vsetvl::cleanup_insns): Fix bug. gcc/testsuite/ChangeLog:

[PATCH 04/10] RISC-V: Support chunk 128

2023-04-19 Thread juzhe . zhong
From: Juzhe-Zhong RISC-V has provide different VLEN configuration by different ISA extension like `zve32x`, `zve64x` and `v` zve32x just guarantee the minimal VLEN is 32 bits, zve64x guarantee the minimal VLEN is 64 bits, and v guarantee the minimal VLEN is 128 bits, Current status (without

[PATCH] RISC-V: Support 128 bit vector chunk

2023-04-19 Thread juzhe . zhong
From: Juzhe-Zhong RISC-V has provide different VLEN configuration by different ISA extension like `zve32x`, `zve64x` and `v` zve32x just guarantee the minimal VLEN is 32 bits, zve64x guarantee the minimal VLEN is 64 bits, and v guarantee the minimal VLEN is 128 bits, Current status (without

[PATCH] RISC-V: Add tuple type vget/vset intrinsics

2023-04-19 Thread juzhe . zhong
From: Juzhe-Zhong gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (valid_type): Adapt for tuple type support. (inttype): Ditto. (floattype): Ditto. (main): Ditto. * config/riscv/riscv-vector-builtins-bases.cc: Ditto. * config/riscv/riscv

[PATCH 0/3] RISC-V: Basic enable RVV auto-vectorizaiton

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong PATCH 1: Add compile option for RVV auto-vectorization. PATCH 2: Enable basic RVV auto-vectorization. PATCH 3: Add sanity testcases. *** BLURB HERE *** Ju-Zhe Zhong (3): RISC-V: Add auto-vectorization compile option for RVV RISC-V: Enable basic auto-vectorization for RVV

[PATCH 1/3] RISC-V: Add auto-vectorization compile option for RVV

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong This patch is adding 2 compile option for RVV auto-vectorization. 1. -param=riscv-autovec-preference= This option is to specify the auto-vectorization approach for RVV. Currently, we only support scalable and fixed-vlmax. - scalable means VLA auto-vectorization.

[PATCH 2/3] RISC-V: Enable basic auto-vectorization for RVV

2023-04-19 Thread juzhe . zhong
c.md b/gcc/config/riscv/autovec.md new file mode 100644 index 000..b5d46ff57ab --- /dev/null +++ b/gcc/config/riscv/autovec.md @@ -0,0 +1,49 @@ +;; Machine description for auto-vectorization using RVV for GNU compiler. +;; Copyright (C) 2023 Free Software Foundation, Inc. +;; Contributed by J

[PATCH 0/3] RISC-V: Basic enable RVV auto-vectorizaiton

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong PATCH 1: Add compile option for RVV auto-vectorization. PATCH 2: Enable basic RVV auto-vectorization. PATCH 3: Add sanity testcases. *** BLURB HERE *** Ju-Zhe Zhong (3): RISC-V: Add auto-vectorization compile option for RVV RISC-V: Enable basic auto-vectorization for RVV

[PATCH 2/3] RISC-V: Enable basic auto-vectorization for RVV

2023-04-19 Thread juzhe . zhong
c.md b/gcc/config/riscv/autovec.md new file mode 100644 index 000..b5d46ff57ab --- /dev/null +++ b/gcc/config/riscv/autovec.md @@ -0,0 +1,49 @@ +;; Machine description for auto-vectorization using RVV for GNU compiler. +;; Copyright (C) 2023 Free Software Foundation, Inc. +;; Contributed by J

[PATCH 1/3] RISC-V: Add auto-vectorization compile option for RVV

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong This patch is adding 2 compile option for RVV auto-vectorization. 1. -param=riscv-autovec-preference= This option is to specify the auto-vectorization approach for RVV. Currently, we only support scalable and fixed-vlmax. - scalable means VLA auto-vectorization.

[PATCH 3/3] RISC-V: Add sanity testcases for RVV auto-vectorization

2023-04-19 Thread juzhe . zhong
From: Ju-Zhe Zhong This patch adds sanity tests for basic enabling auto-vectorization. We should make sure compiler enable auto-vectorization strictly according to '-march' For example, '-march=rv32gc_zve32x' can not allow INT64 auto-vectorization. Since SEW = 64 RVV instructions are illegal ins

<    4   5   6   7   8   9   10   11   12   >