[Bug target/110751] New: RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 Bug ID: 110751 Summary: RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: xuli1 at eswincomputing dot com Target Milestone: --- Created attachment 55588 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55588&action=edit testcase Zhong has merged two auto-vectorization patches: https://github.com/gcc-mirror/gcc/commit/0d4dd7e07a879d6c07a33edb2799710faa95651e https://github.com/gcc-mirror/gcc/commit/44f244e4672578be6cc513104473981790a1c164 Consider this following case: #include __attribute__((noipa)) void vrem_int8_t (int8_t * __restrict dst, int8_t * __restrict a, int8_t * __restrict b, int n) { for (int i = 0; i < n; i++) dst[i] = a[i] % b[i]; } vrem_int8_t: ble a3,zero,.L5 .L3: vsetvli a5,a3,e8,m1,tu,ma --> tu here vle8.v v1,0(a1) vle8.v v2,0(a2) sub a3,a3,a5 vrem.vv v1,v1,v2 vse8.v v1,0(a0) add a1,a1,a5 add a2,a2,a5 add a0,a0,a5 bne a3,zero,.L3 .L5: ret Currently, the return value of TARGET_PREFERRED_ELSE_VALUE targethook is not ideal for RVV since it will let VSETVL PASS use MU or TU. We want to suport undefined value that allows VSETVL PASS use TA/MA. According to Zhong's advice, there are two approachs: 1.Add a new tree code representing undefined value, like DEFTREECODE (UNDEF_TYPE, "undef_type", tcc_type, 0). 2.Modify the targethook TARGET_PREFERRED_ELSE_VALUE to support passing in a GSI parameter. (Currently only the aarch64 and riscv architectures implement this hook), In this way, we can build an RVV intrinsic __riscv_vundefine in the RISCV backend, so that the backend can automatically recognize undefine and use TA in VSETVL instruction. Which approach is better? Looking forward to your advice, thanks.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #16 from xuli1 at eswincomputing dot com --- (In reply to rguent...@suse.de from comment #12) > On Thu, 20 Jul 2023, juzhe.zhong at rivai dot ai wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > > > --- Comment #11 from JuzheZhong --- > > (In reply to rguent...@suse.de from comment #10) > > > On Thu, 20 Jul 2023, juzhe.zhong at rivai dot ai wrote: > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > > > > > > > --- Comment #9 from JuzheZhong --- > > > > (In reply to rguent...@suse.de from comment #8) > > > > > On Thu, 20 Jul 2023, juzhe.zhong at rivai dot ai wrote: > > > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > > > > > > > > > > > --- Comment #6 from JuzheZhong --- > > > > > > (In reply to rguent...@suse.de from comment #5) > > > > > > > On Thu, 20 Jul 2023, kito at gcc dot gnu.org wrote: > > > > > > > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > > > > > > > > > > > > > > > --- Comment #4 from Kito Cheng --- > > > > > > > > > OK, so TA is either merge or all-ones. > > > > > > > > > > > > > > > > Yes, your understand is correct, just few more detail is that > > > > > > > > can be mixing > > > > > > > > with either merge or all-ones. > > > > > > > > > > > > > > > > e.g. > > > > > > > > > > > > > > > > An 4 x i32 vector with mask 1 0 1 0 > > > > > > > > > > > > > > > > Op = | a | b | c | d | > > > > > > > > Mask = | 1 | 0 | 1 | 0 | > > > > > > > > > > > > > > > > the result could be: > > > > > > > > | a | b | c | d | > > > > > > > > | a | all-1 | c | d | > > > > > > > > | a | all-1 | c | all-1 | > > > > > > > > | a | all-1 | c | d | > > > > > > > > > > > > > > > > > > > > > > > > > Not sure how you can use MA at the moment since you specify > > > > > > > > > an existing operand in your target hook. As far as > > > > > > > > > I can see there's no value the target hook can provide that > > > > > > > > > matches any > > > > > > > > of the implementation semantics? > > > > > > > > > > > > > > > > That's the key point - we don't know how to return an undefined > > > > > > > > value there, we > > > > > > > > have intrinsic can generate undefined value, but it seems > > > > > > > > impossible to > > > > > > > > generate that within the hook. > > > > > > > > > > > > > > Well, neither *A nor *U can be specified currently. As said for > > > > > > > 'merge' > > > > > > > we would need another operand. And since 'unspecified' is either > > > > > > > merge > > > > > > > or all-ones we can't express that either. It's not really > > > > > > > 'undefined' > > > > > > > either. > > > > > > > > > > > > > > Note this also means the proposal to define a .MASK_LOAD as > > > > > > > zeroing > > > > > > > masked elements is not going to work for RISC-V, instead we'd need > > > > > > > an explicit 'else' value there as well. > > > > > > > > > > > > > > In fact we could follow .MASK_LOAD for .COND_* and simply omit > > > > > > > the 'else' operand for the case of 'unspecified', no? GIMPLE > > > > > > > would > > > > > > > be fine omitting it, not sure whether there's precedent for > > > > > > > optabs with optional operands? > > > > > > > > > > > > For RVV auto-vectorization, we define COND_LEN_* has else value in > > > > > > the > > > > > > arguments. But the else value is not always the real value we need > > > > > > to > > > > > > care about, this is the code from vectorizable_operation: > > > > > > > > > > > > if (reduc_idx >= 0) > > > > > > { > > > > > > /* Perform the operation on active elements only and > > > > > > take > > > > > > inactive elements from the reduction chain input. > > > > > > */ > > > > > > gcc_assert (!vop2); > > > > > > vops.quick_push (reduc_idx == 1 ? vop1 : vop0); > > > > > > } > > > > > > else > > > > > > { > > > > > > auto else_value = targetm.preferred_else_value > > > > > > (cond_fn, vectype, vops.length () - 1, &vops[1]); > > > > > > vops.quick_push (else_value); > > > > > > } > > > > > > > > > > > > > > > > > > You can see for reduction operations, the else value is the real > > > > > > value we > > > > > > need to depend on, we should use "TU" (Undisturbed or merge value) > > > > > > in RVV. > > > > > > Meaning the inactive elements should remain the "old" value that's > > > > > > why we > > > > > > use "TU". > > > > > > > > > > Sure. For the above case that's obviously correct. > > > > > > > > > > > However, for single binary operations for example, division, we > > > > > > just only > > > > > > need to forbid the division operations of the inactive elements in > > > > > > the > > > > > > hardware, we don't care the value of the inactive elements value. > > > > > > so in > > > > > > this case, we want to use "TA"
[Bug target/111076] RISC-V: segmentation fault during RTL pass: shorten (debug build)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111076 xuli1 at eswincomputing dot com changed: What|Removed |Added CC||xuli1 at eswincomputing dot com --- Comment #1 from xuli1 at eswincomputing dot com --- This issue has been resolved. https://github.com/gcc-mirror/gcc/commit/9f8d1d448e6c10fbad3bb41f4d7322fac8df4cd0
[Bug target/109725] [14 Regression] ICE: RTL check: expected code 'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4430
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109725 xuli1 at eswincomputing dot com changed: What|Removed |Added CC||xuli1 at eswincomputing dot com --- Comment #4 from xuli1 at eswincomputing dot com --- The gcc-13 branch also has the same issue (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61), can I backport this patch to gcc-13?
[Bug target/109725] [14 Regression] ICE: RTL check: expected code 'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4430
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109725 --- Comment #5 from xuli1 at eswincomputing dot com --- (In reply to xu...@eswincomputing.com from comment #4) > The gcc-13 branch also has the same issue > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61), can I backport this > patch to gcc-13? @kito.ch...@gmail.com @juzhe.zh...@rivai.ai @dimi...@gcc.gnu.org
[Bug target/111161] [13 Regression] ICE: RTL check: expected code 'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4394 during build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61 xuli1 at eswincomputing dot com changed: What|Removed |Added CC||xuli1 at eswincomputing dot com --- Comment #1 from xuli1 at eswincomputing dot com --- backport https://github.com/gcc-mirror/gcc/commit/7f26e76c9848aeea9ec10ea701a6168464a4a9c2 to gcc-13, should be fixed now.
[Bug target/111412] New: [release/gcc13 bug]RISC-V:ICE in phase 6 of vsetvl pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111412 Bug ID: 111412 Summary: [release/gcc13 bug]RISC-V:ICE in phase 6 of vsetvl pass Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: xuli1 at eswincomputing dot com Target Milestone: --- Created attachment 55899 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55899&action=edit testcase Compile the code using -march=rv64gcv -mabi=lp64d -O2: test: beq a0,zero,.L2 lui a5,%hi(.LC0) flw fa4,%lo(.LC0)(a5) fmv.s.x fa5,zero li a3,1 vfmv.v.fv1,fa4 --> The vsetvl instruction was not set before this,causing Illegal instruction (core dumped) .L5: sllia5,a3,32 srlia5,a5,32 vsetvli a5,a5,e32,m8,ta,mu beq a5,zero,.L3 mv a4,a3 . Solution: vsetvl pass has been refactored in gcc14, and the optimization process is more reasonable than gcc13. This problem does not exist in gcc14. Phase 6 of gcc13 is an optimization patch. Due to lack of consideration, there will be some hidden bugs, so we decided to remove phase 6. Although the generated code will be redundant,the program is correct.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 xuli1 at eswincomputing dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #45 from xuli1 at eswincomputing dot com --- Verified
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 xuli1 at eswincomputing dot com changed: What|Removed |Added Status|RESOLVED|CLOSED --- Comment #46 from xuli1 at eswincomputing dot com --- closed
[Bug target/111533] [14 Regression] ICE: RTL check: expected code 'reg', have 'const_int' in rhs_regno, at rtl.h:1934
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111533 xuli1 at eswincomputing dot com changed: What|Removed |Added CC||xuli1 at eswincomputing dot com --- Comment #1 from xuli1 at eswincomputing dot com --- Hi, Patrick, I can't reproduce your problem, my steps are as follows: cd riscv_gnu_toolchian ../configure --with-arch=rv64gc --with-abi=lp64d --enable-multilib --enable-gcc-checking=rtl make -j32 Am I doing the right thing?
[Bug target/111533] [14 Regression] ICE: RTL check: expected code 'reg', have 'const_int' in rhs_regno, at rtl.h:1934
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111533 --- Comment #3 from xuli1 at eswincomputing dot com --- The problem has been reproduced, thank you.
[Bug target/117283] [RISC-V] Miscompilation triggered by `__riscv_vsseg7e32_v_i32m1x7`, GCC 14.2.0 at `-O1`, `-O2`, `-O3`, and `-Os`.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117283 --- Comment #4 from xuli1 at eswincomputing dot com --- (In reply to xu...@eswincomputing.com from comment #3) > (In reply to Yibo He from comment #1) > > The data initialization is long, because I find that this bug is triggered > > when long data input. If anyone has a better submission format for code like > > this, please let me know. > > Could you please provide a more detailed description? Have you analyzed > which specific part of the assembly code caused the error? I think you should use __riscv_vssseg7e32_v_i32m1x7 instead of __riscv_vsseg7e32_v_i32m1x7. __riscv_vsseg7e32_v_i32m1x7(ptr_d, vb, vl); -> __riscv_vssseg7e32_v_i32m1x7(ptr_d,1, vb, vl);
[Bug target/117283] [RISC-V] Miscompilation triggered by `__riscv_vsseg7e32_v_i32m1x7`, GCC 14.2.0 at `-O1`, `-O2`, `-O3`, and `-Os`.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117283 xuli1 at eswincomputing dot com changed: What|Removed |Added CC||xuli1 at eswincomputing dot com --- Comment #3 from xuli1 at eswincomputing dot com --- (In reply to Yibo He from comment #1) > The data initialization is long, because I find that this bug is triggered > when long data input. If anyone has a better submission format for code like > this, please let me know. Could you please provide a more detailed description? Have you analyzed which specific part of the assembly code caused the error?