[Bug target/119832] RISC-V: Redundant floating point rounding mode store/restore

2025-04-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119832 --- Comment #4 from Li Pan --- (In reply to Kito Cheng from comment #1) > Created attachment 61135 [details] > 0001-RISC-V-Implement-TARGET_MODE_CONFLUENCE.patch > > My working patch for this bug I think TARGET_MODE_CONFLUENCE should be a bett

[Bug target/119832] RISC-V: Redundant floating point rounding mode store/restore

2025-04-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119832 --- Comment #2 from Li Pan --- More details. 7 --- mode change from 10 -> 7 // NONE => DYN <<< 8 --- restore mode is dyn and prev is not call 9 --- mode change from 7 -> 9 // DYN => CALL 10 --- mode change from 10 -> 9 // NONE =>

[Bug tree-optimization/119757] [15 regression] RISC-V: ICE when building curl-8.13.0 (in operator[], at vec.h:910)

2025-04-13 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119757 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #8 from L

[Bug target/119581] Failure to use vector vandn instruction on RISC-V

2025-04-01 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119581 --- Comment #3 from Li Pan --- (In reply to Jeffrey A. Law from comment #2) > Thanks Pan. I've got an intern working in this space, and this may be a > good exercise for them. So definitely reach out before you dive in to see > if she's gotten

[Bug target/119581] Failure to use vector vandn instruction on RISC-V

2025-04-01 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119581 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from L

[Bug rtl-optimization/119554] [risc-v][bug] Unusual Behavior Observed with RISC-V Vector Extension (RVV)

2025-04-01 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119554 Li Pan changed: What|Removed |Added CC||jeffreyalaw at gmail dot com,

[Bug target/119547] RISC-V: VSETVL mistakenly modified other data

2025-03-31 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547 Li Pan changed: What|Removed |Added CC||jeffreyalaw at gmail dot com,

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-09 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #19 from Li Pan --- > No you got it wrong. > _121 will either be -1 or 0. _11 should be -1 or 0 too. > So the question is what was the VEC_EXTRACT doing the right thing? Is it 0/-1 > or 0/1? Oh, I see. Let me revisit the dump code

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #12 from Li Pan --- (In reply to Robin Dapp from comment #9) > I suspect the problem lies somewhere here: > > _11 = .VEC_EXTRACT (mask__83.22_110, 0); > _23 = MEM[(short int *)&t + 20B]; > _24 = _23 & _132; > _25 = _24 != 0;

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #8 from Li Pan --- 252 │ vect__81.20_52 = vect_cst__142 & _164; // {3} 253 │ mask__82.21_53 = vect__81.20_52 != { 0, 0, 0, 0, 0, 0, 0, 0 };// 0xff 254 │ _31 = mask__82.21_53 ^ mask__57.18_81; // 0xff 255 │ mask__8

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #7 from Li Pan --- Yes, double checked, the result of tree.optimized looks right, details as below. Then should be a backend issue now. will take a look into it. 206 │[local count: 56478818]: 207 │ _114 = MEM[(short int

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #5 from Li Pan --- (In reply to Robin Dapp from comment #4) > Very weird indeed. It looks like we're not even vectorizing? I mean, sure, > we use vector instructions but they are all broadcast from scalars? > (VMAT_INVARIANT) And

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-06 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #2 from Li Pan --- Tweak test case for easy locating. 1 │ int b[18]; 2 │ long long al; 3 │ _Bool e; 4 │ char f = 010; 5 │ short t[18]; 6 │ 7 │ unsigned w[8][18][18][18]; 8 │ _Bool a; 9 │

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-06 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 --- Comment #3 from Li Pan --- The related asm looks abnormal up to a point, there should be a reduce insn for a but actually not, the insn and flow may looks like below. 114 │1028c: cc847057vsetivlizero,8,e16,m1,ta,ma

[Bug target/119114] [14/15 regression] RISC-V: miscompile at -O3 since r14-4077-g86451305d8b

2025-03-05 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from L

[Bug target/118931] [15 Regression] RISC-V: rv64gcv miscompile at -O[23] since r15-3228-g771256bcb9d

2025-02-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931 --- Comment #3 from Li Pan --- It is a bug of interleaved_stepped when expand_const_vector, the base + i*step for base1 series may overflow and then the base2 series will OR overflowed bits to the final result. I will prepare a fix for this.

[Bug target/118931] [15 Regression] RISC-V: rv64gcv miscompile at -O[23] since r15-3228-g771256bcb9d

2025-02-20 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931 --- Comment #2 from Li Pan --- 13 │ int main () 14 │ { 15 │ vector(16) unsigned char vect__3.5; 16 │ unsigned char a_lsm.2; 17 │ long long int _5; 18 │ vector(16) unsigned char _13; 19 │ unsigned char _29;

[Bug target/118931] [15 Regression] RISC-V: rv64gcv miscompile at -O[23] since r15-3228-g771256bcb9d

2025-02-20 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from L

[Bug target/118949] [15 regression] RISC-V: Extra FRM writes since GCC-14.2 since r15-5943-gdc0dea98c96e02

2025-02-20 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118949 --- Comment #5 from Li Pan --- Thanks Vineet, update another case with explicit convert. It is unrelated to the global_reg change. 1 │ #define T float 2 │ 3 │ void func(const T * restrict a, const T * restrict b, 4 │

[Bug target/118949] RISC-V: Extra FRM writes since GCC-14.2

2025-02-20 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118949 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #2 from L

[Bug tree-optimization/116351] RISC-V ICE: in get_len_load_store_mode, at optabs-tree.cc:664

2025-02-17 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351 --- Comment #5 from Li Pan --- (In reply to Li Pan from comment #4) > I see, I worked out another fix that is under testing, will send it out if > no surprise from test and see. seems not that correct, will have a try from yours.

[Bug tree-optimization/116351] RISC-V ICE: in get_len_load_store_mode, at optabs-tree.cc:664

2025-02-17 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351 --- Comment #4 from Li Pan --- I see, I worked out another fix that is under testing, will send it out if no surprise from test and see.

[Bug tree-optimization/116351] RISC-V ICE: in get_len_load_store_mode, at optabs-tree.cc:664

2025-02-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #2 from L

[Bug target/118540] RISC-V: ICE for unsupported target attribute

2025-02-14 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118540 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from L

[Bug target/118832] RISC-V: internal compiler error: could not split insn, with V+Zbb enabled

2025-02-13 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832 --- Comment #9 from Li Pan --- (In reply to Robin Dapp from comment #8) > I think for vec_duplicate the idea is the same as for all the other splits - > keep it in simple shape so we can combine/fwprop etc. It also helps > converting e.g. > >

[Bug target/118832] RISC-V: internal compiler error: could not split insn, with V+Zbb enabled

2025-02-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832 --- Comment #7 from Li Pan --- Thanks Jeff and Robin, that makes much sense to me. However, I got a little confused about the vec_duplicate with define_insn_and_split. As I learned, define_insn_and_split equals define_insn + define_split + defi

[Bug target/118832] RISC-V: internal compiler error: could not split insn, with V+Zbb enabled

2025-02-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832 Li Pan changed: What|Removed |Added CC||juzhe.zhong at rivai dot ai --- Comment #2 fro

[Bug target/118832] RISC-V: internal compiler error: could not split insn, with V+Zbb enabled

2025-02-11 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from L

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-25 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #12 from Li Pan --- (In reply to Li Pan from comment #10) > (In reply to Vineet Gupta from comment #8) > > A fix for PR/118464 is posted to list [1] which also cures this issue. > > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-24 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #11 from Li Pan --- TARGET_CONDITIONAL_REGISTER_USAGE can help to resolve this issue, let me have a try for regression test. But looks we don't need to emit_volatile_frm anymore here, but it is another refactor later.

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-24 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #10 from Li Pan --- (In reply to Vineet Gupta from comment #8) > A fix for PR/118464 is posted to list [1] which also cures this issue. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674498.html Thanks Vineet, it se

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-24 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #9 from Li Pan --- (In reply to Richard Sandiford from comment #7) > The problem seems to be in the modelling of the FRM register. > CALL_USED_REGISTERS says that the register is call-clobbered/caller-save, > which means: > > (a) i

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-24 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #5 from Li Pan --- bisect late-combine2 results in some invalid combine here for main function. (insn 40 5 41 2 (set (reg:SI 11 a1 [151]) (reg:SI 69 frm)) "pr118103-simple.c":67:15 2712 {frrmsi} (nil)) (insn 41 40 7 2 (set (reg

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #4 from Li Pan --- gcc-14 has the correct behavior and mostly some middle-end change I guess. └─(11:39:07 on master⚑ ✭)──> riscv64-linux-gnu-gcc-14 --version riscv64-linux-gnu-gcc-14 (Ubuntu 14.2.0-4ubuntu2~24.04) 1

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #3 from Li Pan --- Interesting the test_example in a separate function other than main will have the frm restore insn, but there will be no such frm in main function. 62 │ test_exampe: 63 │ frrma2 64 │ fsrmi

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 --- Comment #2 from Li Pan --- (In reply to Li Pan from comment #1) > Ack, let me try to reproduce this. Reproduced, the inlined compute delete the restore FRM somewhere, will take a look into it.

[Bug target/118103] [15 Regression] GCC miscompile rvv intrinsics at `-O3`, missing the `fsrm` instruction to the recover status of frm CSR

2025-01-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from L

[Bug target/117688] [15 Regression] RISC-V: Wrong code for .SAT_SUB

2025-01-19 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117688 --- Comment #7 from Li Pan --- Reproduced and will prepare a fix patch for this.

[Bug target/117688] [15 Regression] RISC-V: Wrong code for .SAT_SUB

2025-01-19 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117688 --- Comment #6 from Li Pan --- Ack, looks like a code-gen issue for the risc-v backend, let me try to reproduce it from qemu and dev-board.

[Bug target/118075] [15 Regression] RISC-V: Miscompile at -O3 zvl 256 since r15-4746-g30435cc2610

2024-12-17 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118075 --- Comment #2 from Li Pan --- Ack and reproduced. Take a rough look it should be the strided store for memory alias because disable the sch can fix it. I will take care of it.

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-13 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #12 from Li Pan --- (In reply to Robin Dapp from comment #11) > I'm not really sure. For now I hope not. If we hit similar problems again > that are not easily fixable we can reconsider. Sure thing.

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-13 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #10 from Li Pan --- (In reply to Robin Dapp from comment #9) > Should be fixed. Thanks Robin for fixing this, do we still need to do something like ix86_pre_reload_split for the risc-v backend? Which avoid the the define expand to b

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #4 from Li Pan --- (In reply to Li Pan from comment #3) >1 │ #include >2 │ >3 │ #define I_P1 16 >4 │ #define I_P2 1344 >5 │ >6 │ #define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\ >7 │

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #3 from Li Pan --- 1 │ #include 2 │ 3 │ #define I_P1 16 4 │ #define I_P2 1344 5 │ 6 │ #define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\ 7 │ int t0 = s0 + s1;\ 8 │ int t1 = s0 - s1;\

[Bug target/117990] [15 regression] RISC-V: Miscompile at -O3 zvl 256 since r15-4746-g30435cc2610

2024-12-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990 --- Comment #8 from Li Pan --- The gather load has involved this (mem:BLK (scratch)) already, thus it doesn't have this problem. BTW, does alias analysis support the complicated scenario like strided/index load (I bet we may need more info to f

[Bug target/117990] [15 regression] RISC-V: Miscompile at -O3 zvl 256 since r15-4746-g30435cc2610

2024-12-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990 --- Comment #6 from Li Pan --- Add (mem:BLK (scratch)) to strided load define_insn can help to fix this issue, as (mem:BLK (scratch)) is considered to alias all other memories. In theory, we can do even more accurate alias analysis here, like v

[Bug target/117990] [15 regression] RISC-V: Miscompile at -O3 zvl 256 since r15-4746-g30435cc2610

2024-12-10 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990 --- Comment #5 from Li Pan --- The tree optimized looks right up to a point. 5 │ int main () 6 │ { 7 │ vector(8) int vect__4.8; 8 │ vector(8) char vect__3.7; 9 │ vector(8) char D.2823; 10 │ int _5; 11 │

[Bug target/117990] [15 regression] RISC-V: Miscompile at -O3 zvl 256 since r15-4746-g30435cc2610

2024-12-10 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990 --- Comment #4 from Li Pan --- Another example to reproduce this. 1 │ #define STEP 10 2 │ 3 │ char d[225]; 4 │ int e[STEP]; 5 │ 6 │ int main() { 7 │ for (long h = 0; h < STEP; ++h) 8 │ d[h * STEP] =

[Bug target/117990] [15] RISC-V: Miscompile at -O3 zvl 256 since r15-4746-g30435cc2610

2024-12-10 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990 --- Comment #2 from Li Pan --- (In reply to Patrick O'Neill from comment #1) > -flto can be replaced with -fwhole-program: > > -march=rv64gcv_zvl256b -fwhole-program -O3 -mrvv-vector-bits=zvl test.c -o > user-config.out Confirmed, reproduced b

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-12-10 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #17 from Li Pan --- (In reply to Vineet Gupta from comment #14) > (In reply to Li Pan from comment #7) > > Created attachment 59661 [details] > > with usad pattern > > Can you please post the patch, lest we duplicate your effort. >

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-09 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #7 from Li Pan --- This insn is introduced during reload when lra_constraints. There will be const vector like: (const_vector:V8QI [ (const_int 4 [0x4]) (const_int 12 [0xc]) (const_int 5 [0x5])

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-04 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #6 from Li Pan --- 526.blender_r is quite long to tell, will leverage below code to investigate which can reproduce this issue too. 1 │ int *b; 2 │ inline void c(char *d, int e) { 3 │ d[0] = 0; 4 │ d[1] = e;

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-02 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #4 from Li Pan --- (In reply to Robin Dapp from comment #3) > Generally, yes, I guess. But I'd like to understand better what exactly is > going wrong. Shouldn't emitting those "pre-RA" insns already be guarded > properly? I haven

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-02 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #2 from Li Pan --- (In reply to Robin Dapp from comment #1) > Is this related to PR117353? Seems very similar. Yes, very similar but ice at different pass. The similar approach like ix86_pre_reload_split can fix the code example i

[Bug c/117878] New: RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-02 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 Bug ID: 117878 Summary: RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: no

[Bug tree-optimization/88603] optimization missed for saturation arithmetic add

2024-11-26 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88603 --- Comment #4 from Li Pan --- (In reply to Andrew Pinski from comment #3) > We don't recongize saturation_add in comment #0 as a SAT_ADD still. Yes, the form like convert to widen for overflow checking is not supported for now. I will take care

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-11-25 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600 --- Comment #26 from Li Pan --- (In reply to Uroš Bizjak from comment #25) > (In reply to Li Pan from comment #24) > > > Does upstream still have the issue mentioned above? If not, I'll add some > > test cases i386. > The issue looks fixed, f2

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-11-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600 --- Comment #24 from Li Pan --- (In reply to Uroš Bizjak from comment #22) > (In reply to Li Pan from comment #21) > > > Looks the f2 can vectorized to sat_add from upstream now, may be impacted by > > recent changes. Let me add one test for th

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #12 from Li Pan --- (In reply to Robin Dapp from comment #11) > (In reply to Li Pan from comment #9) > > Created attachment 59663 [details] > > before_vs_after when outer loop is 128 > > Ok, that's a different loop then. I'm seeing

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #9 from Li Pan --- Created attachment 59663 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59663&action=edit before_vs_after when outer loop is 128

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #7 from Li Pan --- Created attachment 59661 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59661&action=edit with usad pattern

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #6 from Li Pan --- Created attachment 59660 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59660&action=edit upstream

[Bug tree-optimization/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-20 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #2 from Li Pan --- Take x86_64 perf data for 625 base, x264_pixel_satd_8x4 is the hottest func. Children Self Command Shared Object Symbol + 19.26%18.96% x264_s_base.non x264_s_base.none [.] x264_pixel

[Bug target/117594] [15] RISC-V: Miscompile at -O3

2024-11-14 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117594 --- Comment #4 from Li Pan --- I can reproduce this. └─(07:29:53 on master⚑ ✭)──> QEMU_CPU=rv64,vlen=128,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0 ~/bin/qemu/bin/qemu-riscv64 test.elf

[Bug tree-optimization/113583] Main loop in 519.lbm not vectorized.

2024-11-11 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #20 from

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-11-11 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600 --- Comment #23 from Li Pan --- (In reply to Uroš Bizjak from comment #22) > (In reply to Li Pan from comment #21) > > > Looks the f2 can vectorized to sat_add from upstream now, may be impacted by > > recent changes. Let me add one test for th

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-11-11 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600 --- Comment #21 from Li Pan --- (In reply to Li Pan from comment #20) > (In reply to Li Pan from comment #19) > > interesting, I will take a look for f2 after some more sat_* supported. > > RISC-V backend works well for all of above pattern but

[Bug target/116655] RISC-V: ICE with -mrvv-max-lmul=dynamic in compute_nregs_for_mode

2024-10-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655 --- Comment #5 from Li Pan --- (In reply to Robin Dapp from comment #4) > Fixed. Thanks Robin, this also fixed the spec17 build failures as below for dynamic. Build errors for intrate: 502.gcc_r(base; CE), 525.x264_r(base; CE), 557.xz_r(base;

[Bug c++/116064] [15 Regression] SPEC 2017 523.xalancbmk_r failed to build

2024-10-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064 --- Comment #15 from Li Pan --- (In reply to Li Pan from comment #14) > > So you have to use one of those two. > > Thanks, I see, let me update the config file and have another try. -Wno-error=template-body works, thanks a lot.

[Bug c++/116064] [15 Regression] SPEC 2017 523.xalancbmk_r failed to build

2024-10-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064 --- Comment #14 from Li Pan --- > So you have to use one of those two. Thanks, I see, let me update the config file and have another try.

[Bug c++/116064] [15 Regression] SPEC 2017 523.xalancbmk_r failed to build

2024-10-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #12 from

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-10-14 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600 --- Comment #20 from Li Pan --- (In reply to Li Pan from comment #19) > interesting, I will take a look for f2 after some more sat_* supported. RISC-V backend works well for all of above pattern but x86 failed on f2, let me dig more details for

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-10-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600 --- Comment #19 from Li Pan --- interesting, I will take a look for f2 after some more sat_* supported.

[Bug target/116883] Compile cpp code with rv32imafc_zve32f failed

2024-10-08 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116883 --- Comment #3 from Li Pan --- I think xuli is working on this issue. As you know, the first week of Oct is the National Holiday.

[Bug tree-optimization/116861] [15 regression] ICE when building netpbm-11.2.10

2024-09-26 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116861 --- Comment #10 from Li Pan --- (In reply to Andrew Pinski from comment #9) > (In reply to Li Pan from comment #8) > > [0] psi ptr 0x7e2f8f00c000 > > [1] psi ptr 0x7e2f8f00c400 > > [2] psi ptr 0xa5a5a5a5a5a5a5a5 <=== Invalid. > > > > Looks som

[Bug tree-optimization/116861] [15 regression] ICE when building netpbm-11.2.10

2024-09-26 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116861 --- Comment #8 from Li Pan --- [0] psi ptr 0x7e2f8f00c000 [1] psi ptr 0x7e2f8f00c400 [2] psi ptr 0xa5a5a5a5a5a5a5a5 <=== Invalid. Looks some gsi info is polluted during matching.

[Bug tree-optimization/116861] [15 regression] ICE when building netpbm-11.2.10

2024-09-26 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116861 --- Comment #7 from Li Pan --- Thanks all for reducing, reproduced from myside and will take a look soon.

[Bug tree-optimization/116795] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed since r15-3708-g2545a1abb77bd6

2024-09-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795 Li Pan changed: What|Removed |Added Status|RESOLVED|CLOSED --- Comment #10 from Li Pan --- Thanks

[Bug tree-optimization/116795] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed since r15-3708-g2545a1abb77bd6

2024-09-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795 --- Comment #8 from Li Pan --- (In reply to Sam James from comment #7) > (In reply to Li Pan from comment #6) > > (In reply to Sam James from comment #5) > > > Pan Li, if you set your email on Bugzilla to pa...@gcc.gnu.org, you will > > > get >

[Bug tree-optimization/116795] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed since r15-3708-g2545a1abb77bd6

2024-09-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795 --- Comment #6 from Li Pan --- (In reply to Sam James from comment #5) > Pan Li, if you set your email on Bugzilla to pa...@gcc.gnu.org, you will get > permissions to modify bugs :) Yes and Thanks. I can modify bugs, but could you please help t

[Bug target/116814] [15 Regression] ICE on libjack2-1.9.22: in expand_fn_using_insn, at internal-fn.cc:263

2024-09-23 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116814 --- Comment #1 from Li Pan --- Ack, thanks for reporting this. Should be introduced by this commit. https://github.com/gcc-mirror/gcc/commit/f2476a2649e9975d454d179145574c21d8218aee I am preparing a fix for this.

[Bug tree-optimization/116795] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed since r15-3708-g2545a1abb77bd6

2024-09-21 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795 --- Comment #2 from Li Pan --- Ack, and thanks for reporting, will take a look soon.

[Bug target/116278] [15] RISC-V: Miscompile at -O2 -fwrapv -fno-strict-aliasing

2024-08-17 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278 --- Comment #11 from Li Pan --- Thanks for suggestion, will move run test to gcc/testsuite/gcc.c-torture/execute and only leave asm check under riscv.

[Bug target/116280] [15 Regression] RISC-V: expected mode RVVMF8QI for operand 2 of insn pred_vwsllrvvmf4hi but got mode RVVMF2SI

2024-08-09 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116280 --- Comment #1 from Li Pan --- Looks like some typos in md files, let me take a look.

[Bug target/116278] [15] RISC-V: Miscompile at -O2 -fwrapv -fno-strict-aliasing

2024-08-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278 --- Comment #8 from Li Pan --- (In reply to Li Pan from comment #7) > The backend take > rtx xmode_x = gen_lowpart (Xmode, x); > > For the incoming op of .SAT_ADD, thus I think we should take lbu instead of > lb according to the ISA. During u

[Bug target/116278] [15] RISC-V: Miscompile at -O2 -fwrapv -fno-strict-aliasing

2024-08-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278 --- Comment #7 from Li Pan --- The backend take rtx xmode_x = gen_lowpart (Xmode, x); For the incoming op of .SAT_ADD, thus I think we should take lbu instead of lb according to the ISA.

[Bug target/116278] [15] RISC-V: Miscompile at -O2 -fwrapv -fno-strict-aliasing

2024-08-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278 --- Comment #6 from Li Pan --- (In reply to Andrew Pinski from comment #4) > lb a1,0(a5) // load -40 > lui a0,%hi(.LC0) > lui a4,%hi(c) > addia5,a1,9 //a5 = -31 > sllia5,a5,48 >

[Bug target/116278] [15] RISC-V: Miscompile at -O2 -fwrapv -fno-strict-aliasing

2024-08-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278 --- Comment #5 from Li Pan --- Reproduced from both qemu and hardware, let me take a look.

[Bug target/116278] [15] RISC-V: Miscompile at -O2 -fwrapv -fno-strict-aliasing

2024-08-07 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278 --- Comment #3 from Li Pan --- (In reply to Kito Cheng from comment #2) > Hi Pan, could you take a look to see if it related to SAT_ADD? Ack, thanks.

[Bug target/116202] RISC-V: Miscompile at -O3 with zvl256b

2024-08-03 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116202 --- Comment #3 from Li Pan --- (In reply to Li Pan from comment #2) > Confirmed, thanks and will take care of it soon. Just prepared a fix, and will send it out if no surprise from test.

[Bug target/116202] RISC-V: Miscompile at -O3 with zvl256b

2024-08-03 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116202 --- Comment #2 from Li Pan --- Confirmed, thanks and will take care of it soon.

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-29 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103 --- Comment #10 from Li Pan --- (In reply to Thomas Schwinge from comment #9) > (In reply to Li Pan from comment #7) > > confirm with you all related failures are covered. > > Yes, the testing state is restored to what it was before, thanks! >

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-29 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103 --- Comment #7 from Li Pan --- Hi Thomas, Could you please help to double confirm the below patch is able to fix these asm check failure? https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658519.html I tested below cases for target=amdgcn-a

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-28 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103 --- Comment #6 from Li Pan --- (In reply to Thomas Schwinge from comment #5) > (In reply to Li Pan from comment #3) > > best practice of cross > > compile gfx908 in x86 linux? > > If you only need the 'cc1' (and no assembler, linker, libc), the

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-26 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103 --- Comment #3 from Li Pan --- Thanks Richard for the suggestion. Hi Thomas, could you please help to insight me the best practice of cross compile gfx908 in x86 linux? Then I can have a try following Richard's suggestion.

[Bug middle-end/115961] [15 Regression] wrong code on llvm-18.1.8 since r15-1936-g80e446e829d818 with bitfields less than the type mode precision

2024-07-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115961 --- Comment #5 from Li Pan --- Thanks Andrew Pinski. That make much sense to me, and I can reproduce this from upstream now. Let me file a patch for it.

[Bug middle-end/115863] [15 Regression] zlib-1.3.1 miscompilation since r15-1936-g80e446e829d818

2024-07-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 --- Comment #14 from Li Pan --- Hi Uroš, > Please note two new instructions in the second asm dump. These are expanded > from .SAT_TRUNC and are not present in the first asm dump. > The problem here is that the presence of ustrunc{m}{n}2 optab

[Bug middle-end/115961] [15 Regression] wrong code on llvm-18.1.8 since r15-1936-g80e446e829d818 with bitfields less than the type mode precision

2024-07-16 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115961 --- Comment #3 from Li Pan --- Only x86 implemented the .SAT_TRUNC for scalar, so I bet it is almost the same as this https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 ?

[Bug middle-end/115863] [15 Regression] zlib-1.3.1 miscompilation since r15-1936-g80e446e829d818

2024-07-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 --- Comment #13 from Li Pan --- Thanks Richard and Bizjak. Got the point here, and let me have a try for the improvement.

[Bug middle-end/115863] [15 Regression] zlib-1.3.1 miscompilation since r15-1936-g80e446e829d818

2024-07-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 --- Comment #8 from Li Pan --- (In reply to Richard Biener from comment #7) > (In reply to Uroš Bizjak from comment #6) > > Please note that w/o .SAT_TRUNC the compiler is able to optimize hot loop in > > compress2 to: > > > >[local count:

  1   2   3   >