[Bug target/112999] riscv: Infinite loop with mask extraction

2023-12-15 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112999 Robin Dapp changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/113249] RISC-V: regression testsuite errors -mtune=generic-ooo

2024-01-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113249 --- Comment #1 from Robin Dapp --- Yes, several (most?) of those are expected because the tests rely on the default latency model. One option is to hard code the tune in those tests. On the other hand the dump tests checking for a more or less

[Bug target/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc

2024-01-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #2 from Robin Dapp --- Confirmed. Funny, we shouldn't vectorize that but really optimize to "return 0". Costing might be questionable but we also haven't optimized away the loop when comparing costs. Disregarding that, of course t

[Bug target/113247] RISC-V: Performance bug in SHA256 after enabling RVV vectorization

2024-01-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113247 --- Comment #1 from Robin Dapp --- Hmm, so I tried reproducing this and without a vector cost model we indeed vectorize. My qemu dynamic instruction count results are not as abysmal as yours but still bad enough (20-30% increase in dynamic inst

[Bug target/113247] RISC-V: Performance bug in SHA256 after enabling RVV vectorization

2024-01-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113247 --- Comment #3 from Robin Dapp --- Yes, sure and I gave a bit of detail why the values chosen there (same as aarch64) make sense to me. Using this generic vector cost model by default without adjusting the latencies is possible. I would be OK

[Bug target/113247] RISC-V: Performance bug in SHA256 after enabling RVV vectorization

2024-01-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113247 --- Comment #4 from Robin Dapp --- The other option is to assert that all tune models have at least a vector cost model rather than NULL... But not falling back to the builtin costs still makes sense.

[Bug target/113249] RISC-V: regression testsuite errors -mtune=generic-ooo

2024-01-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113249 --- Comment #4 from Robin Dapp --- > One of the reasons I've been testing things with generic-ooo is because > generic-ooo had initial vector pipelines defined. For cleaning up the > scheduler, I copied over the generic-ooo pipelines into generi

[Bug middle-end/112971] [14] RISC-V rv64gcv_zvl256b vector -O3: internal compiler error: Segmentation fault signal terminated program cc1

2024-01-10 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971 --- Comment #22 from Robin Dapp --- Yes, going to the thread soon.

[Bug target/113247] RISC-V: Performance bug in SHA256 after enabling RVV vectorization

2024-01-10 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113247 --- Comment #9 from Robin Dapp --- I also noticed this (likely unwanted) vector snippet and wondered where it is being created. First I thought it's a vec_extract but doesn't look like it. I'm going to check why we create this. Pan, the test

[Bug c/113474] RISC-V: Fail to use vmerge.vim for constant vector

2024-01-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113474 --- Comment #1 from Robin Dapp --- Good catch. Looks like the ifn expander always forces into a register. That's probably necessary on all targets except riscv. diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index a07f25f3aee..e923051d5

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

2024-01-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495 --- Comment #27 from Robin Dapp --- Following up on this: I'm seeing the same thing Patrick does. We create a lot of large non-sparse sbitmaps that amount to around 33G in total. I did local experiments replacing all sbitmaps that are not nee

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087 --- Comment #37 from Robin Dapp --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113206#c9 > Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on: > zvl128b (All runtime fails): > 527.cam4 (Runtime) > 531.deepsjeng (Runtime)

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087 --- Comment #38 from Robin Dapp --- deepsjeng also looks ok here.

[Bug testsuite/113558] [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c etc. FAIL

2024-01-23 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113558 --- Comment #2 from Robin Dapp --- Created attachment 57195 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57195&action=edit Tentative patch Ah, it looks like nothing is being vectorized at all and the second check just happened to match

[Bug target/113570] RISC-V: SPEC2017 549 fotonik3d miscompilation in autovec VLS 256 build

2024-01-23 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113570 --- Comment #2 from Robin Dapp --- I'm pretty certain this is "works as intended" and -Ofast causes the precision to be different than with -O3 (and dependant on the target). See also: It has been reported that with gfortran -Ofast -march=nat

[Bug other/113575] [14 Regression] memory hog building insn-opinit.o (i686-linux-gnu -> riscv64-linux-gnu)

2024-01-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113575 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org --- Comment #5

[Bug other/113575] [14 Regression] memory hog building insn-opinit.o (i686-linux-gnu -> riscv64-linux-gnu)

2024-01-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113575 --- Comment #7 from Robin Dapp --- Ok, I'm going to check.

[Bug tree-optimization/113583] New: Main loop in 519.lbm not vectorized.

2024-01-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* riscv*-*-* This might be a known issue but a bugzilla search regarding lbm didn't show any

[Bug tree-optimization/113583] Main loop in 519.lbm not vectorized.

2024-01-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583 --- Comment #2 from Robin Dapp --- > It's interesting, for Clang only RISC-V can vectorize it. The full loop can be vectorized on clang x86 as well when I remove the first conditional (which is not in the snippet I posted above). So that's lik

[Bug other/113575] [14 Regression] memory hog building insn-opinit.o (i686-linux-gnu -> riscv64-linux-gnu)

2024-01-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113575 --- Comment #12 from Robin Dapp --- Created attachment 57209 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57209&action=edit Tentative I tested the attached "fix". On my machine with 13.2 host compiler it reduced the build time for insn

[Bug other/113575] [14 Regression] memory hog building insn-opinit.o (i686-linux-gnu -> riscv64-linux-gnu)

2024-01-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113575 --- Comment #14 from Robin Dapp --- Ok, running tests with the adjusted version and going to post a patch afterwards. However, during a recent run compiling insn-recog took 2G and insn-emit-7 as well as insn-emit-10 required > 1.5G each. Looks

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #4 from Robin Dapp --- I cannot reproduce it either, tried with -ftree-vectorize as well as -fno-vect-cost-model.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #7 from Robin Dapp --- Yep, that one fails for me now, thanks.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #10 from Robin Dapp --- The compile farm machine I'm using doesn't have SVE. Compiling with -march=armv8-a -O3 pr113607.c -fno-vect-cost-model and running it returns 0 (i.e. ok). pr113607.c:35:5: note: vectorized 3 loops in function

[Bug tree-optimization/113583] Main loop in 519.lbm not vectorized.

2024-01-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583 --- Comment #9 from Robin Dapp --- (In reply to rguent...@suse.de from comment #6) > t.c:47:21: missed: the size of the group of accesses is not a power of 2 > or not equal to 3 > t.c:47:21: missed: not falling back to elementwise accesses

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-29 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #16 from Robin Dapp --- Disabling vec_extract makes us operate on non-partial vectors, though so there are a lot of differences in codegen. I'm going to have a look.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-29 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #17 from Robin Dapp --- Grasping for straws by blaming qemu ;) At some point we do the vector shift vsll.vv v1,v2,v2,v0.t but the mask v0 is all zeros: gdb: b = {0 } According to the mask-undisturbed policy set before

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-29 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #18 from Robin Dapp --- Hehe no it doesn't make sense... I wrongly read a v2 as a v1. Please disregard the last message.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-30 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #19 from Robin Dapp --- What seems odd to me is that in fre5 we simplify _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... }); vect_prephitmp_129.51_282 = _429; vect_iftmp.55_287 = VEC_COND_EXPR ;

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-31 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #23 from Robin Dapp --- > this is: > > _429 = mask_patt_205.47_276[i] ? vect_cst__262[i] : (vect_cst__262 << > {0,..})[i]; > vect_iftmp.55_287 = mask_patt_209.54_286[i] ? _429 [i] : vect_cst__262[i] But isn't it rather _429 = mask_

[Bug target/115439] [15 Regression] ICEs after r15-638 on master-thumb_m55_hard_eabi

2024-06-11 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115439 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org --- Comment #2

[Bug target/115439] [15 Regression] ICEs after r15-638 on master-thumb_m55_hard_eabi

2024-06-11 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115439 --- Comment #6 from Robin Dapp --- Looks reasonable. That's what we were doing before in internal-fn.cc before expanding (except operands[2]). Are you going to post a patch?

[Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops

2024-06-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382 --- Comment #7 from Robin Dapp --- Ah yes, I'm going to push the patch to 14 still.

[Bug rtl-optimization/115495] [15 Regression] ICE in smallest_mode_for_size, at stor-layout.cc:356 during combine on RISC-V rv64gcv_zvl256b at -O3

2024-06-19 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115495 --- Comment #3 from Robin Dapp --- At first it looked very weird that we need 50 (or so) instructions to expand ;; MEM [(short int *)&a] = vect_cst__21; but then I realized that all the hoops we jump through are due to possible misalignment.

[Bug tree-optimization/100756] [12 Regression] vect: Superfluous epilog created on s390x

2024-06-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100756 --- Comment #11 from Robin Dapp --- Just noticed this is still open due to the retargeting message. IMHO this can be closed. I'm pretty sure I erroneously used the GCC 12 target when opening the bug when it should have been trunk/GCC 13. I sup

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 --- Comment #4 from Robin Dapp --- Sorry, just got back from the RISC-V summit. IMHO, yes, it should be TU. We have the same thing for the not-element-0 case. I wonder why it doesn't fail with spike or qemu. Probably qemu doesn't do anything

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 --- Comment #5 from Robin Dapp --- > zvl128b => GOOD. > vec_set_vnx8hi_0: > vl1re16.v v1,0(a1) > vsetivlizero,1,e16,m1,ta,ma > vmv.s.x v1,a2 > vs1r.v v1,0(a0) // Only store 1 element as source code

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 --- Comment #7 from Robin Dapp --- I checked. It looks like qemu indeed always implicitly uses TU for vmv.s.x regardless of the actual setting. This behavior masks the bug here.

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 --- Comment #9 from Robin Dapp --- We already merge with operand[0], just the TU is missing as far as I can tell. I'm seeing the following output with my patch: vsetivlizero,8,e16,mf4,tu,ma vle16.v v1,0(a1) vmv.

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 --- Comment #11 from Robin Dapp --- > I believe it is VSETVL PASS doing the fusion, fuse all "vsetvl" according > their > demand field into a single "vsetvli" and put them since beginning. Yes, and the vsetvl fusion is very useful here.

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 --- Comment #14 from Robin Dapp --- Thanks Kito. In addition, I asked Daniel to have a look into the vmv.s.x policy handling. From what I saw it is special in that it currently always uses undisturbed and doesn't observe the specified policy.

[Bug target/115336] [15] rv64gcv_zvl256b miscompile at -O3

2024-07-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115336 --- Comment #3 from Robin Dapp --- Follow-up on this one: My workaround of emitting a vmv.v.i v[0-9],0 before any (potentially) offending masked load is not going to work universally. That's because on several instances we make use of the fact

[Bug target/115725] RISC-V: Use wrong AVL for rv64gcv_zfh_zvl512b

2024-07-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115725 Robin Dapp changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/115995] RISC-V: Can't generate portable RVV code for rv64gcv_zvl512b

2024-07-23 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115995 --- Comment #2 from Robin Dapp --- Hmm I can't reproduce either. riscv64-unknown-linux-gnu-gcc -march=rv64gcv_zvl512b1p0 -mabi=lp64d -O2 990128-1.c QEMU_CPU=rv64,v=true,xventanacondops=true,x-zvfh=true,zfh=true,zba=true,zbb=true,zbc=true,zicond

[Bug target/116036] [14/15] RISCV: internal compiler error: in riscv_expand_mult_with_const_int with -march=rv64idv

2024-07-23 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116036 --- Comment #2 from Robin Dapp --- Begrudgingly confirming :) Still need to figure out where to best error out for that combination. If we do it at the assertion spot the message will be output as many times as we try vector modes (like 8 or s

[Bug target/116059] [14/15 Regression] Miscompile at -O2 since r14-6420-g85c5efcffed

2024-07-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116059 --- Comment #3 from Robin Dapp --- Glad we went for rvv_ma_all_1s=true because otherwise this one would have gone unnoticed :) The -fsigned-char -fno-strict-aliasing -fwrapv look unnecessary. I see the problem without them as well, just the ou

[Bug target/116059] [14/15 Regression] Miscompile at -O2 since r14-6420-g85c5efcffed

2024-07-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116059 Robin Dapp changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rdapp at gcc dot gnu.org Last

[Bug target/114665] [14] RISC-V rv64gcv: miscompile at -O3

2024-07-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114665 Robin Dapp changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug target/114665] [14] RISC-V rv64gcv: miscompile at -O3

2024-07-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114665 Robin Dapp changed: What|Removed |Added Last reconfirmed|2024-07-24 00:00:00 | Known to fail|14.0

[Bug tree-optimization/115819] RISC-V: Failed to hoist vrsub.vx to the header of the loop

2024-07-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115819 --- Comment #6 from Robin Dapp --- (In reply to JuzheZhong from comment #4) > (In reply to Andrew Pinski from comment #1) > > This might be a cost issue. > > No. I don't it's cost issue. > It's because we suppress the hoist by incorrect POLY IN

[Bug tree-optimization/115819] RISC-V: Failed to hoist vrsub.vx to the header of the loop

2024-07-24 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115819 --- Comment #7 from Robin Dapp --- No regressions, going to commit after a while, possibly adding the previously failing test case.

[Bug target/116086] New: RISC-V: Hash mismatch with vectorized 557.xz_r at zvl128b and LMUL=m2

2024-07-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org CC: jeremy.bennett at embecosm dot com, juzhe.zhong at rivai dot ai, law at gcc dot

[Bug target/116086] RISC-V: Hash mismatch with vectorized 557.xz_r at zvl128b and LMUL=m2

2024-07-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116086 --- Comment #1 from Robin Dapp --- The following reproduces the problem for me, though not very minimal yet: typedef unsigned int uint32_t; typedef unsigned long long uint64_t; typedef struct { uint64_t length; uint64_t state[8]; u

[Bug target/116086] RISC-V: Hash mismatch with vectorized 557.xz_r at zvl128b and LMUL=m2

2024-07-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116086 --- Comment #2 from Robin Dapp --- Reduced a bit: typedef unsigned int uint32_t; typedef unsigned long long uint64_t; typedef struct { uint64_t length; uint64_t state[8]; uint32_t curlen; unsigned char buf[128]; } sha512_state;

[Bug target/116036] [14/15 only] RISCV: internal compiler error: in riscv_expand_mult_with_const_int with -march=rv64idv

2024-07-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116036 Robin Dapp changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Known to fail|15.0

[Bug tree-optimization/115819] RISC-V: Failed to hoist vrsub.vx to the header of the loop

2024-07-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115819 Robin Dapp changed: What|Removed |Added Ever confirmed|1 |0 Status|NEW

[Bug target/116086] RISC-V: Hash mismatch with vectorized 557.xz_r at zvl128b and LMUL=m2

2024-07-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116086 --- Comment #4 from Robin Dapp --- Probably because I left out a crucial detail ;) It only happens starting with vlen=256 in qemu.

[Bug target/116086] RISC-V: Hash mismatch with vectorized 557.xz_r at zvl128b and LMUL=m2

2024-07-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116086 --- Comment #6 from Robin Dapp --- Ah, thanks for reducing. I didn't get much further with cvise yesterday. What were your settings for it? The reduced test case is great because it is easy to analyze and uncovers a fairly significant problem

[Bug target/116086] RISC-V: Hash mismatch with vectorized 557.xz_r at zvl128b and LMUL=m2

2024-07-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116086 --- Comment #7 from Robin Dapp --- Ok, if done right, i.e. without introducing a new bug, both the reduced case as well as the original case show the same behavior with respect to the fix. Also, xz calculates the proper hash, phew. I sent a fir

[Bug target/116125] RISC-V: Does not fully checking for overlapping memory regions

2024-07-29 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116125 Robin Dapp changed: What|Removed |Added Known to fail||14.1.0 Status|UNCONFIRMED

[Bug middle-end/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173 --- Comment #2 from Robin Dapp --- In x264, before the optimization we have: _42 = VEC_PERM_EXPR ; ... _44 = VEC_PERM_EXPR ; _45 = VEC_PERM_EXPR ; The first one (_42) is "monotonic" and can be implemented by a vmerge. This implies a load and

[Bug middle-end/117173] New: can_vec_perm_const_p does not consider costs

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org CC: law at gcc dot gnu.org, rguenth at gcc dot gnu.org Target Milestone: --- I only noticed this on riscv but it's actually a target-independent issue. In match.pd:109

[Bug tree-optimization/116578] vectorizer SLP transition issues / dependences

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116578 Bug 116578 depends on bug 116655, which changed state. Bug 116655 Summary: RISC-V: ICE with -mrvv-max-lmul=dynamic in compute_nregs_for_mode https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655 What|Removed |Ad

[Bug target/116655] RISC-V: ICE with -mrvv-max-lmul=dynamic in compute_nregs_for_mode

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655 Robin Dapp changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/117566] New: RISC-V: Enable VLS tests in testsuite for various targets

2024-11-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org CC: ewlu at rivosinc dot com, kito.cheng at gmail dot com, law at gcc dot gnu.org, palmer at dabbelt dot com

[Bug target/117565] New: RISC-V: Make ABI implicit for non-default march

2024-11-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
: target Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org CC: kito.cheng at gmail dot com, law at gcc dot gnu.org, palmer at dabbelt dot com, vineetg at rivosinc dot com Target Milestone: --- Target: riscv

[Bug target/117563] New: RISC-V: -mcpu is ignored when -march has been specified.

2024-11-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org CC: kito.cheng at gmail dot com, law at gcc dot gnu.org, palmer at dabbelt dot com, vineetg at rivosinc dot com Target Milestone

[Bug target/117353] [15 regression] RISC-V: ICE when building libcrypt since r15-3228-g771256bcb9ddc4

2024-10-31 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117353 --- Comment #5 from Robin Dapp --- The issue is that we expand a const-vector (using a left shift, among others) move during lra where we can't create pseudos which we must not do. Likely just missing a can_create_pseudo_p somewhere.

[Bug target/112109] Missing riscv vectorized strcmp (and other) expanders

2024-09-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112109 --- Comment #6 from Robin Dapp --- Should we close this? I think all of the routines are in or are we missing something still? What's IMHO still a TODO is to honor TARGET_MAX_LMUL for some of the builtins that came first. memcpy for example a

[Bug target/116611] Inefficient mix of contiguous and load-lane vectorization due to missing permutes

2024-09-30 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116611 --- Comment #6 from Robin Dapp --- Hmm, the RTL follows the gimple code pretty well and those vect_array.27[0] = vect__2.17_71; become subreg-subreg moves. vect_array.27 is only dead after the v10 use. How should it ideally work? Could we r

[Bug target/116611] Inefficient mix of contiguous and load-lane vectorization due to missing permutes

2024-09-30 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116611 --- Comment #8 from Robin Dapp --- (In reply to Richard Biener from comment #7) > (In reply to Robin Dapp from comment #6) > > Hmm, the RTL follows the gimple code pretty well and those > >vect_array.27[0] = vect__2.17_71; > > become subreg-

[Bug target/117769] RISC-V: Worse codegen in x264_pixel_satd_8x4

2024-11-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117769 --- Comment #1 from Robin Dapp --- The SLP vec_perm patch went upstream since which seems pretty related as specifically targets SATD's permutes. Surprised to see a higher icount, though.

[Bug target/117769] RISC-V: Worse codegen in x264_pixel_satd_8x4

2024-11-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117769 --- Comment #3 from Robin Dapp --- Ok, I see. Those x264 functions are sensitive to alignment. Right now the only tune model to enable it by default is generic ooo. But the commit you mentioned cannot have been OK then either?

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #13 from Robin Dapp --- I don't fully understand yet :) So the full-register moves are undesirable, I agree. When accumulating with a widening op they seem unavoidable, though. The only alternative would be to split out the extens

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-19 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 --- Comment #7 from Robin Dapp --- Thanks, I was just about to write that I managed to build and would start to look into it.

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 --- Comment #3 from Robin Dapp --- My first mistake was not squashing the commits as they depend on each other, sorry about that. The latest error you showed should be the "correct" one. I couldn't test gcn but Andrew's latest test seems to hav

[Bug target/117709] [15 regression] maskload else case generating wrong code

2024-11-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117709 --- Comment #1 from Robin Dapp --- We're not emitting the VEC_COND because type_mode_padding_p is false. My check compares the precision of the scalar type vs the precision of the vector mode's inner mode which is both 32 here: int vs E_V64SImod

[Bug target/117709] [15 regression] maskload else case generating wrong code

2024-11-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117709 --- Comment #2 from Robin Dapp --- This is the code: vect__23.27_8 = .MASK_GATHER_LOAD (&MEM [(void *)&k + -88B], { 0, -15, -30, -45, -60, -75, -90, -105, -120, -135, -150, -165, -180, -195, -210, -225, -240, -255, -270, -285, -300, -315, -3

[Bug target/117657] [15 Regression][gcn] ICE during in-tree newlib build: error: unrecognizable insn

2024-11-19 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117657 --- Comment #10 from Robin Dapp --- The last version of the patch "should not" have changed anything fundamentally, just adding more safe guards but well :) Let me know if I can do something.

[Bug target/117594] [15] RISC-V: Miscompile at -O3

2024-11-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117594 --- Comment #2 from Robin Dapp --- What's the expected output of the latter test case? I'm seeing 36 no matter what I try, -O3, -O2 without 'v', etc. Even with an x86 GCC. And, looking at the loop for (unsigned j = 0; j < (z[i] ?: 10); j +=

[Bug target/115336] [15] rv64gcv_zvl256b miscompile at -O3

2024-11-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115336 Robin Dapp changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/116059] [14/15 Regression] Miscompile at -O2 since r14-6420-g85c5efcffed

2024-11-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116059 Robin Dapp changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/116242] [meta-bug] Tracker for zvl issues in RISC-V

2024-11-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116242 Bug 116242 depends on bug 115336, which changed state. Bug 115336 Summary: [15] rv64gcv_zvl256b miscompile at -O3 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115336 What|Removed |Added

[Bug target/116242] [meta-bug] Tracker for zvl issues in RISC-V

2024-11-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116242 Bug 116242 depends on bug 116059, which changed state. Bug 116059 Summary: [14/15 Regression] Miscompile at -O2 since r14-6420-g85c5efcffed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116059 What|Removed |Add

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #3 from Robin Dapp --- First, pixel_sad_4x4 is not very hot, 8x8 and 16x16 are. Second, we are vectorizing this, but with -mno-vector-strict-align. IMHO we don't need to synthesize an usad pattern.

[Bug target/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

2024-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722 --- Comment #5 from Robin Dapp --- If it's better then OK. Can you show an example?

[Bug c/117804] RISC-V: Worse codegen in mc_chroma of x264

2024-11-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117804 --- Comment #1 from Robin Dapp --- The problem is that at combine-time we don't know the range of cA, cB etc anymore so we can't just combine those to the two-operand widening pattern. I have a local patch that introduces a widening FMA operati

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #3 from Robin Dapp --- Generally, yes, I guess. But I'd like to understand better what exactly is going wrong. Shouldn't emitting those "pre-RA" insns already be guarded properly? I haven't looked into it in detail - isn't there a

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878 --- Comment #1 from Robin Dapp --- Is this related to PR117353? Seems very similar.

[Bug target/118140] [14/15 Regression] RISC-V: Miscompile with -march=rv64gcv_zvl256b -O3 since r14-5076-g01c18f58d37

2024-12-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118140 --- Comment #9 from Robin Dapp --- Already during ifcvt we do Setting value number of _46 to 1 (changed) Replaced _44 <= 1 with 1 in all uses of _46 = _44 <= 1; Value numbering stmt = _41 = _3; Setting value number of _41 to _3 (changed

[Bug tree-optimization/118140] [14/15 Regression] ifcvt miscompiles program at -O3 since r14-5076-g01c18f58d37 for riscv and aarch64 (with SVE)

2024-12-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118140 --- Comment #15 from Robin Dapp --- (In reply to Andrew Pinski from comment #14) > (In reply to Robin Dapp from comment #13) > > The #if 0 shouldn't be necessary, right? > > Correct, it is the same testcase as comment #7 except the plain char

[Bug tree-optimization/118140] [14/15 Regression] ifcvt miscompiles program at -O3 since r14-5076-g01c18f58d37 for riscv and aarch64 (with SVE)

2024-12-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118140 --- Comment #13 from Robin Dapp --- (In reply to Andrew Pinski from comment #11) > I see sometimes we check the return value of maybe_resimplify_conditional_op > and sometimes does not. > > E.g. in try_conditional_simplification we don't check

[Bug target/118734] New: RISC-V: Vector broadcast via strided load.

2025-02-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs
: target Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at gcc dot gnu.org CC: jeffreyalaw at gmail dot com, juzhe.zhong at rivai dot ai, kito.cheng at gmail dot com, palmer at dabbelt dot com, pan2.li at intel dot com

[Bug target/115458] [15 regression] [RISC-V] ICE in lra_split_hard_reg_for, at lra-assigns.cc:1868 unable to find a register to spill since r15-518-g99b1daae18c095

2025-02-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115458 --- Comment #12 from Robin Dapp --- Some "findings" below but I don't have the feeling I'm much closer to anything actionable. At some point we're trying to split a live range of an RVVM8QI register (v16, hard regno = 112) for the reload insn

[Bug target/115703] [15 Regression] rv64gcv_zvl256b miscompile since r15-1579-g792f97b44ff

2025-02-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115703 --- Comment #4 from Robin Dapp --- The problem appears to be Fuse curr info since prev info compatible with it: prev_info: VALID (insn 438, bb 2) Demand fields: demand_ge_sew demand_non_zero_avl SEW=32, VLMUL=m1, RATIO

[Bug target/112853] RISC-V: RVV: SPEC2017 525.x264 regression

2025-02-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 Robin Dapp changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/115703] [15 Regression] rv64gcv_zvl256b miscompile since r15-1579-g792f97b44ff

2025-02-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115703 --- Comment #2 from Robin Dapp --- > I don't see anything wrong with this move on RTL. Maybe there is something > wrong going on the pass which is emitting the vsetivli instructions. Yes, indeed. With --param=vsetvl-strategy=simple the output

[Bug target/115703] [15 Regression] rv64gcv_zvl256b miscompile since r15-1579-g792f97b44ff

2025-02-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115703 --- Comment #3 from Robin Dapp --- For me this doesn't occur on the trunk anymore and I bisected the working change to: r15-3459-gcbea72b265e4c9 Author: Raphael Moreira Zinsly Date: Wed Sep 4 17:21:24 2024 -0600 [PATCH 1/3] RISC-V: Impr

[Bug target/116242] [meta-bug] Tracker for zvl issues in RISC-V

2025-02-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116242 Bug 116242 depends on bug 112853, which changed state. Bug 112853 Summary: RISC-V: RVV: SPEC2017 525.x264 regression https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 What|Removed |Added -

[Bug target/118832] RISC-V: internal compiler error: could not split insn, with V+Zbb enabled

2025-02-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832 --- Comment #4 from Robin Dapp --- >From a cursory look the following shifts might also be vulnerable: (riscv-v.cc:1528) else { /* { 1, 3, 2, 6, ... }. */ rtx tmp2 = gen_reg_rtx

<    1   2   3   4   5   >