[Bug target/114160] ICE on RISCV (-mcpu=thead-c906) when building glibc in dwarf2out_frame_debug_cfa_offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114160 Christoph Müllner changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-03-18 --- Comment #4 from Christoph Müllner --- I now have permission. Thanks Sam!
[Bug target/114160] ICE on RISCV (-mcpu=thead-c906) when building glibc in dwarf2out_frame_debug_cfa_offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114160 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Christoph Müllner --- Closing as fixed.
[Bug target/114194] ICE when using std::unique_ptr with xtheadvector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114194 Christoph Müllner changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Last reconfirmed||2024-03-21 CC||cmuellner at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Target Milestone|--- |14.0 --- Comment #7 from Christoph Müllner --- Thanks for reporting and providing several minimal reproducers. I can reproduce the issue and have further analyzed it. During the analysis, I've noticed that not only memset-zero (clear-memory) is affected, but all memset expansions (e.g. `memset(p, 3, 15)`). I also have a potential fix that will be sent to the list once the testing run is completed.
[Bug target/114194] ICE when using std::unique_ptr with xtheadvector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114194 Christoph Müllner changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from Christoph Müllner --- Closing as resolved (the fix has been pushed on master).
[Bug target/116131] [14/15 Regression] RISC-V: Unrecognizable insn with xtheadmemidx on rv32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116131 --- Comment #3 from Christoph Müllner --- After passing the tests, I've posted the patch on the mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658726.html
[Bug target/116033] [14 only] RISC-V: -march=rv64gv_xtheadmemidx generates illegal vse8.v insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116033 Christoph Müllner changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Christoph Müllner --- The fix was accepted and has been pushed to master on Jul 25: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a86c0cb9379e7b86625908a0250cf698276e9e02 For GCC 14 we had to wait until GCC 14.2 was released and the backport has just been pushed on releases/gcc-14: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eccf707e5ceb7e405ffe4edfbcae2f769b8386cf Closing this ticket as resolved. Patrick, thank you for reporting!
[Bug target/116131] [14/15 Regression] RISC-V: Unrecognizable insn with xtheadmemidx on rv32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116131 --- Comment #5 from Christoph Müllner --- I've prepared a patchset that eliminates the optimization patterns for XThead(F)MemIdx, which produce the non-canonical MEMs. As a side-effect, this change also fixes the issue reported here. However, it also triggers another ICE (in the case of enabled XThead(F)MemIdx and XTheadFmv/Zfa), which is addressed in the last patch of the series: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659676.html https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659677.html https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659678.html
[Bug rtl-optimization/116353] [15 Regression] ICE on glibc-2.39: RTL pass: ce2, in expand_simple_binop, at optabs.cc:1264 since r15-2890-g72c9b5f438f22c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116353 --- Comment #7 from Christoph Müllner --- > To add on to the info provided by Manolis, this is the diff for the proposed > fix: > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc > index 3e25f30b67e..da59c907891 100644 > --- a/gcc/ifcvt.cc > +++ b/gcc/ifcvt.cc > @@ -3938,8 +3938,10 @@ bb_ok_for_noce_convert_multiple_sets (basic_block > test_bb, unsigned *cost) >rtx src = SET_SRC (set); > >/* Do not handle anything involving memory loads/stores since it might > -violate data-race-freedom guarantees. */ > - if (!REG_P (dest) || contains_mem_rtx_p (src)) > +violate data-race-freedom guarantees. Make sure we can force SRC > +to a register as that may be needed in try_emit_cmove_seq. */ > + if (!REG_P (dest) || contains_mem_rtx_p (src) > + || !noce_can_force_operand (src)) > return false; > >/* Destination and source must be appropriate. */ I've successfully bootstrapped the proposed change on top of master with `--enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto` on x86-64 and aarch64. So the change is: Tested-by: Christoph Müllner
[Bug rtl-optimization/116349] [15 regression] ICE in expand_simple_binop, at optabs.cc:1264 when building libgo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116349 --- Comment #7 from Christoph Müllner --- (In reply to seurer from comment #6) > I am seeing this same failure in doing a bootstrap build during stage 2 on > powerpc64: A fix that is confirmed to work on AArch64 and x86-64 has been posted here (see PR116353): https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660245.html
[Bug target/114673] New: RISC-V: "L" constraint cannot be used for lui in inline asm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114673 Bug ID: 114673 Summary: RISC-V: "L" constraint cannot be used for lui in inline asm Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: cmuellner at gcc dot gnu.org Target Milestone: --- The RISC-V-specific "L" constraint is neither documented nor tested. In constraints.md it is defined as "A U-type 20-bit signed immediate.". It tests if the value is a constant int that satisfies LUI_OPERAND(), i.e. a value with the lowest 12 bits zero. One obvious use-case is to use "L" for "lui" in inline asm. However, it does not work as expected: long getB() { //lui a0,0x1800 return 3<<23; //0x0180 } long getB_asm_i() { long reg; //lui a0,0x1800 asm("lui %0, %1" : "=r"(reg) : "i"((3<<23) >> 12)); return reg; } long getB_asm_L() { long reg; //Assembler error: lui expression not in range 0..1048575 asm("lui %0, %1" : "=r"(reg) : "L"(3ul<<23)); return reg; } long getB_asm_Lshift() { long reg; //Compiler error: impossible constraint in 'asm' asm("lui %0, %1" : "=r"(reg) : "L"((3<<23) >> 12)); return reg; } The "L" constraint was introduced as part of the initial RISC-V port. I could not find any tests/documentation, so I am unsure if it can be fixed or if a new constraint should be introduced. My preferred fix would be to shift the provided constant right by 12 if it satisfies LUI_OPERAND(), so that getB_asm_L() would work.
[Bug middle-end/111111] omnetpp: ICEs with dump flags, PGO and LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11 Christoph Müllner changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #2 from Christoph Müllner --- This can't be reproduced anymore (retested with master and releases/gcc-14).
[Bug target/111501] RISC-V: non-optimal casting when shifting
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111501 Christoph Müllner changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org CC||cmuellner at gcc dot gnu.org --- Comment #3 from Christoph Müllner --- I noticed this a while ago as well (when working on the XTheadB* stuff). This can be addressed with an insn_and_split for zero_extract. I even wrote a patch for that back then, but forgot to send it out. I've rebased/retested it now and will send it once the release is out. Btw, LLVM is catching all of these cases.
[Bug target/111501] RISC-V: non-optimal casting when shifting
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111501 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Christoph Müllner --- Closing this, as it has been fixed on master.
[Bug rtl-optimization/115344] New: Missing loop counter reversal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115344 Bug ID: 115344 Summary: Missing loop counter reversal Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: cmuellner at gcc dot gnu.org Target Milestone: --- Let's take a simple for-loop with an unknown bound: void bar (); void foo1 (int n) { for (int i = 0; i < n; i++) { bar (); } } We see that two variables are in the program, but we could eliminate the loop variable `i` as follows: void bar (); void foo2 (int n) { while (n) { bar (); n--; } } Optimizing the loop as above has the following benefits: - No need for a register for the loop variable `i` - No need for an additional slot in the stack frame - No need for instructions to save/restore the loop variable register in the prologue/epilogue - No need for an initialization instruction for the loop variable `i` (to zero) LLVM does this transformation on (at least) x86-64, RISC-V (rv64gc), and AArch64 with -O3, but GCC does not. Tests have been done with trunk and older GCC releases (I've tested down to GCC 4.4). Related bug tickets: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22041 (open - with uses of the loop counter as an array index) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31238 (closed - fixed for GCC 4.5.0) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40886 (closed - fixed for GCC 4.5.0) GCC AArch64 -O3: foo1: cmp w0, 0 ble .L6 stp x29, x30, [sp, -32]! mov x29, sp stp x19, x20, [sp, 16] mov w20, w0 mov w19, 0 .L3: add w19, w19, 1 bl bar cmp w20, w19 bne .L3 ldp x19, x20, [sp, 16] ldp x29, x30, [sp], 32 ret .L6: ret foo2: cbz w0, .L18 stp x29, x30, [sp, -32]! mov x29, sp str x19, [sp, 16] mov w19, w0 .L12: bl bar subsw19, w19, #1 bne .L12 ldr x19, [sp, 16] ldp x29, x30, [sp], 32 ret .L18: ret LLVM AArch64 -O3: foo1: // @foo1 cmp w0, #1 b.lt.LBB0_4 stp x29, x30, [sp, #-32]! // 16-byte Folded Spill str x19, [sp, #16] // 8-byte Folded Spill mov x29, sp mov w19, w0 .LBB0_2:// =>This Inner Loop Header: Depth=1 bl bar subsw19, w19, #1 b.ne.LBB0_2 ldr x19, [sp, #16] // 8-byte Folded Reload ldp x29, x30, [sp], #32 // 16-byte Folded Reload .LBB0_4: ret foo2: // @foo2 cbz w0, .LBB1_4 stp x29, x30, [sp, #-32]! // 16-byte Folded Spill str x19, [sp, #16] // 8-byte Folded Spill mov x29, sp mov w19, w0 .LBB1_2:// =>This Inner Loop Header: Depth=1 bl bar subsw19, w19, #1 b.ne.LBB1_2 ldr x19, [sp, #16] // 8-byte Folded Reload ldp x29, x30, [sp], #32 // 16-byte Folded Reload .LBB1_4: ret GCC RISC-V -O3 -march=rv64gc: foo1: ble a0,zero,.L6 addisp,sp,-32 sd s0,16(sp) sd s1,8(sp) sd ra,24(sp) mv s1,a0 li s0,0 .L3: addiw s0,s0,1 callbar bne s1,s0,.L3 ld ra,24(sp) ld s0,16(sp) ld s1,8(sp) addisp,sp,32 jr ra .L6: ret foo2: beq a0,zero,.L18 addisp,sp,-16 sd s0,0(sp) sd ra,8(sp) mv s0,a0 .L12: addiw s0,s0,-1 callbar bne s0,zero,.L12 ld ra,8(sp) ld s0,0(sp) addisp,sp,16 jr ra .L18: ret LLVM RISC-V -O3 -march=rv64gc foo1: # @foo1 bleza0, .LBB0_4 addisp, sp, -16 sd ra, 8(sp) # 8-byte Folded Spill sd s0, 0(sp) # 8-byte Folded Spill mv s0, a0 .LBB0_2:# =>This Inner Loop Header: Depth=1 callbar addiw s0, s0, -1 bnezs0, .LBB0_2 ld ra, 8(sp) # 8-byte Folded Reload ld s0, 0(sp) # 8-byte Folded Reload addisp, sp, 16 .LBB0_4: ret foo2: # @foo2 beqza0, .LBB1_4 addisp, sp
[Bug target/115554] New: RISC-V: ICE in case of multiple target-arch attributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115554 Bug ID: 115554 Summary: RISC-V: ICE in case of multiple target-arch attributes Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: cmuellner at gcc dot gnu.org Target Milestone: --- Minimal reproducers (for target-arch): extern __attribute__((target("arch=+zba"))) __attribute__((target("arch=+zbb"))) void foo(void); extern __attribute__((target("arch=+zbb"))) __attribute__((target("arch=+zbb"))) void bar(void); The ICE is a bug. If multiple target-arch attributes should not be allowed, then an error message is the right solution. Allowing multiple target-X attributes is problematic, as can be seen for baz(). I.e., does the second attribute amend or replace the previous one? However, accepting multiple target-X attributes if they are equal (like for bar) could be done. The assertion was added in the GCC 14 cycle (commit 9941f0295a1). GCC 14 and 15 are affected. GCC 13 is not affected (we don't have RISC-V target-arch attributes in GCC 13). The ICE looks like this: $ riscv64-unknown-linux-gnu-gcc bar.c -c bar.c:4:1: internal compiler error: in riscv_func_target_put, at common/config/riscv/riscv-common.cc:521 4 | void foo(void); | ^~~~ 0xc15306 riscv_func_target_put(tree_node*, std::__cxx11::basic_string, std::allocator >) /home/cm/src/gcc/riscv-mainline/gcc/common/config/riscv/riscv-common.cc:521 0x18234d3 riscv_process_target_attr /home/cm/src/gcc/riscv-mainline/gcc/config/riscv/riscv-target-attr.cc:370 0x182334c riscv_process_target_attr /home/cm/src/gcc/riscv-mainline/gcc/config/riscv/riscv-target-attr.cc:314 0x182363d riscv_option_valid_attribute_p(tree_node*, tree_node*, tree_node*, int) /home/cm/src/gcc/riscv-mainline/gcc/config/riscv/riscv-target-attr.cc:389 0xd560ee handle_target_attribute /home/cm/src/gcc/riscv-mainline/gcc/c-family/c-attribs.cc:5915 0xc24d04 decl_attributes(tree_node**, tree_node*, int, tree_node*) /home/cm/src/gcc/riscv-mainline/gcc/attribs.cc:900 0xc2bbed c_decl_attributes /home/cm/src/gcc/riscv-mainline/gcc/c/c-decl.cc:5501 0xc43b77 start_decl(c_declarator*, c_declspecs*, bool, tree_node*, bool, unsigned int*) /home/cm/src/gcc/riscv-mainline/gcc/c/c-decl.cc:5647 0xcb4d73 c_parser_declaration_or_fndef /home/cm/src/gcc/riscv-mainline/gcc/c/c-parser.cc:2773 0xcc158b c_parser_external_declaration /home/cm/src/gcc/riscv-mainline/gcc/c/c-parser.cc:2053 0xcc1fb5 c_parser_translation_unit /home/cm/src/gcc/riscv-mainline/gcc/c/c-parser.cc:1907 0xcc1fb5 c_parse_file() /home/cm/src/gcc/riscv-mainline/gcc/c/c-parser.cc:27303 0xd3b9c1 c_common_parse_file() /home/cm/src/gcc/riscv-mainline/gcc/c-family/c-opts.cc:1322
[Bug target/115554] RISC-V: ICE in case of multiple target-arch attributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115554 --- Comment #1 from Christoph Müllner --- Forgot to mention: The ICE is triggered by an assertion in riscv_func_target_put(), which ensures we don't have more than one target-arch attribute in one function declaration.
[Bug target/115562] New: RISC-V: ICE because of reused fndecl with target-arch attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115562 Bug ID: 115562 Summary: RISC-V: ICE because of reused fndecl with target-arch attribute Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: cmuellner at gcc dot gnu.org Target Milestone: --- Minimal (?) reproducer: $ cat foo-copy.c void foo (void); __attribute__((target("arch=+zbb"))) void* memcpy (void *d, const void *, unsigned long) { return d; } __attribute__((target("arch=+zbb"))) void fun0(void) {} __attribute__((target("arch=+zbb"))) void fun1(void) {} __attribute__((target("arch=+zbb"))) void fun2(void) {} __attribute__((target("arch=+zbb"))) void fun3(void) {} __attribute__((target("arch=+zbb"))) void fun4(void) {} __attribute__((target("arch=+zbb"))) void fun5(void) {} __attribute__((target("arch=+zbb"))) void fun6(void) {} __attribute__((target("arch=+zbb"))) void fun7(void) {} __attribute__((target("arch=+zbb"))) void fun8(void) {} __attribute__((target("arch=+zbb"))) void fun9(void) {} __attribute__((target("arch=+zbb"))) void fun10(void) {} __attribute__((target("arch=+zbb"))) void fun11(void) {} __attribute__((target("arch=+zbb"))) void fun12(void) {} This is similar to PR115554, but triggers the assertion in riscv_func_target_put() because when processing `fun12` the fndecl is equal to the previously processed fndecl of `memcpy`. I.e., the assumption that the fndecl pointer can be used as an identifier (or comparable for the hash-table) does not hold. Like PR115554, this bug is part of GCC14 and on the master branch. The ICE looks the same as for PR115554 (the same assertion is triggered). To analyze this issue, I've extended riscv_func_target_put() like this: + if (*target_info_slot) +{ + inform (loc, "Hash collision detected:"); + inform (loc, " old function: %qE (%p)", (*target_info_slot)->fn_decl, (*target_info_slot)->fn_decl); + inform (loc, " old attributes: %s", (*target_info_slot)->fn_target_name.c_str()); + inform (loc, " new function: %qE", fn_decl); + inform (loc, " new attributes: %s", fn_target_name.c_str ()); +} + else +{ + inform (loc, "Adding target attributes to function:"); + inform (loc, " new function: %qE (%p)", fn_decl, fn_decl); + inform (loc, " new attributes: %s", fn_target_name.c_str ()); +} gcc_assert (!*target_info_slot); Additionally, I've included tree.h and added "location_t loc" as parameter of this function. This gives the following output on the reproducer above: $ /opt/riscv-mainline/bin/riscv64-unknown-linux-gnu-gcc -c foo-copy.c foo-copy.c:5:1: note: Adding target attributes to function: 5 | memcpy (void *d, const void *, unsigned long) | ^~ foo-copy.c:5:1: note: new function: 'memcpy' (0x7f295879e200) // first appearance foo-copy.c:5:1: note: new attributes: rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zbb1p0 foo-copy.c:10:43: note: Adding target attributes to function: 10 | __attribute__((target("arch=+zbb"))) void fun0(void) {} | ^~~~ foo-copy.c:10:43: note: new function: 'fun0' (0x7f295879e400) foo-copy.c:10:43: note: new attributes: rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zbb1p0 [...] foo-copy.c:22:43: note: Adding target attributes to function: 22 | __attribute__((target("arch=+zbb"))) void fun11(void) {} | ^ foo-copy.c:22:43: note: new function: 'fun11' (0x7f295879ef00) foo-copy.c:22:43: note: new attributes: rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zbb1p0 foo-copy.c:23:43: note: Hash collision detected: 23 | __attribute__((target("arch=+zbb"))) void fun12(void) {} | ^ foo-copy.c:23:43: note: old function: 'fun12' (0x7f295879e200) // same address! foo-copy.c:23:43: note: old attributes: rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zbb1p0 foo-copy.c:23:43: note: new function: 'fun12' foo-copy.c:23:43: note: new attributes: rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zbb1p0 foo-copy.c:23:1: internal compiler error: in riscv_func_target_put, at common/config/riscv/riscv-common.cc:536 23 | __attribute__((target("arch=+zbb"))) void fun12(void) {} | ^ As can be seen in the example above, fndecl of `memcpy` has the address 0x7f295879e200, which is equal to the address of fndecl of `fun12`. Note that even small adjustments to the source will break the reproducer. Therefore, I could not rename `memcpy` to something different.
[Bug target/115562] RISC-V: ICE because of reused fndecl with target-arch attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115562 --- Comment #1 from Christoph Müllner --- This issue was discovered while analyzing a build issue with a patchset to introduce optimized string processing routines for RISC-V in glibc. See also: https://sourceware.org/pipermail/libc-alpha/2024-June/157627.html
[Bug target/115554] RISC-V: ICE in case of multiple target-arch attributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115554 Christoph Müllner changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Christoph Müllner --- Fixed upstream with: * https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=aa8e2de78cae4dca7f9b0efe0685f3382f9ecb9a * https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=61c21a719e205f70bd046c6a0275d1a3fd6341a4 Backported to GCC-14: * https://gcc.gnu.org/git?p=gcc.git;a=commit;h=0e1f599d637668bba0b2890f4cd81e7fb70473bc * https://gcc.gnu.org/git?p=gcc.git;a=commit;h=b3cff8357e9dce680a20406698fa9dadfe04997d
[Bug target/115562] RISC-V: ICE because of reused fndecl with target-arch attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115562 Christoph Müllner changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from Christoph Müllner --- Fixed upstream with: * https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=aa8e2de78cae4dca7f9b0efe0685f3382f9ecb9a * https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=61c21a719e205f70bd046c6a0275d1a3fd6341a4 Backported to GCC-14: * https://gcc.gnu.org/git?p=gcc.git;a=commit;h=0e1f599d637668bba0b2890f4cd81e7fb70473bc * https://gcc.gnu.org/git?p=gcc.git;a=commit;h=b3cff8357e9dce680a20406698fa9dadfe04997d
[Bug target/116035] [14/15] RISC-V: -march=rv64g_xtheadmemidx_zba generates illegal lwu insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116035 Christoph Müllner changed: What|Removed |Added Last reconfirmed||2024-07-23 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Christoph Müllner --- Thanks for reporting. Seems like a Zba INSN is matching and causing some troubles. I'll prepare a fix.
[Bug target/116035] [14/15] RISC-V: -march=rv64g_xtheadmemidx_zba generates illegal lwu insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116035 --- Comment #2 from Christoph Müllner --- Proposed fix has been posted on the mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658091.html
[Bug target/116035] [14/15] RISC-V: -march=rv64g_xtheadmemidx_zba generates illegal lwu insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116035 --- Comment #4 from Christoph Müllner --- I understood this as GCC 14 and 15 (i.e., master) show this issue. Testing with GCC 13 shows: error: '-march=rv64g_xtheadmemidx_zba': unexpected ISA string at end: 'zba' The issue does not apply to GCC 13 or older because the affected extensions were not supported back then. By the way thanks for reminding me of the GCC 14 backport.
[Bug target/116033] [14/15] RISC-V: -march=rv64gv_xtheadmemidx generates illegal vse8.v insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116033 Christoph Müllner changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-07-24 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org --- Comment #1 from Christoph Müllner --- I've prepared a patch that disables pre-/post-modify addressing if RVV is enabled: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658119.html The underlying issue is outlined in the commit message. We are confronted with the following optimization from auto_inc_dec (-O3), when RVV and XTheadMemIdx are enabled: ``` (insn 23 20 27 3 (set (mem:V4QI (reg:DI 136 [ ivtmp.13 ]) [0 MEM [(char *)_39]+0 S4 A32]) (reg:V4QI 168)) "gcc/testsuite/gcc.target/riscv/pr116033.c":12:27 3183 {*movv4qi} (nil)) (insn 40 39 41 3 (set (reg:DI 136 [ ivtmp.13 ]) (plus:DI (reg:DI 136 [ ivtmp.13 ]) (const_int 20 [0x14]))) 5 {adddi3} (nil)) > (insn 23 20 27 3 (set (mem:V4QI (post_modify:DI (reg:DI 136 [ ivtmp.13 ]) (plus:DI (reg:DI 136 [ ivtmp.13 ]) (const_int 20 [0x14]))) [0 MEM [(char *)_39]+0 S4 A32]) (reg:V4QI 168)) "gcc/testsuite/gcc.target/riscv/pr116033.c":12:27 3183 {*movv4qi} (expr_list:REG_INC (reg:DI 136 [ ivtmp.13 ]) (nil))) ``` One solution would be to introduce a target hook to check if a certain type can be used for pre-/post-modify optimizations. However, it will be hard to justify such a hook if only a single RISC-V vendor extension requires that. Therefore, this patch takes a more drastic approach and disables pre-/post-modify addressing if TARGET_VECTOR is set. This results in not emitting pre-/post-modify instructions from XTheadMemIdx if RVV is enabled.
[Bug target/116033] [14/15] RISC-V: -march=rv64gv_xtheadmemidx generates illegal vse8.v insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116033 --- Comment #2 from Christoph Müllner --- Jeff Law claimed that th_classify_address() is likely missing a mode check. I checked that before, and there is a mode check there. But, after this comment, I challenged the test and indeed: if (!(INTEGRAL_MODE_P (mode) && GET_MODE_SIZE (mode).to_constant () <= 8)) return false; INTEGRAL_MODE_P() includes vector modes. So, the proper fix for this issue is to ensure that GET_MODE_CLASS (MODE) == MODE_INT is fulfilled. I adjusted the patch and provided a v2: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658130.html
[Bug target/116035] [14/15] RISC-V: -march=rv64g_xtheadmemidx_zba generates illegal lwu insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116035 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Christoph Müllner --- The patch got accepted and has been merged: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=9817d29cd66762893782a52b2c304c5083bc0023 A GCC 14 backport was accepted as well and has been merged: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ab0386679fef35c544d139270436c63026e00ff2 Thanks again for reporting!
[Bug target/116131] [14/15 Regression] RISC-V: Unrecognizable insn with xtheadmemidx on rv32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116131 --- Comment #2 from Christoph Müllner --- Thank you for reporting! A first analysis showed, that adding more extensions does not change anything. E.g. rv32gc_xtheadmemidx also triggers the error. However, rv64i_xtheadmemidx is not affected. Also, the optimization level has no impact on the issue. (In reply to Jeffrey A. Law from comment #1) > Looks like non-canonical RTL to me. Inside a MEM that shift should have > been turned into a multiply. I agree, but I don't think it is part of the problem, because in th_memidx_classify_address_index() we expect ASHIFTs. When looking at the XTheadMemIdx implementation, we have two relevant files: * gcc/config/riscv/thead.md has the optimization pattern (th_memidx_*) * gcc/config/riscv/thead.cc processes them (th_memidx_classify_address_index()) In the particular case, th_memidx_I_c creates the optimized INSN: (insn 18 14 0 2 (set (mem:SI (plus:SI (reg/f:SI 141) (ashift:SI (subreg:SI (reg:DI 134 [ a.0_1 ]) 0) (const_int 2 [0x2]))) [0 S4 A32]) (reg:SI 143 [ b ])) "":4:17 -1 (nil)) The goal is obviously to generate an th.srw instruction. The issue here is the subreg, which comes from the following INSN: (insn 9 7 10 2 (set (reg:SI 139) (ashift:SI (subreg:SI (reg:DI 134 [ a.0_1 ]) 0) (const_int 2 [0x2]))) "gcc/testsuite/gcc.target/riscv/pr116131.c":12:8 294 {*ashlsi3} (expr_list:REG_DEAD (reg:DI 134 [ a.0_1 ]) (nil))) An easy fix is to reject subregs (confirmed to work): diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md index a47fe6f28b8..b95959d6827 100644 --- a/gcc/config/riscv/thead.md +++ b/gcc/config/riscv/thead.md @@ -758,6 +758,7 @@ (define_insn_and_split "*th_memidx_I_c" (match_operand:X 3 "register_operand" "r"))) (match_operand:TH_M_ANYI 0 "register_operand" "r"))] "TARGET_XTHEADMEMIDX + && !SUBREG_P (operands[1]) && CONST_INT_P (operands[2]) && pow2p_hwi (INTVAL (operands[2])) && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)" A better alternative is to allow this subreg. I've prepared a patch and will send it once the test are done.
[Bug target/116131] [14/15 Regression] RISC-V: Unrecognizable insn with xtheadmemidx on rv32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116131 Christoph Müllner changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2024-07-30 Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org
[Bug target/116590] unrecognized opcode th.vmv8r.v th.vfrec7.v when compiling for risc-v xtheadvector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116590 Christoph Müllner changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cooper.qu at linux dot alibaba.com Ever confirmed|0 |1 Last reconfirmed|2024-09-04 00:00:00 |2024-10-17
[Bug target/116591] internal compiler error: in extract_insn when compiling for risc-v xtheadvector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116591 Christoph Müllner changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Keywords||ice-on-valid-code Last reconfirmed||2024-10-17 Assignee|unassigned at gcc dot gnu.org |cooper.qu at linux dot alibaba.com Ever confirmed|0 |1
[Bug target/116593] internal compiler error: in get_attr_type, at config/riscv/riscv.md:28048 with -O2 -O3 when compiling for risc-v xtheadvector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116593 Christoph Müllner changed: What|Removed |Added Target|Riscv |riscv Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cooper.qu at linux dot alibaba.com Last reconfirmed||2024-10-17
[Bug target/116347] [13/14/15 only] RISC-V: Duplicate entries for -mtune in --target-help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116347 Christoph Müllner changed: What|Removed |Added Last reconfirmed||2024-10-17 CC||cmuellner at gcc dot gnu.org --- Comment #1 from Christoph Müllner --- Currently, the string "thead-c906" is used as an identifier for a CPU (RISCV_CORE) and a tuning (RISCV_TUNE). The help message lists under "valid arguments for -mtune= option" all tuning identifiers followed by all CPU identifiers. I don't think changing the identifier is the right thing to do, as it would break people's build scripts. Instead, I think it would be better to make the help-string-generator aware of such cases. Looking into the code, this should not be too hard: riscv_get_valid_option_values() in gcc/common/config/riscv/riscv-common.cc needs to be adjusted (in case OPT_mtune_) to avoid adding duplicates to the result vector. The function vec_safe_iterate() should help to iterate over existing entries. The duplication check needs to use strcmp() as the vector elements are const-char-pointers.
[Bug target/116720] [13/14/15 Regression] RISC-V: Unrecognizable insn with xtheadmemidx on rv32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116720 Christoph Müllner changed: What|Removed |Added Last reconfirmed||2024-10-17 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED
[Bug target/111565] ICE: in riscv_expand_strcmp_scalar, at config/riscv/riscv-string.cc:382 with -mcpu=thead-c906 -minline-strncmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111565 Christoph Müllner changed: What|Removed |Added CC||cmuellner at gcc dot gnu.org Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Last reconfirmed||2024-10-17 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Christoph Müllner --- GCC 14 (I just reproduced on GCC 14.1 and 14.2) is affected. Master (GCC 15) does not have any issues.
[Bug target/116347] [13/14/15 only] RISC-V: Duplicate entries for -mtune in --target-help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116347 Christoph Müllner changed: What|Removed |Added Last reconfirmed|2024-10-17 00:00:00 |2024-10-22 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #2 from Christoph Müllner --- I just noticed that there exists a proposal to address this on the list from mid August: https://patchwork.sourceware.org/project/gcc/patch/20240819081442.1955204-1-shiyul...@iscas.ac.cn/ This patch adds the postfix "-series" to tuning identifiers, which are already used as CPU identifiers (e.g. "thead-c906" -> "thead-c906-series"). Jeff questioned if CPU core identifiers should be listed (and accepted) as strings for -mtune. Palmer wrote that he would have a look. Here's a quick overview of what other backends to with mcpu/mtune: * aarch64|arm|rs6000/PowerPC: mtune and mcpu flags accept the same identifiers. mtune selects the tuning struct. mcpu additionally sets the enabled extensions (similar to march). * riscv: same as above, but additional identifiers for tuning structs exist that are accepted for mtune. * mips: No -mcpu flag. mtune selects the tuning struct. * x86: mcpu is deprecated and behaves like mtune. mtune sets the tuning struct. march selects extensions and tuning and accepts the same identifiers as mtune. I still think that simply suppressing duplicates when generating the help text would be the solution with the least user impact.
[Bug c/109393] Very trivial address calculation does not fold
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109393 Christoph Müllner changed: What|Removed |Added CC||cmuellner at gcc dot gnu.org Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #11 from Christoph Müllner --- Patch has been fixed upstream.
[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326 Christoph Müllner changed: What|Removed |Added CC||cmuellner at gcc dot gnu.org Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Christoph Müllner --- Fixed on master.
[Bug tree-optimization/117830] [15 Regression] Miscompilation of 464.h264ref at -O2 -march=generic since r15-5563-g1c4d39ada33d36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117830 --- Comment #5 from Christoph Müllner --- Thank you for reporting this! I can reproduce this issue on x86_64 (I did not test on other architectures). I have also confirmed that the suspected change (1c4d39ada33d) causes this by validating that reverting the change fixes the miscompare. An initial analysis showed that we have a total of four blends in CPU2006's h264: * 3x build_base_gcc43-64bit./block.c.213t.forwprop4 * 1x build_base_gcc43-64bit./macroblock.c.213t.forwprop4 Looking closer at the dump files of forwprop, the issue becomes apparent: In find_sad_16x16 (macroblock.c), we merge two sequences that both utilize three of four lanes. _230 = VEC_PERM_EXPR ; _238 = VEC_PERM_EXPR ; vect__108.3193_321 = _238 - _230; vect__107.3192_225 = _230 + _238; _317 = VEC_PERM_EXPR ; // { 0, 5, 2, 7 } could be narrowed to { 0, 5, 0, 4 } _263 = VEC_PERM_EXPR ; _294 = VEC_PERM_EXPR ; vect__109.3191_252 = _294 - _263; vect__104.3190_257 = _263 + _294; _247 = VEC_PERM_EXPR ; // { 0, 5, 2, 7 } could be narrowed to { 0, 5, 0, 4 } This means the check if we utilize less than half of the lanes in a sequence is wrong. Looking into the code shows that this is indeed the case. I already have a fix that is currently being tested.
[Bug tree-optimization/117830] [15 Regression] Miscompilation of 464.h264ref at -O2 -march=generic since r15-5563-g1c4d39ada33d36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117830 --- Comment #6 from Christoph Müllner --- Patch on list: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672065.html
[Bug target/116347] [13/14/15 only] RISC-V: Duplicate entries for -mtune in --target-help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116347 Christoph Müllner changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #3 from Christoph Müllner --- Patch on list: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672062.html
[Bug tree-optimization/117830] [15 Regression] Miscompilation of 464.h264ref at -O2 -march=generic since r15-5563-g1c4d39ada33d36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117830 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Christoph Müllner --- The patch was approved and has been pushed to master: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eee2891312a9b42acabcc82739604c9fa8421757
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 117830, which changed state. Bug 117830 Summary: [15 Regression] Miscompilation of 464.h264ref at -O2 -march=generic since r15-5563-g1c4d39ada33d36 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117830 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/118149] [15 regression] ICE when building lsp-plugins-1.2.14 (mmap: Cannot allocate memory in forwprop) since r15-5563
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118149 Christoph Müllner changed: What|Removed |Added Last reconfirmed||2024-12-20 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #8 from Christoph Müllner --- Thanks for reporting! I've analyzed this, and indeed, this got fixed with the recent fix for PR117830. When calculating the lane allocation for the blended sequence, we did: while (lane_assignment[l] != 0) l++; That got fixed so that we won't access out of bounds. I've sent a patch that adds the reduced testcase to the test suite.
[Bug target/116347] [13/14/15 only] RISC-V: Duplicate entries for -mtune in --target-help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116347 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Christoph Müllner --- Patch was accepted and has been pushed on master: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8af296c290216e03bc20e7291e64c19e0d94cfd6
[Bug tree-optimization/118149] [15 regression] ICE when building lsp-plugins-1.2.14 (mmap: Cannot allocate memory in forwprop) since r15-5563
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118149 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Christoph Müllner --- The new tests pushed on master: https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672123.html
[Bug other/117728] [15 regression] new test case gcc.dg/tree-ssa/satd-hadamard.c from r15-5563-g1c4d39ada33d36 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117728 Christoph Müllner changed: What|Removed |Added Last reconfirmed||2024-11-21 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Christoph Müllner --- Thanks for reporting! This is an issue with the test case and not with the code generation. The test expects the transformation to succeed, but there are valid reasons that this fails on some platforms. Similar fails could happen to the other tests (vector-8.c and vector-9.c). I'll send out a patch that limits the target architecture to aarch64 and x86-64, where we know the tests pass.
[Bug testsuite/117728] [15 regression] new test case gcc.dg/tree-ssa/satd-hadamard.c from r15-5563-g1c4d39ada33d36 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117728 Christoph Müllner changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #2 from Christoph Müllner --- Should be fixed with https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ae0d842f3e7a119b21a000824b10920614088684
[Bug rtl-optimization/117922] [15 Regression] 1000% compilation time slow down on the testcase from pr26854
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117922 --- Comment #17 from Christoph Müllner --- I reproduced the slow-down with a recent master on a 5950X: * no-mem-fold-offset: 4m58.226s * mem-fold-offset: 11m19.311s (+127%) More details from -ftime-report: * no-mem-fold-offset: df reaching defs : 9.34 ( 3%) 0 ( 0%) * mem-fold-offset: df reaching defs : 381.40 ( 55%) 0 ( 0%) A look at the detailed time report (-ftime-report -ftime-report-details) shows: Time variable wall GGC [...] phase opt and generate : 682.81 ( 99%) 6175M ( 97%) [...] callgraph functions expansion : 646.99 ( 94%) 5695M ( 89%) [...] fold mem offsets : 1.73 ( 0%) 679k ( 0%) `- CFG verifier: 2.10 ( 0%) 0 ( 0%) `- df use-def / def-use chains : 2.32 ( 0%) 0 ( 0%) `- df reaching defs: 370.68 ( 54%) 0 ( 0%) `- verify RTL sharing : 0.05 ( 0%) 0 ( 0%) [...] TOTAL : 690.06 6365M I read this as "fold mem offset utilizes 0% of memory", so there is no issue with the memory footprint. To confirm this, `time -v` was used: * no-mem-fold-offset: Maximum resident set size (kbytes): 15563684 * mem-fold-offset: Maximum resident set size (kbytes): 15564364 I looked at the pass, and a few things could be cleaned up in the pass itself (e.g., redundant calls). However, that won't change anything in the observed performance. The time-consuming part is UD+DU DF analysis for the whole function. Even if the pass would "return 0" right after doing nothing but the analysis, we end up with the same run time (confirmed by measurement). The pass operates on BB-granularity, so DF analysis of the whole function provides more information than needed. When going through the documentation, I came across df_set_blocks(), which I expected to reduce the problem significantly. So, I moved the df_analyse() call into the FOR_ALL_BB_FN() loop, right after a call to df_set_blocks(), with the intent to only have a single block set per iteration. However, that triggered a few ICEs in DF, and once they were bypassed, ended up in practical non-termination (i.e. the calls to df_analyse() won't get significantly cheaper by df_set_blocks()). My conclusion: This can only be fixed by not using DF analysis and implementing a pass-specific analysis. So far, I have not found a good solution for this. But I haven't looked at all the suggestions in detail. Can someone help me find what Paolo referenced as "the multiple definitions DF problem that was introduced for fwprop in 2009"?
[Bug tree-optimization/117079] [15 Regression] FAIL: gcc.target/i386/pr105493.c since r15-2820-gab18785840d7b8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079 --- Comment #4 from Christoph Müllner --- The reason that we don't have "MEM " in the dump anymore is that we now have "MEM ". Further, the size of the function in the test case shrinks from 225 instructions down to 109 (almost all vector instructions). I tried to measure a performance difference on my 5950X (-march=native) when calling the test function four times in a loop with 1024l * 1024 * 1024 * 1024 iterations. However, I did not see enough evidence to claim that the new code is better (memory bandwidth is probably the limit): * old: 4m34.405s, 4m47.825s, 4m38.187s * new: 4m34.722s, 4m34.936s, 4m34.922s I propose to fix the failing test case by fixing the test condition. A patch for that is on the list: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673551.html FWIW, here is a small code change that will bring back the old behavior for analysis: --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -2595,7 +2595,7 @@ out: auto_vec two_op_perm_indices[2]; vec two_op_scalar_stmts[2] = {vNULL, vNULL}; - if (two_operators && oprnds_info.length () == 2 && group_size > 2) + if (false && two_operators && oprnds_info.length () == 2 && group_size > 2) { unsigned idx = 0; hash_map seen;
[Bug tree-optimization/118487] [15 Regression] ICE tree check: expected vector_cst, have ssa_name in vector_cst_encoded_nelts, at tree.h:4683 since r15-5563-g1c4d39ada33d36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118487 Christoph Müllner changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #2 from Christoph Müllner --- I can reproduce this ICE. The issue comes from the uninitialized mask (or selector) of the vector shuffle (or permutation). Uninitialized means that values might exceed the number of possible elements. The documentation states, "The elements of mask are considered modulo N in the single-operand case and modulo 2*N in the two-operand case." However, we don't perform this modulo-operation on the indices found in the mask. I will fix this.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 117079, which changed state. Bug 117079 Summary: [15 Regression] FAIL: gcc.target/i386/pr105493.c since r15-2820-gab18785840d7b8 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/117079] [15 Regression] FAIL: gcc.target/i386/pr105493.c since r15-2820-gab18785840d7b8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079 Christoph Müllner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Christoph Müllner --- Fixed in https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=120a37008222bf6fe17658af3d1ba1b384642905
[Bug tree-optimization/118487] [15 Regression] ICE tree check: expected vector_cst, have ssa_name in vector_cst_encoded_nelts, at tree.h:4683 since r15-5563-g1c4d39ada33d36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118487 Christoph Müllner changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Christoph Müllner --- Fixed with https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b42eeef63a7e88f90e6ecab9c541b96146759b8c Thanks for reporting!
[Bug tree-optimization/118487] [15 Regression] ICE tree check: expected vector_cst, have ssa_name in vector_cst_encoded_nelts, at tree.h:4683 since r15-5563-g1c4d39ada33d36
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118487 --- Comment #3 from Christoph Müllner --- My initial comment about the need to sanitize the mask elements of VEC_PERM_EXPR was correct, but there is nothing to be done for that, because this is handled by ccp1. The ICE reported here comes from the issue of not checking the TREE_CODE of the mask tree. I've sent a fix for that to the list: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673703.html While analysing this, I noticed that we make redundant calls to to_constant(), which is addressed here: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673702.html
[Bug target/119587] RISC-V: XTheadMemIdx: ICE on valid code with asm operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119587 Christoph Müllner changed: What|Removed |Added CC||cmuellner at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Keywords||ice-on-valid-code Last reconfirmed||2025-04-02
[Bug target/119587] New: RISC-V: XTheadMemIdx: ICE on valid code with asm operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119587 Bug ID: 119587 Summary: RISC-V: XTheadMemIdx: ICE on valid code with asm operands Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: cmuellner at gcc dot gnu.org Target Milestone: --- Bohan Lei reported an ICE in a patch [1] to fix this ICE. Reproducer: // gcc -Ofast -march=rv64gc_xtheadmemidx int a; int **b; int** c () { int **e = &b[(unsigned)(long)&a]; __asm__ ("" : "+A"(*e)); return 0; } Replacing "return 0" with "return e" avoids the ICE. The underlying issue is that the combiner's output cannot be lowered later on (which triggers the ICE in LRA). Bohan's patch attempts to address the ICE with a splitter. However, it has not yet been decided whether that's the right way. Jeff Law has started a discussion in [2]. Since this has already become more than a simple fix, I'm opening a ticket here. [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-March/678933.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2025-April/679950.html