[Bug c++/119102] New: GCC 15.0 'import std;' fails with Ofast (not with O3) due to some openmp internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119102 Bug ID: 119102 Summary: GCC 15.0 'import std;' fails with Ofast (not with O3) due to some openmp internal error Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: igor.machado at gmail dot com Target Milestone: --- I successfully used GCC 15.0 to compile several code with CXX Modules, but I noticed it fails with -Ofast, what does not happen with -O0, -O1, -O2 and -O3. The error somehow mentions -fopenmp, which I don't use... code is trivial, just: file: main.cpp ``` import std; int main() { return 0; } ``` Compiled with FAILS: g++-15 -std=c++23 -Ofast -fmodules -fsearch-include-path bits/std.cc main.cpp -o example ``` In module imported at ./main.cpp:1:1: std: error: module contains OpenMP, use ‘-fopenmp’ to enable std: error: failed to read compiled module: Bad file data std: note: compiled module file is ‘gcm.cache/std.gcm’ std: fatal error: returning to the gate for a mechanical issue compilation terminated. ``` This works fine: g++-15 -std=c++23 -O3 -fmodules -fsearch-include-path bits/std.cc main.cpp -o example Even when `-fopenmp` is enabled, compiler breaks: ``` $ g++-15 -std=c++23 -O3 -fopenmp -fmodules -freport-bug -fsearch-include-path bits/std.cc main.cpp -o example] /usr/include/c++/15/bits/std.cc:37:8: internal compiler error: in decl_node, at cp/module.cc:8808 37 | export module std; |^~ 0x73d907e2a1c9 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 0x73d907e2a28a __libc_start_main_impl ../csu/libc-start.c:360 Please submit a full bug report, with preprocessed source. Please include the complete backtrace with any bug report. See for instructions. The bug is not reproducible, so it is likely a hardware or OS problem. ``` Version is: g++-15 (Ubuntu 15-20250213-1ubuntu1) 15.0.1 20250213 (experimental) [master r15-7502-g26baa2c09b3] Operating System is Ubuntu 24.04, with gcc-15 from Ubuntu 25.04 repo: - deb http://cz.archive.ubuntu.com/ubuntu plucky main universe I know that CXX Modules and "import std;" is still quite experimental, but it seemed strange to generate a compiler bug, that's why I'm reporting. Good luck!
[Bug c++/119102] GCC 15.0 'import std;' fails with Ofast (not with O3) due to some openmp internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119102 --- Comment #1 from Igor Machado Coelho --- Note that if first building the std module with O3 and then linking with Ofast, it works fine: g++-15 -std=c++23 -O3 -fmodules -fsearch-include-path bits/std.cc main.cpp -o example # this generates gcm.cache/std.gcm Then... use Ofast: g++-15 -std=c++23 -Ofast -fmodules main.cpp -o example This does not break the compiler... so it's indeed something related to compiling the std.cc with Ofast.
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #6 from Iain Sandoe --- is this related to or maybe a dup of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117364 ?
[Bug c++/119073] [15 Regression] ICE in cp_gimplify_expr, at cp/cp-gimplify.cc:911 with temporary vector in range-for with -std=c++23 since r15-7481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119073 Jason Merrill changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #7 from Jakub Jelinek --- Most likely yes.
[Bug fortran/103391] [12/13/14/15 Regression] ICE: gimplification failed since r7-4021-g574284e9c49687d8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103391 --- Comment #9 from paul.richard.thomas at gmail dot com --- That was a question at the end, not a statement :-) I cannot see anything wrong with the test case but wondered if one of the more eagle-eyed of us could see a standardese problem with it. Have you had any experience with ChatGPT or similar? I was wondering whether or not it is up to the resolution of standard questions. Cheers Paul On Mon, 3 Mar 2025 at 14:34, vehre at gcc dot gnu.org < gcc-bugzi...@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103391 > > Andre Vehreschild changed: > >What|Removed |Added > > >Assignee|unassigned at gcc dot gnu.org |vehre at gcc dot > gnu.org > Status|NEW |ASSIGNED > > -- > You are receiving this mail because: > You are on the CC list for the bug.
[Bug fortran/77872] [12/13/14/15 Regression] ICE in gfc_conv_descriptor_token, at fortran/trans-array.c:305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77872 Andre Vehreschild changed: What|Removed |Added Status|ASSIGNED|WAITING --- Comment #15 from Andre Vehreschild --- Awaiting review at: https://gcc.gnu.org/pipermail/fortran/2025-March/061822.html
[Bug rtl-optimization/119071] [12/13/14/15 Regression] Miscompile at -O2 since r10-7268
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071 --- Comment #10 from Jakub Jelinek --- cse1 optimizes insn optimizes insn 15 away in: (insn 10 9 11 2 (parallel [ (set (reg:SI 84 [ _3 ]) (plus:SI (reg:SI 96) (reg:SI 92 [ _18 ]))) (clobber (reg:CC 17 flags)) ]) 185 {*addsi_1} (nil)) (insn 11 10 12 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 84 [ _3 ]) (const_int 1 [0x1]))) 11 {*cmpsi_1} (nil)) (insn 12 11 13 2 (set (reg:QI 98) (ne:QI (reg:CCZ 17 flags) (const_int 0 [0]))) 732 {*setcc_qi} (nil)) (insn 13 12 14 2 (set (reg:SI 97) (zero_extend:SI (reg:QI 98))) 119 {*zero_extendqisi2} (nil)) (insn 14 13 15 2 (set (reg:SI 89 [ _15 ]) (reg:SI 97)) 67 {*movsi_internal} (nil)) (insn 15 14 16 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 84 [ _3 ]) (const_int 1 [0x1]))) 11 {*cmpsi_1} (nil)) (insn 16 15 17 2 (set (reg:QI 100) (eq:QI (reg:CCZ 17 flags) (const_int 0 [0]))) 732 {*setcc_qi} (nil)) which looks reasonable, nothing clobbers flags in between and insn 11 sets it to the same value. I think things go wrong during combine. Before that revision we have (insn 5 2 6 2 (set (reg:CCZ 17 flags) (compare:CCZ (mem/c:SI (symbol_ref:DI ("a") [flags 0x2] ) [1 a+0 S4 A32]) (const_int -2 [0xfffe]))) 11 {*cmpsi_1} (nil)) ... (insn 18 17 19 2 (set (reg:SI 97) (eq:SI (reg:CCZ 17 flags) (const_int 0 [0]))) 731 {*setcc_si_1_movzbl} (expr_list:REG_DEAD (reg:CC 17 flags) (nil))) (insn 19 18 20 2 (set (reg:SI 102) (ne:SI (reg:CCZ 17 flags) (const_int 0 [0]))) 731 {*setcc_si_1_movzbl} (expr_list:REG_DEAD (reg:CC 17 flags) (nil))) (insn 20 19 21 2 (parallel [ (set (reg:SI 93 [ ]) (minus:SI (reg:SI 102) (reg:SI 97))) (clobber (reg:CC 17 flags)) ]) 254 {*subsi_1} (expr_list:REG_DEAD (reg:SI 102) (expr_list:REG_DEAD (reg:SI 97) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil) (insn 21 20 25 2 (set (mem/c:SI (symbol_ref:DI ("b") [flags 0x2] ) [1 b+0 S4 A32]) (reg:SI 93 [ ])) 67 {*movsi_internal} (nil)) which feels reasonable. The testcase has UB when a == -2 (left shift by -1) and otherwise sets b to 1. But starting with r10-7268 combiner combines this into (insn 5 2 6 2 (set (reg:CCZ 17 flags) (compare:CCZ (mem/c:SI (symbol_ref:DI ("a") [flags 0x2] ) [1 a+0 S4 A32]) (const_int -2 [0xfffe]))) 11 {*cmpsi_1} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) ... (insn 20 19 21 2 (set (reg:SI 93 [ ]) (const_int 0 [0])) 67 {*movsi_internal} (nil)) (insn 21 20 25 2 (set (mem/c:SI (symbol_ref:DI ("b") [flags 0x2] ) [1 b+0 S4 A32]) (reg:SI 93 [ ])) 67 {*movsi_internal} (nil)) which is wrong for the non-UB case of a not being -2.
[Bug rtl-optimization/119099] [15 regression] Compile-time hang in ext-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119099 Jeffrey A. Law changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2025-03-03 Status|UNCONFIRMED |ASSIGNED --- Comment #3 from Jeffrey A. Law --- Bi-directional dataflow is notoriously hard to get correct and I have zero confidence this code handles that reasonably. I thought I had some checks for this, though I don't immediately see them. While I see 2-3 WTF things going on, but as Alexey noted, the key one is the expansion and contraction of the sets.
[Bug rtl-optimization/119071] [12/13/14/15 Regression] Miscompile at -O2 since r10-7268
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071 Jakub Jelinek changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #11 from Jakub Jelinek --- So, I think the problematic combination is Trying 13, 18, 17 -> 19: 13: r97:SI=flags:CCZ!=0 18: {r101:SI=-r97:SI;clobber flags:CC;} REG_UNUSED flags:CC 17: r99:SI=flags:CCZ==0 REG_DEAD flags:CCZ 19: {r102:SI=r99:SI<
[Bug rtl-optimization/118739] [15 Regression] wrong code at -O{s,3} with "-fno-tree-forwprop -fno-tree-vrp" on x86_64-linux-gnu since r15-268-g9dbff9c05520a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118739 --- Comment #18 from GCC Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:a92dc3fe31c95d56019b2fb95a58414bca06241f commit r15-7793-ga92dc3fe31c95d56019b2fb95a58414bca06241f Author: Uros Bizjak Date: Wed Feb 12 11:19:57 2025 +0100 combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739] The combine pass is trying to combine: Trying 16, 22, 21 -> 23: 16: r104:QI=flags:CCNO>0 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC 21: r119:QI=flags:CCNO<=0 REG_DEAD flags:CCNO 23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;} REG_DEAD r120:QI REG_DEAD r119:QI REG_UNUSED flags:CC and creates the following two insn sequence: modifying insn i222: r104:QI=flags:CCNO>0 REG_DEAD flags:CC deferring rescan insn with uid = 22. modifying insn i323: r110:QI=flags:CCNO<=0 REG_DEAD flags:CC deferring rescan insn with uid = 23. where the REG_DEAD note in i2 is not correct, because the flags register is still referenced in i3. In try_combine() megafunction, we have this part: --cut here-- /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3. */ if (i3notes) distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i2notes) distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); if (i1notes) distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL, elim_i2, local_elim_i1, local_elim_i0); if (i0notes) distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, local_elim_i0); if (midnotes) distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL, elim_i2, elim_i1, elim_i0); --cut here-- where the compiler distributes REG_UNUSED note from i2: 22: {r120:QI=r104:QI^0x1;clobber flags:CC;} REG_UNUSED flags:CC via distribute_notes() using the following: --cut here-- /* Otherwise, if this register is used by I3, then this register now dies here, so we must put a REG_DEAD note here unless there is one already. */ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)) && ! (REG_P (XEXP (note, 0)) ? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0))) : find_reg_note (i3, REG_DEAD, XEXP (note, 0 { PUT_REG_NOTE_KIND (note, REG_DEAD); place = i3; } --cut here-- Flags register is used in I3, but there already is a REG_DEAD note in I3. The above condition doesn't trigger and continues in the "else" part where REG_DEAD note is put to I2. The proposed solution corrects the above logic to trigger every time the register is referenced in I3, avoiding the "else" part. PR rtl-optimization/118739 gcc/ChangeLog: * combine.cc (distribute_notes) : Correct the logic when the register is used by I3. gcc/testsuite/ChangeLog: * gcc.target/i386/pr118739.c: New test.
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #5 from Jakub Jelinek --- I'm unsure if this should be a P1, the P1-ish part on this is solely that a test was added that ICEs, but the test ICEd before for several years as well.
[Bug tree-optimization/117919] [14/15 Regression] ICE: in propagate, at gimple-ssa-sccopy.cc:625 with -O -fno-tree-forwprop -fnon-call-exceptions --param=early-inlining-insns=192
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117919 --- Comment #7 from GCC Commits --- The releases/gcc-14 branch has been updated by Filip Kastl : https://gcc.gnu.org/g:6ffbc711afbda9446df51fd2b542ecd61853283d commit r14-11373-g6ffbc711afbda9446df51fd2b542ecd61853283d Author: Filip Kastl Date: Sun Mar 2 06:39:17 2025 +0100 gimple: sccopy: Prune removed statements from SCCs [PR117919] While writing the sccopy pass I didn't realize that 'replace_uses_by ()' can remove portions of the CFG. This happens when replacing arguments of some statement results in the removal of an EH edge. Because of this sccopy can then work with GIMPLE statements that aren't part of the IR anymore. In PR117919 this triggered an assertion within the pass which assumes that statements the pass works with are reachable. This patch tells the pass to notice when a statement isn't in the IR anymore and remove it from it's worklist. PR tree-optimization/117919 gcc/ChangeLog: * gimple-ssa-sccopy.cc (scc_copy_prop::propagate): Prune statements that 'replace_uses_by ()' removed. gcc/testsuite/ChangeLog: * g++.dg/pr117919.C: New test. Signed-off-by: Filip Kastl (cherry picked from commit 5349aa2accdf34a7bf9cabd1447878aaadfc0e87)
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #8 from Iain Sandoe --- the comments in PR117364, lead me to believe that this is a problem down-stream of the FE that happens to be exposed frequently by coroutines (since we need to populate because of the phasing required of that with the initial suspend. Eric seemed to agree that NVRO could just as easily result in the same circumstance. As of now I am not sure how to proceed... if this is to become a P1 we need to discuss how to meet the constraints.
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #9 from Jakub Jelinek --- Looking at the simpler struct C { void *p; explicit C (void *p) : p(p) {} }; C foo (int i, void *p) { C c (p); return c; } test, -O2 -m32 vs. -O2 -m64 -mptr64 the reason why is used in the first case and not in the latter is in want_nrvo_p. can_do_nrvo_p is true in both cases, but aggregate_value_p (functype, current_function_decl) is true only in the former case but in the latter. So, the question is what is going during the coroutine handling that NRVO is still used in a function which uses pretty much the same class.
[Bug ipa/119093] ICE: in function_and_variable_visibility, at ipa-visibility.cc:715 with weakref to target_clone
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119093 Richard Biener changed: What|Removed |Added Keywords||ice-on-valid-code --- Comment #1 from Richard Biener --- We need a ice-on-dubious-code ;)
[Bug fortran/118747] [15 Regression]: seg fault on accessing an elemental procedure dummy argument's deferred-length component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118747 --- Comment #6 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:43c11931acc50f3a44efb485b03e6a8d44df97e0 commit r15-7789-g43c11931acc50f3a44efb485b03e6a8d44df97e0 Author: Andre Vehreschild Date: Wed Feb 26 14:30:13 2025 +0100 Fortran: Fix regression on double free on elemental function [PR118747] Fix a regression were adding a temporary variable inserted a copy of the argument to the elemental function. That copy was then later used to free allocated memory, but the freeing was not tracked in the source array correctly. PR fortran/118747 gcc/fortran/ChangeLog: * trans-array.cc (gfc_trans_array_ctor_element): Remove copy to temporary variable. * trans-expr.cc (gfc_conv_procedure_call): Use references to array members instead of copies when freeing after use. Formatting fix. gcc/testsuite/ChangeLog: * gfortran.dg/alloc_comp_auto_array_4.f90: New test.
[Bug go/119098] GO built from GCC 14 sources no longer works when installing libgo23 build from GCC 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119098 --- Comment #2 from Richard Biener --- In the Makefile I see version.go: s-version; @true s-version: Makefile rm -f version.go.tmp echo "package sys" > version.go.tmp echo 'const GccgoToolDir = "$(libexecsubdir)"' >> version.go.tmp echo 'const StackGuardMultiplierDefault = 1' >> version.go.tmp $(SHELL) $(srcdir)/mvifdiff.sh version.go.tmp version.go $(STAMP) $@
[Bug go/119098] GO built from GCC 14 sources no longer works when installing libgo23 build from GCC 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119098 --- Comment #3 from Richard Biener --- So for GCC 15 I suggest to bump the SONAME. But this behavior really looks odd with bad separation of compiler driver and runtime?
[Bug c++/119076] [15 Regression] ICE with Segmentation fault with modules due to char array in a template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119076 --- Comment #10 from Jakub Jelinek --- Created attachment 60643 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60643&action=edit gcc15-pr119076.patch Untested fix.
[Bug target/119083] Remove SSE_FIRST_REG from ix86_class_likely_spilled_p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083 --- Comment #7 from Hongtao Liu --- (In reply to Hongtao Liu from comment #5) > (In reply to H.J. Lu from comment #3) > > Created attachment 60640 [details] > > A patch to remove SSE_FIRST_REG from ix86_class_likely_spilled_p > > > > Hongtao, can you measure its impact on SPEC CPU2017? > > Sure. No big impact for this.
[Bug c/118983] I'm using the gcc comes from the Ubuntu 20.04, but it faied to compile a C program
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118983 --- Comment #3 from Andrew Pinski --- *** Bug 119095 has been marked as a duplicate of this bug. ***
[Bug c/119095] GCC in Ubuntu 20.04, 22.04 and 24.04 all have this problem.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119095 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- Without a full testcase, it is hard to help you. Also PR 118983 was the one you filed. Plus we don't support GCC that is provided by distro, it is distro that supports it. *** This bug has been marked as a duplicate of bug 118983 ***
[Bug target/119090] [MAME] [Model 1] 3D graphics are full of glitches if built with CXXFLAGS="-march=native -mtune=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119090 --- Comment #3 from Manuel Alfayate --- WHAT "-march=native -mtune=native" ENABLES ON MY SYSTEM gcc -march=native -mtune=native -Q --help=target The following options are target specific: -m128bit-long-double [enabled] -m16 [disabled] -m32 [disabled] -m3dnow [disabled] -m3dnowa [disabled] -m64 [enabled] -m80387 [enabled] -m8bit-idiv [disabled] -m96bit-long-double [disabled] -mabi=sysv -mabm [enabled] -maccumulate-outgoing-args[disabled] -maddress-mode= long -madx [enabled] -maes [enabled] -malign-data= compat -malign-double[disabled] -malign-functions=0 -malign-jumps=0 -malign-loops=0 -malign-stringops [enabled] -mamx-bf16[disabled] -mamx-complex [disabled] -mamx-fp16[disabled] -mamx-int8[disabled] -mamx-tile[disabled] -mandroid [disabled] -mapx-features= none -mapx-inline-asm-use-gpr32[disabled] -mapxf[disabled] -march= znver4 -masm=att -mavx [enabled] -mavx10.1 -mavx10.1-256 -mavx10.1-256 [disabled] -mavx10.1-512 [disabled] -mavx2[enabled] -mavx256-split-unaligned-load [disabled] -mavx256-split-unaligned-store[disabled] -mavx5124fmaps[disabled] -mavx5124vnniw[disabled] -mavx512bf16 [enabled] -mavx512bitalg[enabled] -mavx512bw[enabled] -mavx512cd[enabled] -mavx512dq[enabled] -mavx512er[disabled] -mavx512f [enabled] -mavx512fp16 [disabled] -mavx512ifma [enabled] -mavx512pf[disabled] -mavx512vbmi [enabled] -mavx512vbmi2 [enabled] -mavx512vl[enabled] -mavx512vnni [enabled] -mavx512vp2intersect [disabled] -mavx512vpopcntdq [enabled] -mavxifma [disabled] -mavxneconvert[disabled] -mavxvnni [disabled] -mavxvnniint16[disabled] -mavxvnniint8 [disabled] -mbionic [disabled] -mbmi [enabled] -mbmi2[enabled] -mbranch-cost=<0,5> 3 -mcall-ms2sysv-xlogues[disabled] -mcet-switch [disabled] -mcld [disabled] -mcldemote[disabled] -mclflushopt [enabled] -mclwb[enabled] -mclzero [enabled] -mcmodel= [default] -mcmpccxadd [disabled] -mcpu= -mcrc32 [enabled] -mcx16[enabled] -mdaz-ftz [disabled] -mdirect-extern-access[enabled] -mdispatch-scheduler [disabled] -mdump-tune-features [disabled] -menqcmd [disabled] -mevex512 [enabled] -mf16c[enabled] -mfancy-math-387 [enabled] -mfentry [disabled] -mfentry-name= -mfentry-section= -mfma [enabled] -mfma4[disabled] -mforce-drap [disabled] -mforce-indirect-call [disabled] -mfp-ret-in-387 [enabled] -mfpmath= sse -mfsgsbase[enabled] -mfunction-retur
[Bug c++/99538] ICE: in maybe_add_lambda_conv_op, at cp/lambda.c:1037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99538 Simon Martin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |simartin at gcc dot gnu.org --- Comment #2 from Simon Martin --- Working on this one.
[Bug lto/119067] [14/15 Regression] ICE when building firefox-135.0.1 with LTO (tree check: expected none of vector_type, have vector_type in odr_types_equivalent_p, at ipa-devirt.cc:1262)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119067 --- Comment #12 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:f22e89167b3abfbf6d67f42fc4d689d8ffdc1810 commit r15-7790-gf22e89167b3abfbf6d67f42fc4d689d8ffdc1810 Author: Richard Biener Date: Mon Mar 3 09:54:15 2025 +0100 ipa/119067 - bogus TYPE_PRECISION check on VECTOR_TYPE odr_types_equivalent_p can end up using TYPE_PRECISION on vector types which is a no-go. The following instead uses TYPE_VECTOR_SUBPARTS for vector types so we also end up comparing the number of vector elements. PR ipa/119067 * ipa-devirt.cc (odr_types_equivalent_p): Check TYPE_VECTOR_SUBPARTS for vectors. * g++.dg/lto/pr119067_0.C: New testcase. * g++.dg/lto/pr119067_1.C: Likewise.
[Bug lto/119067] [14 Regression] ICE when building firefox-135.0.1 with LTO (tree check: expected none of vector_type, have vector_type in odr_types_equivalent_p, at ipa-devirt.cc:1262)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119067 Richard Biener changed: What|Removed |Added Known to work||15.0 Summary|[14/15 Regression] ICE when |[14 Regression] ICE when |building firefox-135.0.1|building firefox-135.0.1 |with LTO (tree check: |with LTO (tree check: |expected none of|expected none of |vector_type, have |vector_type, have |vector_type in |vector_type in |odr_types_equivalent_p, at |odr_types_equivalent_p, at |ipa-devirt.cc:1262) |ipa-devirt.cc:1262) --- Comment #13 from Richard Biener --- Fixed on trunk sofar, I have queued a patch to sync hashing and streaming for GC 16.
[Bug lto/119067] [14/15 Regression] ICE when building firefox-135.0.1 with LTO (tree check: expected none of vector_type, have vector_type in odr_types_equivalent_p, at ipa-devirt.cc:1262)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119067 Richard Biener changed: What|Removed |Added Priority|P3 |P2 --- Comment #11 from Richard Biener --- So the odr_types_equivalent_p code is obviously broken, but more interesting is why we didn't merge the two at this point identical vector types. This is because their SCC (size one) hash is different, 4263663699 vs 4287848316 and that is because of how we hash modes vs. how we stream them: if (CODE_CONTAINS_STRUCT (code, TS_TYPE_COMMON)) { hstate.add_hwi (TYPE_MODE (t)); but /* For offloading, avoid streaming out TYPE_MODE for aggregate type since it may be host-specific. For eg, aarch64 uses OImode for ARRAY_TYPE whose size is 256-bits, which is not representable on accelerator. Instead stream out VOIDmode, and while streaming-in, recompute appropriate TYPE_MODE for accelerator. */ if (lto_stream_offload_p && (AGGREGATE_TYPE_P (expr) || VECTOR_TYPE_P (expr))) bp_pack_machine_mode (bp, VOIDmode); /* for VECTOR_TYPE, TYPE_MODE reevaluates the mode using target_flags not necessary valid in a global context. Use the raw value previously set by layout_type. */ else bp_pack_machine_mode (bp, TYPE_MODE_RAW (expr)); I have a fix for the ICE, leaving the two identical type copies around.
[Bug c/119092] Add support for clang/LLVM builtin __builtin_{reduce,elementwise}_*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119092 Richard Biener changed: What|Removed |Added Component|middle-end |c --- Comment #3 from Richard Biener --- __builtin_elementwise_atan - what do they do? Use __has_builtin and then assume there's a library implementation (which what ABI?). IMO _iff_ we want to support those the only practical way at the moment is to lower them in the frontend to scalar operations on vector extracts and build up a vector from the results. The reduction builtins look somewhat more useful (what does openCL have here?), so I'd rather track both in separate bugreports.
[Bug c/119095] New: GCC in Ubuntu 20.04, 22.04 and 24.04 all have this problem.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119095 Bug ID: 119095 Summary: GCC in Ubuntu 20.04, 22.04 and 24.04 all have this problem. Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: wzis at hotmail dot com Target Milestone: --- wzsgcreg.c:237:6: internal compiler error: in subspan, at input.h:68 237 | st.st_size, ptime, line); | ^~ 0x7f8f35858082 __libc_start_main ../csu/libc-start.c:308 As I tested the issue happened on Ubuntu 20.04, 24.04, that means it is from GCC 9.4.0 all the way to 13.3.0。 The C statement that it complained about is as following: sprintf(key, CERT_FMT_NORMAL, licCRC, MaxCertSize, sha384sumf(line), st.st_size, ptime, line); Just for your info, this program has been compiled on many other versions of Linux, and on AIX,Solaris,MacOS, with no issue. I submitted the bug a few days ago, but I couldn't find it any more, in that one, I was told it's a duplicate one, and the original one is fixed, ask me to talk to Ubuntu, but I asked Ubuntu for this, they didn't recognize it.
[Bug tree-optimization/119096] New: Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 Bug ID: 119096 Summary: Loop with conditional, cast and reduction vectorized incorrectly with AVX-512 Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: someone12469 at gmail dot com Target Milestone: --- The following C program, which should output 0, outputs 8 when compiled with gcc -O2 -mavx512f on 64-bit Linux. int printf(const char *, ...); long sum(int* A, int* B) { long total = 0; for(int j = 0; j < 16; j++) if((A[j] > 0) & (B[j] > 0)) total += (long)A[j]; return total; } int main() { int A[16] = { 1,1,1,1, 1,1,1,1, 1,1,1,1, 1,1,1,1 }; int B[16] = { }; printf("%ld\n", sum(A, B)); } The singular & is intentional and significant, in the original program it was used to hint that the read is safe to get better vectorization. In the resulting assembly for f: sum: .LFB0: .cfi_startproc vmovdqu32 (%rdi), %zmm0 vpxor %xmm2, %xmm2, %xmm2 vmovdqu32 (%rsi), %zmm3 vpcmpd $6, %zmm2, %zmm0, %k1 vextracti64x4 $0x1, %zmm0, %ymm1 vpmovsxdq %ymm0, %zmm0 vpmovsxdq %ymm1, %zmm1 vpcmpd $6, %zmm2, %zmm3, %k1{%k1} vmovdqa64 %zmm1, %zmm2 kshiftrw$8, %k1, %k1 vpaddq %zmm1, %zmm0, %zmm2{%k1} vextracti64x4 $0x1, %zmm2, %ymm1 vpaddq %ymm2, %ymm1, %ymm1 vextracti128$0x1, %ymm1, %xmm0 vpaddq %xmm1, %xmm0, %xmm0 vpsrldq $8, %xmm0, %xmm1 vpaddq %xmm1, %xmm0, %xmm0 vmovq %xmm0, %rax vzeroupper ret .cfi_endproc the main issue appears to be the "vpaddq %zmm1, %zmm0, %zmm2{%k1}", which keeps the value from the lower half when the upper half is masked, even if the lower half is masked as well. Tested on x86_64-pc-linux-gnu on gcc 14.2.1 and a local build of the latest commit. Since the bug reporting instructions insist, the output of gcc -v for my distribution's 14.2.1: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/14.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++,rust --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://gitlab.archlinux.org/archlinux/packaging/packages/gcc/-/issues --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.2.1 20240910 (GCC)
[Bug c++/119097] New: Modules references internal linkage entity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119097 Bug ID: 119097 Summary: Modules references internal linkage entity Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: hypengwip at gmail dot com Target Milestone: --- When a variable is declared inside an unnamed namespace in a header file and then used in a struct definition, it causes a compilation error due to internal linkage. The issue occurs when the struct is used both in a header (.h) and inside a module (.cppm). However, using the same variable in a function inside a module (.cppm) does not produce any error. ``` // error: ‘struct A’ references internal linkage entity 'constexpr const int {anonymous}::default_val'` // part1.h #pragma once namespace { static constexpr const int default_val { 0 }; } // if define A here struct A { int value { default_val }; // failed A() : value(default_val) {} // failed static auto a() { return default_val; } // failed }; // part1.cppm module; #include "part1.h" export module mymod; // if define A here struct A { int value { default_val }; // failed A() : value(default_val) {} // ok static auto a() { return default_val; } // ok }; export auto a() { return default_val; } // ok export void part1_fun(A) {} ```
[Bug tree-optimization/119096] [14/15 regression] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 --- Comment #2 from Andrew Pinski --- The vectorizer looks ok though: mask_patt_37.15_52 = [vec_unpack_lo_expr] mask__9.14_51; mask_patt_37.15_53 = [vec_unpack_hi_expr] mask__9.14_51; vect_patt_36.18_57 = .COND_ADD (mask_patt_37.15_52, vect__10.16_54, vect_total_21.17_56, vect__10.16_54); vect_patt_36.18_58 = .COND_ADD (mask_patt_37.15_53, vect__10.16_55, vect_patt_36.18_57, vect__10.16_55); /* A ? B : B -> B. */ (simplify (cnd @0 @1 @1) @1) Confirmed, I think COND_ADD folding goes wrong.
[Bug tree-optimization/119096] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 --- Comment #1 from Andrew Pinski --- Gimple level: vect__4.8_45 = MEM [(int *)A_15(D)]; vect__10.16_54 = [vec_unpack_lo_expr] vect__4.8_45; vect__10.16_55 = [vec_unpack_hi_expr] vect__4.8_45; mask__5.9_46 = vect__4.8_45 > { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; vect__7.12_49 = MEM [(int *)B_16(D)]; mask__8.13_50 = vect__7.12_49 > { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; mask__9.14_51 = mask__5.9_46 & mask__8.13_50; mask_patt_37.15_53 = [vec_unpack_hi_expr] mask__9.14_51; // Only use the upper half vect_patt_36.18_58 = .COND_ADD (mask_patt_37.15_53, vect__10.16_54, vect__10.16_55, vect__10.16_55); _60 = .REDUC_PLUS (vect_patt_36.18_58); [tail call]
[Bug tree-optimization/119096] [14/15 regression] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 Sam James changed: What|Removed |Added Target Milestone|--- |14.3 Known to work||13.3.1 Known to fail||14.2.1, 15.0 Summary|Loop with conditional, cast |[14/15 regression] Loop |and reduction vectorized|with conditional, cast and |incorrectly with AVX-512|reduction vectorized ||incorrectly with AVX-512
[Bug tree-optimization/119096] [14/15 regression] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2025-03-03 Ever confirmed|0 |1
[Bug go/119098] New: GO built from GCC 14 sources no longer works when installing libgo23 build from GCC 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119098 Bug ID: 119098 Summary: GO built from GCC 14 sources no longer works when installing libgo23 build from GCC 15 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: go Assignee: ian at airs dot com Reporter: rguenth at gcc dot gnu.org Target Milestone: --- We face the issue that when updating the compiler runtime which includes libgo23 from GCC 14 provided to GCC 15 provided (which has unchanged libgo SONAME) the GO compiler built from GCC 14 no longer works. The symptom is [ 25s] go: no such tool "cgo" I suspect that somehow (parts of?) the path to cgo are built into this shared library instead of the driver for whatever reason. Possibly the GO runtime knows how to compile?! Strace difference between shlibs when invoking gccgo-14 is thus - 29430 newfstatat(AT_FDCWD, "/usr/lib64/gcc/x86_64-suse-linux/14/cgo", + 29833 newfstatat(AT_FDCWD, "/usr/lib64/gcc/x86_64-suse-linux/15/cgo", It does feel somewhat like a deja-vu ..?
[Bug go/119098] GO built from GCC 14 sources no longer works when installing libgo23 build from GCC 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119098 Richard Biener changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=108057 --- Comment #1 from Richard Biener --- PR108057 also had the 'no such tool "cgo"' issue, but after other type issues.
[Bug tree-optimization/119096] [14/15 regression] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #2) > The vectorizer looks ok though: > mask_patt_37.15_52 = [vec_unpack_lo_expr] mask__9.14_51; > mask_patt_37.15_53 = [vec_unpack_hi_expr] mask__9.14_51; > vect_patt_36.18_57 = .COND_ADD (mask_patt_37.15_52, vect__10.16_54, > vect_total_21.17_56, vect__10.16_54); > vect_patt_36.18_58 = .COND_ADD (mask_patt_37.15_53, vect__10.16_55, > vect_patt_36.18_57, vect__10.16_55); > > > /* A ? B : B -> B. */ > (simplify > (cnd @0 @1 @1) > @1) > > Confirmed, I think COND_ADD folding goes wrong. Wait maybe the original COND_ADD is incorrect. I can't remember how COND_ADD works. I thought it was `mask_patt_37.15_52 ? (vect__10.16_54+vect_total_21.17_56) : vect__10.16_54` if so then the original COND_ADD is wrong.
[Bug target/119090] [MAME] [Model 1] 3D graphics are full of glitches if built with CXXFLAGS="-march=native -mtune=native"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119090 --- Comment #2 from Richard Biener --- Also can you specify what 'native' maps to for you? What processor do you have?
[Bug rtl-optimization/119099] [15 regression] Compile-time hang in ext-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119099 Sam James changed: What|Removed |Added CC||law at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Summary|Compile-time hang in DCE|[15 regression] ||Compile-time hang in ||ext-dce Keywords||compile-time-hog Target Milestone|--- |15.0 --- Comment #1 from Sam James --- We have at least one other PR about ext-dce's use of df.
[Bug tree-optimization/119096] [14/15 regression] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 --- Comment #5 from Richard Biener --- I'm testing diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index dc15b955aad..52533623cab 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -9064,7 +9064,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo, new_stmt = gimple_build_call_internal (internal_fn (code), op.num_ops, vop[0], vop[1], vop[2], - vop[1]); + vop[reduc_index]); else new_stmt = gimple_build_assign (vec_dest, tree_code (op.code), vop[0], vop[1], vop[2]);
[Bug fortran/68241] [meta-bug] [F03] Deferred-length character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68241 Bug 68241 depends on bug 118747, which changed state. Bug 118747 Summary: [15 Regression]: seg fault on accessing an elemental procedure dummy argument's deferred-length component https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118747 What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/119057] [15 regression] ICE at -O{2,3} with "-fno-tree-vrp -fno-tree-forwprop" on x86_64-linux-gnu: in operator[], at vec.h:910 since r15-1055
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119057 --- Comment #4 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:758de6263dfc7ba8701965fa468691ac23cb7eb5 commit r15-7791-g758de6263dfc7ba8701965fa468691ac23cb7eb5 Author: Richard Biener Date: Mon Mar 3 13:21:53 2025 +0100 tree-optimization/119057 - bogus double reduction detection We are detecting a cycle as double reduction where the inner loop cycle has extra out-of-loop uses. This clashes at least with assumptions from the SLP discovery code which says the cycle isn't reachable from another SLP instance. It also was not intended to support this case, in fact with GCC 14 we seem to generate wrong code here. PR tree-optimization/119057 * tree-vect-loop.cc (check_reduction_path): Add argument specifying whether we're analyzing the inner loop of a double reduction. Do not allow extra uses outside of the double reduction cycle in this case. (vect_is_simple_reduction): Adjust. * gcc.dg/vect/pr119057.c: New testcase.
[Bug tree-optimization/119057] [12/13/14 regression] ICE at -O{2,3} with "-fno-tree-vrp -fno-tree-forwprop" on x86_64-linux-gnu: in operator[], at vec.h:910 since r15-1055
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119057 Richard Biener changed: What|Removed |Added Known to work||15.0 Summary|[15 regression] ICE at |[12/13/14 regression] ICE |-O{2,3} with "-fno-tree-vrp |at -O{2,3} with |-fno-tree-forwprop" on |"-fno-tree-vrp |x86_64-linux-gnu: in|-fno-tree-forwprop" on |operator[], at vec.h:910|x86_64-linux-gnu: in |since r15-1055 |operator[], at vec.h:910 ||since r15-1055 Keywords||fixed-but-no-testcase, ||wrong-code Priority|P1 |P2 Target Milestone|15.0|12.5 --- Comment #5 from Richard Biener --- So the issue goes back much longer, it was first partly fixed by r11-4865-g2686de5617bfb5 I'm queueing this for backports, even without having a wrong-code testcase (the outer loop use lacks a inner loop reduction "epilogue").
[Bug rtl-optimization/119099] New: Compile-time hang in DCE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119099 Bug ID: 119099 Summary: Compile-time hang in DCE Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alexey.merzlyakov at samsung dot com Target Milestone: --- Created attachment 60644 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60644&action=edit Reduced test-case The following code c-reduced from csmith generated testcase, causes GCC to hang at -O2 during the compile-time: int a, b; void func2(short); void func1() { while (1) { int loc = 8; while (1) { func2(loc); if (a) loc = 3; else if (b) break; loc |= a; } } } The hang appears on trunk and affects at least RISC-V 64/32, ARM 64/32, powerpc64, MRISC32 targets. Here is the example link on godbolt for RV64: https://godbolt.org/z/eWxfW3q1K GCC hangs in ext-dce pass, where the "df_worklist_dataflow_doublequeue()" loops forwever on the following basic blocks taken from double-worklists queue: BB7->BB6->BB4->BB3->BB5->worklists swap->BB7->BB6->BB4->BB3->BB5->worklists swap ... Dataflow solver algorithm will serve each BB node while BB sets are still changing. The key point for solver in deciding whether to continue or not - is "changed" variable state, which is set by "con_fun_n" function. This function takes a pointer to "ext_dce_rd_transfer_n()" for ext-dce case. It initializes "livenow" variables bitmap by "livein[]" states, processes it and finally emits back "livenow" to the "livein[current BB]". For the selected basic block, "livein" state flip-flops each time when loop returns back to this BB processing. E.g. in the example above for the BB=4: Breakpoint 1, ext_dce_rd_transfer_n (bb_index=4) at ext-dce.cc:980 980 return true; (gdb) p/x (&livein[bb_index])->first->next->bits $8 = {0xf30f000, 0xff0} Continuing... worklist <-> pedning swap Breakpoint 2, df_worklist_dataflow_doublequeue (...) at df-core.cc:1097 1097std::swap (pending, worklist); Continuing... (gdb) p/x (&livein[bb_index])->first->next->bits $11 = {0xf70f000, 0xff0} Continuing... So livein[4] bits flip 0xf30f000 -> 0xf70f000 -> 0xf30f000 -> ... states forever. In other words, ext-dce + df-core could be treated as finite-state machine, whose states acting on "livenow" and "livein" bitmaps. On the given testcase machine loops forever, flipping/flopping or widening/narrowing "livein" states with "livenow". Dataflow solver algorith will never come to the null-worklist state in this case. The cornerstone place for all of the described above seems to be "livein" state, which could be changed in two directions (widen or narrowed) and thus allowing machine to loop forever. If so, the solution could be to allow changing of "livein" bitmap state only in one direction: in our case - to expand. It will cause "ext_dce_rd_transfer_n()" function to also guarantee the dataflow solver algorithm to converge to its final state. The proposed solution could be as follows below: @@ -1094,8 +1094,13 @@ ext_dce_rd_transfer_n (int bb_index) the generic dataflow code that something changed. */ if (!bitmap_equal_p (&livein[bb_index], livenow)) { - bitmap_copy (&livein[bb_index], livenow); - return true; + bitmap tmp = BITMAP_ALLOC (NULL); + bitmap_and (tmp, &livein[bb_index], livenow); + if (!bitmap_equal_p (tmp, livenow)) + { + bitmap_ior_into (&livein[bb_index], livenow); + return true; + } } return false; It allows "livein[bb_index]" to be only widened and returns true only if it is being really changed. The patch works for me locally, and passed simple internal GCC testing. But if this idea to be considered positively to accept, I will to go with more serious testing on different targets and prepare its final version ready for mail-list.
[Bug fortran/118747] [15 Regression]: seg fault on accessing an elemental procedure dummy argument's deferred-length component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118747 Andre Vehreschild changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING |RESOLVED
[Bug tree-optimization/116125] [12/13/14/15 Regression] Does not fully checking for overlapping memory regions with the vectorizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116125 Richard Sandiford changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Richard Sandiford --- Will have a look (but might be a few days).
[Bug ipa/118318] [15 regression] ICE when building firefox-134.0 with PGO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118318 --- Comment #18 from Martin Jambor --- I have proposed the patch on the mailing list: https://inbox.sourceware.org/gcc-patches/ri6bjui45il@virgil.suse.cz/T/#u
[Bug c++/119082] GCC Incorrectly Accepts Explicit Destructor Call for Scalar Type in constexpr Context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119082 --- Comment #3 from Jonathan Wakely --- (In reply to qurong from comment #0) > GCC 12.4/13.3/11.4 erroneously compiles code So you already figured out that this bug was fixed in GCC 14?
[Bug tree-optimization/118976] [12/13/14/15 regression] Correctness Issue: SVE vectorization results in data corruption when cpu has 128bit vectors but compiled with -mcpu=neoverse-v1 (which is only f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118976 Richard Sandiford changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #15 from Richard Sandiford --- Oops, yes, a typo indeed.
[Bug tree-optimization/119057] [15 regression] ICE at -O{2,3} with "-fno-tree-vrp -fno-tree-forwprop" on x86_64-linux-gnu: in operator[], at vec.h:910 since r15-1055
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119057 --- Comment #3 from Richard Biener --- So there's code in check_reduction_path that checks for additional uses of the path defs that explicitly allows out-of-loop uses for the "tail": /* Check there's only a single stmt the op is used on. For the not value-changing tail and the last stmt allow out-of-loop uses. ??? We could relax this and handle arbitrary live stmts by forcing a scalar epilogue for example. */ ... else if (!is_gimple_debug (op_use_stmt) && (*code != ERROR_MARK || flow_bb_inside_loop_p (loop, gimple_bb (op_use_stmt FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) cnt++; we have [local count: 955630224]: # a.1_28 = PHI <_2(9), a.1_6(3)> # b_lsm.15_27 = PHI <_25(9), b_lsm.15_33(3)> <--- # c_lsm.17_3 = PHI <_20(9), c_lsm.17_26(3)> [local count: 7731917314]: # d.6_31 = PHI <_14(10), 0(4)> # b_lsm.15_13 = PHI <_12(10), b_lsm.15_27(4)> <--- # ivtmp_23 = PHI b.3_9 = (unsigned int) b_lsm.15_13;<--- _11 = b.3_9 | e.4_10; <--- (*) _12 = (int) _11; <--- _14 = d.6_31 + 1; ivtmp_22 = ivtmp_23 - 1; if (ivtmp_22 != 0) goto ; [89.00%] else goto ; [11.00%] [local count: 955630224]: # _51 = PHI <_11(5)>(*) # _25 = PHI <_12(5)> <--- c.9_18 = (unsigned int) c_lsm.17_3; _19 = _51 | c.9_18; with the (*) def being used outside of (the inner) loop. That's undesirable behavior for the double-reduction case. Testing a patch.
[Bug rtl-optimization/119099] [15 regression] Compile-time hang in ext-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119099 Richard Biener changed: What|Removed |Added Priority|P3 |P1 --- Comment #2 from Richard Biener --- I also noticed this odd thing (without a testcase), and questioned whether it would converge. So yes, making the solution either only grow or only shrink sounds like a correct fix for this (of course making the thing even slower). It might be that this was intended to happen by the way the thing is written of course and it just not working out. In this case the "fix" would be a workaround for the real bug. Not converging is a P1 definitely.
[Bug tree-optimization/119096] [14/15 regression] Loop with conditional, cast and reduction vectorized incorrectly with AVX-512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119096 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Richard Biener --- (In reply to Andrew Pinski from comment #3) > (In reply to Andrew Pinski from comment #2) > > The vectorizer looks ok though: > > mask_patt_37.15_52 = [vec_unpack_lo_expr] mask__9.14_51; > > mask_patt_37.15_53 = [vec_unpack_hi_expr] mask__9.14_51; > > vect_patt_36.18_57 = .COND_ADD (mask_patt_37.15_52, vect__10.16_54, > > vect_total_21.17_56, vect__10.16_54); > > vect_patt_36.18_58 = .COND_ADD (mask_patt_37.15_53, vect__10.16_55, > > vect_patt_36.18_57, vect__10.16_55); > > > > > > /* A ? B : B -> B. */ > > (simplify > > (cnd @0 @1 @1) > > @1) > > > > Confirmed, I think COND_ADD folding goes wrong. > > Wait maybe the original COND_ADD is incorrect. I can't remember how COND_ADD > works. I thought it was `mask_patt_37.15_52 ? > (vect__10.16_54+vect_total_21.17_56) : vect__10.16_54` if so then the > original COND_ADD is wrong. Yes, I think the .COND_ADD handling fails to handle the single-use-def chain optimization.
[Bug ipa/118785] [15 Regression] ICE when building vpl-gpu-rt (during IPA pass, ICE in decompose, at wide-int.h:1049)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118785 --- Comment #13 from GCC Commits --- The master branch has been updated by Martin Jambor : https://gcc.gnu.org/g:d05b64bdd048ffb7f72d97553888934a9bcd13fa commit r15-7792-gd05b64bdd048ffb7f72d97553888934a9bcd13fa Author: Martin Jambor Date: Mon Mar 3 14:53:03 2025 +0100 ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785) Since we construct arithmetic jump functions even when there is a type conversion in between the operation encoded in the jump function and when it is passed in a call argument, the IPA propagation phase must also perform the operation and conversion in two steps. IPA-VR had actually been doing it even before for binary operations but, as PR 118756 exposes, not in the case on unary operations. This patch adds the necessary step to rectify that. Like in the scalar constant case, we depend on expr_type_first_operand_type_p to determine the type of the result of the arithmetic operation. On top this, the patch special-cases ABSU_EXPR because it looks useful an so that the PR testcase exercises the added code-path. This seems most appropriate for stage 4, long term we should probably stream the types, probably after also encoding them with a string of expr_eval_op rather than what we have today. A check for expr_type_first_operand_type_p was also missing in the handling of binary ops and the intermediate value_range was initialized with a wrong type, so I also fixed this. gcc/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Handle non-conversion unary operations separately before doing any conversions. Check expr_type_first_operand_type_p for non-unary operations too. Fix type of op_res. gcc/testsuite/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * g++.dg/lto/pr118785_0.C: New test.
[Bug tree-optimization/118756] tree-ssa-loop-ivopts.cc:1156: Function defined but not used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118756 --- Comment #6 from GCC Commits --- The master branch has been updated by Martin Jambor : https://gcc.gnu.org/g:d05b64bdd048ffb7f72d97553888934a9bcd13fa commit r15-7792-gd05b64bdd048ffb7f72d97553888934a9bcd13fa Author: Martin Jambor Date: Mon Mar 3 14:53:03 2025 +0100 ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118785) Since we construct arithmetic jump functions even when there is a type conversion in between the operation encoded in the jump function and when it is passed in a call argument, the IPA propagation phase must also perform the operation and conversion in two steps. IPA-VR had actually been doing it even before for binary operations but, as PR 118756 exposes, not in the case on unary operations. This patch adds the necessary step to rectify that. Like in the scalar constant case, we depend on expr_type_first_operand_type_p to determine the type of the result of the arithmetic operation. On top this, the patch special-cases ABSU_EXPR because it looks useful an so that the PR testcase exercises the added code-path. This seems most appropriate for stage 4, long term we should probably stream the types, probably after also encoding them with a string of expr_eval_op rather than what we have today. A check for expr_type_first_operand_type_p was also missing in the handling of binary ops and the intermediate value_range was initialized with a wrong type, so I also fixed this. gcc/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * ipa-cp.cc (ipa_vr_intersect_with_arith_jfunc): Handle non-conversion unary operations separately before doing any conversions. Check expr_type_first_operand_type_p for non-unary operations too. Fix type of op_res. gcc/testsuite/ChangeLog: 2025-02-24 Martin Jambor PR ipa/118785 * g++.dg/lto/pr118785_0.C: New test.
[Bug ipa/118785] [15 Regression] ICE when building vpl-gpu-rt (during IPA pass, ICE in decompose, at wide-int.h:1049)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118785 Martin Jambor changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #14 from Martin Jambor --- Fixed.
[Bug ipa/118318] [15 regression] ICE when building firefox-134.0 with PGO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118318 --- Comment #19 from Sam James --- Thank you both. I wanted to have a go but was a bit lost.
[Bug c++/103379] ICE: tree check: expected class 'type', have 'declaration' (namespace_decl) in comptypes, at cp/typeck.c:1544
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103379 Jonathan Wakely changed: What|Removed |Added Last reconfirmed|2021-11-23 00:00:00 |2025-3-3 --- Comment #4 from Jonathan Wakely --- This is an ice-on-invalid C++23 example that produces the same error: template constexpr auto&& forward_like(U&& x) noexcept { return x; } template class C { int value{}; template friend constexpr auto&& get(this Self&& z) noexcept { return std::forward_like(z.value); } }; (invalid because you can't use deducing this on a non-member function)
[Bug target/119100] New: RISC-V: missed opportunities for vector-scalar instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100 Bug ID: 119100 Summary: RISC-V: missed opportunities for vector-scalar instructions Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: parras at gcc dot gnu.org Target Milestone: --- Created attachment 60645 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60645&action=edit Source reduced from 554.roms A number of RVV instructions have two variants: vector-vector and vector-scalar. For instance, vfmadd.vv and vfmadd.vf: the latter accepts one scalar operand. However, SPEC2017's 554.roms shows that many opportunities of emitting the vector-scalar variant are missed. $ riscv64-linux-gnu-gfortran -S -Ofast -mabi=lp64d -march=rv64gcv_zvl256b_zba_zbb_zbs_zicond -mrvv-vector-bits=zvl rho_eos_tile.F90 -o rho_eos_tile.riscv64.s $ cat rho_eos_tile.riscv64.s ... (1) vfmv.v.fv6,fa0 vlse64.vv2,0(t0),s2 vmv.v.i v5,0 (2) vfmadd.vv v9,v6,v7 ... Here (1) and (2) could be combined into: vfmadd.vf v9,fa0,v7 In RTL terms, it means combining: (set (reg:RVVM1DF 516) (vec_duplicate:RVVM1DF (reg:DF 517))) into: (set (reg:RVVM1DF 515) (plus:RVVM1DF (mult:RVVM1DF (reg:RVVM1DF 362 [ vect_M.84_273.156 ]) (reg:RVVM1DF 516)) (reg:RVVM1DF 519))) I have a draft patch dealing with the simple case where both instructions live in the same basic block. However, the vec_duplicate often gets hoisted to the loop preamble before reaching the combine pass.
[Bug fortran/103391] [12/13/14/15 Regression] ICE: gimplification failed since r7-4021-g574284e9c49687d8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103391 Andre Vehreschild changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |vehre at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug fortran/119101] New: Function compiled with Gflortran appears to produce a pointer that points at itself.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119101 Bug ID: 119101 Summary: Function compiled with Gflortran appears to produce a pointer that points at itself. Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: David.Applegate at global dot amentum.com Target Milestone: --- Created attachment 60646 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60646&action=edit Source file This bug occurs on gfortran 13.2 and 14.2, but not 7.3, 8.5 or 11.5 OS was Oracle Linux 8 and Oracle Linux 9. One of the modules in the fortran code has two return functions: getBC at line 142 and getBC2 at line 165. As far as I can tell these should produce identical results, but you will see from the program output that they do not. Running ddd I was able to see that a circular pointer reference was created at line 97 for the incorrect output. I think the fortran I have written is valid, but even if not, ideally some sort of compilation error when compiled or a runtime error when run would be handy. It took me a long time to track down the issue. This is the compiler version output: COLLECT_GCC=/project/connectflow/gcc/linux64-13.2.0/bin/gcc COLLECT_LTO_WRAPPER=/export/project/connectflow/gcc/linux64-13.2.0/bin/../libexec/gcc/x86_64-pc-linux-gnu/13.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-13.2.0/configure --prefix=/users/davida/SOFTWARE/gcc-13.2.0_install/ --disable-multilib --enable-languages=c,c++,fortran Thread model: posix Supported LTO compression algorithms: zlib gcc version 13.2.0 (GCC) This is the compilation command (-save-temps was tried but didn't produce anything): (base) [davida@ada ~]$ /project/connectflow/gcc/linux64-13.2.0/bin/gfortran -Wall -Wextra -fsanitize=address,undefined -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations test.f90 No output was produced to the terminal. This was the output when the program was run: (base) [davida@ada ~]$ ./a.out DIRICHLET DEFAULT DEFAULT DEFAULT
[Bug target/119100] RISC-V: missed opportunities for vector-scalar instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100 --- Comment #1 from Richard Biener --- doesn't late-combine and/or forwprop not have the single-BB restriction? Also when the vec-duplicate is hoisted out of a loop this then becomes a register pressure in vector vs. scalar regset issue only?
[Bug tree-optimization/119070] gcc15 incorrectly reporting negative array-bounds errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119070 --- Comment #7 from Taylor Hutt --- (In reply to Andrew Pinski from comment #6) > You could do: > >struct_1 *v1 = &global_0.f_2_0; >asm("":"+r"(v1)); >unsigned char *v2 = (unsigned char *)v1; > > to hide from GCC that the address of v2 is related to a global variable. > And that should get rid of the warning too. > > But otherwise this is undefined code. Why not cast the pointer to uintptr_t at the point of the undefined behavior pointer arithmetic?
[Bug libstdc++/119089] FAIL: 23_containers/vector/debug/assign4_backtrace_neg.cc -std=gnu++17 (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119089 John David Anglin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #10 from John David Anglin --- Resolved.
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #12 from Jakub Jelinek --- (In reply to Iain Sandoe from comment #10) > In the coroutine handling to deal with > https://eel.is/c++draft/dcl.fct.def.coroutine#7 > > we unconditionally create the return object in the slot - if we > create it somewhere else, that causes us to produce an unexpected additional > copy. If the return type has non-trivial copy ctor, then it is TREE_ADDRESSABLE type and so aggregate_value_p is true. This PR is just about the corner cases where the return type doesn't need to be returned in memory.
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #13 from Jakub Jelinek --- That check_return_expr call in /* Without a relevant location, bad conversions in check_return_expr result in unusable diagnostics, since there is not even a mention of the relevant function. Here we carry out the first part of finish_return_expr(). */ input_location = fn_start; r = check_return_expr (get_ro, &no_warning, &dangling); input_location = UNKNOWN_LOCATION; gcc_checking_assert (!dangling); actually calls want_nrvo_p and that returns false. But guess check_return_expr isn't expecting first argument like TARGET_EXPR, normal user code would return some VAR_DECL, not a TARGET_EXPR. So perhaps add before that call something like if (!aggregate_return_p (fn_return_type, current_function_decl)) { tree var = build_local_temp (fn_return_type); // Perhaps pushdecl it or else arrange it to be in BLOCK_VARS? // Emit INIT_EXPR for it from get_ro. get_ro = var; } so that you initialize the temp var instead of RESULT_DECL directly?
[Bug rtl-optimization/119071] [12/13/14/15 Regression] Miscompile at -O2 since r10-7268
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071 --- Comment #13 from Uroš Bizjak --- (In reply to Sam James from comment #12) > This works for me on trunk. Did Uros' r15-7793-ga92dc3fe31c95d fix it? Yes, this is the same issue.
[Bug rtl-optimization/119071] [12/13/14/15 Regression] Miscompile at -O2 since r10-7268
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071 Sam James changed: What|Removed |Added CC||uros at gcc dot gnu.org --- Comment #12 from Sam James --- This works for me on trunk. Did Uros' r15-7793-ga92dc3fe31c95d fix it?
[Bug fortran/119101] Function compiled with Gflortran appears to produce a pointer that points at itself.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119101 anlauf at gcc dot gnu.org changed: What|Removed |Added Known to fail||13.3.0, 14.2.0 Keywords||wrong-code Known to work||11.5.0, 12.4.1, 13.3.1, ||14.2.1, 15.0 Status|UNCONFIRMED |WAITING Last reconfirmed||2025-03-03 Ever confirmed|0 |1 --- Comment #1 from anlauf at gcc dot gnu.org --- I can confirm this with the release versions 13.3.0, 14.2.0, which print: DIRICHLET DEFAULT DEFAULT DEFAULT but at r12-10972, r13-9407, r14-11370, and 15-trunk I get: DIRICHLET DEFAULT DIRICHLET DEFAULT So it apparently has been fixed on these branches in the meantime. Are you able to update and verify that it is fixed for you, too?
[Bug target/101507] ICE for gcc.dg/Wstringop-overflow-69.c with -march=iwmmxt (internal compiler error: maximum number of generated reload insns per insn achieved (90))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101507 --- Comment #2 from Vladimir Makarov --- Sorry, I've tried gcc-12, gcc-13, gcc-14, trunk dated by Aug 1, and today trunk but I did not managed to reproduce the error. Probably, it was fixed by some LRA patch (there were a lot of them since 2021).
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #11 from Jakub Jelinek --- In the #include struct B { bool await_ready () const noexcept; void await_suspend (std::coroutine_handle<> h) const noexcept; void await_resume () const noexcept; }; struct C { struct promise_type { const char *value; std::suspend_never initial_suspend (); std::suspend_always final_suspend () noexcept; void return_value (const char *v); void unhandled_exception (); C get_return_object () { return C{this}; } }; promise_type *p; explicit C (promise_type *p) : p(p) {} const char *get (); }; C bar (bool x) { if (x) co_await B{}; co_return "foobar"; } testcase (-O2 -m64 -mptr64 -fcoroutines -std=c++23) I see can_do_nrvo_p return twice false when functype is RECORD_TYPE C, once in get_return_object and once in bar function; the reason it returns false is that retval is not a VAR_DECL, but TARGET_EXPR in both cases. The coroutines.cc code refers to DECL_RESULT unconditionally then. I think the problematic INIT_EXPR is created by #5 0x02090641 in build2 (code=INIT_EXPR, tt=, arg0=, arg1=) at ../../gcc/tree.cc:5199 #6 0x00df98f8 in build2_loc (loc=84139394, code=INIT_EXPR, type=, arg0=, arg1=) at ../../gcc/tree.h:4825 #7 0x0120f81e in cp_build_init_expr (loc=84139394, target=, init=) at ../../gcc/cp/typeck2.cc:2820 #8 0x00d8a860 in cp_build_init_expr (t=, i=) at ../../gcc/cp/cp-tree.h:8600 #9 0x012027cb in check_return_expr (retval=, no_warning=0x7fffd14f, dangling=0x7fffd14e) at ../../gcc/cp/typeck.cc:11513 #10 0x00e23d13 in cp_coroutine_transform::build_ramp_function (this=0x3c0bb60) at ../../gcc/cp/coroutines.cc:5186 Obviously, for the aggregate_value_p (TREE_TYPE (TREE_TYPE (current_function_decl)), current_function_decl) case what the code does is what we want. But I guess as these 2 PRs show, for !aggregate_value_p we want to just initialize a temporary VAR_DECL and then have GIMPLE_RETURN which returns that VAR_DECL.
[Bug rtl-optimization/119071] [12/13/14/15 Regression] Miscompile at -O2 since r10-7268
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119071 --- Comment #14 from Jakub Jelinek --- Indeed, r15-7793-ga92dc3fe31c95d56019b2fb95a58414bca06241f fixed this. I'll prepare a patch with the testcases.
[Bug c/119104] Unclear documentation for [[gnu::nonnull_if_nonzero]]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119104 --- Comment #1 from Andrew Pinski --- Non zero and zero are runtime values of here. Rather than compile characteristics of that argument. Maybe just: If the runtume value of the integral argument is zero, the pointer argument can be null; or if it is non-zero, the pointer argument must not be null.
[Bug target/119083] Remove SSE_FIRST_REG from ix86_class_likely_spilled_p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083 --- Comment #8 from H.J. Lu --- Created attachment 60647 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60647&action=edit A patch to remove CREG and BREG from ix86_class_likely_spilled_p Hongtao, can you measure its impact on SPEC CPU 2017?
[Bug c++/118924] [12/13/14/15 regression] Wrong code at -O2 and above leading to uninitialized accesses on aarch64-linux-gnu since r10-917-g3b47da42de621c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118924 Martin Jambor changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jamborm at gcc dot gnu.org --- Comment #13 from Martin Jambor --- (In reply to rguent...@suse.de from comment #10) [...] > > And still SRA should not use a random RHS "model" to build a new > LHS access, most definitely not when the original aggregate LHS > isn't TBAA compatible with it. That could be accomplished by: diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index c26559edc66..f780285254f 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3451,7 +3451,7 @@ create_total_scalarization_access (struct access *parent, HOST_WIDE_INT pos, access->grp_write = parent->grp_write; access->grp_total_scalarization = 1; access->grp_hint = 1; - access->grp_same_access_path = path_comparable_for_same_access (expr); + access->grp_same_access_path = 0; access->reverse = reverse_storage_order_for_component_p (expr); access->next_sibling = next_sibling; Which works for the testcase but I am afraid it might not be sufficient. If there was a way to actually create a pre-SRA access to an individual element of the array with the wrong (int) type in the function and there wasn't any with the other type, then, SRA not being flow sensitive pass, would happily use the type again because it would not be "random" any more. > The array assignment from the front-end is good enough for the > middle-end as far as IL type hygiene is concerned given the > element types are useless-type-convertible. It is quite evil :-) What would be a good predicate to detect such compatible but TBAA-different assignments, if there is one? Because I think we need to prevent building of references "according to a model" for all scalar replacements under them.
[Bug c++/118924] [12/13/14/15 regression] Wrong code at -O2 and above leading to uninitialized accesses on aarch64-linux-gnu since r10-917-g3b47da42de621c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118924 --- Comment #14 from Martin Jambor --- So something like the following - which is completely untested, the type test may be a wrong one, I'd like to think this through a little more before actually proposing this, but any comments still welcome: diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index c26559edc66..88b350800ce 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -979,6 +979,7 @@ create_access (tree expr, gimple *stmt, bool write) access->type = TREE_TYPE (expr); access->write = write; access->grp_unscalarizable_region = unscalarizable_region; + access->grp_same_access_path = true; access->stmt = stmt; access->reverse = reverse; @@ -1522,6 +1523,10 @@ build_accesses_from_assign (gimple *stmt) racc = build_access_from_expr_1 (rhs, stmt, false); lacc = build_access_from_expr_1 (lhs, stmt, true); + bool tbaa_hazard += (TYPE_MAIN_VARIANT (TREE_TYPE (lhs)) + == TYPE_MAIN_VARIANT (TREE_TYPE (rhs))); + if (lacc) { lacc->grp_assignment_write = 1; @@ -1536,6 +1541,8 @@ build_accesses_from_assign (gimple *stmt) bitmap_set_bit (cannot_scalarize_away_bitmap, DECL_UID (lacc->base)); } + if (tbaa_hazard) + lacc->grp_same_access_path = false; } if (racc) @@ -1555,6 +1562,8 @@ build_accesses_from_assign (gimple *stmt) } if (storage_order_barrier_p (lhs)) racc->grp_unscalarizable_region = 1; + if (tbaa_hazard) + racc->grp_same_access_path = false; } if (lacc && racc @@ -2396,7 +2405,7 @@ sort_and_splice_var_accesses (tree var) bool grp_partial_lhs = access->grp_partial_lhs; bool first_scalar = is_gimple_reg_type (access->type); bool unscalarizable_region = access->grp_unscalarizable_region; - bool grp_same_access_path = true; + bool grp_same_access_path = access->grp_same_access_path; bool bf_non_full_precision = (INTEGRAL_TYPE_P (access->type) && TYPE_PRECISION (access->type) != access->size @@ -2432,7 +2441,8 @@ sort_and_splice_var_accesses (tree var) return NULL; } - grp_same_access_path = path_comparable_for_same_access (access->expr); + if (grp_same_access_path) + grp_same_access_path = path_comparable_for_same_access (access->expr); j = i + 1; while (j < access_count)
[Bug c/119104] New: Unclear documentation for [[gnu::nonnull_if_nonzero]]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119104 Bug ID: 119104 Summary: Unclear documentation for [[gnu::nonnull_if_nonzero]] Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: alx at kernel dot org Target Milestone: --- The documentation about [[gnu::nonnull]] says: nonnull_if_nonzero nonnull_if_nonzero (arg-index, arg2-index) The nonnull_if_nonzero attribute is a conditional version of the nonnull attribute. It has two arguments, the first argument shall be argument index of a pointer argument which must be in some cases non-null and the second argument shall be argument index of an integral argument (other than boolean). If the integral argument is zero, the pointer argument can be null, if it is non-zero, the pointer argument must not be null. extern void * my_memcpy (void *dest, const void *src, size_t len) __attribute__((nonnull (1, 2))); extern void * my_memcpy2 (void *dest, const void *src, size_t len) __attribute__((nonnull_if_nonzero (1, 3), nonnull_if_nonzero (2, 3))); With these declarations, it is invalid to call my_memcpy (NULL, NULL, 0); or to call my_memcpy2 (NULL, NULL, 4); but it is valid to call my_memcpy2 (NULL, NULL, 0);. This attribute should be used on declarations which have e.g. an exception for zero sizes, in which case null may be passed. It says what happens when the value is 0. It says what happens when the value is nonzero. But these are rarely passed as constant expressions, so the compiler will most of the time not be able to determine if it is zero or nonzero. What's the behavior when a variable that the compiler cannot know if it's zero or not is passed? Does it trigger the diagnostics documented for [[gnu::nonnull]] or not?
[Bug c++/117061] Error on use of parameter in lambda outside function body
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117061 --- Comment #4 from eczbek.void at gmail dot com --- Constructors too :( ``` template struct S { S(int x) requires(requires { [x] { x; }; }) {} }; ``` ``` : In lambda function: :3:41: error: use of parameter outside function body before ';' token [-Wtemplate-body] 3 | S(int x) requires(requires { [x] { x; }; }) {} | ^ ```
[Bug target/118996] Should TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P return false for x86-64?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118996 --- Comment #16 from Hongtao Liu --- (In reply to Hongtao Liu from comment #14) > (In reply to H.J. Lu from comment #13) > > (In reply to H.J. Lu from comment #11) > > > Created attachment 60609 [details] > > > An untested patch > > > > Hongtao, do you have SPEC CPU2017 data on this patch? > > I haven't since #c9, assume the new patch fix the issue? > I'll start a test. No big impact(all benchmarks are in <1% change) for both -march=x86-64-v3 -O2 and -march=icelaker-server -Ofast -funroll-loops -flto on ICX.
[Bug target/119083] Remove SSE_FIRST_REG from ix86_class_likely_spilled_p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119083 --- Comment #9 from Hongtao Liu --- (In reply to H.J. Lu from comment #8) > Created attachment 60647 [details] > A patch to remove CREG and BREG from ix86_class_likely_spilled_p > > Hongtao, can you measure its impact on SPEC CPU 2017? Ok.
[Bug ipa/119009] [15 regression] AArch64: Commit 'Node clones share order' (r15-6345-g0895aef01c64c3) causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119009 --- Comment #3 from Michal Jireš --- Thanks a lot for the script. I have reproduced it: # bad3714b - before my patch BM_UIOVecSink/0 33.8 us 33.8 us 20659 bytes_per_second=2.82508G/s html # 0895aef0 - my patch BM_UIOVecSink/0 41.0 us 41.0 us 16890 bytes_per_second=2.32381G/s html However current trunk shows the opposite: # 3605e057 - trunk BM_UIOVecSink/0 33.7 us 33.7 us 20161 bytes_per_second=2.82955G/s html # revert patch BM_UIOVecSink/0 39.9 us 39.9 us 17399 bytes_per_second=2.38832G/s html Is it still a problem on your machine with current trunk? Perf record/report of: snappy_benchmark --benchmark_filter=BM_UIOVecSink/0 --benchmark_min_warmup_time=5 --benchmark_time_unit=us shows regression in functions: 61.46% void snappy::SnappyDecompressor::DecompressAllTags(snappy::SnappyIOVecWriter*) 25.65% snappy::(anonymous namespace)::IncrementalCopy(char const*, char*, char*, char*) relevant symbols: _ZN6snappy18SnappyDecompressor17DecompressAllTagsINS_17SnappyIOVecWriterEEEvPT_ _ZN6snappy12_GLOBAL__N_1L15IncrementalCopyEPKcPcS3_S3_ are identical outside of address changes. Changing alignment of DecompressAllTags with asm("nop; nop") or __attribute__((aligned(128))) removes the regression. 19,023,629 branch-misses:u # bad3714b 53,781,446 branch-misses:u # 0895aef0 The underlying problem seems to be branch misses caused by different alignment, but I cannot pinpoint any specific instruction(s) as a source. I am not sure we can reliably prevent this. In any case, reliable solution would be unrelated to my patch.
[Bug fortran/103391] [12/13/14/15 Regression] ICE: gimplification failed since r7-4021-g574284e9c49687d8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103391 --- Comment #11 from anlauf at gcc dot gnu.org --- (In reply to anlauf from comment #10) > ChatGPT seems to be no real help. I just tried it on comment#7, and it said: > > "Conclusion > > The original code is not standard-conforming because it performs intrinsic > assignment to a pointer array, which is not allowed by the Fortran standard. > Changing the pointer to an allocatable array resolves this issue." Torturing ChatGPT, i.e. telling it several times that its analysis is wrong, and giving some hints, I finally get: "Conclusion ✅ The assignment f%a = x is standard-conforming, as long as f%a is properly allocated before assignment. So, my original claim that the assignment was invalid was incorrect. You were right to question it!" :-)
[Bug c++/119102] GCC 15.0 'import std;' fails with Ofast (not with O3) due to some openmp internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119102 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||c++23, rejects-valid Ever confirmed|0 |1 Last reconfirmed||2025-03-03 --- Comment #2 from Jonathan Wakely --- (In reply to Igor Machado Coelho from comment #0) > I know that CXX Modules and "import std;" is still quite experimental, but > it seemed strange to generate a compiler bug, that's why I'm reporting. Yes, thanks for reporting it - we need to fix things like this to make it less experimental! I see the same behaviour, so confirmed.
[Bug c++/119102] GCC 15.0 'import std;' fails with Ofast (not with O3) due to some openmp internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119102 --- Comment #3 from Jonathan Wakely --- Comparing the preprocessed source for the std.cc module definition file, I see lots of lines with a simd attribute added when compiled with -Ofast: --- std-O3.ii 2025-03-03 17:20:32.885607902 + +++ std-Ofast.ii 2025-03-03 17:20:09.410578347 + @@ -103248,49 +103248,49 @@ # 313 "/usr/include/math.h" 2 3 4 # 1 "/usr/include/bits/mathcalls.h" 1 3 4 # 53 "/usr/include/bits/mathcalls.h" 3 4 - extern double acos (double __x) noexcept (true); extern double __acos (double __x) noexcept (true); +__attribute__ ((__simd__ ("notinbranch"))) extern double acos (double __x) noexcept (true); extern double __acos (double __x) noexcept (true); The attribute is added as a result of glibc doing: /* Get machine-dependent vector math functions declarations. */ #include which does: #if defined __x86_64__ && defined __FAST_MATH__ # if defined _OPENMP && _OPENMP >= 201307 /* OpenMP case. */ # define __DECL_SIMD_x86_64 _Pragma ("omp declare simd notinbranch") # elif __GNUC_PREREQ (6,0) /* W/o OpenMP use GCC 6.* __attribute__ ((__simd__)). */ # define __DECL_SIMD_x86_64 __attribute__ ((__simd__ ("notinbranch"))) # endif
[Bug tree-optimization/119103] New: Very suboptimal AVX2 code generation of simple shift loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103 Bug ID: 119103 Summary: Very suboptimal AVX2 code generation of simple shift loop Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gcc at haasn dot dev Target Milestone: --- == Summary == On x86_64 with -mavx2, GCC has a very hard time optimizing a shift by a small unsigned unknown, even if I add knowledge that the shift amount is sufficiently small. In particular, GCC always chooses vpslld instead of vpsllw, and there seems to be no way to convince it otherwise short of hand written asm or intrinsics. See demonstration here: https://godbolt.org/z/4YobqhsG4 == Code == #include void lshift(uint16_t *x, uint8_t amount) { if (amount > 15) __builtin_unreachable(); for (int i = 0; i < 16; i++) x[i] <<= amount; } == Output of `gcc -O3 -mavx2 -ftree-vectorize` == lshift: vmovdqu ymm1, YMMWORD PTR [rdi] movzx eax, sil vmovq xmm2, rax vpmovzxwd ymm0, xmm1 vextracti128xmm1, ymm1, 0x1 vpmovzxwd ymm1, xmm1 vpslld ymm0, ymm0, xmm2 vpslld ymm1, ymm1, xmm2 vpxor xmm2, xmm2, xmm2 vpblendwymm0, ymm2, ymm0, 85 vpblendwymm2, ymm2, ymm1, 85 vpackusdw ymm0, ymm0, ymm2 vpermq ymm0, ymm0, 216 vmovdqu YMMWORD PTR [rdi], ymm0 vzeroupper ret == Expected result == lshift: vmovdqu ymm1, YMMWORD PTR [rdi] movzx esi, sil vmovd xmm0, esi vpsllw ymm0, ymm1, xmm0 vmovdqu YMMWORD PTR [rdi], ymm0 vzeroupper ret Compiled from: void lshift(uint16_t *x, uint8_t amount) { __m256i data = _mm256_loadu_si256((__m256i *) x); __m128i shift_amount = _mm_cvtsi32_si128(amount); __m256i shifted = _mm256_sll_epi16(data, shift_amount); _mm256_storeu_si256((__m256i *) x, shifted); }
[Bug fortran/101577] [Interop] TYPE with BIND(C): Reject empty TYPE with zero components
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101577 anlauf at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |15.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from anlauf at gcc dot gnu.org --- Fixed in gcc-15.
[Bug fortran/32630] [meta-bug] ISO C binding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32630 Bug 32630 depends on bug 101577, which changed state. Bug 101577 Summary: [Interop] TYPE with BIND(C): Reject empty TYPE with zero components https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101577 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug target/119100] RISC-V: missed opportunities for vector-scalar instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119100 --- Comment #2 from Jeffrey A. Law --- It's even more complicated than that. You have to consider that there can be a cost to move data across the units. ie, it may actually be cheaper to use the variant that broadcasts the value across a vector (vv form) rather than using a value from the scalar int/fp register file (vf/vi forms). It really depends on the uarch behavior. Profitability may also depend on how many other similar cases are nearby. At least in our uarch we have the concept of a "scalar source buffer" where these values are queued up speculatively from the scalar units into a limited sized buffer for consumption on the vector units. If you don't fill up that buffer, then the vf/vi forms are likely profitable, but if you fill up the buffer, then you're going to stall various things waiting for that buffer to drain and make entries available. My general sense is that we probably want to default towards the vf/vi forms, but I don't have emperical data to back that up yet. Paul -- have you run your patch on any design? And if so what did you run and what was the performance delta before/after?
[Bug c/119104] Unclear documentation for [[gnu::nonnull_if_nonzero]]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119104 --- Comment #2 from Alejandro Colomar --- (In reply to Andrew Pinski from comment #1) > Non zero and zero are runtime values of here. Rather than compile > characteristics of that argument. > > Maybe just: > If the runtume value of the integral argument is zero, the pointer argument > can be null; or if it is non-zero, the pointer argument must not be null. Hi Andrew, They are run-time properties, but the analyzer still warns about them with [[gnu::nonnull]]. I'm worried that this new attribute might reduce the number of diagnostics, which would be a bad thing IMO. Indeed, I have been able to install gcc-15 from Debian experimental, and my worries seem to confirm. alx@debian:~/tmp$ cat foo.c #include [[gnu::nonnull]] void f(void *); void g(void *); [[gnu::nonnull_if_nonzero(1, 2)]] void h(void *, int); int main(int argc, char *[]) { void *p; p = malloc(100); f(p); free(p); p = malloc(100); g(p); free(p); p = malloc(100); h(p, argc); free(p); } alx@debian:~/tmp$ gcc-15 -Wall -Wextra -fanalyzer -S foo.c foo.c: In function ‘main’: foo.c:15:9: warning: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Wanalyzer-possible-null-argument] 15 | f(p); | ^~~~ ‘main’: events 1-2 14 | p = malloc(100); | ^~~ | | | (1) this call could return NULL 15 | f(p); | | | | (2) ⚠️ argument 1 (‘p’) from (1) could be NULL where non-null expected foo.c:4:6: note: argument 1 of ‘f’ must be non-null 4 | void f(void *); | ^ This is a regression for memcpy(3) et al. There was a diagnostic with -fanalyzer when it was marked [[gnu::nonnull]], and we're losing that with [[gnu::nonnull_if_nonzero]]. I've been trying to convince Joseph, Aaron, and the C Committee that it was a terrible mistake to allow a null pointer here, precisely for this worry, and it seems my worries were correct.
[Bug fortran/103391] [12/13/14/15 Regression] ICE: gimplification failed since r7-4021-g574284e9c49687d8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103391 --- Comment #12 from Andre Vehreschild --- Mhhh, when one needs to know the "correct answer" to get it from an AI, what help is the AI then?
[Bug libstdc++/119089] FAIL: 23_containers/vector/debug/assign4_backtrace_neg.cc -std=gnu++17 (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119089 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #12 from Xi Ruoyao --- (In reply to Jonathan Wakely from comment #11) > (In reply to John David Anglin from comment #9) > > In addition to regenerating the gcc fixincludes, I believe gcc needs > > rebuilding as the initializer is used in gthr.h. > > I *think* it's an ABI-compatible change. At least it had better be! So > existing code shouldn't need to be recompiled. It seems so. We (Linux From Scratch) have some users who complained about the fixincludes issue but their GCC is just fine after removing the stale "fixed" headers, without being rebuilt.
[Bug tree-optimization/119103] shift not demotated when shift amount range is known
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #4 from Hongtao Liu --- vect_recog_over_widening_pattern could be extended with range info for this?
[Bug tree-optimization/119103] shift not demotated when shift amount range is known
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103 --- Comment #5 from Hongtao Liu --- (In reply to Hongtao Liu from comment #4) > vect_recog_over_widening_pattern could be extended with range info for this? Looks like vectorizer already have range_info from vect_determine_precisions_from_range
[Bug c/119095] GCC in Ubuntu 20.04, 22.04 and 24.04 all have this problem.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119095 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #2 from Xi Ruoyao --- (In reply to wzis from comment #0) > I submitted the bug a few days ago, but I couldn't find it any more Use the "Open bugs reported by me" link on https://gcc.gnu.org/bugzilla/. > but I asked Ubuntu for this, they didn't recognize it. It's their problem then, and it's not a valid reason to spam the upstream bug tracker.
[Bug middle-end/118874] [15 regression] ICE in copy_rtx, at rtl.cc:372
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118874 --- Comment #10 from Iain Sandoe --- In the coroutine handling to deal with https://eel.is/c++draft/dcl.fct.def.coroutine#7 we unconditionally create the return object in the slot - if we create it somewhere else, that causes us to produce an unexpected additional copy. So, I suppose, the difference is the unconditional use (it's not clear to me at the moment how to avoid that - the intent (AFAIU) is that the return object is available to the coroutine body (including initial suspend)).
[Bug c++/119102] GCC 15.0 'import std;' fails with Ofast (not with O3) due to some openmp internal error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119102 --- Comment #4 from Jonathan Wakely --- We're taking the non-OpenMP branch there, but the error from GCC seems to be incorrectly referring to OpenMP. The docs for attribute simd say: If the attribute is specified and #pragma omp declare simd is present on a declaration and the -fopenmp or -fopenmp-simd switch is specified, then the attribute is ignored.
[Bug tree-optimization/119103] Very suboptimal AVX2 code generation of simple shift loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Severity|normal |enhancement Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2025-03-03 --- Comment #2 from Andrew Pinski --- # RANGE [irange] int [0, 65535] MASK 0x VALUE 0x0 _5 = (intD.6) _4; # RANGE [irange] int [0, 15] MASK 0xf VALUE 0x0 _6 = (intD.6) amount_11(D); # RANGE [irange] int [0, 2147450880] MASK 0x7fff VALUE 0x0 _7 = _5 << _6; _8 = (short unsigned intD.18) _7; That should be able to reduce down to just: _8 = _4 << _6; Since _6 has a range for [0,15] so we know it is defined. I suspect once that happens the other part will be optimized.
[Bug tree-optimization/119103] shift not demotated when shift amount range is known
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103 Andrew Pinski changed: What|Removed |Added Summary|Very suboptimal AVX2 code |shift not demotated when |generation of simple shift |shift amount range is known |loop| --- Comment #3 from Andrew Pinski --- RTL handles &0xf but if the range is there we don't optimize it: E.g. it can be shown by: ``` #include void lshift(uint16_t *x, uint8_t amount) { x[0] = x[0] << (amount&0xF); } void lshift1(uint16_t *x, uint8_t amount) { if (amount >= 16) __builtin_unreachable(); x[0] = x[0] << (amount&0xF); } ```
[Bug fortran/101577] [Interop] TYPE with BIND(C): Reject empty TYPE with zero components
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101577 --- Comment #2 from GCC Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:f9f16b9f74b767ca799a82f25be66a5fed25756d commit r15-7798-gf9f16b9f74b767ca799a82f25be66a5fed25756d Author: Harald Anlauf Date: Sun Mar 2 22:20:28 2025 +0100 Fortran: reject empty derived type with bind(C) attribute [PR101577] PR fortran/101577 gcc/fortran/ChangeLog: * symbol.cc (verify_bind_c_derived_type): Generate error message for derived type with no components in standard conformance mode, indicating that this is a GNU extension. gcc/testsuite/ChangeLog: * gfortran.dg/empty_derived_type.f90: Adjust dg-options. * gfortran.dg/empty_derived_type_2.f90: New test.