[Bug c++/59739] missed optimization: attribute ((pure)) with aggregate returns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59739 Andrew Pinski changed: What|Removed |Added CC||marco at technoboredom dot net --- Comment #6 from Andrew Pinski --- *** Bug 45115 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/45115] pure functions returning structs are not optimized.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45115 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #5 from Andrew Pinski --- Dup of bug 59739; even though that is a newer bug it has some more analysis and even been assigned (will add the C++20 testcase there too). *** This bug has been marked as a duplicate of bug 59739 ***
[Bug rust/114629] rust-ast-resolve-expr contains bloated code for funny_error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114629 --- Comment #7 from Marc Poulhiès --- There's no language spec yet, it's WIP: https://github.com/rust-lang/rust/issues/113527 Currently, the reference is rustc and the goal is to match its current behavior. I think the frontend is not listed because it's currently not enabled/working-enough.
[Bug c++/59739] missed optimization: attribute ((pure)) with aggregate returns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59739 --- Comment #7 from Andrew Pinski --- >From PR 108635 (via PR 45115), a C++20 testcase where we should be able to optimize into one call of `operator<=>`: #include struct S { std::weak_ordering operator<=>(const S&) const __attribute__((const)); }; int compare3way(S& a, S& b) { return (a < b) ? -1 : (a > b) ? 1 : 0; }
[Bug tree-optimization/114647] missing DSE when looping over a VLA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114647 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Richard Biener --- /* If we visit this PHI by following a backedge then we have to make sure ref->ref only refers to SSA names that are invariant with respect to the loop represented by this PHI node. */ if (dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt), gimple_bb (use_stmt)) && !for_each_index (ref->ref ? &ref->ref : &ref->base, check_name, gimple_bb (use_stmt))) return DSE_STORE_LIVE; we could make this bail-out "delayed" until we hit the next possible use in the loop (of which there is none).
[Bug target/114639] [riscv] ICE in create_pre_exit, at mode-switching.cc:451
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639 --- Comment #8 from Li Pan --- Find an even simpler code for reproduction. #include extern unsigned long get_vl (); vbool16_t test (vuint64m4_t a) { unsigned long b; return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ()); } ../__RISC-V_INSTALL___RV64/bin/riscv64-unknown-elf-g++ -O3 -march=rv64gcv -c ref.c -S -o - acc22d56e140220e7dc6c138918cb6754b6d1c0b enabled the vector abi by default, and trigger this assert in create_pre_exit. Replace get_vl () with a local variable could bypass this issue. will continue to investigate.
[Bug middle-end/114628] ICE with _BitInt returns_twice (and computed gotos) with -g and optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114628 --- Comment #4 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:7dd1f9d2ec422173f490d91b9173d4fa5d32d909 commit r14-9859-g7dd1f9d2ec422173f490d91b9173d4fa5d32d909 Author: Jakub Jelinek Date: Tue Apr 9 09:28:27 2024 +0200 bitint: Don't move debug stmts from before returns_twice calls [PR114628] Debug stmts are allowed by the verifier before the returns_twice calls. More importantly, they don't have a lhs, so the current handling of arg_stmts statements to force them on the edges ICEs. The following patch just keeps them where they were before. 2024-04-09 Jakub Jelinek PR middle-end/114628 * gimple-lower-bitint.cc (gimple_lower_bitint): Keep debug stmts before returns_twice calls as is, don't push them into arg_stmts vector/move to edges. * gcc.dg/bitint-105.c: New test.
[Bug middle-end/114628] ICE with _BitInt returns_twice (and computed gotos) with -g and optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114628 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #5 from Jakub Jelinek --- Fixed.
[Bug c/88058] gcc fails to detect use of out of scope variable ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88058 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Target Milestone|--- |12.0 Status|NEW |RESOLVED --- Comment #3 from Andrew Pinski --- GCC 12 adds: : In function 'f': :16:9: warning: dangling pointer 'p2' to 'buf' may be used [-Wdangling-pointer=] 16 | g( p2); // Bang ! | ^~ :12:22: note: 'buf' declared here 12 | char buf[ 10]; | ^~~ Which makes this a dup of bug 63272. *** This bug has been marked as a duplicate of bug 63272 ***
[Bug middle-end/63272] GCC should warn when using pointer to dead scoped variable within the same function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63272 --- Comment #10 from Andrew Pinski --- *** Bug 88058 has been marked as a duplicate of this bug. ***
[Bug c/89990] request warning: Use of out of scope compound literals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89990 Bug 89990 depends on bug 88058, which changed state. Bug 88058 Summary: gcc fails to detect use of out of scope variable ? https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88058 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE
[Bug middle-end/63272] GCC should warn when using pointer to dead scoped variable within the same function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63272 Andrew Pinski changed: What|Removed |Added CC||dgilbert at redhat dot com --- Comment #11 from Andrew Pinski --- *** Bug 89990 has been marked as a duplicate of this bug. ***
[Bug middle-end/87403] [Meta-bug] Issues that suggest a new warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403 Bug 87403 depends on bug 89990, which changed state. Bug 89990 Summary: request warning: Use of out of scope compound literals https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89990 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE
[Bug c++/114580] Bogus warning on if constexpr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114580 --- Comment #2 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:cfed80b9e4f562c99679739548df9369117dd791 commit r14-9861-gcfed80b9e4f562c99679739548df9369117dd791 Author: Jakub Jelinek Date: Tue Apr 9 09:31:42 2024 +0200 c++: Fix up maybe_warn_for_constant_evaluated calls [PR114580] When looking at maybe_warn_for_constant_evaluated for the trivial infinite loops patch, I've noticed that it can emit weird diagnostics for if constexpr in templates, first warn that std::is_constant_evaluted() always evaluates to false (because the function template is not constexpr) and then during instantiation warn that std::is_constant_evaluted() always evaluates to true (because it is used in if constexpr condition). Now, only the latter is actually true, even when the if constexpr is in a non-constexpr function, it will still always evaluate to true. So, the following patch fixes it to call maybe_warn_for_constant_evaluated always with IF_STMT_CONSTEXPR_P (if_stmt) as the second argument rather than true if it is if constexpr with non-dependent condition etc. 2024-04-09 Jakub Jelinek PR c++/114580 * semantics.cc (finish_if_stmt_cond): Call maybe_warn_for_constant_evaluated with IF_STMT_CONSTEXPR_P (if_stmt) as the second argument, rather than true/false depending on if it is if constexpr with non-dependent constant expression with bool type. * g++.dg/cpp2a/is-constant-evaluated15.C: New test.
[Bug c/89990] request warning: Use of out of scope compound literals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89990 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #6 from Andrew Pinski --- The warning is now included in GCC 12. And this makes this a dup of bug 63272. *** This bug has been marked as a duplicate of bug 63272 ***
[Bug c++/114580] Bogus warning on if constexpr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114580 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Jakub Jelinek --- Fixed.
[Bug c++/114480] [12/13/14 Regression] g++: internal compiler error: Segmentation fault signal terminated program cc1plus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480 Richard Biener changed: What|Removed |Added Keywords|needs-bisection | --- Comment #29 from Richard Biener --- (In reply to Richard Biener from comment #28) > Bisecting the original 11->12 regression would be nice. r12-6329-g4f6bc28fc7dd86 - the correctness fix for PR66139 / PR52320
[Bug middle-end/114627] undefined behavior in tree-profile.cc while compiling gcc.misc-tests/gcov-18.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114627 --- Comment #2 from GCC Commits --- The master branch has been updated by J?rgen Kvalsvik : https://gcc.gnu.org/g:a2447556a5405d2cde20afc134b90cd1d199ce04 commit r14-9864-ga2447556a5405d2cde20afc134b90cd1d199ce04 Author: Jørgen Kvalsvik Date: Mon Apr 8 15:19:55 2024 +0200 Generate constant at start of loop, without UB Generating the constants used for recording the edges taken for condition coverage would trigger undefined behavior when an expression had exactly 64 (== sizeof (1ULL)) conditions, as it would generate the constant for the next iteration at the end of the loop body, even if there was never a next iteration. By moving the check and constant generation to the top of the loop and hoisting the increment flag there is no opportunity for UB. PR middle-end/114627 gcc/ChangeLog: * tree-profile.cc (instrument_decisions): Generate constant at the start of loop.
[Bug gcov-profile/114599] [14 Regression] ICE: SIGSEGV in bitmap_set_bit(bitmap_head*, int) (bitmap.cc:975) with -O2 -fcondition-coverage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599 --- Comment #9 from GCC Commits --- The master branch has been updated by J?rgen Kvalsvik : https://gcc.gnu.org/g:2daeb89d6f025d6daf7e560575863b3280120be8 commit r14-9863-g2daeb89d6f025d6daf7e560575863b3280120be8 Author: Jørgen Kvalsvik Date: Mon Apr 8 09:28:27 2024 +0200 Add tree-inlined gconds to caller cond->expr map Properly add the condition -> expression mapping of inlined gconds from the caller into the callee map. This is a fix for PR114599 that works beyond fixing the segfault, as the previous fixed copied references to the source gconds, not the deep copied ones that end up in the calle body. The new tests checks this, both in the case of a calle without conditions (which triggered the segfault), and a test that shows that conditions are properly mapped, and not mixed. PR middle-end/114599 gcc/ChangeLog: * tree-inline.cc (copy_bb): Copy cond_uids into callee. (prepend_lexical_block): Remove outdated comment. (add_local_variables): Remove bad cond_uids copy. gcc/testsuite/ChangeLog: * gcc.misc-tests/gcov-19.c: New test.
[Bug target/114639] [riscv] ICE in create_pre_exit, at mode-switching.cc:451
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639 --- Comment #9 from Uroš Bizjak --- (In reply to Andrew Pinski from comment #2) > /* If we didn't see a full return value copy, verify that there >is a plausible reason for this. If some, but not all of the >return register is likely spilled, we can expect that there >is a copy for the likely spilled part. */ This part of the mode-switching pass is a real PITA. The trick here is with the calculation of forced_late_switch (but please see N.b. comment at the beginning of the function where some failed assumptions are described).
[Bug middle-end/93041] GCC 10 removes an infinite loop and causes a null pointer to dereferenced
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93041 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #12 from Xi Ruoyao --- (In reply to Ganton from comment #11) > A related assigned task is: > [C++26] P2809R3 - Trivial infinite loops are not undefined behavior > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114462 Note that even with P2809R3 the test case in this ticket is still invalid, as explained in comment 9.
[Bug rtl-optimization/112560] [14 Regression] ICE in try_combine on pr112494.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112560 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Richard Biener --- Fixed I suppose.
[Bug rtl-optimization/112560] [14 Regression] ICE in try_combine on pr112494.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112560 --- Comment #9 from Uroš Bizjak --- (In reply to Richard Biener from comment #8) > Fixed I suppose. Yes - I plan backport the patch to at least gcc-13.
[Bug target/114639] [riscv] ICE in create_pre_exit, at mode-switching.cc:451
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639 --- Comment #10 from Li Pan --- The #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN) of the riscv backend doesn't honor vector mode. Then the below part 370 if (!targetm.calls.function_value_regno_p (copy_start)) 371 copy_num = 0; 372 else 373 copy_num = hard_regno_nregs (copy_start, 374 GET_MODE (copy_reg)); will have copy_num == 0 and then went to a different code path. Let me run fully riscv regression test for this fix first.
[Bug target/113233] LoongArch: target options from LTO objects not respected during linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233 --- Comment #12 from GCC Commits --- The master branch has been updated by LuluCheng : https://gcc.gnu.org/g:8657d76d583f0f87000e9003ba75922f2bbe4455 commit r14-9866-g8657d76d583f0f87000e9003ba75922f2bbe4455 Author: Yang Yujie Date: Mon Apr 8 16:45:13 2024 +0800 LoongArch: Enable switchable target This patch fixes the back-end context switching in cases where functions should be built with their own target contexts instead of the global one, such as LTO linking and functions with target attributes (TBD). PR target/113233 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_reg_init): Reinitialize the loongarch_regno_mode_ok cache. (loongarch_option_override): Same. (loongarch_save_restore_target_globals): Restore target globals. (loongarch_set_current_function): Restore the target contexts for functions. (TARGET_SET_CURRENT_FUNCTION): Define. * config/loongarch/loongarch.h (SWITCHABLE_TARGET): Enable switchable target context. * config/loongarch/loongarch-builtins.cc (loongarch_init_builtins): Initialize all builtin functions at startup. (loongarch_expand_builtin): Turn assertion of builtin availability into a test. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Define condition loongarch_sx_as. * gcc.dg/lto/pr113233_0.c: New test.
[Bug libstdc++/114645] std::chrono::current_zone ignores $TZ on Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114645 --- Comment #8 from Jonathan Wakely --- Yes, if an application assumes that chrono::current_zone matches $TZ, that's a bug in the application. None of libstdc++, LLVM libc++, MSVC STL or the date/tz.h reference implementation uses $TZ for chrono::current_zone, and I don't see how we could do so without breaking the guarantee that locate_zone(current_zone()->name()) works. The lifetime and ownership of the pointer returned by current_zone would also be unclear if it didn't return one of the IANA zones owned by a tzdb object. The C++ library is extensible outside of namespace std, for example: https://github.com/HowardHinnant/date/blob/master/include/date/ptz.h (that uses the types and constants from the date namespace defined in date/tz.h but that can be replaced by namespace date { using namespace std::chrono; }) We could provide something similar as an extension, but it wouldn't be used automatically by chrono::current_zone, because POSIX time zones (as used by libc) are not IANA zones.
[Bug tree-optimization/111293] [14 Regression] Missed Dead Code Elimination since r14-3414-g0cfc9c953d0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111293 Mikael Morin changed: What|Removed |Added CC||mikael at gcc dot gnu.org --- Comment #3 from Mikael Morin --- pre replaces usages of the 'e' global with a set of 'pretmp' and 'prephitmp' ssa registers. With gcc-13, the value of 'e' is reloaded directly after the call to 'foo', and that value is joined with a phi in the next bb: [local count: 7761689018]: foo (); pretmp_25 = e; goto ; [100.00%] ... [local count: 8963573811]: # j_28 = PHI <1(10), j_30(24), 1(22), j_30(25), 1(26), j_30(20)> # i_29 = PHI # prephitmp_38 = PHI d.7_22 = d; _23 = d.7_22 + 1; d = _23; if (_23 != 0) goto ; [94.50%] else goto ; [5.50%] With gcc-14, the value of 'e' is reloaded later in the next bb, causing a dependency on 'e', even on paths not calling 'foo': [local count: 7761689018]: foo (); goto ; [100.00%] ... [local count: 8963573796]: # i_28 = PHI d.7_22 = d; _23 = d.7_22 + 1; d = _23; pretmp_10 = e; if (_23 != 0) goto ; [94.50%] else goto ; [5.50%] Later on this prevents copyprop from simplifying the 'pretmp' and 'prephitmp' values to 3 and remove the branch calling 'foo'.
[Bug libstdc++/114645] std::chrono::current_zone ignores $TZ on Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114645 --- Comment #9 from Hristo Venev --- I stumbled upon this comment in the library you linked: https://github.com/HowardHinnant/date/blob/0e65940a7fbc4ed617a1ee111a60311eccbead9a/include/date/tz.h#L35 That comment is wrong in its explanation of the mechanism used to determine the local time zone on Linux. However, it clearly shows that the intent is to match the platform's "local time" as closely as reasonably possible. The implementation also has some comments: https://github.com/HowardHinnant/date/blob/0e65940a7fbc4ed617a1ee111a60311eccbead9a/src/tz.cpp#L3936 The intent seems to be clear -- apply a lot of heuristics to try to match what libc would do as closely as possible. Even on Linux there are no guarantees whatsoever that it is possible to extract a IANA time zone from /etc/localtime. In fact, the problem is exactly identical to that with $TZ, if not worse -- $TZ is normally an IANA time zone name, whereas /etc/localtime is a symlink (but sometimes a hardlink or a copy) of a file in some OS-specific directory (sometimes, but not always, /usr/share/zoneinfo) where the name of the file relative to the base directory is a IANA time zone name.
[Bug libstdc++/114645] std::chrono::current_zone ignores $TZ on Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114645 --- Comment #10 from Xi Ruoyao --- > rust's `chrono` Note that this is really a bad example because of CVE-2020-26235.
[Bug tree-optimization/111293] [14 Regression] Missed Dead Code Elimination since r14-3414-g0cfc9c953d0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111293 --- Comment #4 from Mikael Morin --- For what's worth adding -fno-tree-vrp "fixes" this and enables removal of the call to 'foo' with trunk. Here is a minimal revert of the regressing revision, but it may just make the problem latent. diff --git a/gcc/gimple-range-phi.cc b/gcc/gimple-range-phi.cc index 01900a35b32..9fa9fe83ce0 100644 --- a/gcc/gimple-range-phi.cc +++ b/gcc/gimple-range-phi.cc @@ -386,14 +386,6 @@ phi_analyzer::process_phi (gphi *phi) m_work.safe_push (arg); continue; } - // More than 2 outside names is too complicated. - if (m_num_extern >= 2) - { - cycle_p = false; - break; - } - m_external[m_num_extern] = arg; - m_ext_edge[m_num_extern++] = gimple_phi_arg_edge (phi_stmt, x); } else if (code == INTEGER_CST) { @@ -402,12 +394,15 @@ phi_analyzer::process_phi (gphi *phi) wi::to_wide (arg)); init_range.union_ (val); } - else + // More than 2 outside names/CONST is too complicated. + if (m_num_extern >= 2) { - // Everything else terminates the cycle. cycle_p = false; break; } + + m_external[m_num_extern] = arg; + m_ext_edge[m_num_extern++] = gimple_phi_arg_edge (phi_stmt, x); } }
Re: [Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41
There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.
[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #76 from Jan Hubicka --- There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.
[Bug libstdc++/114645] std::chrono::current_zone ignores $TZ on Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114645 --- Comment #11 from Hristo Venev --- I never said that reading $TZ is easy, just that not doing it is (in my opinion) wrong.
[Bug ipa/113291] [14 Regression] compilation never (?) finishes with recursive always_inline functions at -O and above since r14-2172
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113291 --- Comment #8 from Jan Hubicka --- I am not sure this ought to be P1: - the compilation technically is finite, but not in reasonable time - it is possible to adjust the testcas (do early inlining manually) and get same infinite build on release branches - if you ask for inline bomb, you get it. But after some more testing, I do not see reasonably easy way to get better diagnostics. So I will retest the patch fro #6 and go ahead with it.
[Bug middle-end/114653] New: Not vectoring the loop with openmp reduction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 Bug ID: 114653 Summary: Not vectoring the loop with openmp reduction. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org Target Milestone: --- Created attachment 57910 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57910&action=edit testcase Main loop in the attached test case is not vectorized with -fopenmp. It gets vectorized with -fopenmp-simd. In the case of -fopenmp reduction variables lax,lay,laz gets assigned to an array. data reference calculation for this seem to fail. See: offset from base address: (ssizetype) ((sizetype) _20 * 4) constant offset from base address: 0 step: 0 base alignment: 16 base misalignment: 0 offset alignment: 4 step alignment: 128 base_object: D.4806[_20] Creating dr for D.4808[_20] analyze_innermost: Applying pattern match.pd:219, generic-match-1.cc:3190 test.cpp:37:9: missed: failed: evolution of offset is not affine. command used: test.cpp -Ofast -fopenmp -mcpu=neoverse-v2 gcc -v: Using built-in specs. COLLECT_GCC=/home/kvivekananda/install/bin/gcc COLLECT_LTO_WRAPPER=/home/kvivekananda/install/libexec/gcc/aarch64-unknown-linux-gnu/14.0.1/lto-wrapper Target: aarch64-unknown-linux-gnu Configured with: ../gcc/configure --enable-multiarch=yes --enable-languages=c,c++,fortran,lto --disable-bootstrap --prefix=/home/kvivekananda/install Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240314 (experimental) (GCC)
[Bug target/113233] LoongArch: target options from LTO objects not respected during linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233 --- Comment #13 from Xi Ruoyao --- Will we back port the fix to 13 and 12?
[Bug target/114576] [14 regression] VEX-prefixed AES instruction without AVX enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114576 --- Comment #5 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:a79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e commit r14-9869-ga79d13a01f8cbb99fb45bf3f3ffc62c99ee0b05e Author: Jakub Jelinek Date: Tue Apr 9 12:35:18 2024 +0200 i386: Fix aes/vaes patterns [PR114576] On Wed, Apr 19, 2023 at 02:40:59AM +, Jiang, Haochen via Gcc-patches wrote: > > > (define_insn "aesenc" > > > - [(set (match_operand:V2DI 0 "register_operand" "=x,x") > > > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > > > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > > > + [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") > > > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > > > + (match_operand:V2DI 2 "vector_operand" > > > + "xBm,xm,vm")] > > > UNSPEC_AESENC))] > > > - "TARGET_AES" > > > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > > >"@ > > > aesenc\t{%2, %0|%0, %2} > > > + vaesenc\t{%2, %1, %0|%0, %1, %2} > > > vaesenc\t{%2, %1, %0|%0, %1, %2}" > > > - [(set_attr "isa" "noavx,avx") > > > + [(set_attr "isa" "noavx,aes,avx512vl") > > Shouldn't it be vaes_avx512vl and then remove " || (TARGET_VAES && > > TARGET_AVX512VL)" from condition. > > Since VAES should not imply AES, we need that "|| (TARGET_VAES && > TARGET_AVX512VL)" > > And there is no need to add vaes_avx512vl since the last alternative will only > be hit when there is no aes. When there is no aes, the pattern will need vaes > and avx512vl both or we could not use this pattern. avx512vl here is just like > a placeholder. As the following testcase shows, the above change was incorrect. Using aes isa for the second alternative is obviously wrong, aes is enabled whenever -maes is, regardless of -mavx or -mno-avx, so the above change means that for -maes -mno-avx RA can choose, either it matches the first alternative with the dup operand, or it matches the second one (but that is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID). The big question is if "Since VAES should not imply AES" is the case or not. Looking around at what LLVM does on godbolt, seems since clang 6 which added -mvaes support -mvaes there implies -maes, but GCC treats those two independent. Now, if we'd take the LLVM path of making -mvaes imply -maes and -mno-aes imply -mno-vaes, then we should probably just revert the above patch and tweak common/config/i386/ to do the implications (+ add the testcase from this patch). If we keep the current behavior, where AES and VAES are completely independent extensions, then we need to do more changes as the following patch attempts to do. We should use the aesenc etc. insns for noavx as before, we know at that point that TARGET_AES must be true because (TARGET_VAES && TARGET_AVX512VL) won't be true when !TARGET_AVX - TARGET_AVX512VL implies TARGET_AVX. For the second alternative, i.e. the AVX AES VEX or VAES AVX512F EVEX case without using %xmm16+/EGPR regs, the patch uses avx isa, but we need to emit {evex} prefix in the assembly if AES ISA is not enabled. For the last alternative, we need to use a new vaes_avx512vl isa attribute, because the %xmm16+/EGPR support is there only if both VAES and AVX512VL is enabled, not just AVX and AES. Still, I wonder if -mvaes shouldn't imply at least -mavx512f and -mno-avx512f shouldn't imply -mno-vaes, because otherwise can't see how it could use 512-bit registers (this part not done in the patch). 2024-04-09 Jakub Jelinek PR target/114576 * config/i386/i386.md (isa): Remove aes, add vaes_avx512vl. (enabled): Remove aes isa check, add vaes_avx512vl. * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Use jm instead of m for second alternative and emit {evex} prefix for it if !TARGET_AES. Use noavx,avx,vaes_avx512vl isa attribute. (vaesdec_, vaesdeclast_, vaesenc_, vaesenclast_): Add second alternative with x instead of v and jm instead of m. * gcc.target/i386/aes-pr114576.c: New test.
[Bug target/114576] [14 regression] VEX-prefixed AES instruction without AVX enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114576 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from Jakub Jelinek --- Fixed.
[Bug c++/114654] New: Alias template cannot be found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114654 Bug ID: 114654 Summary: Alias template cannot be found Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: fchelnokov at gmail dot com Target Milestone: --- This program ``` template struct S {}; template using A = S; template using B = A<[]{}>; using C = B; ``` looks valid and it is accepted by Clang and MSVC, but GCC complains: :10:11: error: 'B' does not name a type 10 | using C = B; Online demo: https://gcc.godbolt.org/z/P4ozcd6oW Related discussion: https://stackoverflow.com/q/78292459/7325599
[Bug rust/114629] rust-ast-resolve-expr contains bloated code for funny_error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114629 --- Comment #8 from Iain Sandoe --- (In reply to Andrew Pinski from comment #5) > (In reply to Pierre-Emmanuel Patry from comment #2) > While you are at it, it would be useful to add a link to the rust langauge > specification (like there is for almost all other languages [I see > objective-C is not listed]) to https://gcc.gnu.org/readings.html . This is getting a bit off-topic for the current PR - but, for the record, I am not aware of any formal spec for Objective-C/C++ - the API is described in Apple's developer documentation and compliance is assessed (at least by me) in terms of "do we implement the things that we claim, in the same way as the system compiler"?
[Bug c/52534] gcc doesn't detect incorrect expression in call to va_start
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52534 --- Comment #5 from Jakub Jelinek --- Note, if we warn, we shouldn't warn for C23 or later, because one can pass anything there, like 3 arguments, or that (unsigned int)n_args, or just one, etc. And __builtin_va_start (ap, 0) is what is used regardless of the passed argument in that case.
[Bug c++/114480] [12/13/14 Regression] g++: internal compiler error: Segmentation fault signal terminated program cc1plus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480 --- Comment #30 from Richard Biener --- Temporarily memory goes to out-of-SSA computing liveness (2GB), after RTL expansion it drops to ~900MB peak again. Then df-init-at-O0 bumps to 1.3GB and then IRA goes all the way to >3GB. We're using lra_simple_p because num_used_regs is 343033 and last_basic_block is 117142. It's df_analyze () which brings memory usage to 2.8GB. After ira_build () it's 3.4GB. There's 1.5GB used in LR problem data bitmaps, we are not separating the local computed bitmaps from the global solution (the DF machinery doesn't have a separate "local compute" free, possibly the finalize hook can be used for this). Though the local compute bitmaps only use around 20MB. The DF bitmap obstack itself is quite small as well. Of course it's all because we fail to implement the initializers EH cleanup in a more sensible way. Complete "elements" could be processed in a loop over the containers elements while only a single partially under-construction element would need to be handled "inline".
[Bug analyzer/114472] [14 Regression] ICE: in falls_short_of_p, at analyzer/store.cc:365 (in exceeds_p, at analyzer/store.cc:342) with -fanalyzer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114472 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- strncpy (..., -1) is an always UB case; the testcase has UB already on &s - 3, but get_next_bit_offset of -32 suggests that it is somehow adding the -3 * BITS_PER_UNIT offset from the source and the -1 * BITS_PER_UNIT from the size. That is wrong, first of all, strncpy from the source copies just at most that many bytes, in a valid program there would need to be a '\0' far before that as one can't do pointer arithmetics in char * past half of the address space; plus the size isn't negative, it is positive 0xULL on x86-64. The reason strncpy would be UB with [LONG_MAX + 1UL, ULONG_MAX] last argument is that it then has to fill the rest of the buffer with '\0's and again the pointer arithmetics isn't well defined in that case.
[Bug gcov-profile/113765] [14 Regression] ICE: autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #9 from Richard Biener --- Fixed.
[Bug gcov-profile/114601] ICE: SIGSEGV in hash_table_mod1 (hash-table.h:344) with -fcondition-coverage -finstrument-functions-once
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114601 --- Comment #2 from GCC Commits --- The master branch has been updated by J?rgen Kvalsvik : https://gcc.gnu.org/g:dd78e6a3cbd8f7c678d90ca0d05787faeb2e9c9a commit r14-9870-gdd78e6a3cbd8f7c678d90ca0d05787faeb2e9c9a Author: Jørgen Kvalsvik Date: Tue Apr 9 13:39:03 2024 +0200 Guard function->cond_uids access [PR114601] PR114601 shows that it is possible to reach the condition_uid lookup without having also created the fn->cond_uids, through compiler-generated conditionals. Consider all lookups on non-existing maps misses, which they are from the perspective of the source code, to avoid the NULL access. PR gcov-profile/114601 gcc/ChangeLog: * tree-profile.cc (condition_uid): Guard fn->cond_uids access. gcc/testsuite/ChangeLog: * gcc.misc-tests/gcov-pr114601.c: New test.
[Bug target/114639] [riscv] ICE in create_pre_exit, at mode-switching.cc:451
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639 --- Comment #11 from Li Pan --- (In reply to Li Pan from comment #10) > The #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN) > of the riscv backend doesn't honor vector mode. Then the below part > > 370 if (!targetm.calls.function_value_regno_p > (copy_start)) > > 371 copy_num = 0; > > 372 else > 373 copy_num = hard_regno_nregs (copy_start, > 374 GET_MODE (copy_reg)); > > will have copy_num == 0 and then went to a different code path. > > Let me run fully riscv regression test for this fix first. Maybe misunderstand here, need to double-check the vector ABI for return values.
[Bug c++/114654] Alias template cannot be found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114654 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Patrick Palka --- dup *** This bug has been marked as a duplicate of bug 92707 ***
[Bug c++/92707] type alias on type alias on lambda in unevaluated context does not work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92707 Patrick Palka changed: What|Removed |Added CC||fchelnokov at gmail dot com --- Comment #2 from Patrick Palka --- *** Bug 114654 has been marked as a duplicate of this bug. ***
[Bug lto/114655] New: -flto=4 at link time does not override -flto=auto from compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114655 Bug ID: 114655 Summary: -flto=4 at link time does not override -flto=auto from compile time Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- It looks like we now cleverly "merge" IL and link-time -flto, prefering =auto or larger =N. That makes it no longer possible to debug with less jobs active. IMO link time -flto should override any compile-time setting.
[Bug lto/114655] -flto=4 at link time does not override -flto=auto from compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114655 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-04-09 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1
[Bug lto/114655] -flto=4 at link time does not override -flto=auto from compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114655 Richard Biener changed: What|Removed |Added Known to work||11.4.0 Known to fail||12.1.0 --- Comment #1 from Richard Biener --- This changed in r12-824-g3cbcb5d0cfcd17, GCC 11 still "works".
[Bug middle-end/114653] Not vectorizing the loop with openmp reduction.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114653 Richard Biener changed: What|Removed |Added Blocks||53947 Keywords||openmp Summary|Not vectoring the loop with |Not vectorizing the loop |openmp reduction. |with openmp reduction. --- Comment #1 from Richard Biener --- Note the cited problem isn't the problem. There's several OMP reduction vectorizing PRs already. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #14 from Michael Matz --- (In reply to Kewen Lin from comment #13) > (In reply to Giuliano Belinassi from comment #12) > > With your patch we have: > > > > > .LPFE0: > > > ... > > Which seems what is expected. > > Hi Giuliano, thanks for your time on testing it! Could you kindly help to > explain a bit on why "In such way we can't use the this space to place a > trampoline to the new function"? Is it due to inefficient code like needing > more branches? > > global entry: > [b localentry] > L1: > [patched code] > localentry: > [b L1] > > Or some other reason which makes it unused at all? Hmm? But this is not how the global-to-local hand-off is implemented (and expected by tooling): a fall-through. The global entry sets up the GOT register, there simply is no '[b localentry]'. If you mean to imply that also the '[b localentry]' should be patched in at live-patch application time (and hence the GOT setup would need to be moved to still somewhere else), then you have the problem that (in the not-yet-patched case) as long as the L1-nops sit between global and local entry they will always be executed when the global entry is called. That's wasteful. Additionally tooling will be surprised if the address difference between global and local entry isn't exactly 8 (i.e. two instructions). The psABI allows for different values, of course. But I'm willing to bet that there are bugs in the wild when different values would be actually used. So, the nops-between-gep-and-lep could probably be somehow made to work with userspace live patching, but your most recent patch here makes this all mood. It generates exactly the sequence we want: a single nop at the LEP, and a configurable patching area outside of, but near to, the function (here: in front of the GEP).
[Bug target/114656] New: ~5% slowdown of 538.imagick_r on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114656 Bug ID: 114656 Summary: ~5% slowdown of 538.imagick_r on aarch64 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization, needs-bisection Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pheeck at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Host: aarch64-gnu-linux Target: aarch64-gnu-linux As seen here https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=585.507.0 between commits r14-9649-gbb04a11418f54c r14-9728-g6fc84f680d098f there was a 5% exec time slowdown of 538.imagick_r SPEC2017 benchmark on aarch64 with compilation options -Ofast 5% slowdown can be also seen with -Ofast -march=native https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=584.507.0 The CPU is Ampere Altra Neoverse N1. Btw here are the same graphs including also data for GCC13. It is clear that this isn't a regression against GCC13. https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.7=882.507.0&plot.8=585.507.0&; https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.7=881.507.0&plot.8=584.507.0&; Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
[Bug lto/114655] -flto=4 at link time does not override -flto=auto from compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114655 --- Comment #2 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:32fb04adae90a0ea68e64e8fc3cb04b613b2e9f3 commit r14-9872-g32fb04adae90a0ea68e64e8fc3cb04b613b2e9f3 Author: Richard Biener Date: Tue Apr 9 14:25:57 2024 +0200 lto/114655 - -flto=4 at link time doesn't override -flto=auto at compile time The following adjusts -flto option processing in lto-wrapper to have link-time -flto override any compile time setting. PR lto/114655 * lto-wrapper.cc (merge_flto_options): Add force argument. (merge_and_complain): Do not force here. (run_gcc): But here to make the link-time -flto option override any compile-time one.
[Bug lto/114655] [12/13 Regression] -flto=4 at link time does not override -flto=auto from compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114655 Richard Biener changed: What|Removed |Added Summary|-flto=4 at link time does |[12/13 Regression] -flto=4 |not override -flto=auto |at link time does not |from compile time |override -flto=auto from ||compile time Priority|P3 |P2 Known to work||14.0 Target Milestone|--- |12.4 --- Comment #3 from Richard Biener --- Fixed on trunk sofar, queued for backporting.
[Bug c/114657] New: Invalid type conversion from some _BitInt bit-fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114657 Bug ID: 114657 Summary: Invalid type conversion from some _BitInt bit-fields Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juuso.alasuutari at gmail dot com Target Milestone: --- Created attachment 57911 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57911&action=edit Preprocessed test program source Summary --- GCC's type conversion from some bit-precise integer-typed bit-fields does not match the description in the C23 standard. Expected vs. observed behavior -- Based on footnote on page 47 of ISO/IEC 9899/2024 6.3.1.1 (n3320.pdf), the converted type of a _BitInt bit-field should match the original bit-precise integer: "E.g. unsigned _BitInt(7): 2 is a bit-field that can hold the values 0, 1, 2, 3, and converts to unsigned _BitInt(7)." My expectation would then be to see the following output, which does in fact happen when compiling with clang-19: $ ./test unsigned _BitInt(7) This is what GCC does: $ ./test unsigned _BitInt(2) Output of gcc-14 -v --- Using built-in specs. COLLECT_GCC=gcc-14 COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-linux-gnu/14/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 14-20240330-1' --with-bugurl=file:///usr/share/doc/gcc-14/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --prefix=/usr --with-gcc-major-version-only --program-suffix=-14 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-libstdcxx-backtrace --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-14-14-20240330/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/reproducible-path/gcc-14-14-20240330/debian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=yes,extra,rtl --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240330 (experimental) [master r14-9728-g6fc84f680d0] (Debian 14-20240330-1) Compilation command --- gcc-14 -std=gnu23 -Wall -Wextra -Wpedantic -save-temps -o test test.c
[Bug c/114657] Invalid type conversion from some _BitInt bit-fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114657 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||jsm28 at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- I believe Joseph said this isn't well specified in the C standard, see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625765.html If you use something like + 0uwb it should be the type after the integral promotions and so what the standard specifies for integral promotions of the _BitInt bit-fields.
[Bug c/114657] Invalid type conversion from some _BitInt bit-fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114657 --- Comment #2 from Jakub Jelinek --- In particular, 6.5.1.1/2 says "The type of the controlling expression is the type of the expression as if it had undergone an lvalue conversion, array to pointer conversion, or function to pointer conversion." but doesn't list integer promotions (if those were to occur, you'd e.g. never be able to match a char, signed char, unsigned char, short, unsigned short types with _Generic because everything would be promoted to int.
[Bug target/114656] ~5% slowdown of 538.imagick_r on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114656 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Can you try to revert r14-9692 if that commit isn't the cause?
[Bug fortran/113956] [13/14 Regression] ice in gfc_trans_pointer_assignment, at fortran/trans-expr.cc:10524
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113956 --- Comment #7 from GCC Commits --- The master branch has been updated by Paul Thomas : https://gcc.gnu.org/g:88aea122a7ee639230bf17a9eda4bf8a5eb7e282 commit r14-9873-g88aea122a7ee639230bf17a9eda4bf8a5eb7e282 Author: Paul Thomas Date: Tue Apr 9 15:23:46 2024 +0100 Fortran: Fix ICE in gfc_trans_pointer_assignment [PR113956] 2024-04-09 Paul Thomas gcc/fortran PR fortran/113956 * trans-expr.cc (gfc_trans_pointer_assignment): Remove assert causing the ICE since it was unnecesary. gcc/testsuite/ PR fortran/113956 * gfortran.dg/pr113956.f90: New test.
[Bug fortran/114535] [13/14 regression] ICE with elemental finalizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114535 --- Comment #5 from GCC Commits --- The master branch has been updated by Paul Thomas : https://gcc.gnu.org/g:de82b0cf981e49a0bda957c0ac31146b17407e23 commit r14-9874-gde82b0cf981e49a0bda957c0ac31146b17407e23 Author: Paul Thomas Date: Tue Apr 9 15:27:28 2024 +0100 Fortran: Fix ICE in trans-stmt.cc(gfc_trans_call) [PR114535] 2024-04-09 Paul Thomas gcc/fortran PR fortran/114535 * resolve.cc (resolve_symbol): Remove last chunk that checked for finalization of unreferenced symbols. gcc/testsuite/ PR fortran/114535 * gfortran.dg/pr114535d.f90: New test. * gfortran.dg/pr114535iv.f90: Additional source.
[Bug c/114657] Invalid type conversion from some _BitInt bit-fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114657 --- Comment #3 from Joseph S. Myers --- https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2958.htm has my analysis of the various notions of "type" used in relation to bit-fields and the questions of what expressions are considered to have special properties associated with referring to a bit-field.
[Bug driver/114658] New: branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 Bug ID: 114658 Summary: branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)" Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: felix-gcc at fefe dot de Target Milestone: --- Not sure how and where to file this bug, sorry. I'm trying to build the current stable release branch, i.e. 13.2 with bug fixes from git. So I do git checkout releases/gcc-13 and build gcc, but the result doesn't say it is gcc 13.2.1, it says it's gcc 14.0.1 (experimental). Shouldn't this branch contain the non-experimental version?
[Bug driver/114658] branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- That doesn't seem possible. Are you sure you've correctly checked out the releases/gcc-13 branch? That branch should contain gcc/BASE-VER with 13.2.1 in it and the resulting compiler should report that, like in: ./xgcc -B ./ -v Reading specs from ./specs COLLECT_GCC=./xgcc COLLECT_LTO_WRAPPER=./lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=default,ada,obj-c++,lto,go,d,m2 --enable-checking=yes,rtl,extra --enable-libstdcxx-backtrace=yes --with-isl=/usr/src/isl-0.24/obji/ Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.1 20240329 (GCC)
[Bug target/114639] [riscv] ICE in create_pre_exit, at mode-switching.cc:451
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114639 --- Comment #12 from Li Pan --- #include extern unsigned long get_vl (); #if 0 #else vint32m1_t test (vint32m1_t a) { unsigned b; return __riscv_vadd_vx_i32m1 (a, b, get_vl ()); // No ICE } vbool16_t test (vuint64m4_t a) { unsigned long b; return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ()); // ICE } #endif This is comes from the below parts: !(targetm.class_likely_spilled_p (REGNO_REG_CLASS (ret_start))); For RVV, the reg_class values are listed as below. Because the Vector Mask has only one reg, then it will be considered as likely spilled as the hook TARGET_CLASS_LIKELY_SPILLED_P default returns true if reg_class_size[class] == 1. Not very sure if overriding TARGET_CLASS_LIKELY_SPILLED_P hook for riscv is a reasonable fix, trying to understand TARGET_CLASS_LIKELY_SPILLED_P... panli-reg_class_size[0]=0 panli-reg_class_size[1]=14 panli-reg_class_size[2]=26 panli-reg_class_size[3]=32 panli-reg_class_size[4]=32 panli-reg_class_size[5]=2 panli-reg_class_size[6]=1 <= VM panli-reg_class_size[7]=31 <= VD panli-reg_class_size[8]=32 <= V panli-reg_class_size[9]=98
[Bug c/114659] New: gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 Bug ID: 114659 Summary: gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: bruno at clisp dot org Target Milestone: --- Created attachment 57912 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57912&action=edit test case tf.c In the two attached test cases, gcc miscompiles a __builtin_memcpy invocation. In the first test case, the data type is a 'float' (4 bytes). In the second test case, the data type is a 'double' (8 bytes). A value of this data type exists in memory, given as *x and *y. A modified copy of this value, convert_snan_to_qnan(value), exists also in the stack, among the local variables. gcc implements the __builtin_memcpy operation by accessing convert_snan_to_qnan(value) instead of the original value. How to reproduce: $ gcc-version 13.2.0 -m32 -Wall tf.c $ ./a.out ; echo $? 0 $ gcc-version 13.2.0 -m32 -Wall -O2 tf.c $ ./a.out ; echo $? 1 $ gcc-version 13.2.0 -m32 -Wall td.c $ ./a.out ; echo $? 0 $ gcc-version 13.2.0 -m32 -Wall -O2 td.c $ ./a.out ; echo $? 1 Analysis: $ gcc-version 13.2.0 -m32 -Wall -O2 -S tf.c tf.c has this function: int my_totalorderf (float const *x, float const *y) { int xs = __builtin_signbit (*x); int ys = __builtin_signbit (*y); if (!xs != !ys) return xs; int xn = __builtin_isnan (*x); int yn = __builtin_isnan (*y); if (!xn != !yn) return !xn == !xs; if (!xn) return *x <= *y; unsigned int extended_sign = -!!xs; union { unsigned int i; float f; } xu = {0}, yu = {0}; __builtin_memcpy (&xu.f, x, sizeof (float)); __builtin_memcpy (&yu.f, y, sizeof (float)); return (xu.i ^ extended_sign) <= (yu.i ^ extended_sign); } tf.s looks like this: my_totalorderf: pushl %ebx subl$8, %esp ;; int xs = __builtin_signbit (*x); movl16(%esp), %eax flds(%eax) fsts(%esp);; [%esp+0] := convert_snan_to_qnan(*x) fxam fnstsw %ax movl%eax, %edx movl20(%esp), %eax andl$512, %edx ;; int ys = __builtin_signbit (*y); flds(%eax) sete%cl fsts4(%esp) ;; [%esp+4] := convert_snan_to_qnan(*y) fxam fnstsw %ax testb $2, %ah sete%al ;; if (!xs != !ys) cmpb%al, %cl jne .L12 ;; int xn = __builtin_isnan (*x); fxch%st(1) fucomi %st(0), %st fxch%st(1) setnp %bl ;; int yn = __builtin_isnan (*y); fucomip %st(0), %st setnp %al ;; if (!xn != !yn) cmpb%al, %bl jne .L11 fstp%st(0) flds(%esp) fucomi %st(0), %st jp .L9 flds4(%esp) xorl%edx, %edx fcomip %st(1), %st fstp%st(0) setnb %dl jmp .L6 .p2align 4,,10 .p2align 3 .L12: fstp%st(0) fstp%st(0) .L6: addl$8, %esp movl%edx, %eax popl%ebx ret .p2align 4,,10 .p2align 3 .L11: fucomip %st(0), %st setp%dl addl$8, %esp xorl%ecx, %edx popl%ebx movzbl %dl, %edx movl%edx, %eax ret .p2align 4,,10 .p2align 3 .L9: fstp%st(0) negl%edx ;; computes -xs movl(%esp), %eax ;; fetches convert_snan_to_qnan(*x) instead of *x movl4(%esp), %ebx ;; fetches convert_snan_to_qnan(*y) instead of *y sbbl%edx, %edx;; computes extended_sign = -!!xs; xorl%edx, %eax;; computes (xu.i ^ extended_sign) xorl%ebx, %edx;; computes (yu.i ^ extended_sign) cmpl%eax, %edx;; compares (xu.i ^ extended_sign) and (xu.i ^ extended_sign) setnb %dl movzbl %dl, %edx jmp .L6 As you can see, (%esp) and 4(%esp) contain *not* the original *x and *y respectively, but the result of an flds/fsts instruction pair, that is, convert_snan_to_qnan(*x) and convert_snan_to_qnan(*y), respectively. See https://lists.gnu.org/archive/html/bug-gnulib/2023-10/msg00060.html for some background about these instructions on i386. The analysis of td.c is similar; here the value is stored to memory through an fldl/fstl pair.
[Bug c/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #1 from Bruno Haible --- Created attachment 57913 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57913&action=edit test case td.c
[Bug c/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 Bruno Haible changed: What|Removed |Added Build||x86_64-linux-gnu Host||x86_64-linux-gnu Target||x86_64-linux-gnu --- Comment #2 from Bruno Haible --- Note: "gcc-version 13.2.0" just invokes gcc-13.2.0, which I built from source.
[Bug c/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #3 from Bruno Haible --- Also reproducible in 64-bit mode, with '-mfpmath=387': $ gcc -mfpmath=387 -Wall tf.c $ ./a.out ; echo $? 0 $ gcc -mfpmath=387 -Wall -O2 tf.c $ ./a.out ; echo $? 1 $ gcc -mfpmath=387 -Wall td.c $ ./a.out ; echo $? 0 $ gcc -mfpmath=387 -Wall -O2 td.c $ ./a.out ; echo $? 1
[Bug middle-end/114660] New: Exponentiating by squaring not performed for x * y * y * y * y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114660 Bug ID: 114660 Summary: Exponentiating by squaring not performed for x * y * y * y * y Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: antoshkka at gmail dot com Target Milestone: --- For the following code: int mul(int x, int y) { return x * y * y * y * y; } with -O2 GCC produces the frollowing assembly: mul(int, int): mov eax, edi imul eax, esi imul eax, esi imul eax, esi imul eax, esi ret However, a more optimal code could be generated with less multiplications: mul(int, int): mov eax, edi imulesi, esi imuleax, esi imuleax, esi ret Godbolt playground: https://godbolt.org/z/6dP11jPfx
[Bug debug/113566] btf: incorrect BTF_KIND_DATASEC entries for variables which are optimized out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113566 --- Comment #1 from GCC Commits --- The master branch has been updated by David Faust : https://gcc.gnu.org/g:8075477f81ae8d0abf64b80dfbd179151f91b417 commit r14-9876-g8075477f81ae8d0abf64b80dfbd179151f91b417 Author: David Faust Date: Mon Apr 8 11:10:41 2024 -0700 btf: emit symbol refs in DATASEC entries only for BPF [PR114608] The behavior introduced in fa60ac54964 btf: Emit labels in DATASEC bts_offset entries. is only fully correct when compiling for the BPF target with BPF CO-RE enabled. In other cases, depending on optimizations, it can result in an incorrect symbol reference in the entry bts_offset field for a symbol which may not be emitted at all, causing link-time undefined symbol reference errors like in PR114608. The offending bts_offset field of BTF_KIND_DATASEC entries is in reality only currently useful to consumers of BTF information for BPF programs anyway. Correct the regression by only emitting symbol references in these entries when compiling for the BPF target. For other targets, the behavior returns to that prior to fa60ac54964. The underlying cause is related to PR 113566 "btf: incorrect BTF_KIND_DATASEC entries for variables which are optimized out." A complete fix for 113566 is more involved and unsuitable for stage 4, but will be addressed in the near future. gcc/ PR debug/114608 * btfout.cc (btf_asm_datasec_entry): Only emit a symbol reference when generating BTF for BPF CO-RE target. gcc/testsuite/ PR debug/114608 * gcc.dg/debug/btf/btf-datasec-1.c: Check bts_offset symbol references only for BPF target. * gcc.dg/debug/btf/btf-datasec-2.c: Likewise. * gcc.dg/debug/btf/btf-pr106773.c: Likewise.
[Bug debug/114608] [14 Regression] Undefined reference in output asm with -fipa-reference -fipa-reference-addressable -fsection-anchors -gbtf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114608 --- Comment #3 from GCC Commits --- The master branch has been updated by David Faust : https://gcc.gnu.org/g:8075477f81ae8d0abf64b80dfbd179151f91b417 commit r14-9876-g8075477f81ae8d0abf64b80dfbd179151f91b417 Author: David Faust Date: Mon Apr 8 11:10:41 2024 -0700 btf: emit symbol refs in DATASEC entries only for BPF [PR114608] The behavior introduced in fa60ac54964 btf: Emit labels in DATASEC bts_offset entries. is only fully correct when compiling for the BPF target with BPF CO-RE enabled. In other cases, depending on optimizations, it can result in an incorrect symbol reference in the entry bts_offset field for a symbol which may not be emitted at all, causing link-time undefined symbol reference errors like in PR114608. The offending bts_offset field of BTF_KIND_DATASEC entries is in reality only currently useful to consumers of BTF information for BPF programs anyway. Correct the regression by only emitting symbol references in these entries when compiling for the BPF target. For other targets, the behavior returns to that prior to fa60ac54964. The underlying cause is related to PR 113566 "btf: incorrect BTF_KIND_DATASEC entries for variables which are optimized out." A complete fix for 113566 is more involved and unsuitable for stage 4, but will be addressed in the near future. gcc/ PR debug/114608 * btfout.cc (btf_asm_datasec_entry): Only emit a symbol reference when generating BTF for BPF CO-RE target. gcc/testsuite/ PR debug/114608 * gcc.dg/debug/btf/btf-datasec-1.c: Check bts_offset symbol references only for BPF target. * gcc.dg/debug/btf/btf-datasec-2.c: Likewise. * gcc.dg/debug/btf/btf-pr106773.c: Likewise.
[Bug middle-end/114660] Exponentiating by squaring not performed for x * y * y * y * y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114660 --- Comment #1 from Antony Polukhin --- The above godbolt link for an old version of GCC, here's for 14.0 https://godbolt.org/z/dTPYY1T9W
[Bug c/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #4 from Bruno Haible --- Related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58416
[Bug c/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #5 from Bruno Haible --- Related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93271
[Bug target/114431] bpf: GCC generates unverifiable code for systemd restrict_fs_bpf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114431 David Faust changed: What|Removed |Added CC||david.faust at oracle dot com --- Comment #7 from David Faust --- Patch above had some fallout for non-BPF targets. I pushed the below to fix that. The behavior for BPF is unchanged. https://gcc.gnu.org/g:8075477f81ae8d0abf64b80dfbd179151f91b417 commit r14-9876-g8075477f81ae8d0abf64b80dfbd179151f91b417 Author: David Faust Date: Mon Apr 8 11:10:41 2024 -0700 btf: emit symbol refs in DATASEC entries only for BPF [PR114608]
[Bug middle-end/114661] New: Bit operations not optimized to multiplication
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114661 Bug ID: 114661 Summary: Bit operations not optimized to multiplication Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: antoshkka at gmail dot com Target Milestone: --- Consider the example: unsigned mul(unsigned char c) { if (c > 3) __builtin_unreachable(); return c << 18 | c << 15 | c << 12 | c << 9 | c << 6 | c << 3 | c; } GCC with -O2 generates the following assembly: mul(unsigned char): movzx edi, dil lea edx, [rdi+rdi*8] lea eax, [0+rdx*8] mov ecx, edx sal edx, 15 or eax, edi sal ecx, 9 or eax, ecx or eax, edx ret However it could be optimized to just: mul(unsigned char): imul eax, edi, 299593 ret Compiling with -Os does not help. Godbolt playground: https://godbolt.org/z/YszzMbovK P.S.: without `c << 18 | c << 15 |` the bit operations are transformed to multiplication.
[Bug target/114656] ~5% slowdown of 538.imagick_r on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114656 --- Comment #2 from Filip Kastl --- (In reply to Jakub Jelinek from comment #1) > Can you try to revert r14-9692 if that commit isn't the cause? I have tried reverting r14-9692 and that indeed removed the slowdown. The benchmark ran as fast as on r14-9649-gbb04a11418f54c. So it seems that r14-9692 is the cause of this slowdown.
[Bug libstdc++/114645] std::chrono::current_zone ignores $TZ on Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114645 --- Comment #12 from Jonathan Wakely --- (In reply to Hristo Venev from comment #9) > I stumbled upon this comment in the library you linked: > > https://github.com/HowardHinnant/date/blob/ > 0e65940a7fbc4ed617a1ee111a60311eccbead9a/include/date/tz.h#L35 > > That comment is wrong in its explanation of the mechanism used to determine > the local time zone on Linux. However, it clearly shows that the intent is > to match the platform's "local time" as closely as reasonably possible. The intent is for current_zone() to be "the time zone which the computer has set as its local time zone", not the time zone that _the process_ has set via TZ. That's /etc/localtime on GNU/Linux and many unixes. Which is what the comment says. $TZ allows you to override it per-process (and even change it during the lifetime of a process by using setenv and tzset). We don't support that for current_zone(). > The implementation also has some comments: > > https://github.com/HowardHinnant/date/blob/ > 0e65940a7fbc4ed617a1ee111a60311eccbead9a/src/tz.cpp#L3936 > > The intent seems to be clear -- apply a lot of heuristics to try to match > what libc would do as closely as possible. The intent is to infer an IANA time zone from the /etc/localtime symlink, if possible. If the intent was to match libc, it would look at $TZ. I've discussed this exact question with the author of that library (which is the origin of the std::chrono components too). What I said in comment 8 above is paraphrasing what he said. > Even on Linux there are no guarantees whatsoever that it is possible to > extract a IANA time zone from /etc/localtime. And so current_zone() can fail. > In fact, the problem is > exactly identical to that with $TZ, if not worse -- $TZ is normally an IANA > time zone name, whereas /etc/localtime is a symlink (but sometimes a > hardlink or a copy) of a file in some OS-specific directory (sometimes, but > not always, /usr/share/zoneinfo) where the name of the file relative to the > base directory is a IANA time zone name. If $TZ is an IANA name then you can just look that name up with locate_zone, it's easy. If $TZ is a POSIX time zone spec, things are more complicated. So the most we could do is handle the easy case, but not in a thread-safe way (because the environment is mutable and not synchronized). So we could support something that is already easy for users to do, by introducing possible data races into the program. That doesn't seem like a good trade off to me. Just do the easy thing yourself.
[Bug libfortran/114646] libgfortran still doesn't define GTHREAD_USE_WEAK to 0 for newer glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 --- Comment #17 from Sunil Pandey --- (In reply to H.J. Lu from comment #10) > Created attachment 57906 [details] > A patch > > I am testing this. This patch resolved my static testing issue.
[Bug libstdc++/114645] std::chrono::current_zone ignores $TZ on Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114645 --- Comment #13 from Hristo Venev --- > $TZ allows you to override it per-process (and even change it during the > lifetime of a process by using setenv and tzset). We don't support that for > current_zone(). /etc/localtime can also change. > The intent is to infer an IANA time zone from the /etc/localtime symlink, if > possible. If the intent was to match libc, it would look at $TZ. I've > discussed this exact question with the author of that library (which is the > origin of the std::chrono components too). What I said in comment 8 above is > paraphrasing what he said. Point taken. Still, do you have any explanation for why this behavior was chosen? > Just do the easy thing yourself. The easy thing being to fix all applications that currently use or will ever use current_zone(). Fun times ahead...
[Bug driver/114658] branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 --- Comment #2 from felix-gcc at fefe dot de --- I'm probably doing something really stupid wrong, sorry for the noise. Here's what I'm doing: $ git checkout releases/gcc-13 Switched to branch 'releases/gcc-13' $ git branch master * releases/gcc-13 $ cat gcc/BASE-VER 14.0.1
[Bug driver/114658] branch "releases/gcc-13" builds "gcc version 14.0.1 (experimental)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114658 felix-gcc at fefe dot de changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from felix-gcc at fefe dot de --- ok it looks like it was my fault (surprise) and I fixed it. Here's what I did: $ git checkout master Switched to branch 'master' Your branch is up to date with 'origin/master'. $ git branch -D releases/gcc-13 Deleted branch releases/gcc-13 (was 32fb04adae9). $ git checkout releases/gcc-13 Updating files: 100% (40334/40334), done. branch 'releases/gcc-13' set up to track 'origin/releases/gcc-13'. Switched to a new branch 'releases/gcc-13' $ cat gcc/BASE-VER 13.2.1 Sorry again for the noise. Hope this helps the next git noob :)
[Bug tree-optimization/114660] Exponentiating by squaring not performed for x * y * y * y * y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114660 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Component|middle-end |tree-optimization --- Comment #2 from Andrew Pinski --- We don't do as much reassociation as we should with signed integers due to overflow. If you use -fwrapv, you get the reassociation; I am 99% sure there is a dup for this bug too.
[Bug tree-optimization/114660] Exponentiating by squaring not performed for x * y * y * y * y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114660 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-04-09 Ever confirmed|0 |1 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #2) > We don't do as much reassociation as we should with signed integers due to > overflow. If you use -fwrapv, you get the reassociation; I am 99% sure there > is a dup for this bug too. I should say we also do it for unsigned already (see PR 95867), -fwrapv case we just treat signed similar to unsigned here. Anyways what needs to happen is we need 2 levels of gimple, one with signed integer overflow behavior and then one with wrapping behavior. RTL does not distinguish between signed and unsigned behaviors for many operations (plus and multiple) so we get some optimizations there but not all.
[Bug testsuite/114642] new test case gcc.dg/debug/btf/btf-datasec-3.c from r14-6195-gb8cf266f4ca4ff fails for 32 bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114642 --- Comment #4 from GCC Commits --- The master branch has been updated by David Faust : https://gcc.gnu.org/g:639215c5eb6c56ba3830cd868d1d3ddd700b4c90 commit r14-9878-g639215c5eb6c56ba3830cd868d1d3ddd700b4c90 Author: David Faust Date: Mon Apr 8 13:33:48 2024 -0700 btf: improve btf-datasec-3.c test [PR114642] This test failed on powerpc --target_board=unix'{-m32}' because two variables were not placed in sections where the test silently (and incorrectly) assumed they would be. The important thing for the test is only that BTF_KIND_DATASEC entries are NOT generated for the extern variable declarations without an explicit section attribute. Make the test more robust by placing the non-extern variables in explicit sections, and invert the checks to more accurately verify what we care about in this test. gcc/testsuite/ PR testsuite/114642 * gcc.dg/debug/btf/btf-datasec-3.c: Make test more robust on different architectures.
[Bug target/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 Andrew Pinski changed: What|Removed |Added Component|middle-end |target --- Comment #6 from Andrew Pinski --- I doubt there is not much to be done here. It is a x87 issue where we do the store of the float register stack register to the stack to get 32bits (or 64bit) version. And then load it into a GPR. float t = *x; float t1 = *y; __builtin_memcpy (&xu.f, &t, sizeof (float)); __builtin_memcpy (&xu.f, &t1, sizeof (float)); Produces exactly the same issue.
[Bug target/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=56831, ||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=57484 --- Comment #7 from Andrew Pinski --- Much more related to PR 56831 and PR 57484 rather than the other two ...
[Bug lto/114662] New: [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662 Bug ID: 114662 Summary: [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:1e3312a25a7b34d6e3f549273e1674c7114e4408, r14-9841-g1e3312a25a7b34 I am seeing this on our big endian machines for -m32 make -k check-gcc RUNTESTFLAGS="--target_board=unix'{-m32}' lto.exp=*" FAIL: gcc-dg-lto-pr113359-2-01.exe scan-wpa-ipa-dump icf "Semantic equality hit:geta/.*getb/" FAIL: gcc.dg/lto/pr113359-2 c_lto_pr113359-2_0.o assemble, -O2 -flto -fno-strict-aliasing -fno-ipa-cp --disable-tree-esra -fdump-ipa-icf-details FAIL: gcc.dg/lto/pr113359-2 c_lto_pr113359-2_0.o-c_lto_pr113359-2_1.o execute -O2 -flto -fno-strict-aliasing -fno-ipa-cp --disable-tree-esra -fdump-ipa-icf-details commit 1e3312a25a7b34d6e3f549273e1674c7114e4408 (HEAD) Author: Martin Jambor Date: Mon Apr 8 18:53:23 2024 +0200 ICF&SRA: Make ICF and SRA agree on padding
[Bug tree-optimization/114660] Exponentiating by squaring not performed for x * y * y * y * y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114660 --- Comment #4 from Jakub Jelinek --- I think we've been discussing an idea of turning on flag_wrapv very late among the GIMPLE passes and reassociate again. Because RTL also kind of assumes flag_wrapv, there is no difference between signed/unsigned addition/subtraction/non-widening multiplication. Though, a question is if it wouldn't screw up range info for use during expansion too much. Other option is to rewrite into unsigned operations only what we've successfully reassociated in the late reassoc pass.
[Bug target/110027] [11/12/13/14 regression] Stack objects with extended alignments (vectors etc) misaligned on detect_stack_use_after_return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #17 from Jakub Jelinek --- Both of the posted patches are incorrect, this needs to be fixed in asan_emit_stack_protection, account for the different offsets[0] which happens when a stack pointer guard is created. I'll deal with it tomorrow.
[Bug debug/112878] ICE: in ctf_add_slice, at ctfc.cc:499 with _BitInt > 255 in a struct and -gctf1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112878 --- Comment #3 from Indu Bhagat --- The limit of 255 is somewhat arbitrary but we need to follow it for now, because libctf has a check in ctf_add_slice () in libctf/ctf-create.c : if ((ep->cte_bits > 255) || (ep->cte_offset > 255)) return (ctf_set_typed_errno (fp, ECTF_SLICEOVERFLOW)); ... slice.cts_bits = ep->cte_bits; slice.cts_offset = ep->cte_offset; The CTF generation in GCC does not have a mechanism to roll-back an already added type. In this testcase presented in the PR, we hit a representation limit in CTF slices (for a member of a struct) and ICE, after the type for struct (CTF_K_STRUCT) has already been added to the container. To exit gracefully instead in GCC, one option is to simply check for both the offset and size of the bitfield to be explicitly <= 255. If the check fails, we emit the member with type CTF_K_UNKNOWN instead.
[Bug c++/114663] New: Several contracts test cases fail with -fsanitize=undefined -fsanitize-trap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114663 Bug ID: 114663 Summary: Several contracts test cases fail with -fsanitize=undefined -fsanitize-trap Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: testsuite-fail, wrong-code Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: iains at gcc dot gnu.org CC: jason at gcc dot gnu.org Target Milestone: --- I found this while working on -funreachable-traps (but the failure equally occurs with -fsanitize=undefined -fsanitize-trap) FAIL: g++.dg/contracts/contracts10.C execution test FAIL: g++.dg/contracts/contracts18.C execution test FAIL: g++.dg/contracts/contracts19.C execution test FAIL: g++.dg/contracts/contracts2.C execution test Initial analysis is that somehow the lowering of the contracts code is exploiting UB [which has a large measure of irony if true] to make these cases pass, for example contracts2.C optimised tree dump contains: ;; Function main (main, funcdef_no=0, decl_uid=2531, cgraph_uid=1, symbol_order=0) int main () { int x; int D.2551; const struct D.2542; int _2; : x_1 = 1; if (x_1 < 0) goto ; [INV] else goto ; [INV] : __builtin_unreachable (); : if (x_1 <= 0) goto ; [INV] else goto ; [INV] : = When (default) the __builtin_unreachable () is replaced with nothing (i.e. it falls though) the test case passes. When we replace the __builtin_unreachable () with a trap (either using the ubsan or -funreachable-traps style) the test case fails with the trap. This seems to be unlikely to be what was intended (or if it was intended it's terribly fragile); I'm labelling it wrong code for now. Similar code patterns exist in the other cases mentioned.
[Bug rtl-optimization/114664] New: -fno-omit-frame-pointer causes an ICE during the build of the greenlet package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664 Bug ID: 114664 Summary: -fno-omit-frame-pointer causes an ICE during the build of the greenlet package Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bergner at gcc dot gnu.org Target Milestone: --- Current builds of the greenlet package on one specific distro, are seeing an ICE on multiple architectures (ppc64le & riscv64) when being built with -fno-omit-frame-pointer. The upstream github issue is here: https://github.com/python-greenlet/greenlet/issues/395 A minimized test case on Power is: bergner@ltcden2-lp1:$ cat bug.c void bug (void) { __asm__ volatile ("" : : : "r31"); } bergner@ltcden2-lp1:$ /opt/gcc-nightly/trunk/bin/gcc -S -fno-omit-frame-pointer bug.c bug.c: In function ‘bug’: bug.c:5:1: error: 31 cannot be used in ‘asm’ here 5 | } | ^ bug.c:5:1: error: 31 cannot be used in ‘asm’ here This is not a regression, as all gcc's I have easy access to (back to gcc v8) ICE the same way. The code that is ICEing here is in ira.c:ira_setup_eliminable_regset(): /* Build the regset of all eliminable registers and show we can't use those that we already know won't be eliminated. */ for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++) { bool cannot_elim = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to) || (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed)); if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from)) { SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from); if (cannot_elim) SET_HARD_REG_BIT (ira_no_alloc_regs, eliminables[i].from); } else if (cannot_elim) error ("%s cannot be used in % here", reg_names[eliminables[i].from]); else df_set_regs_ever_live (eliminables[i].from, true); } On Power, targetm.can_eliminate(r31,r1) returns true (ie, the port will allow us to eliminate r31 into r1) even in the face of -fno-omit-frame-pointer, but it's the RA specific test (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed) that is catching us here. The question I have is, is it legal to mention the hard frame pointer register in an asm clobber list when using -fno-omit-frame-pointer? Ie, is this user error or should the compiler be able to handle this?
[Bug debug/112878] ICE: in ctf_add_slice, at ctfc.cc:499 with _BitInt > 255 in a struct and -gctf1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112878 --- Comment #4 from Andrew Pinski --- Another option to ouput a sorry message and then suspend this until libctf gets fixed/changed.
[Bug target/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #8 from Bruno Haible --- (In reply to Andrew Pinski from comment #6) > I doubt there is not much to be done here. I see it as an incorrect modelization of the x87 hardware, together with a missing distinction in the common expression elimination / aliasing analysis. In detail: * Incorrect modelization of the x87 hardware: The compiler seems to assume that flds MEM_LOCATION_1 fsts MEM_LOCATION_2 will result in MEM_LOCATION_2 having the same value as MEM_LOCATION_1. This is wrong; this is not how the x87 hardware behaves. The actual result is: *MEM_LOCATION_2 = convert_snan_to_qnan(*MEM_LOCATION_1). * In the common expression elimination / aliasing analysis, the compilers seems to keep track of a set of memory locations MEM_LOCATION_1, ..., MEM_LOCATION_n which have the same value. In fact, this set needs to be partitioned into two sets: a subset which contains the same value, and the complementary subset which contains convert_snan_to_qnan(value). In other words, each element of the set needs to be annotated with a bit that tells whether the value has been subject to the convert_snan_to_qnan.
[Bug target/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #9 from Bruno Haible --- (In reply to Andrew Pinski from comment #7) > Much more related to PR 56831 and PR 57484 rather than the other two ... Well, bug #56831 is more about function calls and the ABI, whereas this bug here and bug #58416 and bug #93271 are about the compiler picking a memory location which holds convert_snan_to_qnan(value) rather than a memory location which holds the original value.
[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664 --- Comment #1 from Andrew Pinski --- Let me find the dups ...
[Bug lto/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662 Edwin Lu changed: What|Removed |Added CC||ewlu at rivosinc dot com, ||patrick at rivosinc dot com --- Comment #1 from Edwin Lu --- We are also seeing this for rv32 targets on linux and newlib https://github.com/patrick-rivos/gcc-postcommit-ci/issues/746#issuecomment-2045727038