[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Target||riscv --- Comment #4 from Richard Biener --- (In reply to JuzheZhong from comment #1) > Oh. I see we have cond_xxx pattern for VLS modes. > > like V64HImdoe. But we don't support partial vectorization for VLS modes. > > VLS modes are supposed to used as SIMD GNU vectorization. > > As long as COND_XXX is enabled, loop vectorizer considers target support > partial > vectorization with mask and since no while_ult, then go through AVX512 > partial vectorization. I think the bug is in the AVX512 code where it probably lacks some guards. But in theory even with RVV you can do mask based vectorization of partial loops, the AVX512 code doesn't require .WHILE_ULT but instead uses regular compares. I don't think you should work around this by disabling RVV patterns here. I can have a look later what happens. > It seems that for conditional operations, I should use backend RTL PASS to > walk around that.
[Bug c/112339] ICE with clang::no_sanitize and -fsanitize=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112339 --- Comment #4 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:533241c6c60bc7c9f7dc47a94e94b5eed1b370e6 commit r14-5265-g533241c6c60bc7c9f7dc47a94e94b5eed1b370e6 Author: Jakub Jelinek Date: Thu Nov 9 09:05:54 2023 +0100 attribs: Fix ICE with -Wno-attributes= [PR112339] The following testcase ICEs, because with -Wno-attributes=foo::no_sanitize (but generally any other non-gnu namespace and some gnu well known attribute name within that other namespace) the FEs don't really parse attribute arguments of such attribute, but lookup_attribute_spec is non-NULL with NULL handler and such attributes are added to DECL_ATTRIBUTES or TYPE_ATTRIBUTES and then when e.g. middle-end does lookup_attribute on a particular attribute and expects the attribute to mean something and/or have a particular verified arguments, it can crash when seeing the foreign attribute in there instead. The following patch fixes that by never adding ignored attributes to DECL_ATTRIBUTES/TYPE_ATTRIBUTES, previously that was the case just for attributes in ignored namespace (where lookup_attribute_space returned NULL). We don't really know anything about those attributes, so shouldn't pretend we know something about them, especially when the arguments are error_mark_node or NULL instead of something that would have been parsed. And it would be really weird if we normally ignore say [[clang::unused]] attribute, but when people use -Wno-attributes=clang::unused we actually treated it as gnu::unused. All the user asked for is suppress warnings about that attribute being unknown. The first hunk is just playing safe, I'm worried people could -Wno-attributes=gnu:: and get various crashes with known GNU attributes not being actually parsed and recorded (or worse e.g. when we tweak standard attributes into GNU attributes and we wouldn't add those). The -Wno-attributes= documentation says that it suppresses warning about unknown attributes, so I think -Wno-attributes=gnu:: should prevent warning about say [[gnu::foobarbaz]] attribute, but not about [[gnu::unused]] because the latter is a known attribute. The routine would return true for any scoped attribute in the ignored namespace, with the change it ignores only unknown attributes in ignored namespace, known ones in there will be ignored only if they have max_length of -2 (e.g.. with -Wno-attributes=gnu:: -Wno-attributes=gnu::foobarbaz). 2023-11-09 Jakub Jelinek PR c/112339 * attribs.cc (attribute_ignored_p): Only return true for attr_namespace_ignored_p if as is NULL. (decl_attributes): Never add ignored attributes. * c-c++-common/ubsan/Wno-attributes-1.c: New test.
[Bug libstdc++/91910] Debug mode: there is a racing condition between destructors of iterator and the associated container.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91910 --- Comment #11 from CVS Commits --- The releases/gcc-13 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:0fc5cc6e5a2dfd7dbdd20bc27f975eed155ffb91 commit r13-8020-g0fc5cc6e5a2dfd7dbdd20bc27f975eed155ffb91 Author: Jonathan Wakely Date: Mon Sep 11 16:42:54 2023 +0100 libstdc++: Remove unconditional use of atomics in Debug Mode The fix for PR 91910 (r10-3426-gf7a3a382279585) introduced unconditional uses of atomics into src/c++11/debug.cc, which causes linker errors for arm4t where GCC emits an unresolved reference to __sync_synchronize. By making the uses of atomics depend on _GLIBCXX_HAS_GTHREADS we can avoid those unconditional references to __sync_synchronize for targets where the atomics are unnecessary. As a minor performance optimization we can also check the __gnu_cxx::__is_single_threaded function to avoid atomics for single-threaded programs even where they don't cause linker errors. libstdc++-v3/ChangeLog: * src/c++11/debug.cc (acquire_sequence_ptr_for_lock): New function. (reset_sequence_ptr): New function. (_Safe_iterator_base::_M_detach) (_Safe_local_iterator_base::_M_detach): Replace bare atomic_load with acquire_sequence_ptr_for_lock. (_Safe_iterator_base::_M_reset): Replace bare atomic_store with reset_sequence_ptr. (cherry picked from commit 4a2766ed00a47904dc8b85bf0538aa116d8e658b)
[Bug libstdc++/111172] Dead code in std::get for variant?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72 --- Comment #3 from CVS Commits --- The releases/gcc-13 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:9847a4e9f025e66bfd09e0174e8bd32aa01939af commit r13-8021-g9847a4e9f025e66bfd09e0174e8bd32aa01939af Author: Jonathan Wakely Date: Tue Sep 12 21:28:38 2023 +0100 libstdc++: Remove non-void static assertions in variant's std::get [PR72] A void template argument would cause a substitution failure when trying to form a reference for the return type, so the function body would never be instantiated. libstdc++-v3/ChangeLog: PR libstdc++/72 * include/std/variant (get): Remove !is_void static assertions. (cherry picked from commit d19bdf8874059457fdfe50a9e14dad8f8b8cecbb)
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 --- Comment #5 from JuzheZhong --- (In reply to Richard Biener from comment #4) > (In reply to JuzheZhong from comment #1) > > Oh. I see we have cond_xxx pattern for VLS modes. > > > > like V64HImdoe. But we don't support partial vectorization for VLS modes. > > > > VLS modes are supposed to used as SIMD GNU vectorization. > > > > As long as COND_XXX is enabled, loop vectorizer considers target support > > partial > > vectorization with mask and since no while_ult, then go through AVX512 > > partial vectorization. > > I think the bug is in the AVX512 code where it probably lacks some guards. > But in theory even with RVV you can do mask based vectorization of > partial loops, the AVX512 code doesn't require .WHILE_ULT but instead > uses regular compares. > > I don't think you should work around this by disabling RVV patterns here. > > I can have a look later what happens. > > > It seems that for conditional operations, I should use backend RTL PASS to > > walk around that. Thanks a lot Richi. I was about to add disable cond_xxx pattern or add cond_len_xxx pattern to walk around this issue. Actually, we always apply partial vectorization on VLA modes. We always use VLS modes on SIMD GNU vectorization. We enable cond_xxx for VLS modes to handle conditional operation which makes use of match.pd vectorizations. Here is the example: https://godbolt.org/z/csx995anE You can see with cond_div on VLS modes, we can have much better codegen. Anyway, really appreciate you take care of this issue!
[Bug libstdc++/21769] per-file control over PCH inclusion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21769 --- Comment #9 from CVS Commits --- The releases/gcc-13 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:5d036ff51e2491401b9a64705bfd7f7467764260 commit r13-8029-g5d036ff51e2491401b9a64705bfd7f7467764260 Author: Jonathan Wakely Date: Wed Aug 16 21:46:05 2023 +0100 libstdc++: Fix testsuite no_pch directive The { dg-add-options no_pch } directive is supposed to add a macro definition that invalidates the PCH file, and ensures that the #include directives in the test file are processed as written. But the proc that adds the options actually removes all existing options, cancelling out any previous dg-options directive. This means that using no_pch will cause FAILs in a file that relies on other options set by an earlier dg-options. The no_pch directive was added for PR libstdc++/21769 where Janis suggested adding it as return "$flags -D__GLIBCXX__=" but what was actually committed didn't include the $flags so replaced them. Additionally, using no_pch only prevents the precompiled version of from being included, it doesn't prevent the non-precompiled version being included by -include bits/stdc++.h in the test flags. Use regsub to filter that out of the options as well. libstdc++-v3/ChangeLog: * testsuite/lib/dg-options.exp (add_options_for_no_pch): Remove any "-include bits/stdc++.h" from options and add the macro to the existing options instead of replacing them. (cherry picked from commit 91315f23ba127ea4d1a584023bae34e143f6eb8c)
[Bug libgcc/65833] Attempting to convert 128 bit integers to 128 bit decimal floating-point results in an unresolved symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65833 --- Comment #3 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:f172b9d38db426d2b102c0f9c1fd58672acc6c9b commit r14-5266-gf172b9d38db426d2b102c0f9c1fd58672acc6c9b Author: Jakub Jelinek Date: Thu Nov 9 09:14:07 2023 +0100 libgcc: Add {unsigned ,}__int128 <-> _Decimal{32,64,128} conversion support [PR65833] The following patch adds the missing {unsigned ,}__int128 <-> _Decimal{32,64,128} conversion support into libgcc.a on top of the _BitInt support (doing it without that would be larger amount of code and I hope all the targets which support __int128 will eventually support _BitInt, after all it is a required part of C23) and because it is in libgcc.a only, it doesn't hurt that much if it is added for some architectures only in GCC 15. Initially I thought about doing this on the compiler side, but doing it on the library side seems to be easier and more -Os friendly. The tests currently require bitint effective target, that can be removed when all the int128 targets support bitint. 2023-11-09 Jakub Jelinek PR libgcc/65833 libgcc/ * config/t-softfp (softfp_bid_list): Add {U,}TItype <-> _Decimal{32,64,128} conversions. * soft-fp/floattisd.c: New file. * soft-fp/floattidd.c: New file. * soft-fp/floattitd.c: New file. * soft-fp/floatuntisd.c: New file. * soft-fp/floatuntidd.c: New file. * soft-fp/floatuntitd.c: New file. * soft-fp/fixsdti.c: New file. * soft-fp/fixddti.c: New file. * soft-fp/fixtdti.c: New file. * soft-fp/fixunssdti.c: New file. * soft-fp/fixunsddti.c: New file. * soft-fp/fixunstdti.c: New file. gcc/testsuite/ * gcc.dg/dfp/int128-1.c: New test. * gcc.dg/dfp/int128-2.c: New test. * gcc.dg/dfp/int128-3.c: New test. * gcc.dg/dfp/int128-4.c: New test.
[Bug c++/112455] New: befriending a lambda closure type doesn't grant access to the lambda body
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112455 Bug ID: 112455 Summary: befriending a lambda closure type doesn't grant access to the lambda body Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: richard-gccbugzilla at metafoo dot co.uk Target Milestone: --- Testcase: class C; auto x = [](MyC *p) { return p->n; }; class C { int n; friend decltype(x); } c; int k = x(&c); This appears to be valid, and Clang, MSVC, and EDG accept, but GCC reports an access error. The templated operator() of the lambda is a member function of a friend of C, so should have access to C::n.
[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444 --- Comment #8 from Richard Biener --- OK, so the reason is that we value-number tmp_46 in [local count: 482002707]: if (0 != 0) goto ; [33.00%] else goto ; [67.00%] [local count: 322941815]: tmp_5 = .DEFERRED_INIT (1, 2, &"tmp"[0]); ... goto ; [local count: 159060893]: ... tmp_2 = .DEFERRED_INIT (1, 2, &"tmp"[0]); ... tmp_46 = PHI to tmp_2 because we use ssa_undefined_value_p () to check whether we are dealing with an undefined value. And that returns true for tmp_5. This makes us pick tmp_2 which we treat as VARYING since we didn't visit it and we don't trust the not-executable state of its incoming edge (a missed optimization, guess I can look at that as well). tmp_2 is also considered undefined. We then have /* If we saw only undefined values and VN_TOP use one of the undefined values. */ else if (sameval == VN_TOP) result = seen_undef ? seen_undef : sameval; and "one of" puts us in an unlucky situation here. I do have a sensible fix around this I think.
[Bug target/112443] [12/13/14 Regression] Misoptimization of _mm256_blendv_epi8 intrinsic on avx512bw+avx512vl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112443 --- Comment #2 from Alexander Grund --- I can confirm that the suggested patch can be applied to 12.2.0 and fixes the issue I observed
Re: [Bug ada/112446] New: Switch -gnatyz included in -gnatyg
> "gnatmake --help" states that -gnatyg is equivalent to -gnatydISux, but > in fact the new switch -gnatyz (check parentheses not required by operator > precedence rules) is included. > > If this is deliberate, the help information should say so. This is indeed deliberate, thanks for reporting! Arno
[Bug ada/112446] Switch -gnatyz included in -gnatyg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112446 --- Comment #1 from charlet at adacore dot com --- > "gnatmake --help" states that -gnatyg is equivalent to -gnatydISux, but > in fact the new switch -gnatyz (check parentheses not required by operator > precedence rules) is included. > > If this is deliberate, the help information should say so. This is indeed deliberate, thanks for reporting! Arno
[Bug debug/107231] [13/14 Regression] c-c++-common/goacc/kernels-loop-g.c: '-fcompare-debug' failure (length)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107231 Thomas Schwinge changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |aoliva at gcc dot gnu.org --- Comment #4 from Thomas Schwinge --- Resolved by Alexandre's recent commit r14-5257-g61d2b4746300a604469df15789194d0a7c73791b "skip debug stmts when assigning locus discriminators", thanks! Is this desirable to also cherry-pick into GCC 13?
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 Richard Biener changed: What|Removed |Added Last reconfirmed||2023-11-09 CC||rsandifo at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #6 from Richard Biener --- So in fact RVV with it's single-bit element mask and the ability to produce it from a V64QImode unsigned LT compare (but not from V64SImode?) is supposed to be able to handle the "AVX512" style masking as far as checking in vect_verify_full_masking_avx512 is concerned. What I failed to implement (and check) is that the mask types have an integer mode, thus we run into if (known_eq (TYPE_VECTOR_SUBPARTS (rgm->type), TYPE_VECTOR_SUBPARTS (vectype))) return rgm->controls[index]; /* Split the vector if needed. Since we are dealing with integer mode masks with AVX512 we can operate on the integer representation performing the whole vector shifting. */ unsigned HOST_WIDE_INT factor; bool ok = constant_multiple_p (TYPE_VECTOR_SUBPARTS (rgm->type), TYPE_VECTOR_SUBPARTS (vectype), &factor); gcc_assert (ok); gcc_assert (GET_MODE_CLASS (TYPE_MODE (rgm->type)) == MODE_INT); it would be fine if we didn't need to split the 64 element mask into two halves for a V32SImode vector op we need to mask here. We try to look at the subset of the mask by converting it to a same size integer type, right-rshift it, truncate and covert back to the mask type. That might or might not be possible with RVV masks (might or might not be the "optimal" way to do things). We can "fix" this by doing diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a544bc9b059..c7a92354578 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -11034,24 +11034,24 @@ vect_get_loop_mask (loop_vec_info loop_vinfo, bool ok = constant_multiple_p (TYPE_VECTOR_SUBPARTS (rgm->type), TYPE_VECTOR_SUBPARTS (vectype), &factor); gcc_assert (ok); - gcc_assert (GET_MODE_CLASS (TYPE_MODE (rgm->type)) == MODE_INT); tree mask_type = truth_type_for (vectype); - gcc_assert (GET_MODE_CLASS (TYPE_MODE (mask_type)) == MODE_INT); unsigned vi = index / factor; unsigned vpart = index % factor; tree vec = rgm->controls[vi]; gimple_seq seq = NULL; vec = gimple_build (&seq, VIEW_CONVERT_EXPR, - lang_hooks.types.type_for_mode - (TYPE_MODE (rgm->type), 1), vec); + lang_hooks.types.type_for_size + (GET_MODE_BITSIZE (TYPE_MODE (rgm->type)) + .to_constant (), 1), vec); /* For integer mode masks simply shift the right bits into position. */ if (vpart != 0) vec = gimple_build (&seq, RSHIFT_EXPR, TREE_TYPE (vec), vec, build_int_cst (integer_type_node, (TYPE_VECTOR_SUBPARTS (vectype) * vpart))); - vec = gimple_convert (&seq, lang_hooks.types.type_for_mode - (TYPE_MODE (mask_type), 1), vec); + vec = gimple_convert (&seq, lang_hooks.types.type_for_size + (GET_MODE_BITSIZE (TYPE_MODE (mask_type)) + .to_constant (), 1), vec); vec = gimple_build (&seq, VIEW_CONVERT_EXPR, mask_type, vec); if (seq) gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); which then generates the "expected" partial vector code. If you don't want partial vectors for VLS modes then I guess we could also enhance the vector_modes "iteration" to allow the target to override --param vect-partial-vector-usage on a per-mode base. Or I can simply not "fix" the code above but instead add an integer mode check to vect_verify_full_masking_avx512. But as said, in principle this scheme works. That fix would be diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a544bc9b059..0b364ac1c6e 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -1462,7 +1462,10 @@ vect_verify_full_masking_avx512 (loop_vec_info loop_vinfo ) if (!mask_type) continue; - if (TYPE_PRECISION (TREE_TYPE (mask_type)) != 1) + /* For now vect_get_loop_mask only supports integer mode masks +when we need to split it. */ + if (GET_MODE_CLASS (TYPE_MODE (mask_type)) != MODE_INT + || TYPE_PRECISION (TREE_TYPE (mask_type)) != 1) { ok = false; break;
[Bug libstdc++/112452] : operator|(_Range&& __r, _Self&& __self) should return decltype(auto)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112452 Jonathan Wakely changed: What|Removed |Added CC||ppalka at gcc dot gnu.org --- Comment #1 from Jonathan Wakely --- c.f. https://cplusplus.github.io/LWG/issue3981
[Bug libstdc++/111172] Dead code in std::get for variant?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72 Jonathan Wakely changed: What|Removed |Added Target Milestone|14.0|13.3
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 --- Comment #7 from JuzheZhong --- breakpoint.vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num, (gdb) p vectype->type_common.mode $1 = E_V64HImode Form my observation. It seems to be V64HImode. I tried you patch locally, it fixes the ICE now. Thanks!
[Bug sanitizer/109882] sanitizer/common_interface_defs.h bogusly defines __has_feature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109882 --- Comment #10 from Jonathan Wakely --- The fix has been committed upstream now.
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 --- Comment #8 from JuzheZhong --- I think RVV won't use vec_pack/vec_unpack for mask. Since we always uses len as the loop control. I think it's fine just disable it when target doesn't support split mask operations like RVV.
[Bug sanitizer/109882] sanitizer/common_interface_defs.h bogusly defines __has_feature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109882 Sam James changed: What|Removed |Added See Also||https://github.com/llvm/llv ||m-project/pull/66628 --- Comment #11 from Sam James --- We could really do with a general sync of libsanitizer as well.
[Bug modula2/111956] Many powerpc platforms do _not_ have support for IEEE754 long double
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111956 Thomas Koenig changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org --- Comment #11 from Thomas Koenig --- A remark - gfortran handles 128-bit reals on POWER as well, it might be a good idea to look into libgfortran's configure scripts.
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #9 from Richard Biener --- OK, I'll include it in my next round of testing.
[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444 --- Comment #9 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:8ebcea91e24964ec52ca2caf9f8585f3a785f7d5 commit r14-5276-g8ebcea91e24964ec52ca2caf9f8585f3a785f7d5 Author: Richard Biener Date: Thu Nov 9 09:41:10 2023 +0100 tree-optimization/112444 - avoid bougs PHI value-numbering With .DEFERRED_INIT ssa_undefined_value_p () can return true for values we did not visit (because they proved unreachable) but are not .VN_TOP. Avoid using those as value which, because they are not visited, are assumed to be defined outside of the region. PR tree-optimization/112444 * tree-ssa-sccvn.cc (visit_phi): Avoid using not visited defs as undefined vals. * gcc.dg/torture/pr112444.c: New testcase.
[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Richard Biener --- Fixed.
[Bug c++/112456] New: Diagnostic for [[nodiscard]] on a constructor could be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112456 Bug ID: 112456 Summary: Diagnostic for [[nodiscard]] on a constructor could be improved Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- struct S { [[nodiscard]] S() { } }; void f() { S(); } This correctly warns: nod.cc: In function 'void f()': nod.cc:7:6: warning: ignoring return value of 'S::S()', declared with attribute 'nodiscard' [-Wunused-result] 7 | S(); | ^ nod.cc:2:17: note: declared here 2 | [[nodiscard]] S() { } | ^ But the text could be improved because a constructor does not have a return value. Maybe "ignoring temporary object constructed by 'S::S()', declared with ...
[Bug c++/106851] [modules] Name conflict for exported using-declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106851 Nathaniel Shead changed: What|Removed |Added CC||nathanieloshead at gmail dot com --- Comment #2 from Nathaniel Shead --- This behaviour should be as expected right? The 'using' is trying to bring both names into the same scope (the global namespace), irrespective of the fact that we're also exporting that new declaration. (That is, removing the 'export' keywords from this test case gives the exact some result.) That said, perhaps it would be helpful for the error message to point to the using-declaration it actually conflicts with, rather than the definition that said using-declaration points to.
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #10 from Richard Biener --- commit 8863a7990e9f0cd49c8900605a2c75a0e8886e85 (origin/master, origin/HEAD) Author: Richard Biener Date: Thu Nov 9 11:44:07 2023 +0100 tree-optimization/112450 - avoid AVX512 style masking for BImode masks The following avoids running into the AVX512 style masking code for RVV which would theoretically be able to handle it if I were not relying on integer mode maskness in vect_get_loop_mask. While that's easy to fix (patch in PR), the preference is to not have AVX512 style masking for RVV, thus the following. * tree-vect-loop.cc (vect_verify_full_masking_avx512): Check we have integer mode masks as required by vect_get_loop_mask.
[Bug target/112413] Wrong switch jump table offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112413 Mikael Pettersson changed: What|Removed |Added CC||mikpelinux at gmail dot com --- Comment #4 from Mikael Pettersson --- Does the `.balignw` filler disappear if you drop `-malign-int`?
[Bug c/112457] New: Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 Bug ID: 112457 Summary: Possible better vectorization of different reduction min/max reduction Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- Hi, Richard. GCC-14 almost has all features of RVV. I am planning to participate on improving GCC loop vectorizer in GCC-15. Fix FAILs of TSVC is one of my plan. Currently we can vectorize this following case: int idx = 0; int max = 0; void foo (int n, int * __restrict a){ for (int i = 0; i < n; ++i) { max = max < a[i] ? a[i] : max; } } However, if we change this case it failed: void foo2 (int n, int * __restrict a){ for (int i = 0; i < n; ++i) { if (max < a[i]) { max = a[i]; } else max = max; } } Now, I notice another interesting and possible vectorization enhancement which inspired by this patch of LLVM: https://reviews.llvm.org/D143465 And more advance case is which is case from LLVM patch: which is vectorization reduction with index: void foo3 (int n, int * __restrict a){ for (int i = 0; i < n; ++i) { if (max < a[i]) { idx = i; max = a[i]; } } } I wonder it is a valuable optimization ? If yes, it would be one of my TODO list. Thanks.
[Bug c/112457] Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 --- Comment #1 from JuzheZhong --- Reference: https://godbolt.org/z/9M1jWzMdx
[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450 --- Comment #11 from CVS Commits --- The master branch has been updated by Pan Li : https://gcc.gnu.org/g:83f66d90af69837f7c8fc88f8afb7074d4555394 commit r14-5278-g83f66d90af69837f7c8fc88f8afb7074d4555394 Author: Juzhe-Zhong Date: Thu Nov 9 20:00:38 2023 +0800 RISC-V: Add PR112450 test to avoid regression ICE has been fixed by Richard:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450. Add test to avoid future regression. Committed. PR target/112450 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112450.c: New test.
[Bug target/112443] [12/13/14 Regression] Misoptimization of _mm256_blendv_epi8 intrinsic on avx512bw+avx512vl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112443 --- Comment #3 from Alexander Grund --- > I can confirm that the suggested patch can be applied to 12.2.0 and fixes > the issue I observed Also tested 12.1, 12.3, 13.1, 13.2 with this patch and it works (as expected) too
[Bug tree-optimization/112458] New: SLP permute optimization issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112458 Bug ID: 112458 Summary: SLP permute optimization issue Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- I'm facing an ICE on the vect-slp-only branch when compiling gcc.target/i386/pr98928.c with --param vect-single-lane-slp=1, I get during GIMPLE pass: vect /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c: In function 'main': /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: internal compiler error: in operator[], at vec.h:910 0x1a69a9e vec::operator[](unsigned int) /space/rguenther/src/gcc-clean/gcc/vec.h:910 0x1a646f4 vec::operator[](unsigned int) /space/rguenther/src/gcc-clean/gcc/vec.h:1599 0x1a4e5b2 vect_optimize_slp_pass::change_vec_perm_layout(_slp_tree*, vec, va_heap, vl_ptr>&, int, unsigned int) /space/rguenther/src/gcc-clean/gcc/tree-vect-slp.cc:4779 0x1a4e746 vect_optimize_slp_pass::internal_node_cost(_slp_tree*, int, unsigned int) /space/rguenther/src/gcc-clean/gcc/tree-vect-slp.cc:4827 0x1a503c7 vect_optimize_slp_pass::forward_pass() /space/rguenther/src/gcc-clean/gcc/tree-vect-slp.cc:5419 0x1a527d5 vect_optimize_slp_pass::run() /space/rguenther/src/gcc-clean/gcc/tree-vect-slp.cc:5953 and the issue is we end up in change_vec_perm_layout for /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: node 0x462f8d0 (max_nunits=1, refcnt=1) vector(8) float /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: op: VEC_PERM_EXPR /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: { } /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: lane permutation { 0[0] 1[0] } /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: children 0x462ee30 0x462ef40 where the first child is a constant def: /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: node (constant) 0x462ee30 (max_nunits=1, refcnt=1) vector(8) float /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: { 0.0 } and the second child is internal: /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: node 0x462ef40 (max_nunits=16, refcnt=1) vector(8) float /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: op template: patt_53 = patt_26 ? _93 : 0.0; /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: stmt 0 patt_53 = patt_26 ? _93 : 0.0; /space/rguenther/src/gcc-clean/gcc/testsuite/gcc.target/i386/pr98928.c:11:6: note: children 0x462efc8 0x462f5a0 0x462f738 now, for the constant def child the partition number is -1 and thus 4777 slp_tree in_node = SLP_TREE_CHILDREN (node)[entry.first]; 4778 unsigned int in_partition_i = m_vertices[in_node->vertex].partition; 4779 this_in_layout_i = m_partitions[in_partition_i].layout; we crash here. I wonder where to fend this off or where exactly we assume all children of a VEC_PERM are either internal or not. I've seen /* Check that the child nodes support the chosen layout. Checking the first child is enough, since any second child would have the same shape. */ auto first_child = SLP_TREE_CHILDREN (node)[0]; if (in_layout_i > 0 && !is_compatible_layout (first_child, in_layout_i)) return -1; but it doesn't apply here since in_layout_i is -1.
[Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 Richard Biener changed: What|Removed |Added Component|c |tree-optimization Blocks||53947 --- Comment #2 from Richard Biener --- Well, this is because MAX_EXPR detection fails when store motion inserts flags (the max = max is elided) to avoid store-data races. Also when using -Ofast we avoid this but then the next phiopt comes too late to discover MAX after store motion is applied. The more practical example is int foo2 (int max, int n, int * __restrict a) { for (int i = 0; i < n; ++i) if (max < a[i]) { max = a[i]; } return max; } and that's handled OK. For your second example, index reduction, there's already bugreports. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug c++/112455] befriending a lambda closure type doesn't grant access to the lambda body
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112455 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- Dup of bug 102791. *** This bug has been marked as a duplicate of bug 102791 ***
[Bug c++/102791] Friend declaration of lambda function is ignored
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102791 Andrew Pinski changed: What|Removed |Added CC||richard-gccbugzilla@metafoo ||.co.uk --- Comment #2 from Andrew Pinski --- *** Bug 112455 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/111133] SLP of scatters not implemented
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33 --- Comment #2 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:fd8e5f3c430f37c99ddcc00fcafc1a12b3475a3a commit r14-5280-gfd8e5f3c430f37c99ddcc00fcafc1a12b3475a3a Author: Richard Biener Date: Wed Nov 8 13:14:59 2023 +0100 Refactor x86 decl based scatter vectorization, prepare SLP The following refactors the x86 decl based scatter vectorization similar to what I did to the gather path. This prepares scatters for SLP as well, mainly single-lane since there are multiple missing bits to support multi-lane scatters. Tested extensively on the SLP-only branch which has the ability to force SLP even for single lanes. PR tree-optimization/33 * tree-vect-stmts.cc (vect_build_scatter_store_calls): Remove and refactor to ... (vect_build_one_scatter_store_call): ... this new function. (vectorizable_store): Use vect_check_scalar_mask to record the SLP node for the mask operand. Code generate scatters with builtin decls from the main scatter vectorization path and prepare that for SLP. * tree-vect-slp.cc (vect_get_operand_map): Do not look at the VDEF to decide between scatter or gather since that doesn't work for patterns. Use the LHS being an SSA_NAME or not instead.
[Bug tree-optimization/111133] multi-lane SLP of scatters not implemented
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33 Richard Biener changed: What|Removed |Added Summary|SLP of scatters not |multi-lane SLP of scatters |implemented |not implemented --- Comment #3 from Richard Biener --- Single-lane SLP scatters now work (well, verified on x86 sofar).
[Bug driver/111605] Cross compilation doesn't work with `-fuse-ld=mold`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111605 --- Comment #13 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:1c6d6b34b112b52566ebde49afef3e6eb747ef90 commit r14-5281-g1c6d6b34b112b52566ebde49afef3e6eb747ef90 Author: Tatsuyuki Ishi Date: Mon Oct 16 14:04:12 2023 +0900 Do not prepend target triple to -fuse-ld=lld,mold. lld and mold are platform-agnostic and not prefixed with target triple. Prepending the target triple makes it less likely to find the intended linker executable. A potential breaking change is that we no longer try to search for triple-prefixed lld/mold binaries anymore. However, since there doesn't seem to be support to build LLVM or mold with triple-prefixed executable names, it seems better to just not bother with that case. PR driver/111605 * collect2.cc (main): Do not prepend target triple to -fuse-ld=lld,mold.
[Bug driver/111605] Cross compilation doesn't work with `-fuse-ld=mold`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111605 Richard Biener changed: What|Removed |Added Known to work||14.0 Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #14 from Richard Biener --- Works on trunk now, queued for backporting up to GCC 12.
[Bug libstdc++/112453] : __take_of_repeat_view/__drop_of_repeat_view should forwards __r._M_value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112453 Patrick Palka changed: What|Removed |Added Target Milestone|--- |13.3 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2023-11-09 Keywords||rejects-valid Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org --- Comment #1 from Patrick Palka --- Confirmed, thanks for catching and reporting this.
[Bug libstdc++/112452] : operator|(_Range&& __r, _Self&& __self) should return decltype(auto)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112452 Patrick Palka changed: What|Removed |Added Last reconfirmed||2023-11-09 Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1
[Bug fortran/112459] New: gfortran -w option causes derived-type finalization at creation time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112459 Bug ID: 112459 Summary: gfortran -w option causes derived-type finalization at creation time Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: bardeau at iram dot fr Target Milestone: --- Hi everyone, with gfortran version 13.2.0, the -w compilation switch modifies the code behavior at execution time. This was not the case with e.g. gfortran 12.1.0. ~> gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/home/bardeau/Softs/gcc-13.2.0/libexec/gcc/x86_64-pc-linux-gnu/13.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../srcdir/configure --with-gmp=/home/bardeau/Softs/gcc-deps --prefix=/home/bardeau/Softs/gcc-13.2.0 --enable-languages=c,c++,fortran --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 13.2.0 (GCC) Code example: module mymod type mysubtype integer(kind=4), allocatable :: a(:) end type mysubtype type :: mytype integer :: i type(mysubtype) :: sub contains final :: mytype_final end type mytype contains subroutine mysubtype_final(sub) type(mysubtype), intent(inout) :: sub print *,'MYSUBTYPE>FINAL' if (allocated(sub%a)) deallocate(sub%a) end subroutine mysubtype_final subroutine mytype_final(typ) type(mytype), intent(inout) :: typ print *,"MYTYPE>FINAL" call mysubtype_final(typ%sub) end subroutine mytype_final end module mymod ! program myprog use mymod type(mytype), pointer :: c print *,"Before allocation" allocate(c) print *,"After allocation" end program myprog Compilation and execution: ~> gfortran -w test1.f90 -o test1 && ./test1 Before allocation MYTYPE>FINAL MYSUBTYPE>FINAL After allocation The problem is that the FINAL procedure (mytype_final) is invoked at the time the c variable is allocated, which is unexpected. Plus, this behavior is random. If the "sub" component is removed from "mytype", mytype_final is not invoked anymore. I also have a much more complex example where the program crashes because the evaluation "allocated(sub%a)" is incorrect and leads to deallocation of the unallocated "sub%a". All these behaviors are correlated to the presence of the -w option. In order to be complete, I must say that the -w option is not described in https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gfortran/Error-and-Warning-Options.html but it is suggested in -fallow-argument-mismatch (documented here: https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gfortran/Fortran-Dialect-Options.html ). In practice it can be used for example with: subroutine test call foo() call foo(1) end subroutine test ~> gfortran -c -fallow-argument-mismatch -w test2.f90 which shows no warning thanks to -w.
[Bug ada/111813] Inconsistent limit in Ada.Calendar.Formatting
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111813 --- Comment #1 from CVS Commits --- The master branch has been updated by Marc Poulhi?s : https://gcc.gnu.org/g:a80daa2e52ab8fd8a83eec1379b4a5d4187a1162 commit r14-5282-ga80daa2e52ab8fd8a83eec1379b4a5d4187a1162 Author: Simon Wright Date: Mon Oct 16 14:32:43 2023 +0100 Fix PR ada/111813 (Inconsistent limit in Ada.Calendar.Formatting) The description of the second Value function (returning Duration) (ARM 9.6.1(87) doesn't place any limitation on the Elapsed_Time parameter's value, beyond "Constraint_Error is raised if the string is not formatted as described for Image, or the function cannot interpret the given string as a Duration value". It would seem reasonable that Value and Image should be consistent, in that any string produced by Image should be accepted by Value. Since Image must produce a two-digit representation of the Hours, there's an implication that its Elapsed_Time parameter should be less than 100.0 hours (the ARM merely says that in that case the result is implementation-defined). The current implementation of Value raises Constraint_Error if the Elapsed_Time parameter is greater than or equal to 24 hours. This patch removes the restriction, so that the Elapsed_Time parameter must only be less than 100.0 hours. 2023-10-15 Simon Wright PR ada/111813 gcc/ada/ * libgnat/a-calfor.adb (Value (2)): Allow values of parameter Elapsed_Time greater than or equal to 24 hours, by doing the hour calculations in Natural rather than Hour_Number (0 .. 23). Calculate the result directly rather than by using Seconds_Of (whose Hour parameter is of type Hour_Number). If an exception occurs of type Constraint_Error, re-raise it rather than raising a new CE. gcc/testsuite/ * gnat.dg/calendar_format_value.adb: New test.
[Bug fortran/112407] [13/14 Regression] Fix for PR37336 triggers an ICE in gfc_format_decoder while constructing a vtab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112407 --- Comment #6 from Paul Thomas --- (In reply to Tomáš Trnka from comment #5) > I'm looking forward to any more information on the root cause. I have failed to produce a compact reproducer that resembles your bug. In fact, you will note the first comment in the reproducer below, which is a bit ironic :-). You will note the commented out assignment and select type block. These generate the exact error. ie. Whenever 'new_t' appears in a variable expression the error is triggered. I am deeply puzzled and will have another go at achieving some enlightenment tomorrow. Paul module m private new_t type s procedure(),pointer,nopass :: op end type type :: t integer :: i type (s) :: s contains procedure :: new_t procedure :: bar procedure :: add_t generic :: new => new_t, bar generic, public :: assignment(=) => add_t final :: final_t end type integer :: i = 0, finals = 0 contains ! recursive subroutine new_t (arg1, arg2) ! gfortran doesn't detect the recursion subroutine new_t (arg1, arg2)! in 'new_t'! Other brands do. class(t), intent(out) :: arg1 type(t), intent(in) :: arg2 i = i + 1 !arg1%s%op => new_t ! This generates the error !select type (arg1) ! As does this ! type is (t) !arg1 = t(arg1%i,s(new_t)) !end select print *, "new_t" if (i .ge. 10) return !arg1 = arg2 ! gfortran does not detect the recursion if (arg1%i .ne. arg2%i) then ! According to F2018(8.5.10), arg1 should be arg1%i = arg2%i! undefined on invocation, unless any sub-components call arg1%new(arg2)! are default initialised. gfortran sets arg1%i = 0 endif! gfortran misses this recursion end subroutine bar(arg) class(t), intent(out) :: arg call arg%new(t(42, s(new_t))) end subroutine add_t (arg1, arg2) class(t), intent(out) :: arg1 type(t), intent(in) :: arg2 call arg1%new (arg2) end impure elemental subroutine final_t (arg1) type(t), intent(in) :: arg1 finals = finals + 1 end end use m class(t), allocatable :: x allocate(x) call x%new() ! gfortran ouputs 10*'new_t' print *, x%i, i, finals!-||- 0 10 11 ! ! The other brands output 2*'new_t' + 42 2 3 end
[Bug fortran/112460] New: ICE with parameterized derived types (incorrect code, should be rejected)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112460 Bug ID: 112460 Summary: ICE with parameterized derived types (incorrect code, should be rejected) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: juergen.reuter at desy dot de Target Milestone: --- This is probably known (then it can be marked as duplicate), but let me report it nevertheless. The following code should be rejected, but leads to an ICE: 27 | print *, cc | 1 internal compiler error: Segmentation fault: 11 Reproducer: module color_propagator implicit none private public :: open_epsilon, closed_epsilon type :: open_epsilon integer, dimension(2) :: i end type open_epsilon type :: closed_epsilon integer, dimension(3) :: i end type closed_epsilon public :: t type :: t (n_in, n_out) integer, len :: n_in = 0, n_out = 0 logical :: is_ghost = .false. integer, dimension(n_in) :: in integer, dimension(n_out) :: out type(open_epsilon), dimension(:), allocatable :: open_eps, open_eps_bar type(closed_epsilon), dimension(:), allocatable :: closed_eps, closed_eps_bar end type t end module color_propagator program foo use color_propagator type(t(n_in=2,n_out=1)), save :: aa type(t(n_in=1,n_out=2)), save :: bb type(t), dimension(2), save :: cc cc = [aa, bb] print *, cc end program foo
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #74 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:2d44ab221f64f01fc676be0da1a6774740d713c6 commit r14-5283-g2d44ab221f64f01fc676be0da1a6774740d713c6 Author: Tamar Christina Date: Thu Nov 9 13:58:59 2023 + middle-end: expand copysign handling from lockstep to nested iters various optimizations in match.pd only happened on COPYSIGN in lock step which means they exclude IFN_COPYSIGN. COPYSIGN however is restricted to only the C99 builtins and so doesn't work for vectors. The patch expands these optimizations to work as nested iters. This is needed for the second patch which will add the testcase. gcc/ChangeLog: PR tree-optimization/109154 * match.pd: expand existing copysign optimizations.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #75 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:3f176e1adc6bc9cc2c21222d776b51d9f43cb66b commit r14-5284-g3f176e1adc6bc9cc2c21222d776b51d9f43cb66b Author: Tamar Christina Date: Thu Nov 9 13:59:39 2023 + middle-end: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154] This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more canonical and allows a target to expand this sequence efficiently. Such sequences are common in scientific code working with gradients. There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x)) which I remove since this is a less efficient form. The testsuite is also updated in light of this. gcc/ChangeLog: PR tree-optimization/109154 * match.pd: Add new neg+abs rule, remove inverse copysign rule. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.dg/fold-copysign-1.c: Updated. * gcc.dg/pr55152-2.c: Updated. * gcc.dg/tree-ssa/abs-4.c: Updated. * gcc.dg/tree-ssa/backprop-6.c: Updated. * gcc.dg/tree-ssa/copy-sign-2.c: Updated. * gcc.dg/tree-ssa/mult-abs-2.c: Updated. * gcc.target/aarch64/fneg-abs_1.c: New test. * gcc.target/aarch64/fneg-abs_2.c: New test. * gcc.target/aarch64/fneg-abs_3.c: New test. * gcc.target/aarch64/fneg-abs_4.c: New test. * gcc.target/aarch64/sve/fneg-abs_1.c: New test. * gcc.target/aarch64/sve/fneg-abs_2.c: New test. * gcc.target/aarch64/sve/fneg-abs_3.c: New test. * gcc.target/aarch64/sve/fneg-abs_4.c: New test.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #76 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:f30ecd8050444fb902ab66b4600c590908861fdf commit r14-5285-gf30ecd8050444fb902ab66b4600c590908861fdf Author: Tamar Christina Date: Thu Nov 9 14:00:20 2023 + ifcvt: Add support for conditional copysign This adds a masked variant of copysign. Nothing very exciting just the general machinery to define and use a new masked IFN. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Note: This patch is part of a testseries and tests for it are added in the AArch64 patch that adds supports for the optab. gcc/ChangeLog: PR tree-optimization/109154 * internal-fn.def (COPYSIGN): New. * match.pd (UNCOND_BINARY, COND_BINARY): Map IFN_COPYSIGN to IFN_COND_COPYSIGN. * optabs.def (cond_copysign_optab, cond_len_copysign_optab): New.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #79 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:ffd40d3b233d63c925cceb0dcd5a4fc8925e2993 commit r14-5288-gffd40d3b233d63c925cceb0dcd5a4fc8925e2993 Author: Tamar Christina Date: Thu Nov 9 14:18:48 2023 + AArch64: Use SVE unpredicated LOGICAL expressions when Advanced SIMD inefficient [PR109154] SVE has much bigger immediate encoding range for bitmasks than Advanced SIMD has and so on a system that is SVE capable if we need an Advanced SIMD Inclusive-OR by immediate and would require a reload then use an unpredicated SVE ORR instead. This has both speed and size improvements. gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (3): Add SVE split case. * config/aarch64/aarch64-simd.md (ior3): Likewise. * config/aarch64/predicates.md(aarch64_orr_imm_sve_advsimd): New. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/sve/fneg-abs_1.c: Updated. * gcc.target/aarch64/sve/fneg-abs_2.c: Updated. * gcc.target/aarch64/sve/fneg-abs_4.c: Updated.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #77 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:2ea13fb9c0b56e9b8c0425d101cf81437a5200cf commit r14-5286-g2ea13fb9c0b56e9b8c0425d101cf81437a5200cf Author: Tamar Christina Date: Thu Nov 9 14:02:21 2023 + AArch64: Add special patterns for creating DI scalar and vector constant 1 << 63 [PR109154] This adds a way to generate special sequences for creation of constants for which we don't have single instructions sequences which would have normally lead to a GP -> FP transfer or a literal load. The patch starts out by adding support for creating 1 << 63 using fneg (mov 0). gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64-protos.h (aarch64_simd_special_constant_p, aarch64_maybe_generate_simd_constant): New. * config/aarch64/aarch64-simd.md (*aarch64_simd_mov, *aarch64_simd_mov): Add new coden for special constants. * config/aarch64/aarch64.cc (aarch64_extract_vec_duplicate_wide_int): Take optional mode. (aarch64_simd_special_constant_p, aarch64_maybe_generate_simd_constant): New. * config/aarch64/aarch64.md (*movdi_aarch64): Add new codegen for special constants. * config/aarch64/constraints.md (Dx): new. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/fneg-abs_1.c: Updated. * gcc.target/aarch64/fneg-abs_2.c: Updated. * gcc.target/aarch64/fneg-abs_4.c: Updated. * gcc.target/aarch64/dbl_mov_immediate_1.c: Updated.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #80 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:ed2e058c58ab064fe3a26bc4a47a5d0a47350f97 commit r14-5289-ged2e058c58ab064fe3a26bc4a47a5d0a47350f97 Author: Tamar Christina Date: Thu Nov 9 14:04:57 2023 + AArch64: Handle copysign (x, -1) expansion efficiently copysign (x, -1) is effectively fneg (abs (x)) which on AArch64 can be most efficiently done by doing an OR of the signbit. The middle-end will optimize fneg (abs (x)) now to copysign as the canonical form and so this optimizes the expansion. If the target has an inclusive-OR that takes an immediate, then the transformed instruction is both shorter and faster. For those that don't, the immediate has to be separately constructed, but this still ends up being faster as the immediate construction is not on the critical path. Note that this is part of another patch series, the additional testcases are mutually dependent on the match.pd patch. As such the tests are added there insteadof here. gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (copysign3): Handle copysign (x, -1). * config/aarch64/aarch64-simd.md (copysign3): Likewise. * config/aarch64/aarch64-sve.md (copysign3): Likewise.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #78 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:830460d67a10549939602ba323ea3fa65fb7de20 commit r14-5287-g830460d67a10549939602ba323ea3fa65fb7de20 Author: Tamar Christina Date: Thu Nov 9 14:03:04 2023 + AArch64: Add movi for 0 moves for scalar types [PR109154] Following the Neoverse N/V and Cortex-A optimization guides SIMD 0 immediates should be created with a movi of 0. At the moment we generate an `fmov .., xzr` which is slower and requires a GP -> FP transfer. gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (*mov_aarch64, *movsi_aarch64, *movdi_aarch64): Add new w -> Z case. * config/aarch64/iterators.md (Vbtype): Add QI and HI. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/fneg-abs_2.c: Updated. * gcc.target/aarch64/fneg-abs_4.c: Updated. * gcc.target/aarch64/dbl_mov_immediate_1.c: Updated.
[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 --- Comment #81 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:e01c2eeb2b654abc82378e204da8327bcdaf05dc commit r14-5290-ge01c2eeb2b654abc82378e204da8327bcdaf05dc Author: Tamar Christina Date: Thu Nov 9 14:05:40 2023 + AArch64: Add SVE implementation for cond_copysign. This adds an implementation for masked copysign along with an optimized pattern for masked copysign (x, -1). gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64-sve.md (cond_copysign): New. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.target/aarch64/sve/fneg-abs_5.c: New test.
[Bug tree-optimization/109154] [13 regression] jump threading de-optimizes nested floating point comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154 Tamar Christina changed: What|Removed |Added Summary|[13/14 regression] jump |[13 regression] jump |threading de-optimizes |threading de-optimizes |nested floating point |nested floating point |comparisons |comparisons Status|NEW |RESOLVED Target Milestone|13.3|14.0 Resolution|--- |FIXED --- Comment #82 from Tamar Christina --- This should give better performance then GCC-12. The patches are not backportable so closing as resolved in GCC-14.
[Bug target/112308] [14 Regression] GCN: 'error: literal operands are not supported' for 'v_add_co_u32'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112308 Andrew Stubbs changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2023-11-09 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |ams at gcc dot gnu.org
[Bug ada/112461] New: [14 regression] Simple return inside extended return loses updates to return object value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112461 Bug ID: 112461 Summary: [14 regression] Simple return inside extended return loses updates to return object value Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada Assignee: unassigned at gcc dot gnu.org Reporter: simon at pushface dot org CC: dkm at gcc dot gnu.org Target Milestone: --- In a complicated extended return[1] with this structure -- Calculate the sum of the natural numbers up to & including the -- given limit. function Add (Up_To : Natural) return Natural is Round : Natural := 0; begin return Result : Natural := 0 do loop Result := Result + Round; if Round = Up_To then return; end if; Round := Round + 1; end loop; end return; end Add; what was returned was the equivalent of the initial value (here, 0) rather than the value as updated in the loop (NB! this simple loop doesn’t fail, I only include it as an example, since I don’t have a simple reproducer). The problem was introduced after 20231008. [1] https://github.com/alire-project/alire/blob/a69ac7c7a24590bdfe1ca77bcb60386551989696/src/alire/alire-properties-from_toml.adb#L14
[Bug c++/112456] Diagnostic for [[nodiscard]] on a constructor could be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112456 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org Last reconfirmed||2023-11-09 Status|UNCONFIRMED |NEW Ever confirmed|0 |1
[Bug target/112462] New: RISC-V zicond cost model enhancements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112462 Bug ID: 112462 Summary: RISC-V zicond cost model enhancements Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- Currently the costing of zicond always returns COSTS_N_INSNS (1) which can be inaccurate. I see two primary issues that need to be fixed. First, for conditions which are not equality comparisons against zero the expander will need to emit a sCC insn. That additional instruction needs to be included in the cost. Second, the expander needs to look at the true/false arms and potentially emit additional code because of the limitations of the czero instruction. Those additional instructions need to be included in the cost as well. It's unclear if we should refactor the expander logic so that its basic structure can be used to drive costing as well as expansion logic or if we should just mirror the basic structure with new code and keep it in sync with the expander logic.
[Bug libstdc++/111667] [C++23] Implement P1132R8, out_ptr - a scalable output pointer abstraction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111667 Jonathan Wakely changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org --- Comment #2 from Jonathan Wakely --- Mine.
[Bug modula2/112110] fails to build on freebsd when compiling wrapclock.cc in wrapclock_timezone attempting to return timezone
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112110 --- Comment #4 from CVS Commits --- The releases/gcc-13 branch has been updated by Gaius Mulley : https://gcc.gnu.org/g:61025fbaf989a57ebf44f76d397fb895be0210ac commit r13-8033-g61025fbaf989a57ebf44f76d397fb895be0210ac Author: Gaius Mulley Date: Thu Nov 9 16:14:43 2023 + PR modula2/112110: fails to build on freebsd when compiling wrapclock.cc This patch fixes a mangled #if #endif conditional section within wrapclock.cc. The conditional section in wrapclock_timezone should return 0 rather than return timezone. libgm2/ChangeLog: PR modula2/112110 * libm2iso/wrapclock.cc (timezone): Return 0 if unable to get the timezone from the tm struct. Signed-off-by: Gaius Mulley
[Bug c/112442] Segfault from casting a ptr when using -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442 --- Comment #10 from Xi Ruoyao --- (In reply to Adam Andersson from comment #9) > I was sure I had tried -fno-strict-aliasing without any difference, but I > guessed I messed up somehow. Sorry about that. > > Still, is it not strange that -Wall doesn't generate a warning about this > then? -Wall only enables -Wstrict-aliasing=3 which may have false negatives. -Wstrict-aliasing=1 or -Wstrict-aliasing=2 warns about this, but generally they can produce many false positives (as they are documented). Generally it's impossible to make a reliable way to detect aliasing violation at compile time. For runtime checking LLVM folks were developing a Type Sanitizer (https://llvm.org/devmtg/2017-10/slides/Finkel-The%20Type%20Sanitizer.pdf) but the development seems stalled now. Thus we document "try -fno-strict-aliasing" in the "new bug" page as a "not so bad" way to rule out aliasing issues (it's only "not so bad", not "very good" because it may still hide real bugs).
[Bug c/112463] New: ternary operator / -Wsign-compare inconsistency
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112463 Bug ID: 112463 Summary: ternary operator / -Wsign-compare inconsistency Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: vincent-gcc at vinc17 dot net Target Milestone: --- -Wsign-compare is described in the man page as follows: -Wsign-compare Warn when a comparison between signed and unsigned values could produce an incorrect result when the signed value is converted to unsigned. In C++, this warning is also enabled by -Wall. In C, it is also enabled by -Wextra. But it can emit a warning even in the absence of comparisons between signed and unsigned values. For instance, it can appear due to the 2nd and 3rd operands of the ternary operator (these operands are not compared, just selected from the value of the first operand). This affects the warning output by -Wextra. Consider the following C code: #include int main (void) { for (int c = -1; c <= 1; c++) { long long i = c == 0 ? 0LL : (c >= 0 ? 1U : -1), j = c >= 0 ? (c == 0 ? 0LL : 1U) : -1; printf ("i = %lld\nj = %lld\n", i, j); } return 0; } (which shows that the ternary operator is not associative due to type conversions). With -Wextra, I get: ternary-op.c: In function ‘main’: ternary-op.c:7:43: warning: operand of ‘?:’ changes signedness from ‘int’ to ‘unsigned int’ due to unsignedness of other operand [-Wsign-compare] 7 | i = c == 0 ? 0LL : (c >= 0 ? 1U : -1), | ^~ But the "-Wsign-compare" is incorrect as there are no comparisons between signed and unsigned values. Only -Wsign-conversion should trigger a warning.
[Bug tree-optimization/112464] New: [14 Regression] ICE avx512 with -ftrapv since r14-5076
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 Bug ID: 112464 Summary: [14 Regression] ICE avx512 with -ftrapv since r14-5076 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: mjires at suse dot cz CC: rdapp at gcc dot gnu.org Target Milestone: --- Following reduction of testcase avx-vandpd-1.c causes ICE with -mavx512dq -ftrapv Bisection points to r14-5076 $ cat included.c long *e; int n, i, err; void fn() { for (; i < n; i++) if (e[i]) err++; } $ gcc included.c -Ofast -mavx512dq -ftrapv during GIMPLE pass: vect included.c: In function ‘fn’: included.c:3:6: internal compiler error: in vect_finish_replace_stmt, at tree-vect-stmts.cc:1353 3 | void fn() { | ^~ 0x91fc95 vect_finish_replace_stmt(vec_info*, _stmt_vec_info*, gimple*) /home/mjires/git/GCC/master/gcc/tree-vect-stmts.cc:1353 0x1296bd6 vectorize_fold_left_reduction /home/mjires/git/GCC/master/gcc/tree-vect-loop.cc:7191 0x1296bd6 vect_transform_reduction(_loop_vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, gimple**, _slp_tree*) /home/mjires/git/GCC/master/gcc/tree-vect-loop.cc:8456 0x1fac0f5 vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*) /home/mjires/git/GCC/master/gcc/tree-vect-stmts.cc:13069 0x1287d1f vect_transform_loop_stmt /home/mjires/git/GCC/master/gcc/tree-vect-loop.cc:11325 0x12acf13 vect_transform_loop(_loop_vec_info*, gimple*) /home/mjires/git/GCC/master/gcc/tree-vect-loop.cc:11770 0x12ee5a1 vect_transform_loops /home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1006 0x12eebac try_vectorize_loop_1 /home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1152 0x12eebac try_vectorize_loop /home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1182 0x12eef34 execute /home/mjires/git/GCC/master/gcc/tree-vectorizer.cc:1298 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. $ gcc -v Using built-in specs. COLLECT_GCC=/home/mjires/built/master/bin/gcc COLLECT_LTO_WRAPPER=/home/mjires/built/master/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /home/mjires/git/GCC/master/configure --prefix=/home/mjires/built/master --disable-bootstrap --enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer --enable-checking Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.0 20231109 (experimental) (GCC)
[Bug target/97503] Suboptimal use of cntlzw and cntlzd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97503 --- Comment #6 from Uroš Bizjak --- (In reply to LIU Hao from comment #4) > Are there any reasons why this was not done for 64? > (https://gcc.godbolt.org/z/7vddPdxaP) There is zero-extension from the result of __builtin_clzll that confuses optimizers.
[Bug target/97503] Suboptimal use of cntlzw and cntlzd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97503 --- Comment #7 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #6) > (In reply to LIU Hao from comment #4) > > Are there any reasons why this was not done for 64? > > (https://gcc.godbolt.org/z/7vddPdxaP) > > There is zero-extension from the result of __builtin_clzll that confuses > optimizers. Actually, sign-extension, but the result is never sign-extended.
[Bug libstdc++/112348] [C++23] defect in struct hash>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112348 --- Comment #1 from vincenzo Innocente --- This patch works for me diff --git a/libstdc++-v3/include/std/stacktrace b/libstdc++-v3/include/std/stacktrace index da0e48d3532..9a0d0b16068 100644 --- a/libstdc++-v3/include/std/stacktrace +++ b/libstdc++-v3/include/std/stacktrace @@ -797,7 +797,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION size_t operator()(const basic_stacktrace<_Allocator>& __st) const noexcept { - hash __h; + hash __h; size_t __val = _Hash_impl::hash(__st.size()); for (const auto& __f : __st) __val = _Hash_impl::__hash_combine(__h(__f), __val);
[Bug libgcc/112465] New: libgcc: aarch64: lse runtime does not work with big data segments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112465 Bug ID: 112465 Summary: libgcc: aarch64: lse runtime does not work with big data segments Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: jemarch at gcc dot gnu.org Target Milestone: --- While compiling and linking the STREAM benchmark (http://www.cs.virginia.edu/stream/ref.html) in aarch64 with very big arrays, this happens: $ gcc -O2 -DSTREAM_ARRAY_SIZE=178956970 -mcmodel=large -fopenmp -o stream.4gb stream.c libgcc.a(lse-init.o): in function `init_have_lse_atomics': (.text.startup+0x14): relocation truncated to fit: R_AARCH64_ADR_PREL_PG_HI21 against `.bss' libgcc.a(ldadd_4_1.o): in function `__aarch64_ldadd4_relax': (.text+0x4): relocation truncated to fit: R_AARCH64_ADR_PREL_PG_HI21 against symbol `__aarch64_have_lse_atomics' defined in .bss section in collect2: error: ld returned 1 exit status The LSE machinery in libgcc relies on the fact that the global __aarch64_have_lse_atomics is reachable within 4GiB. This is due to code like this: .macroJUMP_IF_NOT_LSE label adrpx(tmp0), __aarch64_have_lse_atomics ldrbw(tmp0), [x(tmp0), :lo12:__aarch64_have_lse_atomics] cbz w(tmp0), \label .endm That is put in the prologue in all LSE instructions in libcc (such as __aarch64_ldadd4_relax in the little reproducer below) and in the initialization routine also part of libgcc: static void __attribute__((constructor (90))) init_have_lse_atomics (void) { unsigned long hwcap = __getauxval (AT_HWCAP); __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0; } The code compiled for the last assignment in that function also makes use of an instruction sequence using adrp. The addressing mode implemented by adrp+ldrb allows to access +-4GiB. In the stream.c benchmark, and also in this little reproducer: static int foo; static double a[178956970],b[178956970],c[178956970]; int main () { #pragma omp atomic foo++; return foo + a[0] + b[0] + c[0]; } The variables a, b and c get allocated as bss. Now, it happens that __aarch64_have_lse_atomics also goes to the bss: /* Define the symbol gating the LSE implementations. */ _Bool __aarch64_have_lse_atomics __attribute__((visibility("hidden"), nocommon)); But _after_ a, b and c. So it is the offset of __aarch64_have_lse_atomics within the bss that is overflowing the relocation for the adrp instruction.
[Bug target/112465] libgcc: aarch64: lse runtime does not work with big data segments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112465 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-11-09 Ever confirmed|0 |1 Component|libgcc |target Keywords||link-failure Target||aarch64 --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/112464] [14 Regression] ICE avx512 with -ftrapv since r14-5076
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0
[Bug c++/89867] internal compiler error: in layout_type, at stor-layout.c:2578
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89867 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #7 from Marek Polacek --- I hit this ICE with: int f (auto(__attribute__((unused)) i));
[Bug tree-optimization/112464] [14 Regression] ICE avx512 with -ftrapv since r14-5076
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 --- Comment #1 from Robin Dapp --- We fail at: void vect_finish_replace_stmt (vec_info *vinfo, stmt_vec_info stmt_info, gimple *vec_stmt) { gimple *scalar_stmt = vect_orig_stmt (stmt_info)->stmt; gcc_assert (gimple_get_lhs (scalar_stmt) == gimple_get_lhs (vec_stmt)); where scalar_stmt = _ifc__40 = .COND_ADD (_22, err_lsm.9_10, 1, err_lsm.9_10); and patt_7 = stmp_patt_7.23_123 + stmp_patt_7.23_124; It happens when we expand the reduction into separate unconditional statements.
[Bug fortran/112459] gfortran -w option causes derived-type finalization at creation time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112459 anlauf at gcc dot gnu.org changed: What|Removed |Added CC||pault at gcc dot gnu.org --- Comment #1 from anlauf at gcc dot gnu.org --- When I compile the code with -std=f2008, I get: pr112459.f90:28:13: 28 | allocate(c) | 1 Warning: The structure constructor at (1) has been finalized. This feature was removed by f08/0011. Use -std=f2018 or -std=gnu to eliminate the finalization. It behaves as you expect if I specify -std=gnu or -std=f2018. Trying several combinations, it appears the following variants work: -std=gnu -std=f2018 -std=f2018 -w and these "fail": -w -std=f2008 -std=f2008 -w -std=gnu -w Note that default is -std=gnu . Now I wonder how -w interferes with -std=gnu ...
[Bug c/112449] Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 --- Comment #9 from joseph at codesourcery dot com --- To quote the C23 DIS, "This annex does not require the full support for signaling NaNs specified in IEC 60559. This annex uses the term NaN, unless explicitly qualified, to denote quiet NaNs.". Support for signaling NaNs is indicated by FE_SNANS_ALWAYS_SIGNAL in , which glibc makes sure to define only if __SUPPORT_SNAN__ (which is defined by GCC if -fsignaling-nans). If -fsignaling-nans is not used, you should not expect consistency in whether a signaling NaN is handled differently from a quiet NaN (including whether optimizations might be applied that result in a signaling NaN result from an operation that can't produce such a result with IEEE signaling NaN semantics).
[Bug target/112454] csinc (csel is though) is not being used when there is matches twice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112454 --- Comment #1 from Andrew Pinski --- here is another testcase which shows the issue with pulling the constant one out of the loop when it could have been merged with the csel to use csinc: ``` int f(int *a, int n, int *b, int d) { for(int i = 0; i < n; i++) b[i] = a[i] == 100 ? 1 : d; return 0; } ```
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #29 from John David Anglin --- The miscompilation is in compiler_visit_expr: (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/dave/debian/python3.11/python3.11-3.11.6/build-static/Programs/_freeze_module importlib._bootstrap ../Lib/importlib/_bootstrap.py Python/frozen_modules/importlib._bootstrap.h warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available. Breakpoint 2, compiler_jump_if (c=0xf8f02508, e=0x5763f8, next=0xfaeaa908, cond=0) at ../Python/compile.c:2898 2898{ (gdb) watch *0xfaea51b8 Watchpoint 3: *0xfaea51b8 (gdb) c Continuing. Watchpoint 3: *0xfaea51b8 Old value = -85046408 New value = 43 0x0019c688 in compiler_visit_expr (e=0x576308, c=0xf8f02508) at ../Python/compile.c:5968 5968SET_LOC(c, e); (gdb) bt #0 0x0019c688 in compiler_visit_expr (e=0x576308, c=0xf8f02508) at ../Python/compile.c:5968 #1 compiler_call_helper (c=0xf8f02508, n=0, args=, keywords=0x0) at ../Python/compile.c:5138 #2 0x0019ec70 in compiler_visit_expr (e=, c=0xf8f02508) at ../Python/compile.c:5969 #3 compiler_jump_if (c=0xf8f02508, e=, next=0x0, cond=) at ../Python/compile.c:2988 #4 0x001a0770 in compiler_if (s=0x0, c=0x5763c0) at ../Python/compile.c:3090 #5 compiler_visit_stmt (c=0x5763c0, s=0x0) at ../Python/compile.c:4118 #6 0x001a1378 in compiler_for (s=0x0, c=0x5763c0) at ../Python/compile.c:3124 #7 compiler_visit_stmt (c=0x5763c0, s=0x0) at ../Python/compile.c:4114 #8 0x001a3170 in compiler_function (c=0x2, s=, is_async=) at ../Python/compile.c:2670 #9 0x001a3438 in compiler_body (c=0x0, stmts=0x5763c0) at ../Python/compile.c:2180 #10 0x001a5cdc in compiler_mod (mod=0x0, c=0xf8f02528) at ../Python/compile.c:2197 #11 _PyAST_Compile (mod=0x0, filename=0xf8f02528, flags=, optimize=, arena=) at ../Python/compile.c:581 #12 0x001dea00 in Py_CompileStringObject (optimize=0, flags=0x5763c0, start=0, filename=0x2, str=0x0) at ../Python/pythonrun.c:1799 #13 Py_CompileStringExFlags (str=0x0, filename_str=, start=0, --Type for more, q to quit, c to continue without paging-- flags=0x5763c0, optimize=) at ../Python/pythonrun.c:1812 #14 0x000167a4 in compile_and_marshal (text=0x0, name=0x2 ) at ../Programs/_freeze_module.c:125 #15 main (argc=0, argv=) at ../Programs/_freeze_module.c:230 (gdb) diass $pc-16,$pc+16 Undefined command: "diass". Try "help". (gdb) disass $pc-16,$pc+16 Dump of assembler code from 0x19c678 to 0x19c698: 0x0019c678 : ldw 14(r25),ret1 0x0019c67c : ldw 18(r25),r31 0x0019c680 : ldw 1c(r25),ret0 0x0019c684 : stw r23,0(r22) => 0x0019c688 : stw ret1,0(r21) 0x0019c68c : stw r31,0(r20) 0x0019c690 : b,l 0x198d58 ,rp 0x0019c694 : stw ret0,0(r19) End of assembler dump. The code at 0x0019c688 clobbers the value at c->u->u_ste: (gdb) p/x $r21 $35 = 0xfaea51b8 (gdb) p/x *c $36 = {c_filename = 0xfaed9480, c_st = 0xfaeafd10, c_future = 0xfaef7030, c_flags = 0xf8f02544, c_optimize = 0x0, c_interactive = 0x0, c_nestlevel = 0x2, c_const_cache = 0xfae81280, u = 0xfaea51b8, c_stack = 0xfae57a88, c_arena = 0xfaec0c90} (gdb) p/x *c->u $37 = {u_ste = 0x2b, u_name = 0xfae7ff80, u_qualname = 0xfae7ff80, u_scope_type = 0x2, u_consts = 0xfaeaa7f8, u_names = 0xfaeaa7d0, u_varnames = 0xfaeaa780, u_cellvars = 0xfaeaa7a8, u_freevars = 0xfaeaa758, u_private = 0x0, u_argcount = 0x2, u_posonlyargcount = 0x0, u_kwonlyargcount = 0x0, u_blocks = 0xfaeaa908, u_curblock = 0xfaeaa868, u_nfblocks = 0x1, u_fblock = {{fb_type = 0x1, fb_block = 0xfaeaa840, fb_exit = 0xfaeaa8b8, fb_datum = 0x0}, {fb_type = 0x0, fb_block = 0x0, fb_exit = 0x0, fb_datum = 0x0} }, u_firstlineno = 0x28, u_lineno = 0x2b, u_col_offset = 0xb, u_end_lineno = 0x2b, u_end_col_offset = 0x20, u_need_new_implicit_block = 0x0} (gdb) p/x $r23 $38 = 0x2b #define SET_LOC(c, x) \ (c)->u->u_lineno = (x)->lineno; \ (c)->u->u_col_offset = (x)->col_offset; \ (c)->u->u_end_lineno = (x)->end_lineno; \ (c)->u->u_end_col_offset = (x)->end_col_offset; (gdb) p/x *e $40 = {kind = 0x18, v = {BoolOp = {op = 0xfaeb8b60, values = 0x1}, NamedExpr = {target = 0xfaeb8b60, value = 0x1}, BinOp = { left = 0xfaeb8b60, op = 0x1, right = 0x0}, UnaryOp = {op = 0xfaeb8b60, operand = 0x1}, Lambda = {args = 0xfaeb8b60, body = 0x1}, IfExp = { test = 0xfaeb8b60, body = 0x1, orelse = 0x0}, Dict = {keys = 0xfaeb8b60, values = 0x1}, Set = {elts = 0xfaeb8b60}, ListComp = {elt = 0xfaeb8b60, generators = 0x1}, SetComp = {elt = 0xfaeb8b60, generators = 0x1}, DictComp = {key = 0xfaeb8b60, value = 0x1, generators = 0x0}, GeneratorExp = {elt = 0xfaeb8b60, generators = 0x1}, Await = { value = 0xfaeb8b60}, Yield = {value = 0xfaeb8b60}, YieldFrom = {
[Bug modula2/110779] SysClock can not read the clock
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779 Gaius Mulley changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #19 from Gaius Mulley --- Many thanks for spotting this error - will fix!
[Bug target/112465] libgcc: aarch64: lse runtime does not work with big data segments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112465 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from Wilco --- -mcmodel=large is not well supported in general (no support for PIC/PIE, not well optimized or tested). The newly designed medium model will be far better, but until that is implemented it is best to use -mcpu=native and only use -mcmodel=large if there is no other option.
[Bug fortran/112459] gfortran -w option causes derived-type finalization at creation time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112459 Paul Thomas changed: What|Removed |Added Last reconfirmed||2023-11-09 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Blocks||37336 --- Comment #2 from Paul Thomas --- It has been on my TODO list to partially revert the finalization of structure and array constructors so that gfortran behaves like nagfor and only "knows" about F2018. Finalization of constructors per se was removed in F2018 for very good reasons! I am still working on the CLASS variants of PR99065. Perhaps I should put that on one side and clean up finalization? @Sebastien - I will have to see what the -w switch does and why it should affect the code behaviour. I can confirm the behaviour that you describe, as Harald has already done. The F2003 and F2008 specific parts are blocked by: if (!gfc_notification_std (GFC_STD_F2018_DEL) && Evidently switching off the warnings with -w causes this to return true! Cheers Paul Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37336 [Bug 37336] [F03] Finish derived-type finalization
[Bug rtl-optimization/110215] RA fails to allocate register when loop invariant lives across calls and eh
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110215 --- Comment #7 from CVS Commits --- The master branch has been updated by Vladimir Makarov : https://gcc.gnu.org/g:a99f6bb142bc4506dcb8aa2b7722310ad92e4528 commit r14-5294-ga99f6bb142bc4506dcb8aa2b7722310ad92e4528 Author: Vladimir N. Makarov Date: Thu Nov 9 08:51:15 2023 -0500 [IRA]: Fixing conflict calculation from region landing pads. The following patch fixes conflict calculation from exception landing pads. The previous patch processed only one newly created landing pad. Besides it was wrong, it also resulted in large memory consumption by IRA. gcc/ChangeLog: PR rtl-optimization/110215 * ira-lives.cc: (add_conflict_from_region_landing_pads): New function. (process_bb_node_lives): Use it.
[Bug target/112426] sched1 pessimizes codegen on aarch64 by increasing register pressure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112426 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #4 from Wilco --- That first REG_DEAD note after scheduling looks wrong: 15: x0:DI=r93:DI+0x10 REG_DEAD r93:DI 8: [r93:DI]=r98:DI REG_DEAD r98:DI 9: [r93:DI+0x8]=r99:DI
[Bug target/112426] sched1 pessimizes codegen on aarch64 by increasing register pressure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112426 --- Comment #5 from Andrew Pinski --- (In reply to Wilco from comment #4) > That first REG_DEAD note after scheduling looks wrong: > >15: x0:DI=r93:DI+0x10 > REG_DEAD r93:DI > 8: [r93:DI]=r98:DI > REG_DEAD r98:DI > 9: [r93:DI+0x8]=r99:DI IIRC REG_DEADs are updated via df before IRA so they can be ignored here.
[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103 --- Comment #1 from Segher Boessenkool --- Those are: $ diff -up rlwinm-0.s{.12,} --- rlwinm-0.s.12 2023-11-09 18:28:49.362639203 + +++ rlwinm-0.s 2023-11-09 18:30:46.422896735 + @@ -6747,7 +6747,7 @@ f_1_16_31: .LFB345: .cfi_startproc rlwinm 3,3,1,16,31 - rlwinm 3,3,0,0x + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -7645,7 +7645,7 @@ f_1_24_31: .LFB390: .cfi_startproc rlwinm 3,3,1,24,31 - rlwinm 3,3,0,0xff + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -11235,7 +11235,7 @@ f_2_16_31: .LFB570: .cfi_startproc rlwinm 3,3,2,16,31 - rlwinm 3,3,0,0x + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -12133,7 +12133,7 @@ f_2_24_31: .LFB615: .cfi_startproc rlwinm 3,3,2,24,31 - rlwinm 3,3,0,0xff + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -15722,7 +15722,7 @@ f_7_16_31: .LFB795: .cfi_startproc rlwinm 3,3,7,16,31 - rlwinm 3,3,0,0x + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -16620,7 +16620,7 @@ f_7_24_31: .LFB840: .cfi_startproc rlwinm 3,3,7,24,31 - rlwinm 3,3,0,0xff + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -20207,7 +20207,7 @@ f_8_16_31: .LFB1020: .cfi_startproc rlwinm 3,3,8,16,31 - rlwinm 3,3,0,0x + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -24691,7 +24691,7 @@ f_9_16_31: .LFB1245: .cfi_startproc rlwinm 3,3,9,16,31 - rlwinm 3,3,0,0x + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -29174,7 +29174,7 @@ f_15_16_31: .LFB1470: .cfi_startproc rlwinm 3,3,15,16,31 - rlwinm 3,3,0,0x + rldicl 3,3,0,32 blr .long 0 .byte 0,0,0,0,0,0,0,0 @@ -67092,4 +67092,4 @@ f_31_31_31: .cfi_endproc .LFE3375: .size f_31_31_31,.-.L.f_31_31_31 - .ident "GCC: (GNU) 12.0.1 20220406 (experimental)" + .ident "GCC: (GNU) 14.0.0 20231103 (experimental)"
[Bug target/112426] sched1 pessimizes codegen on aarch64 by increasing register pressure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112426 --- Comment #6 from Andrew Pinski --- (In reply to Andrew Pinski from comment #5) > IIRC REG_DEADs are updated via df before IRA so they can be ignored here. Yes see ira in ira.cc: df_note_add_problem (); That will recompute the REG_DEAD (removing all old ones too).
[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103 --- Comment #2 from Segher Boessenkool --- In all those cases the code is perfectly fine, but also in all of those cases the code is still suboptimal: the rldicl is just as superfluous as the second rlwinm was! :-)
[Bug jit/112466] New: Add support for getting the supported CPU features
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112466 Bug ID: 112466 Summary: Add support for getting the supported CPU features Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: jit Assignee: dmalcolm at gcc dot gnu.org Reporter: bouanto at zoho dot com Target Milestone: --- I have an incoming patch for this that I'll post soon.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #30 from John David Anglin --- 0x0019c684 <+588>: stw r23,0(r22) => 0x0019c688 <+592>: stw ret1,0(r21) 0x0019c68c <+596>: stw r31,0(r20) 0x0019c690 <+600>: b,l 0x198d58 ,rp 0x0019c694 <+604>: stw ret0,0(r19) These instructions are in a loop: /* No * or ** args, so can use faster calling sequence */ for (i = 0; i < nelts; i++) { expr_ty elt = asdl_seq_GET(args, i); assert(elt->kind != Starred_kind); VISIT(c, expr, elt); } r21 is clobbered by VISIT call. Value is okay in first iteration. The initialization instructions are outside the loop: 0x0019c638 <+512>: ldo 184(r19),r22 0x0019c63c <+516>: ldw 184(r19),r14 0x0019c640 <+520>: ldo 188(r19),r21 0x0019c644 <+524>: ldw 188(r19),r13 0x0019c648 <+528>: ldo 18c(r19),r20 0x0019c64c <+532>: ldw 18c(r19),r12 0x0019c650 <+536>: ldw 190(r19),r11
[Bug libstdc++/112467] New: [14 Regression] libstdc++ fails to build on clang: bits/stl_bvector.h:189:23: error: '__assume__' attribute cannot be applied to a statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112467 Bug ID: 112467 Summary: [14 Regression] libstdc++ fails to build on clang: bits/stl_bvector.h:189:23: error: '__assume__' attribute cannot be applied to a statement Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: slyfox at gcc dot gnu.org Target Milestone: --- Build failure is probably introduced by r14-5260-ge39b3e02c27bd7 and fails on clang-16 as: // $ cat a.cc #include $ clang++ -c a.cc In file included from a.cc:1: In file included from /<>/gcc-14.0.0/include/c++/14.0.0/vector:67: /<>/gcc-14.0.0/include/c++/14.0.0/bits/stl_bvector.h:189:23: error: '__assume__' attribute cannot be applied to a statement __attribute__ ((__assume__ (__ofst < unsigned(_S_word_bit; ^~ 1 error generated. I think it happens because `clang` implements different `assume` attribute compared to `gcc`: https://clang.llvm.org/docs/AttributeReference.html#assume
[Bug c++/96213] GCC doesn't complain about ill-formed non-dependent template default argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96213 Patrick Palka changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||ppalka at gcc dot gnu.org Target Milestone|--- |13.0 --- Comment #3 from Patrick Palka --- Looks like we reject the first testcase since r13-6380, and the second since r11-434, so we can close this PR. Adding a new regression test doesn't seem necessary in either case since these are manifestations of more general issues (recognition of non-dependent variable template-ids and instantiation of non-dependent decltype) that are already captured by existing tests.
[Bug libstdc++/112467] [14 Regression] libstdc++ fails to build on clang: bits/stl_bvector.h:189:23: error: '__assume__' attribute cannot be applied to a statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112467 --- Comment #1 from Andrew Pinski --- GCC/GNUC's assume is the same as the way C++23 attribute is defined ... Looks like clang decided to implement an attribute assume which is totally different ...
[Bug libstdc++/112467] [14 Regression] libstdc++ fails to build on clang: bits/stl_bvector.h:189:23: error: '__assume__' attribute cannot be applied to a statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112467 --- Comment #2 from Sergei Trofimovich --- Filed a feature request on `clang` side to consider implementing it: https://github.com/llvm/llvm-project/issues/71858 Meanwhile would it be reasonable to enable the attribute only for `gcc`?
[Bug libstdc++/112467] [14 Regression] libstdc++ fails to build on clang: bits/stl_bvector.h:189:23: error: '__assume__' attribute cannot be applied to a statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112467 --- Comment #3 from Andrew Pinski --- (In reply to Sergei Trofimovich from comment #2) > Meanwhile would it be reasonable to enable the attribute only for `gcc`? Or rather for !__CLANG__ :).
[Bug c++/105996] [11 Regression] reinterpret_cast in constexpr failure creating a pair with a function pointer of class parent
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105996 Patrick Palka changed: What|Removed |Added See Also|https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=102637| CC||officesamurai at gmail dot com --- Comment #10 from Patrick Palka --- *** Bug 102637 has been marked as a duplicate of this bug. ***
[Bug c++/102637] "Error: ‘reinterpret_cast’ is not a constant expression" when no reinterpret_cast is involved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102637 Patrick Palka changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||ppalka at gcc dot gnu.org Resolution|--- |DUPLICATE See Also|https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=105996| --- Comment #2 from Patrick Palka --- looks like this is pretty much a dup of PR105996, which has been fixed for 10.5/11.5/12.3/13 *** This bug has been marked as a duplicate of bug 105996 ***
[Bug c++/101603] [meta-bug] pointer to member functions issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101603 Bug 101603 depends on bug 102637, which changed state. Bug 102637 Summary: "Error: ‘reinterpret_cast’ is not a constant expression" when no reinterpret_cast is involved https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102637 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug tree-optimization/112468] New: [14 Regression] Missed phi-opt after recent change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112468 Bug ID: 112468 Summary: [14 Regression] Missed phi-opt after recent change Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- This change: commit 3f176e1adc6bc9cc2c21222d776b51d9f43cb66b (HEAD) Author: Tamar Christina Date: Thu Nov 9 13:59:39 2023 + middle-end: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154] This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more canonical and allows a target to expand this sequence efficiently. Such sequences are common in scientific code working with gradients. There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x)) which I remove since this is a less efficient form. The testsuite is also updated in light of this. gcc/ChangeLog: PR tree-optimization/109154 * match.pd: Add new neg+abs rule, remove inverse copysign rule. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.dg/fold-copysign-1.c: Updated. * gcc.dg/pr55152-2.c: Updated. * gcc.dg/tree-ssa/abs-4.c: Updated. * gcc.dg/tree-ssa/backprop-6.c: Updated. * gcc.dg/tree-ssa/copy-sign-2.c: Updated. * gcc.dg/tree-ssa/mult-abs-2.c: Updated. * gcc.target/aarch64/fneg-abs_1.c: New test. * gcc.target/aarch64/fneg-abs_2.c: New test. * gcc.target/aarch64/fneg-abs_3.c: New test. * gcc.target/aarch64/fneg-abs_4.c: New test. * gcc.target/aarch64/sve/fneg-abs_1.c: New test. * gcc.target/aarch64/sve/fneg-abs_2.c: New test. * gcc.target/aarch64/sve/fneg-abs_3.c: New test. * gcc.target/aarch64/sve/fneg-abs_4.c: New test. Is causing a testsuite regression on moxie-elf. This is a scan dump failure, so you don't need a full toolchain, just a cross compiler. moxie-sim: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
[Bug tree-optimization/112468] [14 Regression] Missed phi-opt after recent change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112468 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0
[Bug libstdc++/112453] : __take_of_repeat_view/__drop_of_repeat_view should forwards __r._M_value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112453 --- Comment #2 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:d63282fa5b587f1b994210212f236b998a332995 commit r14-5297-gd63282fa5b587f1b994210212f236b998a332995 Author: Patrick Palka Date: Thu Nov 9 15:15:08 2023 -0500 libstdc++: Fix forwarding in __take/drop_of_repeat_view [PR112453] We need to respect the value category of the repeat_view passed to these two functions when accessing the view's _M_value member. This revealed that the space-efficient partial specialization of __box lacks && overloads of operator* to match those of the primary template (inherited from std::optional). PR libstdc++/112453 libstdc++-v3/ChangeLog: * include/std/ranges (__detail::__box<_Tp>::operator*): Define && overloads as well. (__detail::__take_of_repeat_view): Forward __r when accessing its _M_value member. (__detail::__drop_of_repeat_view): Likewise. * testsuite/std/ranges/repeat/1.cc (test07): New test. Reviewed-by: Jonathan Wakely
[Bug tree-optimization/112402] [11/12/13/14 Regression] Path splitting causes if-conversion miss
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112402 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Last reconfirmed||2023-11-09 Ever confirmed|0 |1 --- Comment #3 from Andrew Pinski --- I have a patch which moves split paths to RTL like it should be. Waiting on legal though.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #31 from Jeffrey A. Law --- IIRC r21 is call-clobbered. So I guess the question turns into what was the sequence before f-m-o got involved -- was it assuming r21 would be preserved, or did f-m-o make r21 live across the call?