[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #34 from Richard Biener --- I can confirm this observation on Zen2. Note perf still records STLF failures for these cases it just seems that the penalties are well hidden with the high store load on the caller side for small NUM? I'm not sure how well CPUs handle OOO execution across calls here, but I'm guessing that for cray there's only dependent instructions on the STLF failing loads while for your test the result is always stored to memory. Not sure if there's a good way to add in a "serializing" instruction instead of a store - if we'd write vector code directly we'd return an accumulation vector and pass that as input to further iterations but that makes it difficult to compare to the scalar variant. If we look at double NUM2 scalar: 2.67811 vec: 2.46635: no!! vecn: 2.22982 then we do see a slight penalty to the case with successful STLF, but I suspect the main load of the test is the 9 vector stores in the caller. What's odd though is NUM4 scalar: 3.19169 vec: 8.2489: penalty vecn: 2.25086 we still have the "same" assembly in foo, just using %ymm instead of %xmm I'll also note that foo2n vs foo2 access stores of different distance: void __attribute__ ((noipa)) foo (TYPE* x, TYPE* y, TYPE* __restrict p) { p[0] = x[0] + y[0]; p[1] = x[1] + y[1]; } vs. void __attribute__ ((noipa)) foo (TYPE* x, TYPE* y, TYPE* __restrict p) { p[0] = x[15] + y[15]; p[1] = x[16] + y[16]; } shouldn't the former access x[14] and x[15]? Also on Zen2 using 512 byte vector stores in main() causes them to be decomposed to 128 byte vector stores, not in generic vector lowering which should choose 256 byte vector stores but during RTL expansion. So we have to avoid this, otherwise the vecn cases with larger vector sizes will fail to STLF as well. With the two possible issues resolved I get char NUM2 scalar: 2.61746 vec: 6.99399 vecn: 2.17881 NUM4 scalar: 3.04455 vec: 5.6571 vecn: 2.17512 NUM8 scalar: 3.99576 vec: 5.64829 vecn: 2.18647 NUM16 scalar: 5.71159 vec: 5.70879 vecn: 2.222 short NUM2 scalar: 2.63836 vec: 5.92917 vecn: 2.22295 NUM4 scalar: 3.07966 vec: 5.93041 vecn: 2.22694 NUM8 scalar: 4.14134 vec: 6.16279 vecn: 2.29287 NUM16 scalar: 5.96713 vec: 5.91371 vecn: 2.29854 int NUM2 scalar: 2.74058 vec: 2.51288 vecn: 2.28018 NUM4 scalar: 3.22811 vec: 2.53454 vecn: 2.30637 NUM8 scalar: 4.14464 vec: 6.84145 vecn: 2.30211 NUM16 scalar: 5.97653 vec: 7.28825 vecn: 2.52693 int64_t NUM2 scalar: 2.75497 vec: 2.51353 vecn: 2.29852 NUM4 scalar: 3.20552 vec: 8.02914 vecn: 2.28612 NUM8 scalar: 4.1486 vec: 8.40673 vecn: 2.54104 NUM16 scalar: 5.96569 vec: 8.03334 vecn: 2.98774 float NUM2 scalar: 2.74666 vec: 2.53057 vecn: 2.29079 NUM4 scalar: 3.22499 vec: 2.52525 vecn: 2.29374 NUM8 scalar: 4.12471 vec: 7.33367 vecn: 2.30114 NUM16 scalar: 6.27016 vec: 7.78154 vecn: 2.53966 double NUM2 scalar: 2.76049 vec: 2.52339 vecn: 2.31286 NUM4 scalar: 3.25052 vec: 8.09372 vecn: 2.31465 NUM8 scalar: 4.19226 vec: 8.90108 vecn: 2.56059 NUM16 scalar: 6.32366 vec: 8.22693 vecn: 3.00417 Note Zen2 has comparatively few entries in the store queue, 22 when SMT is enabled (the 44 are statically partitioned). What I take away from this is that modern OOO archs do not benefit much from short sequences of low-lane vectorized code (here in particular NUM2) since there's a good chance there's enough resources to carry out the scalar variant in parallel.
[Bug d/103528] [12 regression] d21 doesn't build on Solaris
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103528 --- Comment #13 from CVS Commits --- The master branch has been updated by Rainer Orth : https://gcc.gnu.org/g:1375e2b62332351a8f9c928421cd1ea8b53c5127 commit r12-7610-g1375e2b62332351a8f9c928421cd1ea8b53c5127 Author: Rainer Orth Date: Fri Mar 11 09:37:44 2022 +0100 libphobos: Enable on Solaris/SPARC or with /bin/as [PR 103528] libphobos is currently only enabled on Solaris/x86 with gas. As discovered when gdc was switched to the dmd frontend, this initially broke bootstrap for the other Solaris configurations. However, it's now well possible to enable it both for Solaris/x86 with as and Solaris/SPARC (both as and gas) since the original problems (x86 as linelength limit, among others) are long gone. The following patch does just that. Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (both as and gas) with gdc 9.3.0 (x86) resp. 9.4.0 (sparc, configured with --enable-libphobos) as bootstrap compilers. 2021-12-01 Rainer Orth libphobos: PR d/103528 * configure.ac : Remove gas requirement. * configure: Regenerate. * configure.tgt (sparc*-*-solaris2.11*): Mark supported.
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #35 from Hongtao.liu --- (In reply to Richard Biener from comment #34) > I can confirm this observation on Zen2. Note perf still records STLF > failures penalty is much higher on Znver3 than zen2 for the same case(v2df).
[Bug testsuite/104732] gcc.target/i386/pr100711-1.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104732 --- Comment #7 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #6 from Roger Sayle --- > This should now be fixed on mainline. Rainer please let me know if you notice > any remaining issues on solaris/x86. I've now run bootstraps with and without your patch: no regressions apart from the usual flakey tests. Thanks!
[Bug c++/104872] Memory corruption in Coroutine with POD type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104872 --- Comment #1 from Benjamin Buch --- More minimal version: https://godbolt.org/z/aEv13e38a ```cpp #include #include #include using namespace std::literals; class logging_string{ public: logging_string(std::string_view text) :text_(text) { std::cout << " view: " << this << " " << text_ << std::endl; } logging_string(logging_string&& other) { std::cout << " move: " << this << " <= " << &other << " new <= " << other.text_ << std::endl; text_ = std::move(other.text_); } ~logging_string(){ std::cout << " destruct: " << this << " " << text_ << std::endl; } logging_string& operator=(logging_string&& other){ std::cout << "move-assign: " << this << " <= " << &other << " " << text_ << " <= " << other.text_ << std::endl; text_ = std::move(other.text_); return *this; } private: std::string text_; }; struct wrapper{ // wrapper() = default; // wrapper(std::string_view text) :filename(text) {} // wrapper(wrapper&&) = default; // wrapper& operator=(wrapper&&) = default; // ~wrapper() = default; logging_string filename; }; struct generator{ struct promise_type; using handle_type = std::coroutine_handle; struct promise_type{ wrapper value{"default"sv}; generator get_return_object(){ return handle_type::from_promise(*this); } std::suspend_always initial_suspend(){ return {}; } std::suspend_always final_suspend()noexcept{ return {}; } void unhandled_exception(){} std::suspend_always yield_value(wrapper&& new_value){ value = std::move(new_value); return {}; } }; generator(handle_type h) : handle(h) {} ~generator(){ handle.destroy(); } handle_type handle; }; static generator generate(){ co_yield {"generate"sv}; } int main(){ auto gen = generate(); gen.handle(); } ```
[Bug c++/104872] Memory corruption in Coroutine with POD type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104872 --- Comment #2 from Benjamin Buch --- To workaround it is enough define the wrapper constructor to build a string. ```cpp wrapper(std::string text): filename(std::move(text)) {} ``` https://godbolt.org/z/9za7hfjs8 ```cpp #include #include using namespace std::literals; class logging_string{ public: logging_string(std::string text): text_(std::move(text)) { std::cout << " view: " << this << " " << text_ << std::endl; } logging_string(logging_string&& other) { std::cout << " move: " << this << " <= " << &other << " new <= " << other.text_ << std::endl; text_ = std::move(other.text_); } ~logging_string(){ std::cout << " destruct: " << this << " " << text_ << std::endl; } logging_string& operator=(logging_string&& other){ std::cout << "move-assign: " << this << " <= " << &other << " " << text_ << " <= " << other.text_ << std::endl; text_ = std::move(other.text_); return *this; } private: std::string text_; }; struct wrapper{ // wrapper(std::string text): filename(std::move(text)) {} logging_string filename; }; struct generator{ struct promise_type; using handle_type = std::coroutine_handle; struct promise_type{ wrapper value{"default"s}; generator get_return_object(){ return handle_type::from_promise(*this); } std::suspend_always initial_suspend(){ return {}; } std::suspend_always final_suspend()noexcept{ return {}; } void unhandled_exception(){} std::suspend_always yield_value(wrapper&& new_value){ value = std::move(new_value); return {}; } }; generator(handle_type h) : handle(h) {} ~generator(){ handle.destroy(); } handle_type handle; }; static generator generate(){ co_yield {"generate"s}; } int main(){ auto gen = generate(); gen.handle(); } ```
[Bug target/104879] New: [nvptx] Use .common directive (available starting ptx isa version 5.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104879 Bug ID: 104879 Summary: [nvptx] Use .common directive (available starting ptx isa version 5.0) Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- We currently have in the nvptx port in nvptx_option_override: ... /* Set flag_no_common, unless explicitly disabled. We fake common using .weak, and that's not entirely accurate, so avoid it unless forced. */ if (!OPTION_SET_P (flag_no_common)) flag_no_common = 1; ... and in nvptx_output_aligned_decl: ... /* If this is public, it is common. The nearest thing we have to common is weak. */ fprintf (file, "\t%s", TREE_PUBLIC (decl) ? ".weak " : ""); ... [ There's also some optimisation note related to .common: ... /* Buffer needed to broadcast across workers and vectors. This is used for both worker-neutering and worker broadcasting, and vector-neutering and boardcasting when vector_length > 32. It is shared by all functions emitted. The buffer is placed in shared memory. It'd be nice if PTX supported common blocks, because then this could be shared across TUs (taking the largest size). */ ... but I'm not sure whether this is safe, perhaps this would require a mutex to make it safe. ] Starting with ptx isa 5.0, we have the .common directive available in ptx. However, it "can be used only on variables with .global storage", so it's somewhat limited.
[Bug rtl-optimization/104154] [12 Regression] Another ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #11 from Jakub Jelinek --- Assuming fixed. If not, please reopen. That said, arc is neither primary nor secondary, so it shouldn't be P1 but P4.
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #11 from Jakub Jelinek --- Is this fixed?
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #36 from Richard Biener --- As additional observation for the c-ray case we end up with [local count: 1073741824]: vect_ray_orig_x_87.270_173 = MEM [(double *)&ray]; _170 = BIT_FIELD_REF ; _171 = BIT_FIELD_REF ; # DEBUG D#93 => ray.orig.x # DEBUG ray$orig$x => D#93 # DEBUG D#92 => ray.orig.y # DEBUG ray$orig$y => D#92 ray$orig$z_89 = ray.orig.z; # DEBUG ray$orig$z => ray$orig$z_89 vect_ray_dir_x_90.266_178 = MEM [(double *)&ray + 24B]; _175 = BIT_FIELD_REF ; _176 = BIT_FIELD_REF ; so we load as vector but will need both lanes for scalar code pieces we couldn't vectorize (live lanes). It's somewhat difficult to reverse the vectorization decision at that point - we need the final idea on what stmts we vectorize to compute live lanes and we need to know which operands are vectorized to tell whether we can vectorize a stmt. But at least for loads we eventually could use scalar loads and a CTOR "late". There's also code in GIMPLE forwprop that can decompose vector loads feeding BIT_FIELD_REFs but it only does that if there's no other use of the vector (in this case of course there is - a single for the first and two for the second). There is not much value in the vectorization we do in this function (when manually fixing the STLF issue the speed is as good as with the scalar code). We cost ray.dir.x 1 times scalar_load costs 12 in body ray.dir.y 1 times scalar_load costs 12 in body vs. ray.dir.x 1 times unaligned_load (misalign -1) costs 12 in body ray.dir.x 1 times vec_to_scalar costs 4 in epilogue ray.dir.y 1 times vec_to_scalar costs 4 in epilogue which is probably OK, with SSE it's two loads vs one load + move + unpck, with AVX we can elide the move (but a move is free), the disadvantage of the vector load is the higher latency on the high part (plus of course the STLF hit). Since the vectorizer doesn't prune individual stmts because of costs but only throws away the whole opportunity if the overall cost doesn't seem profitable it's difficult to optimially handle this on the costing side I think. Instead the vectorizer should somehow be directed to use scalar loads + vector construction if likely STLF fails are detected. For example the following mitigates the issue for c-ray without resorting to "late" adjustments via costs but instead by changing the vectorization strathegy for possibly affected loads using target independent and likely flawed heuristics. A full exercise of the cummulative-args machinery might be able to tell how (parts) of a PARM_DECL are passed. Whether the caller will end up using wider moves with %xmm remains a guess of course. What's also completely missing is an idea how far from function entry this vectorization happens - for c-ray it would be enough to restrict this to loads in BB 2 for example. diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 5c9e8cfefa5..4f07e5ddc61 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2197,7 +2197,24 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, /* Stores can't yet have gaps. */ gcc_assert (slp_node || vls_type == VLS_LOAD || gap == 0); - if (slp_node) + if (!loop_vinfo + && vls_type == VLS_LOAD + && TREE_CODE (DR_BASE_ADDRESS (first_dr_info->dr)) == ADDR_EXPR + && (TREE_CODE (TREE_OPERAND (DR_BASE_ADDRESS (first_dr_info->dr), 0)) + == PARM_DECL) + /* Assume that for a power of two number of elements the aggregate +move to the stack is using larger moves at the caller side. */ + && !pow2p_hwi (group_size)) +{ + /* When doing BB vectorizing force loads from function parameters +(??? that are passed in memory and stored in pieces likely +causing STLF failures) to be done elementwise. */ + /* ??? Note this will cause vectorization to fail because of +the fear of underestimating the cost of elementwise accesses, +see the end of get_load_store_type. */ + *memory_access_type = VMAT_ELEMENTWISE; +} + else if (slp_node) { /* For SLP vectorization we directly vectorize a subchain without permutation. */
[Bug c++/104876] untranslated strings in diagnostic about failed mapper
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104876 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Jonathan Wakely --- dup *** This bug has been marked as a duplicate of bug 104709 ***
[Bug c++/104709] A translated error message will include untanslated parts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104709 Jonathan Wakely changed: What|Removed |Added CC||roland.illig at gmx dot de --- Comment #3 from Jonathan Wakely --- *** Bug 104876 has been marked as a duplicate of this bug. ***
[Bug c++/104877] missing standard gnu++20 in diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104877 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2022-03-11 Keywords||diagnostic See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=90183 --- Comment #1 from Jonathan Wakely --- Maybe we should avoid referring to specific -std options and just say "since C++20 or with -fconcepts" (see also PR 90183).
[Bug rtl-optimization/104869] [12 Regression] Miscompilation of qt5-qtdeclarative since r12-6342-ge7a7dbb5ca5dd696
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104869 Jakub Jelinek changed: What|Removed |Added Last reconfirmed||2022-03-11 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #2 from Jakub Jelinek --- Reduced self-contained testcase for the testsuite: // PR rtl-optimization/104869 // { dg-do run } // { dg-options "-O2 -fvisibility=hidden" } // { dg-require-visibility "" } struct QBasicAtomicInteger { [[gnu::noipa]] int loadRelaxed() { return 1; } }; struct RefCount { bool deref() { int count = atomic.loadRelaxed(); if (count) return false; return deref(); } QBasicAtomicInteger atomic; }; struct QArrayData { RefCount ref; }; struct QString { ~QString(); QArrayData d; }; int ok; QString::~QString() { d.ref.deref(); } struct Label { bool isValid() { return generator; } int *generator; int index; }; struct ControlFlow; struct Codegen { [[gnu::noipa]] bool visit(); ControlFlow *controlFlow; }; struct ControlFlow { enum UnwindType { EE }; struct UnwindTarget { Label linkLabel; }; ControlFlow *parent; UnwindType unwindTarget_type; UnwindTarget unwindTarget() { QString label; ControlFlow *flow = this; while (flow) { Label l = getUnwindTarget(unwindTarget_type, label); if (l.isValid()) return {l}; flow = flow->parent; } return UnwindTarget(); } [[gnu::noipa]] Label getUnwindTarget(UnwindType, QString &) { Label l = { &ok, 0 }; return l; } }; [[gnu::noipa]] void foo(int) { ok = 1; } [[gnu::noipa]] bool Codegen::visit() { if (!controlFlow) return false; ControlFlow::UnwindTarget target = controlFlow->unwindTarget(); if (target.linkLabel.isValid()) foo(2); return false; } int main() { ControlFlow cf = { nullptr, ControlFlow::UnwindType::EE }; Codegen c = { &cf }; c.visit(); if (!ok) __builtin_abort (); }
[Bug middle-end/104880] New: regression ICE in expand_expr_addr_expr_1, at expr.c:8231
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104880 Bug ID: 104880 Summary: regression ICE in expand_expr_addr_expr_1, at expr.c:8231 Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dimitar.yordanov at sap dot com Target Milestone: --- Hi, I hit an ICE with the following reduced example: cat > foo.cc << EOF class c { long b; }; class B { public: typedef void *d; }; class aa { public: aa(B::d, char *); }; class e : public B { public: e(); }; __uint128_t f; struct g { struct h : c { h(__uint128_t &i) : c(reinterpret_cast(i)) {} __uint128_t ad(); }; }; class n : g { public: n(int); void j() { __uint128_t a; h k(a); __atomic_compare_exchange_n(&f, &a, k.ad(), true, 3, 0); } }; int l; class m : e { void ar() { n b(l); b.j(); } virtual c bd() { aa(d(&m::ar), ""); } }; void o() { new m; } EOF g++ -mcx16 -O2 -c foo.cc during RTL pass: expand foo.cc: In member function ‘void m::ar()’: foo.cc:29:32: internal compiler error: in expand_expr_addr_expr_1, at expr.c:8231 29 | __atomic_compare_exchange_n(&f, &a, k.ad(), true, 3, 0); Regression appeared first with: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eb72dc663e9070b281be83a80f6f838a3a878822 Best regards Dimitar
[Bug target/104208] -mlong-double-64 should override a previous -mabi=ibmlongdouble
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104208 --- Comment #13 from Florian Weimer --- Thanks, I can confirm that we can build the glibc test suite once more in Fedora rawhide. (Fedora only needs the GCC 12 fix, it's our first GCC version with the float128 default).
[Bug other/102664] contrib/gcc-git-customization.sh uses echo -n, which isn't portable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102664 --- Comment #36 from Richard Earnshaw --- Can this be closed now?
[Bug rtl-optimization/104869] [12 Regression] Miscompilation of qt5-qtdeclarative since r12-6342-ge7a7dbb5ca5dd696
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104869 --- Comment #3 from Jakub Jelinek --- I've tried to figure out where the former (GCC 10) all_uses_available_at checking has gone into (i.e. where we actually check that the propagation is possible, that something doesn't overwrite or clobber the src from the def_insn in between the def_insn and use_insn), but I can't find it, must be hidden somewhere. Richard, could you please have a look or at least hint at where to look for it?
[Bug other/102664] contrib/gcc-git-customization.sh uses echo -n, which isn't portable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102664 Martin Liška changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #37 from Martin Liška --- I guess so.
[Bug middle-end/104880] [11/12 Regression] ICE in expand_expr_addr_expr_1, at expr.c:8231 since r11-165-geb72dc663e9070
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104880 Martin Liška changed: What|Removed |Added Summary|regression ICE in |[11/12 Regression] ICE in |expand_expr_addr_expr_1, at |expand_expr_addr_expr_1, at |expr.c:8231 |expr.c:8231 since ||r11-165-geb72dc663e9070 Status|UNCONFIRMED |NEW Target Milestone|--- |11.3 Last reconfirmed||2022-03-11 CC||marxin at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Martin Liška --- Thanks for the report.
[Bug middle-end/104880] [11/12 Regression] ICE in expand_expr_addr_expr_1, at expr.c:8231 since r11-165-geb72dc663e9070
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104880 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Keywords||ice-on-valid-code --- Comment #2 from Richard Biener --- Interestingly it only fails with -fno-checking, with -fchecking it works fine. A diff -fchecking vs -fno-checking reveals --- a/t.cc.031t.einline 2022-03-11 13:42:58.925057781 +0100 +++ b/t.cc.031t.einline 2022-03-11 13:43:04.713135650 +0100 @@ -47,25 +47,25 @@ void n::j (struct n * const this) { struct h k(address-taken); - __int128 unsigned a(address-taken); + __int128 unsigned a; __int128 unsigned _1; __int128 unsigned _5; : k(address-taken) ={v} {CLOBBER}; - k(address-taken).D.2425 = MEM[(const struct c &)&a(address-taken)]; + k(address-taken).D.2425 = MEM[(const struct c &)&a]; _5 = g::h::ad (&k(address-taken)); : _1 = _5; - __atomic_compare_exchange_16 (&f(address-taken), &a(address-taken), _1, 1, 3, 0); - a(address-taken) ={v} {CLOBBER(eol)}; + __atomic_compare_exchange_16 (&f(address-taken), &a, _1, 1, 3, 0); + a ={v} {CLOBBER(eol)}; k(address-taken) ={v} {CLOBBER(eol)}; return; : : - a(address-taken) ={v} {CLOBBER(eol)}; + a ={v} {CLOBBER(eol)}; k(address-taken) ={v} {CLOBBER(eol)}; resx 1 where the -fno-checking case loses the TREE_ADDRESSABLE flag as part of early inlining. We have if (optimize_atomic_compare_exchange_p (stmt)) { /* For __atomic_compare_exchange_N if the second argument is &var, don't mark var addressable; if it becomes non-addressable, we'll rewrite it into ATOMIC_COMPARE_EXCHANGE call. */ tree arg = gimple_call_arg (stmt, 1); gimple_call_set_arg (stmt, 1, null_pointer_node); gimple_ior_addresses_taken (addresses_taken, stmt); gimple_call_set_arg (stmt, 1, arg); } but that rewriting doesn't actually happen, that's because 'a' has partial defs (but we cannot rely on that here). So this optimization looks premature. What restores the TREE_ADDRESSABLE bit with -fchecking is verify_ssa_operands which also fails to diagnose a missing bit (and to restore the non-set status here).
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #37 from Hongtao.liu --- > There is not much value in the vectorization we do in this function > (when manually fixing the STLF issue the speed is as good as with the > scalar code). We cost > > ray.dir.x 1 times scalar_load costs 12 in body > ray.dir.y 1 times scalar_load costs 12 in body Still from an target-related perspective, instead of adding cost for STLF penalty, maybe we should just reduce cost of scalar_load if it's from parm_decl because there's probably STLF.
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #38 from rguenther at suse dot de --- On Fri, 11 Mar 2022, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 > > --- Comment #37 from Hongtao.liu --- > > There is not much value in the vectorization we do in this function > > (when manually fixing the STLF issue the speed is as good as with the > > scalar code). We cost > > > > ray.dir.x 1 times scalar_load costs 12 in body > > ray.dir.y 1 times scalar_load costs 12 in body > Still from an target-related perspective, instead of adding cost for STLF > penalty, maybe we should just reduce cost of scalar_load if it's from > parm_decl > because there's probably STLF. That's an interesting idea - it would eventually also improve the case where the argument is passed in register(s) but we fail to realize that. I'll see if I get around to prototype some argument classification in the vectorizer (looking how hard it is to use INIT_CUMULATIVE_ARGS in a context where we are not expanding to RTL), unfortunately stack passing is done by code in function.cc (plus extra target hooks of course), but it might be easy enough to figure alignment and size at least (and whether arguments are passed on the stack or not).
[Bug libstdc++/104881] New: Document libstdc++ ABI evolution for experimental features
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104881 Bug ID: 104881 Summary: Document libstdc++ ABI evolution for experimental features Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: documentation Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- It might be helpful if the API evolution page at https://gcc.gnu.org/onlinedocs/libstdc++/manual/api.html contained details of breaking changes to experimental features. For example, gcc-9.1 was the first release with non-experimental C++17 support, and had ABI changes to std::variant compared with previous releases (which is why you shouldn't mix experimental features from different releases). r8-1649-g705037247447f4 made the copy constructor trivial in some cases, making 8.1.0 incompatible with 7.x r9-6369-g669a6fdcb436ae rewrote large parts of std::variant, making 9.1.0 incompatible with 8.x (and r9-7106-g038bc9bfd6dfd9 and r9-6715-gda97b98ad34145 made further changes). Similarly, gcc-8.1.0 had triviality changes for std::optional (r8-6100), and maybe again for 9.1.0 (r9-4229). There needs to be a caveat about the documentation of those changes being incomplete, so users don't assume that if it's not mentioned there, it's safe to use across versions. It's not supported, and the doc should not be taken as a guarantee of what can be relied on.
[Bug target/97106] [nvptx] Issues with weak aliases introduced by C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106 --- Comment #2 from Tom de Vries --- Created attachment 52606 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52606&action=edit Tentative patch With this patch and: - current trunk - misa default set to sm_75 (so 3.1 multilib disabled, because sm_75 is not supported there) - patch "middle-end: Support ABIs that pass FP values as wider integers" to fix the build with sm_75 default - Tentative patch "[nvptx] Add warp sync at simt exit" from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783#c6 I get 100% pass rate on and sm_75 board for ovo testsuite: ... + ./ovo.sh report >> Overall result for test_result/2022-03-11_13-10_delia pass rate(%)test(#)success(#)compilation error(#)runtime error(#)wrong value(#)timeout(#) -- - -- -- 100%910 910 0 0 0 0 ...
[Bug target/104762] [12 Regression] x86_64 538.imagick_r 8%-28% regressions and 10% 525.x264_r regressions after r12-7319-g90d693bdc9d718
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104762 --- Comment #4 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:69619acd8d9b5856f5af6e5323d9c7c4ec9ad08f commit r12-7612-g69619acd8d9b5856f5af6e5323d9c7c4ec9ad08f Author: Richard Biener Date: Fri Mar 11 11:51:13 2022 +0100 target/104762 - vectorization costs of CONSTRUCTORs After accounting for GPR -> XMM move cost for vec_construct the base cost needs adjustments to not double-cost those. This also lowers the cost when such move is not necessary. 2022-03-11 Richard Biener PR target/104762 * config/i386/i386.cc (ix86_builtin_vectorization_cost): Do not cost the first lane of SSE pieces as inserts for vec_construct.
[Bug middle-end/104880] [11 Regression] ICE in expand_expr_addr_expr_1, at expr.c:8231 since r11-165-geb72dc663e9070
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104880 --- Comment #3 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:eb5edcf3f3ae008a1c55c88f08a886a5f350a759 commit r12-7613-geb5edcf3f3ae008a1c55c88f08a886a5f350a759 Author: Richard Biener Date: Fri Mar 11 14:09:33 2022 +0100 tree-optimization/104880 - update-address-taken and cmpxchg The following addresses optimistic non-addressable marking of an argument of __atomic_compare_exchange_n which broke when I added DECL_NOT_GIMPLE_REG_P since we cannot guarantee we can rewrite it when TREE_ADDRESSABLE is unset. Instead we have to restore TREE_ADDRESSABLE in that case. 2022-03-11 Richard Biener PR tree-optimization/104880 * tree-ssa.cc (execute_update_address_taken): Remember if we optimistically made something not addressable and prepare to undo it. * g++.dg/opt/pr104880.cc: New testcase.
[Bug target/104762] [12 Regression] x86_64 538.imagick_r 8%-28% regressions and 10% 525.x264_r regressions after r12-7319-g90d693bdc9d718
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104762 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #5 from Richard Biener --- Fixed.
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 104762, which changed state. Bug 104762 Summary: [12 Regression] x86_64 538.imagick_r 8%-28% regressions and 10% 525.x264_r regressions after r12-7319-g90d693bdc9d718 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104762 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug target/101929] [12 Regression] r12-7319 regress x264_r by 4% on CLX.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101929 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Richard Biener --- Should be fixed with r12-7612.
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 101929, which changed state. Bug 101929 Summary: [12 Regression] r12-7319 regress x264_r by 4% on CLX. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101929 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug c++/104623] [11/12 Regression] ICE in cp_parser_skip_to_pragma_eol, at cp/parser.cc:4107
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104623 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Created attachment 52607 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52607&action=edit gcc12-pr104623.patch The problem is that the function purges the tokens, from the token after CPP_PRAGMA until CPP_PRAGMA_EOL (the latter inclusive). But CPP_PRAGMA is not purged. So, next time we tentatively parse it we ICE because CPP_PRAGMA isn't followed by CPP_PRAGMA_EOL anymore (because purged tokens are ignored). I have tried various things, purging also the pragma_tok token or changing second argument of cp_parser_skip_to_pragma_eol to NULL in cp_parser_skip_to_end_of_statement and cp_parser_skip_to_end_of_block_or_statement but both regressed various testcases. This at least in quick check-g++ testing didn't regress anything.
[Bug target/104882] New: [12 Regression] MVE: Wrong code at -O2 since r12-1434-g046a3beb1673bf4a61c131373b6a5e84158e92bf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104882 Bug ID: 104882 Summary: [12 Regression] MVE: Wrong code at -O2 since r12-1434-g046a3beb1673bf4a61c131373b6a5e84158e92bf Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- Created attachment 52608 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52608&action=edit broken assembly output The following code: int i; char src[1072]; char dst[72]; int main() { for (i = 0; i < 128; i++) src[i] = i; __builtin_memcpy(dst, src, 7); for (i = 0; i < 7; i++) if (dst[i] != i) __builtin_abort(); } is miscompiled at -O2 since vectorization was enabled at -O2. With -O2 -ftree-vectorize, it is miscompiled earlier, starting with: commit 046a3beb1673bf4a61c131373b6a5e84158e92bf Author: Christophe Lyon Date: Thu Jun 3 15:35:50 2021 arm: Auto-vectorization for MVE: add pack/unpack patterns It looks like we do some dubious packing of vector elements before storing to src. If I change the last loop to print the elements of dst instead, I see: 0 8 4 12 1 9 5 it should of course print: 0 1 2 3 4 5 6. The broken code is attached. The testcase above was reduced from gcc/testsuite/gcc.c-torture/execute/memcpy-1.c.
[Bug libstdc++/104866] [12 Regression] this_thread_sleep.h doesn't compile if _GLIBCXX_NO_SLEEP is defined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104866 --- Comment #3 from dv at vollmann dot ch --- Thanks. Tested for AVR, works :-)
[Bug target/104882] [12 Regression] MVE: Wrong code at -O2 since r12-1434-g046a3beb1673bf4a61c131373b6a5e84158e92bf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104882 --- Comment #1 from Alex Coplan --- For completeness, the options -march=armv8.1-m.main+mve -mfloat-abi=hard -O2 -ftree-vectorize are required to reproduce.
[Bug libstdc++/104870] [12 Regression] fast_float doesn't work for 16-bit size_t, but is used anyway by floating_from_chars.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104870 --- Comment #7 from dv at vollmann dot ch --- Tested for AVR, works :-) Thanks, Detlef
[Bug c/92209] Imprecise column number for -Wstrict-prototypes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92209 Eric Gallager changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=82922 URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2022-March/5 ||91595.html Keywords||patch --- Comment #3 from Eric Gallager --- A patch has been posted to the gcc-patches mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591595.html (In reply to Eric Gallager from comment #2) > this would become more important if -Wstrict-prototypes is enabled by > default or under some umbrella flag, as has been proposed in other bugs (see, for example, bug 82922)
[Bug other/102664] contrib/gcc-git-customization.sh uses echo -n, which isn't portable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102664 --- Comment #38 from Eric Gallager --- (In reply to Martin Liška from comment #37) > I guess so. Yep, I can confirm this is fixed now; thanks!
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Sorry, still can't reproduce that. The --help=target output is very similar to yours, just (your -> mine): - -mcmodel=small - -mprioritize-restricted-insns= 1 + -mprioritize-restricted-insns= 0 - -mprofile-kernel [disabled] - -mstack-protector-guard= tls + -mstack-protector-guard= global
[Bug c++/104867] Base class matching ignores type of `auto` template parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104867 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Status|UNCONFIRMED |NEW Last reconfirmed||2022-03-11 Ever confirmed|0 |1 --- Comment #1 from Patrick Palka --- Confirmed, this never worked.
[Bug target/104868] [12 Regression] powerpc: Compiling libgfortran with -flto failing with GCC 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868 --- Comment #6 from Matheus Castanho --- I can still reproduce the issue after applying the patch in previous comment.
[Bug target/97106] [nvptx] Issues with weak aliases introduced by C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97106 --- Comment #3 from Tom de Vries --- With this additionally: ... diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc index 1a89c1bc77f..2e1a2dad9fe 100644 --- a/gcc/config/nvptx/nvptx.cc +++ b/gcc/config/nvptx/nvptx.cc @@ -968,7 +968,8 @@ static void write_fn_proto_1 (std::stringstream &s, bool is_defn, const char *name, const_tree decl) { - write_fn_marker (s, is_defn, TREE_PUBLIC (decl), name); + if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL) +write_fn_marker (s, is_defn, TREE_PUBLIC (decl), name); /* PTX declaration. */ if (DECL_EXTERNAL (decl)) ... I get a simplified libgomp/testsuite/libgomp.c-c++-common/pr96390.c to work. That is, using only one level of alias indirection: ... #pragma omp target map(from:n) n = bar (); ... With the original, two level alias indirection I get: ... libgomp: Link error log ptxas fatal : Internal error: alias to unknown symbol ... That seems to be consistent with the ptx manual, which dictates: ... .alias fAlias, fAliasee; Identifier fAliasee is a function symbol which must be defined in the same module as .alias declaration. ...
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 --- Comment #4 from Jakub Jelinek --- Ah, I can reproduce with additional -fpie. We should have never accepted the --enable-default-pie mess, that is a maintainance nightmare.
[Bug target/99754] [sse2] new _mm_loadu_si16 and _mm_loadu_si32 implemented incorrectly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754 Peter Cordes changed: What|Removed |Added CC||peter at cordes dot ca --- Comment #2 from Peter Cordes --- Can we get this patch applied soon? There aren't any other strict-aliasing-safe movd load intrinsics, but this one won't be portably usable while there are buggy GCC versions around. Until then, code should probably use something like inline __m128i movd(void *p){ return _mm_castps_si128(_mm_load_ss((const float*)p)); } (Which believe it or not is strict-aliasing safe even on integer data. At least it should be; last I tested it was across compilers, except maybe on ICC. Would have to double-check there.)
[Bug libstdc++/104883] New: should define all std::errc enumerators
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104883 Bug ID: 104883 Summary: should define all std::errc enumerators Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- Currently we only define std::errc enumerators when the OS defines the corresponding errno macro: #ifdef EOVERFLOW value_too_large = EOVERFLOW, #endif Which causes errors when we rely on that being present, e.g. for AVR: /home/jwakely/src/gcc/build-avr/avr/libstdc++-v3/include/charconv: In function ‘std::to_chars_result std::__detail::__to_chars(char*, char*, _Tp, int)’: /home/jwakely/src/gcc/build-avr/avr/libstdc++-v3/include/charconv:132:28: error: ‘value_too_large’ is not a member of ‘std::errc’; did you mean ‘file_too_large’? 132 | __res.ec = errc::value_too_large; |^~~ |file_too_large And src/filesystem/ops-common.h does this to workaround the fact that std::errc::not_supported isn't always defined: inline error_code __unsupported() noexcept { #if defined ENOTSUP return std::make_error_code(std::errc::not_supported); #elif defined EOPNOTSUPP // This is supposed to be for socket operations return std::make_error_code(std::errc::operation_not_supported); #else return std::make_error_code(std::errc::invalid_argument); #endif } We should consider defining all the enumerators unconditionally, picking values outside the range used by the OS for constants, e.g. #ifdef EOVERFLOW value_too_large = EOVERFLOW, #else value_too_large = 1001, #endif The tricky part is picking a value range that doesn't clash with the OS. For the OS-specific headers such as config/os/mingw32-w64/error_constants.h we can just inspect and make an educated choice. For config/os/generic/error_constants.h maybe we want to do something in configure or with the preprocessor to find the largest value among all the errno macros that *are* defined, and add 100. Or just take a gamble and assume the OS uses small numbers and we can start from 1000, or 32000, or something.
[Bug target/99754] [sse2] new _mm_loadu_si16 and _mm_loadu_si32 implemented incorrectly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754 --- Comment #3 from Peter Cordes --- Wait a minute, the current implementation of _mm_loadu_si32 isn't strict-aliasing or alignment safe!!! That defeats the purpose for its existence as something to use instead of _mm_cvtsi32_si128( *(int*)p ); The current code contains a deref of a plain (int*). It should be using something like typdef int unaligned_aliasing_int __attribute__((aligned(1),may_alias));
[Bug d/104835] [12 Regression] libphobos fails to build on mips64el-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104835 Iain Buclaw changed: What|Removed |Added See Also||https://github.com/dlang/dm ||d/pull/13805 --- Comment #4 from Iain Buclaw --- Raised pull request in upstream dmd. Why the new version works but the old doesn't is anyone's guess though. It's the difference between: --- T tmp = T(args); __builtin_memcpy(p, (void*)&tmp, sizeof(T)); --- and --- new(p) T(args); --- As far as I can tell, both should end up with the same outcome.
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324 since r12-1202-g9080a3bf232978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 Martin Liška changed: What|Removed |Added Summary|[12 Regression] ICE in |[12 Regression] ICE in |simplify_subreg, at |simplify_subreg, at |simplify-rtx.cc:7324|simplify-rtx.cc:7324 since ||r12-1202-g9080a3bf232978 CC||guihaoc at gcc dot gnu.org --- Comment #5 from Martin Liška --- Started with r12-1202-g9080a3bf232978.
[Bug target/99754] [sse2] new _mm_loadu_si16 and _mm_loadu_si32 implemented incorrectly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754 Jakub Jelinek changed: What|Removed |Added Last reconfirmed||2022-03-11 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #4 from Jakub Jelinek --- Created attachment 52609 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52609&action=edit gcc12-pr99754.patch Full untested patch.
[Bug target/104868] [12 Regression] powerpc: Compiling libgfortran with -flto failing with GCC 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868 Michael Meissner changed: What|Removed |Added CC||meissner at gcc dot gnu.org --- Comment #7 from Michael Meissner --- Created attachment 52610 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52610&action=edit Patch to fix extendidti2 constraint on power10
[Bug target/104868] [12 Regression] powerpc: Compiling libgfortran with -flto failing with GCC 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868 --- Comment #8 from Michael Meissner --- Matheus, try the patch I just attached to the PR that I posted to the gcc-patches mailing list.
[Bug c++/103328] [11/12 Regression] ICE in remap_gimple_stmt, at tree-inline.c:1921 since r11-7419-g0f161cc8494cf728
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103328 --- Comment #20 from Benno Evers --- Created attachment 52611 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52611&action=edit Possible fix
[Bug target/104868] [12 Regression] powerpc: Compiling libgfortran with -flto failing with GCC 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868 --- Comment #9 from Matheus Castanho --- That one works. Thanks!
[Bug rtl-optimization/104814] [10/11/12 Regression] ifcvt: Deleting live variable in IF-CASE-2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104814 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Last reconfirmed||2022-03-11 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #4 from Jakub Jelinek --- Created attachment 52612 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52612&action=edit gcc12-pr104814.patch Untested fix.
[Bug c++/104873] Bug in overload resolution for constrained class templates with deduction guides
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104873 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org --- Comment #1 from Patrick Palka --- Reduced testcase: template concept C = true; template requires C struct S { S(...); }; template S(S) -> S>; using type = decltype(S(S())); using type = S; // Clang accepts, GCC demands S> The difference is that the copy deduction candidate generated by GCC is unconstrained: template S(S) -> S; whereas Clang presumably attaches the constraints of the class template to the candidate: template requires C S(S) -> S; This also occurs for guides synthesized from a constructor: template concept C = true; template requires C struct S { S(); S(T); // #1 }; template S(T) -> S>; // #2 using type = decltype(S(0)); // Clang selects guide for #1, GCC selects #2 using type = S; // therefore Clang accepts, GCC demands S> Is it true that a deduction guide inherits the constraints of the class template? It's not clear to me according to [over.match.class.deduct].
[Bug c/104884] New: functions miss their 'ret' instruction (and fall through) in certain cases with '-O3' under x86-84
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104884 Bug ID: 104884 Summary: functions miss their 'ret' instruction (and fall through) in certain cases with '-O3' under x86-84 Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zamfofex at twdb dot moe Target Milestone: --- Created attachment 52613 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52613&action=edit code in C function is wrongly executed It seems as if certain loops that rely on undefined behavior cause generated functions to be allowed to fall through (i.e. they don’t include an appropriate ‘ret’ instruction as necessary) with ‘-O3’, at least under x86‐64. Even though they do make use of undefined behavior, letting the code execution fall through to subsequent code in the executable seems potentially really dangerous to me, unless I’m missing something (which is not impossible). In the attached ‘main.c’ file, the code in the function ‘bar’ should never be executed, yet it somehow is. If the loop from the function ‘foo’ is placed within an exported function, the generated function body is empty, and upon calling, the execution falls through and likely causes a segfault.
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324 since r12-1202-g9080a3bf232978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P1 CC||vmakarov at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- This seems to go wrong during LRA. In *.ira we have just: (debug_insn 352 199 201 19 (var_location:SI D#2 (reg:SI 227)) -1 (nil)) but then LRA turns that into: (debug_insn 352 199 201 19 (var_location:SI D#2 (plus:SI (plus:SI (minus:SI (subreg:SI (reg:HI 130 [ _15 ]) 0) (subreg:SI (reg:DI 325 [orig:131 _16 ] [131]) 4)) (subreg:SI (reg:HI 130 [ _15 ]) 0)) (zero_extend:SI (subreg:QI (mem/c:SI (reg:SI 304) [0 S4 A32]) 3 -1 (nil)) and later lra_substitute_pseudo turns that into: (debug_insn 352 199 201 19 (var_location:SI D#2 (plus:SI (plus:SI (minus:SI (subreg:SI (subreg:HI (reg:SI 270 [ a+4 ]) 2) 0) (subreg:SI (reg:DI 325 [orig:131 _16 ] [131]) 4)) (subreg:SI (subreg:HI (reg:SI 270 [ a+4 ]) 2) 0)) (zero_extend:SI (subreg:QI (mem/c:SI (reg:SI 304) [0 S4 A32]) 3 -1 (nil)) and later another lra_substitute_pseudo turns that into: (debug_insn 352 199 201 19 (var_location:SI D#2 (plus:SI (plus:SI (minus:SI (subreg:SI (const_int 1 [0x1]) 0) (subreg:SI (reg:DI 325 [orig:131 _16 ] [131]) 4)) (subreg:SI (const_int 1 [0x1]) 0)) (zero_extend:SI (subreg:QI (mem/c:SI (reg:SI 304) [0 S4 A32]) 3 -1 (nil)) I think we can deal in debug_insns with subreg inside of subreg, but we certainly can't deal with a subreg with VOIDmode operand (ditto for ZERO_EXTEND/SIGN_EXTEND etc.). E.g. combine.cc (subst) has: if (GET_CODE (x) == SUBREG && CONST_SCALAR_INT_P (new_rtx)) { machine_mode mode = GET_MODE (x); x = simplify_subreg (GET_MODE (x), new_rtx, GET_MODE (SUBREG_REG (x)), SUBREG_BYTE (x)); if (! x) x = gen_rtx_CLOBBER (mode, const0_rtx); } else if (CONST_SCALAR_INT_P (new_rtx) && (GET_CODE (x) == ZERO_EXTEND || GET_CODE (x) == SIGN_EXTEND || GET_CODE (x) == FLOAT || GET_CODE (x) == UNSIGNED_FLOAT)) { x = simplify_unary_operation (GET_CODE (x), GET_MODE (x), new_rtx, GET_MODE (XEXP (x, 0))); if (!x) return gen_rtx_CLOBBER (VOIDmode, const0_rtx); } /* CONST_INTs shouldn't be substituted into PRE_DEC, PRE_MODIFY etc. arguments, otherwise we can ICE before trying to recog it. See PR104446. */ else if (CONST_SCALAR_INT_P (new_rtx) && GET_RTX_CLASS (GET_CODE (x)) == RTX_AUTOINC) return gen_rtx_CLOBBER (VOIDmode, const0_rtx); else SUBST (XEXP (x, i), new_rtx); for this (note, RTX_AUTOINC shouldn't appear in debug insns, but the rest can).
[Bug middle-end/26374] Compile failure on long double
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26374 --- Comment #22 from beebe at math dot utah.edu --- Yesterday, I got a Fedora 36 PPC64LE VM up, and left it installing hundreds of packages overnight. QEMU 4.2.1 on Ubuntu 20.04 picks a default CPU type of POWER9. Alas, the ISO installation gets nicely into a normal boot, then dies with a kernel panic. I changed the default CPU type to POWER8, and that led to a successful installation. This morning, I built my feature test package http://www.math.utah.edu/pub/features/ and got this nice report back: LDBL_DECIMAL_DIG = 36 LDBL_DIG = 33 LDBL_EPSILON = 0x1p-112 LDBL_HAS_SUBNORM = 1 LDBL_MANT_DIG= 113 LDBL_MAX = 0x1.p+16383 LDBL_MAX_10_EXP = 4932 LDBL_MAX_EXP = 16384 LDBL_MIN = 0x1p-16382 LDBL_MIN_10_EXP = -4931 LDBL_MIN_EXP = -16381 LDBL_NORM_MAX= [not defined] LDBL_TRUE_MIN= 0x0.0001p-16382 HUGE_VALL= inf ... Computed long double limits: smallest floating-point number: 0.e+00 == 2^(-16494) [IEEE 754 smallest 128-bit subnormal] machine epsilon:1.92592994e-34 == 2^(-112) [IEEE 754 128-bit conformant] long double appears to be 128-bit value stored in 128-bit (16-byte) field That indicates that long double implemented as doubled double is now history on this platform, and that is good news for me. Thanks for telling me about this change for Fedora on PowePC. --- - Nelson H. F. BeebeTel: +1 801 581 5254 - - University of UtahFAX: +1 801 581 4148 - - Department of Mathematics, 110 LCBInternet e-mail: be...@math.utah.edu - - 155 S 1400 E RM 233 be...@acm.org be...@computer.org - - Salt Lake City, UT 84112-0090, USAURL: http://www.math.utah.edu/~beebe/ - ---
[Bug c++/84964] [9/10/11/12 Regression] ICE in expand_call, at calls.c:4540
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84964 --- Comment #17 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:098c538ae8c0c5e281d9191a6b54ffe38b624ef3 commit r12-7614-g098c538ae8c0c5e281d9191a6b54ffe38b624ef3 Author: Roger Sayle Date: Fri Mar 11 17:35:21 2022 + [Committed] Update g++.dg/other/pr84964.C for ia32 (and similar) targets. The "sorry, unimplemented" message in the new g++.dg/other/pr84964.C is apparently dependent upon whether the target passes multi-gigabyte arguments on the stack. This tweaks the testcase to just confirm that it no longer ICEs, not the specific set of warnings/errors triggered. 2022-03-11 Roger Sayle gcc/testsuite/ChangeLog PR c++/84964 * g++.dg/other/pr84964.C: Tweak test to check for the ICE, not for the (target-dependent) sorry.
[Bug c/104884] functions miss their 'ret' instruction (and fall through) in certain cases with '-O3' under x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104884 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- The behavior is undefined so gcc decides the code is unreachable and optimizes it that way. Use -fwrapv if you want it to be defined code. Of use -fsanitize=undefined to detect the undefined behavior at runtime.
[Bug tree-optimization/98335] [9/10/11/12 Regression] Poor code generation for partial struct initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98335 --- Comment #8 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:c5288df751f9ecd11898dec5f2a7b6b03267f79e commit r12-7615-gc5288df751f9ecd11898dec5f2a7b6b03267f79e Author: Roger Sayle Date: Fri Mar 11 17:46:50 2022 + PR tree-optimization/98335: Improvements to DSE's compute_trims. This patch is the main middle-end piece of a fix for PR tree-opt/98335, which is a code-quality regression affecting mainline. The issue occurs in DSE's (dead store elimination's) compute_trims function that determines where a store to memory can be trimmed. In the testcase given in the PR, this function notices that the first byte of a DImode store is dead, and replaces the 8-byte store at (aligned) offset zero, with a 7-byte store at (unaligned) offset one. Most architectures can store a power-of-two bytes (up to a maximum) in single instruction, so writing 7 bytes requires more instructions than writing 8 bytes. This patch follows Jakub Jelinek's suggestion in comment 5, that compute_trims needs improved heuristics. On x86_64-pc-linux-gnu with -O2 the new test case in the PR goes from: movl$0, -24(%rsp) movabsq $72057594037927935, %rdx movl$0, -21(%rsp) andq-24(%rsp), %rdx movq%rdx, %rax salq$8, %rax movbc(%rip), %al ret to xorl%eax, %eax movbc(%rip), %al ret 2022-03-11 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/98335 * builtins.cc (get_object_alignment_2): Export. * builtins.h (get_object_alignment_2): Likewise. * tree-ssa-alias.cc (ao_ref_alignment): New. * tree-ssa-alias.h (ao_ref_alignment): Declare. * tree-ssa-dse.cc (compute_trims): Improve logic deciding whether to align head/tail, writing more bytes but using fewer store insns. gcc/testsuite/ChangeLog PR tree-optimization/98335 * g++.dg/pr98335.C: New test case. * gcc.dg/pr86010.c: New test case. * gcc.dg/pr86010-2.c: New test case.
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324 since r12-1202-g9080a3bf232978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 --- Comment #7 from Jakub Jelinek --- Created attachment 52614 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52614&action=edit gcc12-pr104778.patch Untested fix.
[Bug tree-optimization/98335] [9/10/11/12 Regression] Poor code generation for partial struct initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98335 --- Comment #9 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:251ea6dfbdb4448875e41081682bb3aa451b5729 commit r12-7616-g251ea6dfbdb4448875e41081682bb3aa451b5729 Author: Roger Sayle Date: Fri Mar 11 17:57:12 2022 + PR tree-optimization/98335: New peephole2 xorl;movb -> movzbl This patch is the backend piece of my proposed fix to PR tree-opt/98335, to allow C++ partial struct initialization to be as efficient/optimized as full struct initialization. With the middle-end patch just posted to gcc-patches, the test case in the PR compiles on x86_64-pc-linux-gnu with -O2 to: xorl%eax, %eax movbc(%rip), %al ret with this additional peephole2 (actually four peephole2s): movzbl c(%rip), %eax ret 2022-03-11 Roger Sayle gcc/ChangeLog PR tree-optimization/98335 * config/i386/i386.md (peephole2): Eliminate redundant insv. Combine movl followed by movb. Transform xorl followed by a suitable movb or movw into the equivalent movz[bw]l. gcc/testsuite/ChangeLog PR tree-optimization/98335 * g++.target/i386/pr98335.C: New test case. * gcc.target/i386/pr98335.c: New test case.
[Bug middle-end/104885] New: ICE in compiling new test case g++.dg/other/pr84964.C after r12-7607-ga717376e99fb33
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104885 Bug ID: 104885 Summary: ICE in compiling new test case g++.dg/other/pr84964.C after r12-7607-ga717376e99fb33 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- Tried g:a717376e99fb33ba3b06bd8122e884f4b63a60c9, r12-7607-ga717376e99fb33 make -k check-gcc RUNTESTFLAGS="dg.exp=g++.dg/other/pr84964.C" FAIL: g++.dg/other/pr84964.C -std=c++98 (internal compiler error: Aborted signal terminated program cc1plus) FAIL: g++.dg/other/pr84964.C -std=c++98 (test for warnings, line 6) FAIL: g++.dg/other/pr84964.C -std=c++98 1 blank line(s) in output FAIL: g++.dg/other/pr84964.C -std=c++98 (test for excess errors) FAIL: g++.dg/other/pr84964.C -std=c++14 (internal compiler error: Aborted signal terminated program cc1plus) FAIL: g++.dg/other/pr84964.C -std=c++14 (test for warnings, line 6) FAIL: g++.dg/other/pr84964.C -std=c++14 1 blank line(s) in output FAIL: g++.dg/other/pr84964.C -std=c++14 (test for excess errors) FAIL: g++.dg/other/pr84964.C -std=c++17 (internal compiler error: Segmentation fault signal terminated program cc1plus) FAIL: g++.dg/other/pr84964.C -std=c++17 (test for warnings, line 6) FAIL: g++.dg/other/pr84964.C -std=c++17 (test for excess errors) FAIL: g++.dg/other/pr84964.C -std=c++20 (internal compiler error: Segmentation fault signal terminated program cc1plus) FAIL: g++.dg/other/pr84964.C -std=c++20 (test for warnings, line 6) FAIL: g++.dg/other/pr84964.C -std=c++20 (test for excess errors) commit a717376e99fb33ba3b06bd8122e884f4b63a60c9 (HEAD, refs/bisect/bad) Author: Roger Sayle Date: Thu Mar 10 23:49:15 2022 + PR c++/84964: Middle-end patch to expand_call for ICE after sorry.
[Bug c/104884] functions miss their 'ret' instruction (and fall through) in certain cases with '-O3' under x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104884 --- Comment #2 from zamfofex at twdb dot moe --- Isn’t it dangerous to allow the control to accidentally leak out of a function like that, though? If a function is written in a way that expects the compiler to not allow the control to leak out of the function and it is compiled with ‘-O3’, then that can lead to code being executed in unexpected situations, which feels like a fairly significant attack vector. Imagine e.g. the code in the executable immediately after the function definition contains some kind of secretive functionality, and the program is being run as some kind of server. That could allow attackers to exploit that compiler behavior in programs that use undefined behavior by accident.
[Bug c/104884] functions miss their 'ret' instruction (and fall through) in certain cases with '-O3' under x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104884 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- It is dangerous to invoke undefined behavior. An optimizing compiler optimizes on the assumption that UB doesn't happen. Please read https://blog.regehr.org/archives/213
[Bug c++/95153] Arrays of 'const void *' should not be copyable in C++20
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95153 Barry Revzin changed: What|Removed |Added CC||barry.revzin at gmail dot com --- Comment #5 from Barry Revzin --- Just to follow up, gcc trunk right now does this: int main() { int a[3]{}; void const* b[3](a); // ok void const* c[3]{}; void const* d[3](c); // error: array must be initialized with a brace-enclosed initializer } If 'd' is actually an error, then it's not because of that, so at the very least, the diagnostic is wrong.
[Bug c/104884] functions miss their 'ret' instruction (and fall through) in certain cases with '-O3' under x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104884 --- Comment #4 from zamfofex at twdb dot moe --- I would have expected that compilers would at least try to avoid incorporating behaving in a way that is deemed dangerous, even when a program causes undefined behavior. But to be honest, I understand that it’s difficult to determine what is “dangerous” concretely in some cases, and also that it’s not the compiler’s job to fix buggy code, so the current behavior doesn’t seem unreasonable after all. Thank you for the clarification, and apologies for the noise.
[Bug c++/104873] Bug in overload resolution for constrained class templates with deduction guides
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104873 --- Comment #2 from Patrick Palka --- Here's a testcase that seems to argue for including the class constraints in the implicit guide's constraints (which already contain the rewritten constraints of the constructor): template struct A { static_assert(!__is_same(T, void)); static constexpr bool value = true; }; template requires (!__is_same(T, void)) struct B { B(T*) requires A::value; // #1 B(T); }; void* p; using type = decltype(B(p)); using type = B; By not doing so we end up checking the two sets of constraints (the class's and the constructor's) "out of order", which causes a hard error during constraint checking of the guide for #1.
[Bug target/104816] -fcf-protection=branch should generate endbr instead of notrack jumps
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816 H.J. Lu changed: What|Removed |Added Status|NEW |WAITING --- Comment #8 from H.J. Lu --- (In reply to Joao Moreira from comment #0) > When -fcf-protection=branch is used, the compiler will generate jump tables > where the indirect jump is prefixed with the NOTRACK prefix, so it can jump > to non-ENDBR targets. Yet, for NOTRACK prefixes to work, the NOTRACK > specific enable bit must be set, what renders the binary broken on any > environment where this is not the case. In fact, having NOTRACK disabled was > a design choice for the Linux kernel CET support > [https://lkml.org/lkml/2022/3/7/1068]. > > With the above, the compiler should generate jump tables with ENDBRs, for > proper correctness. And, if security regarding the additional ENDBRs is a > concern, the code can be explicitly compiled with -fno-jump-tables. There is an undocumented option: -mcet-switch. It does exactly what you are looking for. Currently it is off by default. We can document it and turn it on by default.
[Bug c++/104642] Add __builtin_trap() for missing return at -O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104642 --- Comment #4 from Jonathan Wakely --- PR 104884 is another "why is undefined behaviour so surprising?" case for -funreachable-traps
[Bug target/104816] -fcf-protection=branch should generate endbr instead of notrack jumps
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816 --- Comment #9 from H.J. Lu --- Created attachment 52615 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52615&action=edit A patch
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324 since r12-1202-g9080a3bf232978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 --- Comment #8 from Andrew Pinski --- (In reply to Jakub Jelinek from comment #4) > Ah, I can reproduce with additional -fpie. > We should have never accepted the --enable-default-pie mess, that is a > maintainance nightmare. Well if they had used -freport-bug and attached that preprocessed source, figuring out the exact options would not have been an issue.
[Bug translation/104552] Mistakes in strings to be translated in GCC 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104552 --- Comment #36 from CVS Commits --- The master branch has been updated by Iain Buclaw : https://gcc.gnu.org/g:7a6ba7c7cb6ff5ac9bbcc747bd5fad957b78fa0a commit r12-7617-g7a6ba7c7cb6ff5ac9bbcc747bd5fad957b78fa0a Author: Iain Buclaw Date: Fri Mar 11 21:59:57 2022 +0100 d: Fix mistakes in strings to be translated [PR104552] Address comments made in PR104552 about documented D language options. gcc/d/ChangeLog: PR translation/104552 * lang.opt (fdump-cxx-spec=): Fix typo in argument handle. (fpreview=fixaliasthis): Quote `alias this' as code.
[Bug c++/104008] [11/12 Regression] New g++ folly compile error since r11-7931-ga2531859bf5bf6cf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104008 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #8 from Marek Polacek --- Since r11-7931, we strip_typedefs in TYPE_PACK_EXPANSION. In this test that results in IsOneOf being turned into disjunction<>, but then make_pack_expansion -> find_parameter_packs_r won't find OtherHolders.
[Bug c/104886] New: -Wdangling-pointer= prints internal MEM and (D) names in warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104886 Bug ID: 104886 Summary: -Wdangling-pointer= prints internal MEM and (D) names in warnings Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: slyfox at gcc dot gnu.org Target Milestone: --- Noticed internal names on libuv-1.43.0. Here is the minimal example: //$ cat a.c struct uv_stream_s { int io_watcher; void *write_completed_queue[]; }; void uv__write_callbacks(struct uv_stream_s *stream) { int pq; if (*(int **)stream->write_completed_queue) ; *(int **)stream->write_completed_queue[0] = &pq; } $ gcc -std=gnu89 -fno-strict-aliasing -c a.c -Wall -O2 a.c: In function 'uv__write_callbacks': a.c:10:45: warning: storing the address of local variable 'pq' in '*(int **)MEM[(int * *)stream_2(D) + 8B]' [-Wdangling-pointer=] 10 | *(int **)stream->write_completed_queue[0] = &pq; | ~~^ a.c:7:7: note: 'pq' declared here 7 | int pq; | ^~ a.c:8:7: note: '((int **)stream)[1]' declared here 8 | if (*(int **)stream->write_completed_queue) | ^~ Note: "note: " looks like a reasonable expression while "warning: " contains internal names. $ gcc -v Using built-in specs. COLLECT_GCC=/<>/gcc-12.0.0/bin/gcc COLLECT_LTO_WRAPPER=/<>/gcc-12.0.0/libexec/gcc/x86_64-unknown-linux-gnu/12.0.1/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.0.1 20220306 (experimental) (GCC)
[Bug tree-optimization/104886] -Wdangling-pointer= prints internal MEM and (D) names in warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104886 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2022-03-11 Status|UNCONFIRMED |NEW See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=104077 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed, the warning itself looks correct, just the formating of the warning seems bogus in this case.
[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708 --- Comment #34 from CVS Commits --- The releases/gcc-11 branch has been updated by Michael Meissner : https://gcc.gnu.org/g:6f581f90e3757392a510f11279e2daf5fcfdefa8 commit r11-9649-g6f581f90e3757392a510f11279e2daf5fcfdefa8 Author: Michael Meissner Date: Fri Mar 11 18:41:20 2022 -0500 Revert __SIZEOF__IBM128__ and __SIZEOF_FLOAT128__ patch. 2022-03-05 Michael Meissner gcc/ PR target/99708 * config/rs6000/rs6000-c.c: Revert patch from 2022-03-05. gcc/testsuite/ PR target/99708 * gcc.target/powerpc/pr99708.c: Revert patch from 2022-03-05.
[Bug tree-optimization/101895] [11/12 Regression] SLP Vectorizer change pushes VEC_PERM_EXPR into bad location spoiling further optimization opportunities
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101895 Roger Sayle changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |roger at nextmovesoftware dot com CC||roger at nextmovesoftware dot com --- Comment #5 from Roger Sayle --- Patch proposed. https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591644.html
[Bug target/99708] __SIZEOF_FLOAT128__ not defined on powerpc64le-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708 --- Comment #35 from CVS Commits --- The releases/gcc-10 branch has been updated by Michael Meissner : https://gcc.gnu.org/g:8437794102e86a1bd5f2257aa95ea76890810a28 commit r10-10493-g8437794102e86a1bd5f2257aa95ea76890810a28 Author: Michael Meissner Date: Fri Mar 11 19:09:20 2022 -0500 Revert __SIZEOF__IBM128__ and __SIZEOF_FLOAT128__ patch. 2022-03-05 Michael Meissner gcc/ PR target/99708 * config/rs6000/rs6000-c.c: Revert 2022-03-05 patch. gcc/testsuite/ PR target/99708 * gcc.target/powerpc/pr99708.c: Revert 2022-03-05 patch.
[Bug target/55690] On some targets thread_fence is not a compiler barrier when memmodel != MEMMODEL_SEQ_CST
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55690 Joseph changed: What|Removed |Added CC||schuchart at hlrs dot de --- Comment #1 from Joseph --- Came across this today with 7.3.0 on ia64 and found that it is fixed in 8.3.0. I don't have 8.1.0 and 8.2.0 available so cannot test when exactly it was resolved. I guess the ticket can be closed though.
[Bug target/104868] [12 Regression] powerpc: Compiling libgfortran with -flto failing with GCC 12
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868 --- Comment #10 from CVS Commits --- The master branch has been updated by Michael Meissner : https://gcc.gnu.org/g:3cb27b85a7b977958d53e1a29596ba211d21dde2 commit r12-7620-g3cb27b85a7b977958d53e1a29596ba211d21dde2 Author: Michael Meissner Date: Fri Mar 11 19:47:09 2022 -0500 Fix DImode to TImode sign extend issue PR target/104868 had had an issue where my code that updated the DImode to TImode sign extension for power10 failed. In looking at the failure message, the reason is when extendditi2 tries to split the insn, it generates an insn that does not satisfy its constraints: (set (reg:V2DI 65 1) (vec_duplicate:V2DI (reg:DI 0))) The reason is vsx_splat_v2di does not allow GPR register 0 when the will be generating a mtvsrdd instruction. In the definition of the mtvsrdd instruction, if the RA register is 0, it means clear the upper 64 bits of the vector instead of moving register GPR 0 to those bits. When I wrote the extendditi2 pattern, I forgot that mtvsrdd had that behavior so I used a 'r' constraint instead of 'b'. In the rare case where the value is in GPR register 0, this split will fail. This patch uses the right constraint for extendditi2. 2022-03-11 Michael Meissner gcc/ PR target/104868 * config/rs6000/vsx.md (extendditi2): Use a 'b' constraint when moving from a GPR register to an Altivec register.
[Bug bootstrap/104887] New: mold linker is not detected properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104887 Bug ID: 104887 Summary: mold linker is not detected properly Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: rui314 at gmail dot com Target Milestone: --- GCC 11.2.0 fails to bootstrap if `ld` is `ld.mold` (https://github.com/rui314/mold). It is because gcc's configure script has no idea as to what `ld.mold` is and consider it as a very old linker that doesn't support COMDAT. GCC 12's configure script correctly recognizes mold. Can you backport the fix to support mold in the GCC 11 series?
[Bug bootstrap/104887] [9/10/11 only] mold linker is not detected properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104887 Andrew Pinski changed: What|Removed |Added Summary|mold linker is not detected |[9/10/11 only] mold linker |properly|is not detected properly Target Milestone|--- |11.3 --- Comment #1 from Andrew Pinski --- >Can you backport the fix to support mold in the GCC 11 series? I doubt it since it might cause other issues. Plus GCC 12 will be out in about an month. Each distro can decide to backport it if they want though.
[Bug bootstrap/104887] [9/10/11 only] mold linker is not detected properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104887 --- Comment #2 from Rui Ueyama --- What kind of regression are you worry about?
[Bug tree-optimization/102586] [12 Regression] ICE in clear_padding_type, at gimple-fold.c:4798 since r12-3433-ga25e0b5e6ac8a77a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102586 --- Comment #28 from Jason Merrill --- (In reply to Qing Zhao from comment #27) > Does this issue only exist with -flifetime-dse=2? > When -flifetime-dse=2, the call to __builtin_clear_padding should be > inserted AFTER the start point of the constructor of the object, otherwise > it’s dead and will be eliminated by DSE. > And with -flifetime-dse=2, the padding initialization should be done by C++ > FE instead of middle end. > Is this understanding correct? Yes, -flifetime-dse=2 conflicts with -ftrivial-auto-var-init in that both are trying to define the initial state of the variable. Perhaps -ftrivial-auto-var-init should lower to -flifetime-dse=1. Or change build_clobber_this to do the specified initialization instead of a clobber, though that would mean doing the initialization for any object of that type, not just automatics.
[Bug fortran/104888] New: diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 Bug ID: 104888 Summary: diagnostics use non-idiomatic '%s' Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: roland.illig at gmx dot de Target Milestone: --- fortran/openmp.cc says: > selector '%s' not allowed for context selector set '%s' at %C One year ago, the message contained the idiomatic %qs. Why was it changed in ae3c4e521dd0b66db712639298cd08331d62f315?
[Bug fortran/104888] diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 --- Comment #1 from Roland Illig --- While here: > expected : at %C The quotes around the %<:%> are missing.
[Bug fortran/104888] diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 --- Comment #2 from Roland Illig --- While here: > "'omp_allocator_handle_kind' kind at %L" Should this be uppercase instead?
[Bug fortran/104888] diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 --- Comment #3 from Roland Illig --- While here: > DEPEND clause of depobj Should DEPOBJ be uppercase?
[Bug d/104889] New: [12 Regression] D frontend fails to link on x86_64-linux-gnux32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104889 Bug ID: 104889 Summary: [12 Regression] D frontend fails to link on x86_64-linux-gnux32 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: d Assignee: ibuclaw at gdcproject dot org Reporter: doko at gcc dot gnu.org Target Milestone: --- seen with trunk 20220302 on x86_64-linux-gnu during stage1, with x86_64-linux-gnux32-gdc-10 as the bootstrap compiler. /usr/bin/ld: auto-profile.o: in function `__gnu_cxx::new_allocator > >::allocate(unsigned int, void const*)': /usr/include/c++/11/ext/new_allocator.h:116: undefined reference to `std::__throw_bad_array_new_length()' /usr/bin/ld: auto-profile.o: in function `__gnu_cxx::new_allocator > >::allocate(unsigned int, void const*)': /usr/include/c++/11/ext/new_allocator.h:116: undefined reference to `std::__throw_bad_array_new_length()' /usr/bin/ld: auto-profile.o: in function `__gnu_cxx::new_allocator > >::allocate(unsigned int, void const*)': /usr/include/c++/11/ext/new_allocator.h:116: undefined reference to `std::__throw_bad_array_new_length()' /usr/bin/ld: auto-profile.o: in function `__gnu_cxx::new_allocator const, autofdo::function_instance*> > >::allocate(unsigned int, void const*)': /usr/include/c++/11/ext/new_allocator.h:116: undefined reference to `std::__throw_bad_array_new_length()' /usr/bin/ld: auto-profile.o: in function `__gnu_cxx::new_allocator > >::allocate(unsigned int, void const*)': /usr/include/c++/11/ext/new_allocator.h:116: undefined reference to `std::__throw_bad_array_new_length()' /usr/bin/ld: auto-profile.o:/usr/include/c++/11/ext/new_allocator.h:116: more undefined references to `std::__throw_bad_array_new_length()' follow collect2: error: ld returned 1 exit status make[5]: *** [../../src/gcc/d/Make-lang.in:228: d21] Error 1
[Bug d/104889] [12 Regression] D frontend fails to link on x86_64-linux-gnux32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104889 --- Comment #1 from Andrew Pinski --- > /usr/include/c++/11/ext/new_allocator.h > x86_64-linux-gnux32-gdc-10 Hmm, mixing the library from GCC 10 but compiling with g++-10
[Bug fortran/104888] diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 --- Comment #4 from Roland Illig --- While here: > requiries typo: should be requires
[Bug d/104889] [12 Regression] D frontend fails to link on x86_64-linux-gnux32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104889 --- Comment #2 from Matthias Klose --- > Hmm, mixing the library from GCC 10 but compiling with g++-10 ok, I'll check that
[Bug fortran/104888] diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 --- Comment #5 from Roland Illig --- Related, in trans-openmp.cc: > "specified at %L " The space at the end is too much.
[Bug c++/97198] __is_constructible(int[], int) should return true
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97198 Zhihao Yuan changed: What|Removed |Added CC||lichray at gmail dot com --- Comment #5 from Zhihao Yuan --- Encountered this today. In case I cannot show up when discussing LWG3486, my use case is that C(in_place_type, a, b, c) should "just works." It's up to C how to deal with it. In my case, it's new T[].