[Bug tree-optimization/110760] slp introduces new overflow arithmetic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110760 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Last reconfirmed||2023-07-21 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Keywords||wrong-code --- Comment #5 from Richard Biener --- Vector types follow the same rules as scalar types because we eventually lower them to scalar ops. So yes, I think this is a bug. Now, it will be difficult to exploit since the values are not actually used. One trick would be to CSE with a later tem = (unsigned)b[1] - (unsigned)c[1]; where for scalar code FRE would eliminate the above with the earlier a[2] doing tem = (unsigned)a[2]; but it currently doesn't do that for vectors and as the vectorizer introduces the wrong behavior we are not going to decompose that to scalars either. The particular case in question should be easy to fix though.
[Bug middle-end/110612] text-art: four clang warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110612 --- Comment #4 from David Binderman --- (In reply to David Malcolm from comment #3) > Thanks for filing this. You are welcome. > I believe all of these should be fixed by the above commit; please let me > know if any such warnings remain. I don't think this one has been mentioned already: gcc/analyzer/access-diagram.cc:944:26: warning: private field 'm_col_widths' is not used [-Wunused-private-field] $ grep m_col_widths access-diagram.cc m_col_widths (col_widths), m_cell_sizes (m_col_widths, m_row_heights), table_dimension_sizes &m_col_widths; // Reference to shared column widths m_col_widths (col_widths) table_dimension_sizes &m_col_widths; m_col_widths for (auto w : m_col_widths->m_requirements) = new x_aligned_table_widget (std::move (t), m_theme, *m_col_widths); = new x_aligned_x_ruler_widget (*this, m_theme, *m_col_widths); int canvas_w = m_col_widths->m_requirements[table_x]; = (m_col_widths->m_requirements[table_x] * fixed_point) / bit_size; m_col_widths->m_requirements[min_idx] += 1; std::unique_ptr m_col_widths; $
[Bug tree-optimization/110742] [14 Regression] cc1plus ICE "Floating point exception" during SLP and profiled bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110742 --- Comment #13 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:9a8782e63790842d1bfa03e12eecf73c4aaeb1f8 commit r14-2697-g9a8782e63790842d1bfa03e12eecf73c4aaeb1f8 Author: Richard Biener Date: Thu Jul 20 13:09:17 2023 +0200 tree-optimization/110742 - fix latent issue with permuting existing vectors When we materialize a layout we push edge permutes to constant/external defs without checking we can actually do so. For externals defined by vector stmts rather than scalar components we can't. PR tree-optimization/110742 * tree-vect-slp.cc (vect_optimize_slp_pass::get_result_with_layout): Do not materialize an edge permutation in an external node with vector defs. (vect_slp_analyze_node_operations_1): Guard purely internal nodes better. * g++.dg/torture/pr110742.C: New testcase.
[Bug tree-optimization/110742] [14 Regression] cc1plus ICE "Floating point exception" during SLP and profiled bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110742 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #14 from Richard Biener --- Fixed.
[Bug tree-optimization/80574] GCC fail to optimize nested ternary
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574 --- Comment #11 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:6d449531a60b56ed0f4aeb640aa9e46e4ec35208 commit r14-2698-g6d449531a60b56ed0f4aeb640aa9e46e4ec35208 Author: Andrew Pinski Date: Thu Jul 20 17:36:29 2023 -0700 MATCH: Add Max,a> -> Max simplifcation This adds a simple match pattern to simplify `max,a>` to `max`. Reassociation handles this already (r0-77700-ge969dbde29bfd396259357) but seems like we should be able to handle this even before reassociation. This fixes part of PR tree-optimization/80574 but more work is needed fix it the rest of the way. The original testcase there is fixed but the RTL level is what fixes it the rest of the way. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * match.pd (minmax,a>->minmax): New transformation. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/reassoc-12.c: Disable all of the passes that enables match-and-simplify. * gcc.dg/tree-ssa/minmax-23.c: New test.
[Bug tree-optimization/88540] Issues with vectorization of min/max operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540 --- Comment #8 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:9f8f37f5490076b10436993fb90d18092a960922 commit r14-2699-g9f8f37f5490076b10436993fb90d18092a960922 Author: Richard Biener Date: Thu Jul 13 08:58:58 2023 +0200 tree-optimization/88540 - FP x > y ? x : y if-conversion without -ffast-math The following makes sure that FP x > y ? x : y style max/min operations are if-converted at the GIMPLE level. While we can neither match it to MAX_EXPR nor .FMAX as both have different semantics with IEEE than the ternary ?: operation we can make sure to maintain this form as a COND_EXPR so backends have the chance to match this to instructions their ISA offers. The patch does this in phiopt where we recognize min/max and instead of giving up when we have to honor NaNs we alter the generated code to a COND_EXPR. This resolves PR88540 and we can then SLP vectorize the min operation for its testcase. It also resolves part of the regressions observed with the change matching bit-inserts of bit-field-refs to vec_perm. Expansion from a COND_EXPR rather than from compare-and-branch gcc.target/i386/pr54855-9.c by producing extra moves while the corresponding min/max operations are now already synthesized by RTL expansion, register selection isn't optimal. This can be also provoked without this change by altering the operand order in the source. I have XFAILed that part of the test. PR tree-optimization/88540 * tree-ssa-phiopt.cc (minmax_replacement): Do not give up with NaNs but handle the simple case by if-converting to a COND_EXPR. * gcc.target/i386/pr88540.c: New testcase. * gcc.target/i386/pr54855-9.c: XFAIL check for redundant moves. * gcc.target/i386/pr54855-12.c: Adjust. * gcc.target/i386/pr54855-13.c: Likewise. * gcc.target/i386/pr110170.c: Likewise. * gcc.dg/tree-ssa/split-path-12.c: Likewise.
[Bug tree-optimization/80874] gcc does not emit cmov for minmax
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80874 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-07-21 Ever confirmed|0 |1 Component|target |tree-optimization Depends on||67962 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #4 from Andrew Pinski --- This is basically PR 67962. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67962 [Bug 67962] Optimization opportunity with conditional swap to two MIN/MAX in phiopt
[Bug tree-optimization/80574] GCC fail to optimize nested ternary
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80574 --- Comment #12 from Andrew Pinski --- Note the original example in comment #0 is now optimized for GCC 14 but only at the RTL level rather than the gimple level.
[Bug target/110741] vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org CC||linkw at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2023-07-21 --- Comment #1 from Kewen Lin --- Confirmed.
[Bug libfortran/110759] [14 Regression] IEEE Fortran change broke RISC-V linux build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110759 --- Comment #11 from Francois-Xavier Coudert --- Thanks Andrew for fixing it, my mistake. Apologies to everyone.
[Bug target/110741] vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741 Kewen Lin changed: What|Removed |Added CC||bergner at gcc dot gnu.org, ||segher at gcc dot gnu.org --- Comment #2 from Kewen Lin --- It exposed one issue on xxeval output vsx operands' format, can be fixed with: diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 0c269e4e8d9..1a87f1c0b63 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -6586,7 +6586,7 @@ (define_insn "xxeval" (match_operand:QI 4 "u8bit_cint_operand" "n")] UNSPEC_XXEVAL))] "TARGET_POWER10" - "xxeval %0,%1,%2,%3,%4" + "xxeval %x0,%x1,%x2,%x3,%4" [(set_attr "type" "vecperm") (set_attr "prefixed" "yes")])
[Bug middle-end/110761] New: No warning for an uninitialized variable when used within its own initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110761 Bug ID: 110761 Summary: No warning for an uninitialized variable when used within its own initialization Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jwzeng at nuaa dot edu.cn Target Milestone: --- Link to the Compiler Explorer: https://godbolt.org/z/ra7P3EcMn There is no warning for a variable is uninitialized when used within its own initialization. $ cat test.c int main(void) { int a = 1, b = 2; int g = (a % ((~0) && b)) && g; } $ gcc-trunk -Wall -Wextra -pedantic -c test.c $ $ clang-16.0.0 -Weverything -pedantic -c test.c test.c:4:34: warning: variable 'g' is uninitialized when used within its own initialization [-Wuninitialized] int g = (a % ((~0) && b)) && g; ~^ 1 warning generated. $ $ gcc-trunk -v gcc (GCC) 14.0.0 20230619 (experimental) [master r14-1917-gf8e0270272] $ clang-16.0.0 -v clang version 16.0.0 Target: x86_64-unknown-linux-gnu Would it be better to output warning information like clang? In gcc-11.4 and earlier versions, there will be corresponding warnings.
[Bug tree-optimization/110755] [13/14 Regression] Wrong optimization of fabs on ppc64el at -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110755 --- Comment #5 from Jakub Jelinek --- A big hammer solution might be to treat flag_rounding_math in frange::set the same as !HONOR_SIGNED_ZEROS, i.e. always extend [0, x] ranges to [-0, x] and [y, -0] to [y, 0] because we don't know what the rounding will do: - else if (!HONOR_SIGNED_ZEROS (m_type)) + else if (!HONOR_SIGNED_ZEROS (m_type) || flag_rounding_math) { if (real_iszero (&m_max, 1)) m_max.sign = 0; if (real_iszero (&m_min, 0)) m_min.sign = 1; } Though, such a change would affect even say operator_abs handling where we even for flag_rounding_math are guaranteed the sign will be positive (unless -fno-signed-zeros, in that case it is right we don't assume anything). Or do it in range_operator::fold_range? Or some other spot? Generally, operations like neg, abs, comparisons should be fine, but +-*/ at least in the fold_range direction probably need to do that.
[Bug target/106952] Missed optimization: x < y ? x : y not lowered to minss
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org, ||uros at gcc dot gnu.org Status|ASSIGNED|NEW Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #5 from Richard Biener --- After the latest fixes we still fail to recognize min/max early for float < 0.0 ? float : 0.0 because prepare_cmp_insn doesn't push the FP 0.0 constant to a reg since RTX cost for this seems to be zero. We then call insn_operand_matches which ultimatively fails in ix86_fp_comparison_operator as ix86_fp_comparison_strategy is IX86_FPCMP_COMI here and ix86_trivial_fp_comparison_operator for (lt (reg/v:SF 110 [ t2 ]) (const_double:SF 0.0 [0x0.0p+0])) returns false. If I fix things so we try (gt (const_double:SF 0.0 [0x0.0p+0]) (reg:SF ..)) then maybe_legitimize_operands "breaks" things here since it forces the cond operand to a register but not the comparison operand so ix86_expand_fp_movcc again FAILs. I'm not sure why the x86 backend allows any CONST_DOUBLE as part of comparisons (during expansion only?). This and maybe special-handling of rtx_cost with this special constant and LT/GT code makes the first compares not recognized as MIN/MAX. The rest is fixed now. Patch for trying (gt ..): diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 32ff379ffc3..3ff8ba88bbb 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -4607,6 +4607,14 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code comparison, rtx size, break; } + if (FLOAT_MODE_P (mode)) +{ + prepare_cmp_insn (y, x, swap_condition (comparison), + size, unsignedp, methods, ptest, pmode); + if (*ptest) + return; +} + if (methods != OPTAB_LIB_WIDEN) goto fail;
[Bug target/110170] Sub-optimal conditional jumps in conditional-swap with floating point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110170 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #19 from Richard Biener --- Fixed now.
[Bug tree-optimization/67962] Optimization opportunity with conditional swap to two MIN/MAX in phiopt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67962 --- Comment #4 from Richard Biener --- (In reply to Andrew Pinski from comment #3) > Mine, but for gcc 13. The main problem I see if two cmov might be slower > than a branch on x86_64 processors. Two cmov definitely, a min/max pair not. Now, phiopt will turn : if (y_16(D) < x_17(D)) goto ; [INV] else goto ; [INV] : : # x_14 = PHI if (y_16(D) < x_17(D)) goto ; [INV] else goto ; [INV] : : # y_15 = PHI into the desired pair but fails for the equivalent : if (y_16(D) < x_17(D)) goto ; [INV] else goto ; [INV] : : # x_14 = PHI # y_15 = PHI We do value-replacement for more than one PHI but not others, not exactly sure why. We could dry-run convert all PHIs and only if that succeeds and the condition goes away perform the transforms. Of course some transforms might still not be profitable then.
[Bug middle-end/110754] assume create spurious load for volatile variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754 --- Comment #6 from Martin Uecker --- One should check whether there is a similar issue with atomics, at least regarding accidentally introducing memory ordering or so.
[Bug tree-optimization/110755] [13/14 Regression] Wrong optimization of fabs on ppc64el at -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110755 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #6 from Jakub Jelinek --- Created attachment 55594 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55594&action=edit gcc14-pr110755.patch Untested patch.
[Bug target/110762] New: inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 Bug ID: 110762 Summary: inappropriate use of SSE (or AVX) insns for v2sf mode operations Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jbeulich at suse dot com Target Milestone: --- Perhaps related to work done for bug 95046, this code typedef float __attribute__((vector_size(8))) v2sf_t; typedef float __attribute__((vector_size(16))) v4sf_t; v2sf_t test(v4sf_t x, v4sf_t y) { v2sf_t x2, y2; __builtin_ia32_storelps(&x2, x); __builtin_ia32_storelps(&y2, y); return x2 + y2; } compiled for a 64-bit target with -O2 translates to a single addps (besides the ret instruction of course), coming from *mmx_addv2sf3. This cannot be right: The contents of the upper halves of both registers aren't known at this point, so the extra care mentioned in 95046 does not look to be applied here.
[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717 --- Comment #11 from Jakub Jelinek --- To handle this in generic code, I think expand_expr_real_2 woiuld need to notice this case of << operand of arithmetic >> by same amount and tell that to expand_variable_shift -> expand_shift_1 -> expand_binop somehow. Wasn't there a proposal for SEXT_EXPR at some point?
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #1 from Richard Biener --- So what's the issue? That this is wrong for -ftrapping-math? Or that the return value has undefined contents in the upper half? (I don't think the ABI specifies how V2SF is returned)
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 Richard Biener changed: What|Removed |Added Keywords||wrong-code Last reconfirmed||2023-07-21 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #2 from Richard Biener --- The (insn 13 4 14 2 (set (reg:V2SF 20 xmm0 [orig:91 x2 ] [91]) (vec_select:V2SF (reg:V4SF 20 xmm0 [94]) (parallel [ (const_int 0 [0]) (const_int 1 [0x1]) ]))) "t.c":10:12 4394 {sse_storelps} (nil)) insns are gone in split after reload.
[Bug middle-end/110761] No warning for an uninitialized variable when used within its own initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110761 Richard Biener changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Richard Biener --- Because g is only used conditionally and that condition is false. Does clang diagnose int g = 0 && g; ? if you write int g = 1 && g; we diagnose the use.
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #3 from Uroš Bizjak --- (In reply to Richard Biener from comment #1) > So what's the issue? That this is wrong for -ftrapping-math? Or that the > return value has undefined contents in the upper half? (I don't think the > ABI specifies how V2SF is returned) __m64 is classified as SSE class, returned in XMM register.
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #4 from Alexander Monakov --- In addition to FPU exception issue, it's also a performance trap due to handling of accidental denormals in upper halves.
[Bug testsuite/110763] New: FAIL: gcc.dg/ubsan/object-size-dyn.c -O2 execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110763 Bug ID: 110763 Summary: FAIL: gcc.dg/ubsan/object-size-dyn.c -O2 execution test Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- I see this testcase FAILing because it ends up returning uninitialized memory. This can be seen when enabling glibc malloc perturbing. The 'off' method ands with zero to avoid this. int __attribute__ ((noinline)) dyn (int size, int i) { __builtin_printf ("dyn\n"); fflush (stdout); int *alloc = __builtin_calloc (size, sizeof (int)); int ret = alloc[i]; __builtin_free (alloc); return ret; } ... int main (void) { int ret = dyn (2, 2); ret |= off (4, 4, 0); return ret;
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #5 from Segher Boessenkool --- (In reply to Richard Biener from comment #2) > The > >(insn 13 4 14 2 (set (reg:V2SF 20 xmm0 [orig:91 x2 ] [91]) > (vec_select:V2SF (reg:V4SF 20 xmm0 [94]) > (parallel [ > (const_int 0 [0]) > (const_int 1 [0x1]) > ]))) "t.c":10:12 4394 {sse_storelps} > (nil)) > > insns are gone in split after reload. Insns 13 and 14 are deleted by split2, yes. Although the very next insn (15) obviously uses the regs (20 and 21) those insns set?!
[Bug c++/102854] [OpenMP] Bogus "initializer expression refers to iteration variable" when using templates
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102854 Tobias Burnus changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #5 from Tobias Burnus --- Close as FIXED - based on * my previous comment (comment 4) * based on the following: (In reply to Jakub Jelinek from comment #2) > WIP patch. Clearly still more work is needed, apparently pointer iterators > in non-rectangular loops are rejected, like: (example that is committed as part of PR106449) > and enabling it result in ICEs during omp-expand.c. Furthermore, for both > pointer and random access iterator non-rect loops, I think we cannot really use non-rectangular loops with random-access iterators. It works fine as long as the outer and the inner loop access different variables, an example for this is libgomp.c++/collapse-2.C → https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgomp/testsuite/libgomp.c++/collapse-2.C;hb=HEAD But I currently see no way to use in the outer loop * a C++ iterator * a range-based for loop * a C-style index-variable loop and then to use 'outer' in any way, except of having yet another classic C-style loop. Everything else I came up ends up using an expression involving the outer iteration variable and not the bare variable. But that's rejected ("expression refers to iteration variable" error). > I should verify we only > allow the var-outer, var-outer + a2, a2 + var-outer and var-outer - a2 forms > and no others and test code generation. I believe that's what r12-4733-g2084b5f42a4432da8b0625f9c669bf690ec46468 does in c-c++-common/gomp/loop-9.c → https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/c-c++-common/gomp/loop-9.c;hb=HEAD (Admittedly only for "initializer expression" and not for "condition expression" or "increment expression", but I can confirm that at last the condition error works.)
[Bug middle-end/110764] New: [12/13/14 Regression] False positive -Warray-bounds warning swapping std::thread::id
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110764 Bug ID: 110764 Summary: [12/13/14 Regression] False positive -Warray-bounds warning swapping std::thread::id Product: gcc Version: 12.3.1 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- #include #include #include void g(int n); void g(int n) { std::cout << "consume: " << n << std::endl; } int main(void) { std::vector threads(1); // The compiler error comes from the two threads below. threads[0] = std::thread(g, 42); threads[1] = std::thread(g, 42); threads[0].join(); threads[1].join(); return (0); } Compiled with -Warray-bounds -O2 this gives a false positive: In file included from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/exception_ptr.h:43, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/exception:168, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/ios:39, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/ostream:38, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/iostream:39, from max.cc:1: In function ‘std::_Require >, std::is_move_constructible<_Tp>, std::is_move_assignable<_Tp> > std::swap(_Tp&, _Tp&) [with _Tp = thread::id]’, inlined from ‘void std::thread::swap(std::thread&)’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/std_thread.h:171:16, inlined from ‘std::thread& std::thread::operator=(std::thread&&)’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/std_thread.h:165:11, inlined from ‘int main()’ at max.cc:15:35: /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/move.h:205:7: warning: array subscript 1 is outside array bounds of ‘std::thread [1]’ [-Warray-bounds] 205 | __a = _GLIBCXX_MOVE(__b); | ^~~ In file included from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/x86_64-pc-linux-gnu/bits/c++allocator.h:33, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/allocator.h:46, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/string:41, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/locale_classes.h:40, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/ios_base.h:41, from /home/jwakely/gcc/12.1.0/include/c++/12.1.0/ios:42: In member function ‘_Tp* std::__new_allocator<_Tp>::allocate(size_type, const void*) [with _Tp = std::thread]’, inlined from ‘static _Tp* std::allocator_traits >::allocate(allocator_type&, size_type) [with _Tp = std::thread]’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/alloc_traits.h:464:28, inlined from ‘std::_Vector_base<_Tp, _Alloc>::pointer std::_Vector_base<_Tp, _Alloc>::_M_allocate(std::size_t) [with _Tp = std::thread; _Alloc = std::allocator]’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/stl_vector.h:378:33, inlined from ‘void std::_Vector_base<_Tp, _Alloc>::_M_create_storage(std::size_t) [with _Tp = std::thread; _Alloc = std::allocator]’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/stl_vector.h:395:44, inlined from ‘std::_Vector_base<_Tp, _Alloc>::_Vector_base(std::size_t, const allocator_type&) [with _Tp = std::thread; _Alloc = std::allocator]’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/stl_vector.h:332:26, inlined from ‘std::vector<_Tp, _Alloc>::vector(size_type, const allocator_type&) [with _Tp = std::thread; _Alloc = std::allocator]’ at /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/stl_vector.h:552:47, inlined from ‘int main()’ at max.cc:12:39: /home/jwakely/gcc/12.1.0/include/c++/12.1.0/bits/new_allocator.h:137:55: note: at offset 8 into object of size 8 allocated by ‘operator new’ 137 | return static_cast<_Tp*>(_GLIBCXX_OPERATOR_NEW(__n * sizeof(_Tp))); | ^ This started with r12-1992
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #6 from Richard Biener --- (In reply to Segher Boessenkool from comment #5) > (In reply to Richard Biener from comment #2) > > The > > > >(insn 13 4 14 2 (set (reg:V2SF 20 xmm0 [orig:91 x2 ] [91]) > > (vec_select:V2SF (reg:V4SF 20 xmm0 [94]) > > (parallel [ > > (const_int 0 [0]) > > (const_int 1 [0x1]) > > ]))) "t.c":10:12 4394 {sse_storelps} > > (nil)) > > > > insns are gone in split after reload. > > Insns 13 and 14 are deleted by split2, yes. Although the very next insn > (15) obviously uses the regs (20 and 21) those insns set?! set_noop_p returns true for it ...
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #7 from Richard Biener --- I guess for the specific usage we need to wrap this in an UNSPEC?
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #8 from Richard Biener --- OTOH the set isn't noop for the xmm0 hardreg (it zeros the upper parts)
[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717 --- Comment #12 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #9) > Wonder how many important targets provide double-word shift patterns vs. > ones which expand it through generic code. Very long ago rs6000 had special code for this. That was sub-optimal in other ways, and the generic code generated almost ideal code (sometimes an extra data movement insn). > powerpc probably could be improved: > foo: > srwi 9,4,5 > mr 10,9 > rlwimi 4,9,5,0,31-5 > rlwimi 10,3,27,0,31-27 > srawi 3,10,27 > blr This is hugely worse than what we used to do, it seems? GCC 8 did srdi 9,4,5 rldimi 9,3,59,0 rldimi 4,9,5,0 sradi 3,9,59 blr GCC 9 started with the unnecessary move. But we should get only one insert insn in any case!
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #9 from jbeulich at suse dot com --- (In reply to Richard Biener from comment #1) > So what's the issue? That this is wrong for -ftrapping-math? Even without that option MXCSR may be modified for reasons contained to just the upper halves of the registers. > Or that the > return value has undefined contents in the upper half? (I don't think the > ABI specifies how V2SF is returned) This part is fine, aiui.
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #10 from Uroš Bizjak --- (In reply to Richard Biener from comment #7) > I guess for the specific usage we need to wrap this in an UNSPEC? Probably, so a MOVQ xmm, xmm insn should be emitted for __builtin_ia32_storelps (AKA _mm_storel_pi), so the top 64bits will be cleared. There is already *vec_concatv4sf_0 that looks appropriate to implement the move.
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #11 from Richard Biener --- (In reply to Richard Biener from comment #2) > The > >(insn 13 4 14 2 (set (reg:V2SF 20 xmm0 [orig:91 x2 ] [91]) > (vec_select:V2SF (reg:V4SF 20 xmm0 [94]) > (parallel [ > (const_int 0 [0]) > (const_int 1 [0x1]) > ]))) "t.c":10:12 4394 {sse_storelps} > (nil)) > > insns are gone in split after reload. The opinion is that the above insn leaves the upper half of xmm0 undefined.
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #12 from Richard Biener --- _mm_storel_pi could be implemented using __builtin_shufflevector these days. Which shows exactly the same issue: typedef float __attribute__((vector_size(8))) v2sf_t; typedef float __attribute__((vector_size(16))) v4sf_t; v2sf_t test(v4sf_t x, v4sf_t y) { v2sf_t x2, y2; x2 = __builtin_shufflevector (x, x, 0, 1); y2 = __builtin_shufflevector (y, x, 0, 1); return x2 + y2; } expands to (insn 7 4 8 2 (set (reg:DI 88) (vec_select:DI (subreg:V2DI (reg/v:V4SF 85 [ x ]) 0) (parallel [ (const_int 0 [0]) ]))) "t.c":7:5 -1 (nil)) (insn 8 7 9 2 (set (reg:DI 89) (vec_select:DI (subreg:V2DI (reg/v:V4SF 86 [ y ]) 0) (parallel [ (const_int 0 [0]) ]))) "t.c":8:5 -1 (nil)) (insn 9 8 10 2 (set (reg:V2SF 87) (plus:V2SF (subreg:V2SF (reg:DI 88) 0) (subreg:V2SF (reg:DI 89) 0))) "t.c":12:12 -1 (nil)) and is recognized by the same set_noop_p code. On GIMPLE we have x2_2 = BIT_FIELD_REF ; y2_4 = BIT_FIELD_REF ; _5 = x2_2 + y2_4;
[Bug libstdc++/110077] [14 regression] libstdc++-abi/abi_check FAILs on Solaris
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110077 --- Comment #18 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #17 from Jonathan Wakely --- > I hope this is fixed now. It is indeed. Thanks a lot.
[Bug c++/110765] New: [13 regression] constraints on parameters of derived type in CRTP base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110765 Bug ID: 110765 Summary: [13 regression] constraints on parameters of derived type in CRTP base Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: h2+bugs at fsfe dot org Target Milestone: --- Created attachment 55595 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55595&action=edit GCC12's intermediate code; build with -std=c++20 I have large parts of library code that fail to build with GCC13, although they build with GCC10-12 and Clang-16. I have attached GCC12's intermediate code that compiles cleanly with GCC12 but fails to build with GCC13. The error is the same as when building the actual code. See logs below. Playing a bit around with the code seems to indicate that constrained friend declarations in the CRTP base that references the derived class could be the problem: Triggering code: ```cpp friend constexpr auto tag_invoke(custom::to_char, derived_type const a) noexcept requires(requires { { a.to_char() }; }) { return a.to_char(); } ``` Workaround: ```cpp friend constexpr auto tag_invoke(custom::to_char, std::same_as auto const a) noexcept requires(requires { { a.to_char() }; }) { return a.to_char(); } ``` It seems that GCC13 tries to check the constraints on the derived class earlier than previous GCCs and Clang, and apparently before the type is complete. Passes: % g++12 --version g++12 (FreeBSD Ports Collection) 12.2.0 %g++12 -std=c++20 a-test.ii % Fails: % g++13 --version g++13 (FreeBSD Ports Collection) 13.1.1 20230701 % g++13 -std=c++20 a-test.ii In file included from /usr/local/lib/gcc12/include/c++/bits/stl_pair.h:60, from /usr/local/lib/gcc12/include/c++/bits/stl_algobase.h:64, from /usr/local/lib/gcc12/include/c++/vector:60, from /home/hannes/devel/biocpp-core/include/bio/alphabet/nucleotide/dna4.hpp:16, from test.cpp:1: /usr/local/lib/gcc12/include/c++/type_traits:1522:12: error: expected identifier before '__is_nothrow_convertible' 1522 | struct __is_nothrow_convertible |^~~~ /usr/local/lib/gcc12/include/c++/type_traits:1522:12: error: expected unqualified-id before '__is_nothrow_convertible' /usr/local/lib/gcc12/include/c++/type_traits:3083:66: error: template argument 2 is invalid 3083 | __is_nothrow_convertible> | ^~ In file included from /usr/local/lib/gcc12/include/c++/memory:76, from /home/hannes/devel/biocpp-core/include/bio/meta/detail/type_inspection.hpp:21, from /home/hannes/devel/biocpp-core/include/bio/alphabet/concept.hpp:22, from /home/hannes/devel/biocpp-core/include/bio/alphabet/base.hpp:20, from /home/hannes/devel/biocpp-core/include/bio/alphabet/nucleotide/nucleotide_base.hpp:16, from /home/hannes/devel/biocpp-core/include/bio/alphabet/nucleotide/dna4.hpp:18: /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:536:15: error: expected identifier before '__remove_cv' 536 | using __remove_cv = typename remove_cv<_Up>::type; | ^~~ /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:536:27: error: expected '(' before '=' token 536 | using __remove_cv = typename remove_cv<_Up>::type; | ^ | ( /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:536:27: error: expected type-specifier before '=' token /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:536:27: error: expected unqualified-id before '=' token /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:542:69: error: wrong number of template arguments (1, should be 2) 542 | __not_, __remove_cv<_Up>>> >; | ^~ /usr/local/lib/gcc12/include/c++/type_traits:635:12: note: provided for 'template struct std::is_same' 635 | struct is_same; |^~~ /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:542:71: error: template argument 1 is invalid 542 | __not_, __remove_cv<_Up>>> >; | ^ /usr/local/lib/gcc12/include/c++/bits/unique_ptr.h:542:73: error: template argument 2 is invalid 542 | __not_, __remove_cv<_Up>>> >; | ^ /usr/local/lib/gcc12/include/c++/type_traits: In instantiation of 'struct std::is_nothrow_destructible': /home/hannes/devel/biocpp-core/include/bio/alphabet/base.hpp:260:27: recursively required from 'consteval auto bio::alp
[Bug libfortran/110651] libgfortran.spec links twice with libgcc spec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110651 --- Comment #3 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #2 from ro at CeBiTec dot Uni-Bielefeld.DE Uni-Bielefeld.DE> --- >> --- Comment #1 from Iain Sandoe --- >> (In reply to Rainer Orth from comment #0) >>> When bootstrapping current trunk on macOS 14.0 beta 3 with Xcode 15 beta 4, >>> every single fortran link test FAILs like >> >>> * Get rid of %(libgcc) in libgfortran.spec.in. >>> >>> * Include it conditionally depending on a configure test. >> >> Hmm .. I thought we already had configure tests to customise the spec for >> Darwin? >> FX? > > We do: @LIBM@ is handled that way. FWIW, I've now removed the %(libgcc) from libgfortran.spec.in locally: on all of *-*-solaris2.11, x86_64-pc-unix-gnu, and x86_64-apple-darwin23.0.0 there were no regressions. It seems ever more important to understand why it was introduced in the first place, before even considering adding it conditionally.
[Bug tree-optimization/41320] XFAIL gcc.dg/tree-ssa/forwprop-12.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41320 --- Comment #5 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:8cbdb2e4d64461d8a19e033bd33b585187059d8a commit r14-2708-g8cbdb2e4d64461d8a19e033bd33b585187059d8a Author: Richard Biener Date: Fri Jul 21 13:55:43 2023 +0200 tree-optimization/41320 - remove bogus XFAILed testcase gcc.dg/tree-ssa/forwprop-12.c looks for reconstruction of an ARRAY_REF from pointer arithmetic and dereference. That's not safe because ARRAY_REFs carry special semantics we later exploit during data dependence analysis. The following removes the testcase, closing the bug as WONTFIX. PR tree-optimization/41320 * gcc.dg/tree-ssa/forwprop-12.c: Remove.
[Bug tree-optimization/41320] XFAIL gcc.dg/tree-ssa/forwprop-12.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41320 Richard Biener changed: What|Removed |Added Resolution|--- |WONTFIX Status|ASSIGNED|RESOLVED --- Comment #6 from Richard Biener --- the desired transform is undesirable.
[Bug middle-end/55266] vector expansion: 24 movs for 4 adds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55266 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Richard Biener --- The original issue is fixed. f: .LFB0: .cfi_startproc movapd (%rdi), %xmm2 movapd 16(%rdi), %xmm1 movapd %xmm2, %xmm0 addpd %xmm2, %xmm0 addpd %xmm2, %xmm0 movaps %xmm0, (%rdi) movapd %xmm1, %xmm0 addpd %xmm1, %xmm0 addpd %xmm1, %xmm0 movaps %xmm0, 16(%rdi) ret the issue in comment#4 as well I think: _Z5dotd1Dv4_fS_: .LFB3: .cfi_startproc movaps %xmm1, %xmm3 pxor%xmm2, %xmm2 movhlps %xmm0, %xmm2 cvtps2pd%xmm0, %xmm0 cvtps2pd%xmm2, %xmm1 pxor%xmm2, %xmm2 movhlps %xmm3, %xmm2 cvtps2pd%xmm3, %xmm3 cvtps2pd%xmm2, %xmm2 mulpd %xmm3, %xmm0 mulpd %xmm2, %xmm1 addpd %xmm0, %xmm1 movapd %xmm1, %xmm0 unpckhpd%xmm1, %xmm0 addpd %xmm1, %xmm0 cvtsd2ss%xmm0, %xmm0 ret .cfi_endproc .LFE3: .size _Z5dotd1Dv4_fS_, .-_Z5dotd1Dv4_fS_ .p2align 4 .globl _Z5dotd2Dv4_fS_ .type _Z5dotd2Dv4_fS_, @function _Z5dotd2Dv4_fS_: .LFB4: .cfi_startproc movaps %xmm1, %xmm3 cvtps2pd%xmm0, %xmm4 pxor%xmm2, %xmm2 movhlps %xmm0, %xmm2 pxor%xmm0, %xmm0 movhlps %xmm3, %xmm0 cvtps2pd%xmm2, %xmm2 cvtps2pd%xmm1, %xmm1 cvtps2pd%xmm0, %xmm0 mulpd %xmm4, %xmm1 mulpd %xmm0, %xmm2 addpd %xmm2, %xmm1 movapd %xmm1, %xmm0 unpckhpd%xmm1, %xmm0 addpd %xmm1, %xmm0 cvtsd2ss%xmm0, %xmm0 ret
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 55266, which changed state. Bug 55266 Summary: vector expansion: 24 movs for 4 adds https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55266 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug middle-end/88670] [meta-bug] generic vector extension issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670 Bug 88670 depends on bug 55266, which changed state. Bug 55266 Summary: vector expansion: 24 movs for 4 adds https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55266 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/110766] New: ICE on valid code at -O3 on x86_64-linux-gnu: in gimple_phi_arg_def_from_edge, at gimple.h:4699
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110766 Bug ID: 110766 Summary: ICE on valid code at -O3 on x86_64-linux-gnu: in gimple_phi_arg_def_from_edge, at gimple.h:4699 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhendong.su at inf dot ethz.ch Target Milestone: --- It seems to be related to the recently fixed PR 110669. Compiler Explorer: https://godbolt.org/z/85drx858c [524] % gcctk -v Using built-in specs. COLLECT_GCC=gcctk COLLECT_LTO_WRAPPER=/local/home/suz/suz-local/software/local/gcc-trunk/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --disable-bootstrap --enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk --enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 14.0.0 20230721 (experimental) [master r14-924-gd709841ae0f] (GCC) [525] % [525] % gcctk -O3 small.c during GIMPLE pass: sccp small.c: In function ‘main’: small.c:4:5: internal compiler error: in gimple_phi_arg_def_from_edge, at gimple.h:4699 4 | int main() { | ^~~~ 0x80fe21 gimple_phi_arg_def_from_edge(gphi const*, edge_def const*) ../../gcc-trunk/gcc/gimple.h:4699 0x811332 gimple_phi_arg_def_from_edge(gphi const*, edge_def const*) ../../gcc-trunk/gcc/tree.h:3700 0x811332 final_value_replacement_loop(loop*) ../../gcc-trunk/gcc/tree-scalar-evolution.cc:3733 0x118f795 execute ../../gcc-trunk/gcc/tree-ssa-loop.cc:411 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. [526] % [526] % cat small.c int a, b, c, e; short d, f; int g(int h) { return h > a ? h : h << a; } int main() { while (e) { b = 0; for (; b < 3; b++) if (c) { e = g(1); f = e | d; } d = 0; } return 0; }
[Bug target/110767] New: vec_{fm,}{addsub,subadd} are missing for AVX512 modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110767 Bug ID: 110767 Summary: vec_{fm,}{addsub,subadd} are missing for AVX512 modes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- See PR54939 for a testcase.
[Bug tree-optimization/54939] Very poor vectorization of loops with complex arithmetic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED CC||crazylht at gmail dot com --- Comment #14 from Richard Biener --- With SSE4.2 we now get .L3: movupd (%rdx,%rax), %xmm0 movupd (%rcx,%rax), %xmm4 movapd %xmm0, %xmm1 palignr $8, %xmm0, %xmm0 mulpd %xmm3, %xmm1 mulpd %xmm2, %xmm0 addpd %xmm4, %xmm1 addsubpd%xmm0, %xmm1 movups %xmm1, (%rcx,%rax) addq$16, %rax cmpq%rsi, %rax jne .L3 with AVX and FMA .L4: vmovupd (%rdx,%rax), %ymm0 vmovapd %ymm4, %ymm1 vfmadd213pd (%rcx,%rax), %ymm0, %ymm1 vpermilpd $5, %ymm0, %ymm0 vmulpd %ymm3, %ymm0, %ymm0 vaddsubpd %ymm0, %ymm1, %ymm1 vmovupd %ymm1, (%rcx,%rax) addq$32, %rax cmpq%rsi, %rax jne .L4 so I'd say fixed. But. With AVX512 we now get .L4: vmovupd (%rdi,%rax), %zmm0 vmovapd %zmm7, %zmm2 vmovapd %zmm4, %zmm6 vfmadd213pd (%rcx,%rax), %zmm0, %zmm2 vpermilpd $85, %zmm0, %zmm0 vfmadd132pd %zmm0, %zmm2, %zmm6 vfnmadd132pd%zmm4, %zmm2, %zmm0 vmovapd %zmm6, %zmm0{%k1} vmovupd %zmm0, (%rcx,%rax) addq$64, %rax cmpq%rax, %rsi jne .L4 it's odd that this only happens with -mprefer-vector-width=512 though. Do we possibly miss vec_{fm,}{addsub,subadd} for those? Looks like so. Tracking in PR110767. The vectorizer side is fixed.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 54939, which changed state. Bug 54939 Summary: Very poor vectorization of loops with complex arithmetic https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021 Bug 37021 depends on bug 54939, which changed state. Bug 54939 Summary: Very poor vectorization of loops with complex arithmetic https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug target/84361] Fails to use vfmaddsub* for complex multiplication
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84361 Richard Biener changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #3 from Richard Biener --- Really a duplicate of PR81904 which tracks a little bit more. *** This bug has been marked as a duplicate of bug 81904 ***
[Bug tree-optimization/54939] Very poor vectorization of loops with complex arithmetic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54939 Bug 54939 depends on bug 84361, which changed state. Bug 84361 Summary: Fails to use vfmaddsub* for complex multiplication https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84361 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug target/81904] FMA and addsub instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904 --- Comment #3 from Richard Biener --- *** Bug 84361 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 84361, which changed state. Bug 84361 Summary: Fails to use vfmaddsub* for complex multiplication https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84361 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #13 from Uroš Bizjak --- I think we should put all partial vector V2SF operations under !flag_trapping_math.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #18 from rsandifo at gcc dot gnu.org --- I'd understood LLVM's undef as essentially being “unspecified”, or “unspecified bit-pattern” to quote the docs. It doesn't indicate undefined behaviour in the C/C++ sense: Undefined values are useful because they indicate to the compiler that the program is well defined no matter what value is used. And I think that's what we want here. The reason we have TARGET_PREFERRED_ELSE_VALUE is that the vectoriser sometimes doesn't care what values the inactive lanes of the result have. The else value can be anything without affecting the validity of the program. So if we had undef, we wouldn't need the hook. I think the same thing applies to a VEC_PERM_EXPR that only selects from the first vector. We canonicalise that by duplicating the vector input, but IMO an undef second operand would be more accurate. An undef value would also allow us to represent “don't care” indices in a permute index vector, such as -1 in a __builtin_shuffle call. (There were times when I wanted the same thing in the vectoriser too, but I can't remember where.) There again, a separate “care/don't care” mask might be better for VLA. ACLE provides “svundef” functions that have essentially the same semantics as LLVM's undef. So I Think it would be useful to be able to access the semantics outside of these particular IFNs.
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #14 from Alexander Monakov --- That seems undesirable in light of comment #4, you'd risk creating a situation when -fno-trapping-math is unpredictably slower when denormals appear in dirty upper halves.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #19 from Richard Biener --- (In reply to rsand...@gcc.gnu.org from comment #18) > I'd understood LLVM's undef as essentially being “unspecified”, or > “unspecified bit-pattern” to quote the docs. It doesn't indicate undefined > behaviour in the C/C++ sense: > > Undefined values are useful because they indicate to the compiler that > the program is well defined no matter what value is used. > > And I think that's what we want here. The reason we have > TARGET_PREFERRED_ELSE_VALUE is that the vectoriser sometimes doesn't care > what values the inactive lanes of the result have. The else value can be > anything without affecting the validity of the program. So if we had undef, > we wouldn't need the hook. > > I think the same thing applies to a VEC_PERM_EXPR that only selects from the > first vector. We canonicalise that by duplicating the vector input, but IMO > an undef second operand would be more accurate. > > An undef value would also allow us to represent “don't care” indices in a > permute index vector, such as -1 in a __builtin_shuffle call. (There were > times when I wanted the same thing in the vectoriser too, but I can't > remember where.) There again, a separate “care/don't care” mask might be > better for VLA. > > ACLE provides “svundef” functions that have essentially the same semantics > as LLVM's undef. > > So I Think it would be useful to be able to access the semantics outside of > these particular IFNs. Sure, I can kind of see the usefulness elsewhere. Just for this particular issue it doesn't seem necessary to sit down and design this when we can represent it like we do for MASK_LOAD (omit the 'else' value). As noted above we have the use-case of a not undefined 'else' value. But I agree, in theory we could drop the target hook and omit the 'else' value when we don't need any particular one. So what I want to point out is that we're fine without for MASK_LOAD so we should be fine without elsewhere as well. In other context we discussed specifying zero for MASK_LOAD masked elements so we can for example CSE better. CSE with UNDEF might be possible as well, but I'm not sure what LLVM's undef would allow and whether it's defined rigidly enough.
[Bug libstdc++/109921] [13 Regression] c++17/floating_from_chars.cc: compile error: ‘from_chars_strtod’ was not declared in this scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109921 Bug 109921 depends on bug 110077, which changed state. Bug 110077 Summary: [14 regression] libstdc++-abi/abi_check FAILs on Solaris https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110077 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED
[Bug libstdc++/110077] [14 regression] libstdc++-abi/abi_check FAILs on Solaris
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110077 Jonathan Wakely changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #19 from Jonathan Wakely --- Great, thanks for confirming it!
[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762 --- Comment #15 from jbeulich at suse dot com --- (In reply to Richard Biener from comment #12) > _mm_storel_pi could be implemented using __builtin_shufflevector these days. > Which shows exactly the same issue: (also related to comment 10) I don't think the problem is how the registers are filled (and in my example I simply used the first approach that came to mind and worked). The problem is that the arithmetic insn assumes the upper parts to not hold certain special values (or pairs thereof). Aiui one could create the exact same situation with inline assembly instead of any of the builtins. This isn't any different from using 512-bit operations for more narrow vectors when AVX512VL isn't enabled. Afaict such uses are carefully avoided for floating point vectors, and are used only in a limited number of cases on integer vectors (Hongtao recently asked me to not go any further in that direction either).
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #20 from rsandifo at gcc dot gnu.org --- (In reply to Richard Biener from comment #19) > Sure, I can kind of see the usefulness elsewhere. Just for this particular > issue it doesn't seem necessary to sit down and design this when we can > represent it like we do for MASK_LOAD (omit the 'else' value). Yeah, that's fair. For the ifn->optab interface, I think it'd be natural to use an actual rtx rather than a null pointer, since e.g. predicates are not set up to handle nulls. So perhaps we should start the process there. We could add an UNDEF rtl code that is initially only used for the ifn->optab interface, and expand it as we find new use cases. We can grow the semantics based on those use cases and based on LLVM's experience. > In other context we discussed specifying zero for MASK_LOAD masked elements > so we can for example CSE better. CSE with UNDEF might be possible as well, > but I'm not sure what LLVM's undef would allow and whether it's defined > rigidly enough. One of the main optimisations I wanted from that was: a = IFN_MASK_LOAD (…, mask) b = VEC_COND_EXPR → a = IFN_MASK_LOAD (…, mask) b = a which wouldn't be valid for undef.
[Bug testsuite/110756] [14 Regression] commit g:92d1425ca78 causes failures in g++.dg/gomp/pr58567.C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110756 --- Comment #3 from Patrick Palka --- Whoops, sorry for not catching this.. I agree with Andrew, and your proposed patch looks good to me FWIW
[Bug tree-optimization/110768] New: [14 Regression] Dead Code Elimination Regression since r14-2623-gc11a3aedec2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110768 Bug ID: 110768 Summary: [14 Regression] Dead Code Elimination Regression since r14-2623-gc11a3aedec2 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: theodort at inf dot ethz.ch Target Milestone: --- https://godbolt.org/z/GsTMz1G9c Given the following code: void foo(void); static int a, b; int main() { { short c = 45127; char d; b = 0; for (; b <= 3; b++) { if (b) continue; d = 0; for (; d <= 3; d++) { if (!(((c) >= -20409) && ((c) <= 1))) { __builtin_unreachable(); } if (~(0 == a) & 1) return b; c = 0; for (; c <= 0; c++) a = 3; } } foo(); } } gcc-trunk -Os does not eliminate the call to foo: main: xorl%r9d, %r9d movla(%rip), %eax xorl%edx, %edx xorl%esi, %esi movl%r9d, b(%rip) xorl%ecx, %ecx .L2: cmpl$4, %edx je .L31 testl %edx, %edx jne .L3 testl %eax, %eax je .L4 .L24: testb %sil, %sil je .L5 xorl%r8d, %r8d movl%r8d, b(%rip) .L5: testb %cl, %cl je .L27 movl%eax, a(%rip) jmp .L27 .L4: movl$3, %eax movb$1, %cl jmp .L24 .L3: incl%edx movb$1, %sil jmp .L2 .L31: pushq %rdi testb %sil, %sil je .L9 movl$4, b(%rip) .L9: testb %cl, %cl je .L10 movl%eax, a(%rip) .L10: callfoo xorl%eax, %eax popq%rdx ret .L27: xorl%eax, %eax ret gcc-13.1.0 -Os eliminates the call to foo: main: xorl%eax, %eax cmpl$0, a(%rip) movl%eax, b(%rip) jne .L2 movl$3, a(%rip) .L2: xorl%eax, %eax ret Bisects to r14-2623-gc11a3aedec2
[Bug c++/110765] [13 regression] constraints on parameters of derived type in CRTP base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110765 Patrick Palka changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED CC||ppalka at gcc dot gnu.org --- Comment #1 from Patrick Palka --- Thanks for the bug report. This looks like a dup of PR109751, which is an unexpected consequence of the CWG2596 resolution (which we proactively implement IIUC). A general workaround is to turn the constrained hidden friend into a template, so that its constraints don't get immediately checked upon instantiation of its enclosing class (at which point the derived class is still incomplete). template friend constexpr auto tag_invoke(custom::to_char, derived_type const a) noexcept requires(requires { { a.to_char() }; }) { return a.to_char(); } *** This bug has been marked as a duplicate of bug 109751 ***
[Bug c++/109751] [13/14 Regression] boost iterator_interface fails concept check starting in gcc-13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109751 Patrick Palka changed: What|Removed |Added CC||h2+bugs at fsfe dot org --- Comment #22 from Patrick Palka --- *** Bug 110765 has been marked as a duplicate of this bug. ***
[Bug c/110769] New: ICE in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110769 Bug ID: 110769 Summary: ICE in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: shaohua.li at inf dot ethz.ch Target Milestone: --- Looks like a recent regression. Compiler explorer: https://godbolt.org/z/eMh54dYKz $ cat a.c int a; int b(unsigned d) { int c = 0; for (; d; c++) d >>= 1; return c; } int main() { a = 0; for (; b(31) + a > 21; a = a + (unsigned)8) ; for (;;) ; } $ $ gcc-tk -O3 a.c during GIMPLE pass: ch_vect crash_0_reduced.c: In function ‘main’: crash_0_reduced.c:8:5: internal compiler error: in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023 8 | int main() { | ^~~~ 0x216183e internal_error(char const*, ...) ???:0 0x9d1376 fancy_abort(char const*, int, char const*) ???:0 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ $ gcc-tk -v Using built-in specs. COLLECT_GCC=gcc-tk COLLECT_LTO_WRAPPER=/zdata/shaoli/compilers/ccbuilder-compilers/gcc-8cbdb2e4d64461d8a19e033bd33b585187059d8a/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --disable-multilib --disable-bootstrap --enable-languages=c,c++ --prefix=/zdata/shaoli/compilers/ccbuilder-compilers/gcc-8cbdb2e4d64461d8a19e033bd33b585187059d8a Thread model: posix Supported LTO compression algorithms: zlib gcc version 14.0.0 20230721 (experimental) (GCC) $
[Bug c++/110765] [13 regression] constraints on parameters of derived type in CRTP base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110765 --- Comment #2 from Hannes Hauswedell --- Thanks for the quick reply. I agree it is the same problem.
[Bug target/110066] [RISC-V] Segment fault if compiled with -static -pg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110066 Aurelien Jarno changed: What|Removed |Added CC||aurelien at aurel32 dot net --- Comment #4 from Aurelien Jarno --- This is also reproducible with the tst-gmon-static test from glibc 2.38, when compiled with GCC 13, while the test passes fine with GCC 12. A very basic debugging shows that the problem is triggered by using crtbeginT.o from GCC 13. The test passes when compiling everything with GCC 13, but using crtbeginT.o from GCC 12. The backtrace is the following: Program received signal SIGSEGV, Segmentation fault. 0x000516e2 in classify_object_over_fdes () (gdb) bt #0 0x000516e2 in classify_object_over_fdes () #1 0x00052690 in __register_frame_info () #2 0x00010570 in frame_dummy () #3 0x00010872 in call_init (envp=0x3ff3e0, argv=0x3ff3c8, argc=2) at libc-start.c:189 #4 __libc_start_main_impl (main=0x10448 , argc=2, argv=0x3ff3c8, init=, fini=, rtld_fini=, stack_end=) at libc-start.c:355 #5 0x00010488 in _start () at ../sysdeps/riscv/start.S:61
[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717 --- Comment #13 from Segher Boessenkool --- So. Before expand we have _6 = (__int128) x_3(D); x.0_1 = _6 << 59; _2 = x.0_1 >> 59; _4 = (__int128 unsigned) _2; return _4; That should have been optimised better :-( The RTL code it expands to sets the same pseudo multiple times. Bad bad bad. This hampers many optimisations. Like: (insn 6 3 7 2 (set (reg:DI 124) (lshiftrt:DI (reg:DI 129 [ x+8 ]) (const_int 5 [0x5]))) "110717.c":6:11 299 {lshrdi3} (nil)) (insn 7 6 8 2 (set (reg:DI 132) (ashift:DI (reg:DI 128 [ x ]) (const_int 59 [0x3b]))) "110717.c":6:11 289 {ashldi3} (nil)) (insn 8 7 9 2 (set (reg:DI 132) (ior:DI (reg:DI 124) (reg:DI 132))) "110717.c":6:11 233 {*booldi3} (nil)) (They are subregs right after expand, totally unreadable; this is after subreg1, slightly more readable, but essentially the same code still). The web pass eventually gets rid of the double set in this case. Because the shift-left-then-right survives all the way to combine, it (being the greedy bastard that it is) will use the combiner patterns rs6000 has for multi-precision shifts, before it would notice the two (multiprecision!) shifts together are largely a no-op, so you get stuck at a local optimum. Pat for the course for combine :-/
[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717 --- Comment #14 from Andrew Pinski --- (In reply to Segher Boessenkool from comment #13) > So. Before expand we have > > _6 = (__int128) x_3(D); > x.0_1 = _6 << 59; > _2 = x.0_1 >> 59; Jakub is trying to emulate this using shifts but this is better using bitfields just to get the truncation: #ifdef __SIZEOF_INT128__ #define type __int128 #else #define type long long #endif struct f { #ifdef __SIZEOF_INT128__ __int128 t:(128-59); #else long long t:(64-27); #endif }; unsigned type foo (unsigned type x) { struct f t; t.t = x; return t.t; }
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 Jakub Jelinek changed: What|Removed |Added Attachment #55592|0 |1 is obsolete|| --- Comment #87 from Jakub Jelinek --- Created attachment 55596 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55596&action=edit gcc14-bitint-wip.patch large/huge _BitInt __builtin_{add,sub}_overflow mostly implemented (I've left 2 spots to finish - gcc_unreachable () - which only trigger rarely). Though, e.g. in bitint-41.c test still t113sub t122mul t125mul t127mul t160sub t171mul t174mul t176mul functions abort, so to be debugged next week, then ubsan, inline asm and then hopefully submit.
Re: [Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c
> I suspect this is most likely the profile updates changes ... Quite possibly. The goal of this excercise is to figure out if there are some bugs in profile estimate or whether passes somehow preffer broken profile or if it is just back luck. Looking at sphinx and fatigue it seems that LRA really may preffer increased profile counts in peeled vectorized loop since it does not understand the fact that putting spill on critical path through the dependnecy graph of the code is not good for out of order execution.
[Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758 --- Comment #2 from Jan Hubicka --- > I suspect this is most likely the profile updates changes ... Quite possibly. The goal of this excercise is to figure out if there are some bugs in profile estimate or whether passes somehow preffer broken profile or if it is just back luck. Looking at sphinx and fatigue it seems that LRA really may preffer increased profile counts in peeled vectorized loop since it does not understand the fact that putting spill on critical path through the dependnecy graph of the code is not good for out of order execution.
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 Martin Jambor changed: What|Removed |Added CC||jamborm at gcc dot gnu.org --- Comment #1 from Martin Jambor --- The first (2%) slowdown seems to be due to r14-2524-gaa6741ef2e0c31 (Turn TODO_rebuild_frequencies to a pass), I'm now bisecting the bigger one.
[Bug c++/110382] [13/14 Regression] internal compiler error: in verify_ctor_sanity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110382 --- Comment #2 from Marek Polacek --- Cleaned-up test: struct S { double a = 0; }; constexpr double g () { S arr[1]; S s = arr[0]; return s.a; } int main() { return g (); } Note that changing double a = 0; to double a = 0.; gets rid of the ICE!
[Bug target/110727] [14 Regression] gcc.target/aarch64/sve/aarch64-sve.exp has two new failures since commit 061f74c0673
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110727 --- Comment #1 from CVS Commits --- The master branch has been updated by Jan Hubicka : https://gcc.gnu.org/g:a31ef26b056d0c4f0a9f08b6eb81456ea257298e commit r14-2716-ga31ef26b056d0c4f0a9f08b6eb81456ea257298e Author: Jan Hubicka Date: Fri Jul 21 19:38:26 2023 +0200 Avoid scaling flat loop profiles of vectorized loops As discussed, when vectorizing loop with static profile, it is not always good idea to divide the header frequency by vectorization factor because the profile may not realistically represent the expected number of iterations. Since in such cases we default to relatively low iteration counts (based on average for spec2k17), this will make vectorized loop body look cold. This patch makes vectorizer to look for flat profiles and only possibly reduce the profile by known upper bound on iteration counts. gcc/ChangeLog: PR target/110727 * tree-vect-loop.cc (scale_profile_for_vect_loop): Avoid scaling flat profiles by vectorization factor. (vect_transform_loop): Check for flat profiles.
[Bug middle-end/110316] [11/12/13/14 Regression] g++.dg/ext/timevar1.C and timevar2.C fail erratically
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110316 --- Comment #3 from Andrew Pinski --- See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625180.html thread too which is exactly about this issue. Basically what is happening is after inlining, there is now fused multiple subtract being used and that causes issues. This is why it is not seen on x86 (without using --with-cpu=).
[Bug c++/110106] [11/12/13/14 Regression] ICE on noexcept(noexcept(...)) with optional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110106 --- Comment #5 from CVS Commits --- The trunk branch has been updated by Marek Polacek : https://gcc.gnu.org/g:e36d1994051122fc6e1f8c728fbd109a59e0a822 commit r14-2717-ge36d1994051122fc6e1f8c728fbd109a59e0a822 Author: Marek Polacek Date: Tue Jul 18 16:02:21 2023 -0400 c++: fix ICE with is_really_empty_class [PR110106] is_really_empty_class is liable to crash when it gets an incomplete or dependent type. Since r11-557, we pass the yet-uninstantiated class type S<0> of the PARM_DECL s to is_really_empty_class -- because of the potential_rvalue_constant_expression -> is_rvalue_constant_expression change in cp_parser_constant_expression. Here we're not parsing a template so we did not check COMPLETE_TYPE_P as we should. It should work to complete the type before checking COMPLETE_TYPE_P. PR c++/110106 gcc/cp/ChangeLog: * constexpr.cc (potential_constant_expression_1): Try to complete the type when !processing_template_decl. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept80.C: New test.
[Bug libstdc++/110653] Support std::stoi etc. without C99 APIs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110653 --- Comment #20 from dave.anglin at bell dot net --- On 2023-07-19 6:10 a.m., redi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110653 > > --- Comment #17 from Jonathan Wakely --- > (In reply to Jonathan Wakely from comment #16) >> PASS: 21_strings/basic_string/numeric_conversions/char/stoll.cc (test for >> excess errors) >> PASS: 21_strings/basic_string/numeric_conversions/char/stoll.cc execution >> test > Oops, sorry, not that one! As mentioned, that will be UNSUPPORTED for > hpux11.11 Yes, stoll and stoull tests are UNSUPPORTED. We also have: UNSUPPORTED: 21_strings/basic_string/numeric_conversions/char/stold.cc The rest of the non wide character conversion tests pass. The stoll and stoull tests pass when dg-require-string-conversions is 1. The stold test fails, I think because it returns LDBL_MAX instead of HUGE_VALL (inf). See _GLIBCXX_HAVE_BROKEN_STRTOLD comment in /config/os/hpux/os_defines.h. There is a problem with std::stof. It throws an out of range exception for 0. It needs to check for 0 value.
[Bug libstdc++/110653] Support std::stoi etc. without C99 APIs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110653 --- Comment #21 from Jonathan Wakely --- (In reply to dave.anglin from comment #20) > The stoll and stoull tests pass when dg-require-string-conversions is 1. > The stold test > fails, I think because it returns LDBL_MAX instead of HUGE_VALL (inf). See > _GLIBCXX_HAVE_BROKEN_STRTOLD comment in /config/os/hpux/os_defines.h. Aha. > There is a problem with std::stof. It throws an out of range exception for > 0. It needs to > check for 0 value. Oops :-) I'll fix both of these on Monday - thanks for the testing and analysis.
[Bug c++/110382] [13/14 Regression] internal compiler error: in verify_ctor_sanity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110382 --- Comment #3 from Marek Polacek --- (In reply to Marek Polacek from comment #2) > Note that changing > double a = 0; > to > double a = 0.; > gets rid of the ICE! ...because the latter means the {} is reduced_constant_expression_p and TREE_CONSTANT, so we never call cxx_eval_bare_aggregate.
[Bug target/110770] New: bpf: add pseudoc assembly dialect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110770 Bug ID: 110770 Summary: bpf: add pseudoc assembly dialect Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: cupertino.miranda at oracle dot com Target Milestone: --- LLVM supports a different assembly dialect. We should support the same assembly format in GCC. A reference for these instructions can be found in: https://gcc.gnu.org/wiki/BPFBackEnd
[Bug target/110770] bpf: add pseudoc assembly dialect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110770 --- Comment #1 from CVS Commits --- The master branch has been updated by Cupertino Miranda : https://gcc.gnu.org/g:77d0f9ec3809b4d2e32c36069b6b9239d301c030 commit r14-2720-g77d0f9ec3809b4d2e32c36069b6b9239d301c030 Author: Cupertino Miranda Date: Mon Jul 17 17:42:42 2023 +0100 bpf: pseudo-c assembly dialect support New pseudo-c BPF assembly dialect already supported by clang and widely used in the linux kernel. gcc/ChangeLog: PR target/110770 * config/bpf/bpf.opt: Added option -masm=. * config/bpf/bpf-opts.h (enum bpf_asm_dialect): New type. * config/bpf/bpf.cc (bpf_print_register): New function. (bpf_print_register): Support pseudo-c syntax for registers. (bpf_print_operand_address): Likewise. * config/bpf/bpf.h (ASM_SPEC): handle -msasm. (ASSEMBLER_DIALECT): Define. * config/bpf/bpf.md: Added pseudo-c templates. * doc/invoke.texi (-masm=): New eBPF option item.
[Bug middle-end/110757] [14 Regression] 7% parest regression on zen3 -Ofast -march=native -flto between g:4dbb3af1efe55174 (2023-07-14 00:54) and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110757 --- Comment #2 from Martin Jambor --- The second slow-down of 4.5% was caused by r14-2546-g061f74c06735e1: 061f74c06735e1fa35b910ae0bcf01b61a74ec23 is the first bad commit commit 061f74c06735e1fa35b910ae0bcf01b61a74ec23 Author: Jan Hubicka Date: Sun Jul 16 23:56:59 2023 +0200 Fix profile update in scale_profile_for_vect_loop When vectorizing 4 times, we sometimes do for <4x vectorized body> for <2x vectorized body> for <1x vectorized body> Here the second two fors handling epilogue never iterates. Currently vecotrizer thinks that the middle for itrates twice. This turns out to be scale_profile_for_vect_loop that uses niter_for_unrolled_loop.
[Bug c++/110771] New: [14 regression] g++.dg/gomp/pr58567.C fails after r14-2655-g92d1425ca78040
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110771 Bug ID: 110771 Summary: [14 regression] g++.dg/gomp/pr58567.C fails after r14-2655-g92d1425ca78040 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:92d1425ca7804000cfe8aa635cf363a87d362d75, r14-2655-g92d1425ca78040 make -k check-gcc RUNTESTFLAGS="gomp.exp=g++.dg/gomp/pr58567.C" FAIL: g++.dg/gomp/pr58567.C -std=c++98 (test for excess errors) FAIL: g++.dg/gomp/pr58567.C -std=c++14 (test for excess errors) FAIL: g++.dg/gomp/pr58567.C -std=c++17 (test for excess errors) FAIL: g++.dg/gomp/pr58567.C -std=c++20 (test for excess errors) # of expected passes4 # of unexpected failures4 FAIL: g++.dg/gomp/pr58567.C -std=c++20 (test for excess errors) Excess errors: /home/seurer/gcc/git/gcc-test2/gcc/testsuite/g++.dg/gomp/pr58567.C:8:3: error: invalid type for iteration variable 'i' commit 92d1425ca7804000cfe8aa635cf363a87d362d75 (HEAD) Author: Patrick Palka Date: Wed Jul 19 16:10:20 2023 -0400 c++: redundant targ coercion for var/alias tmpls
[Bug tree-optimization/110669] [14 regression] ICE in gcc.dg/torture/pr105132.c since r14-2515-gb77161e60bce7b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110669 --- Comment #11 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:cfe53af09364d94fb86013f85ef598a1d47e0657 commit r14-2721-gcfe53af09364d94fb86013f85ef598a1d47e0657 Author: Roger Sayle Date: Fri Jul 21 20:37:59 2023 +0100 PR c/110699: Defend against error_mark_node in gimplify.cc. This patch resolves PR c/110669, an ICE-after-error regression, by adding a check that the array type isn't error_mark_node in gimplify_compound_lval. 2023-07-21 Roger Sayle Richard Biener gcc/ChangeLog PR c/110699 * gimplify.cc (gimplify_compound_lval): If the array's type is error_mark_node then return GS_ERROR. gcc/testsuite/ChangeLog PR c/110699 * gcc.dg/pr110699.c: New test case.
[Bug c/110699] [12/13/14 Regression] internal compiler error: tree check: expected array_type, have error_mark in array_ref_low_bound, at tree.cc:12754
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110699 --- Comment #2 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:cfe53af09364d94fb86013f85ef598a1d47e0657 commit r14-2721-gcfe53af09364d94fb86013f85ef598a1d47e0657 Author: Roger Sayle Date: Fri Jul 21 20:37:59 2023 +0100 PR c/110699: Defend against error_mark_node in gimplify.cc. This patch resolves PR c/110669, an ICE-after-error regression, by adding a check that the array type isn't error_mark_node in gimplify_compound_lval. 2023-07-21 Roger Sayle Richard Biener gcc/ChangeLog PR c/110699 * gimplify.cc (gimplify_compound_lval): If the array's type is error_mark_node then return GS_ERROR. gcc/testsuite/ChangeLog PR c/110699 * gcc.dg/pr110699.c: New test case.
[Bug testsuite/110756] [14 Regression] commit g:92d1425ca78 causes failures in g++.dg/gomp/pr58567.C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110756 Andrew Pinski changed: What|Removed |Added CC||seurer at gcc dot gnu.org --- Comment #4 from Andrew Pinski --- *** Bug 110771 has been marked as a duplicate of this bug. ***
[Bug c++/110771] [14 regression] g++.dg/gomp/pr58567.C fails after r14-2655-g92d1425ca78040
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110771 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- Dup. *** This bug has been marked as a duplicate of bug 110756 ***
[Bug target/110770] bpf: add pseudoc assembly dialect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110770 Andrew Pinski changed: What|Removed |Added Target||bpf Severity|normal |enhancement
[Bug tree-optimization/110641] [14 Regression] ICE in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110641 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2023-07-21 --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/110769] ICE in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110769 Andrew Pinski changed: What|Removed |Added Depends on||110641 --- Comment #1 from Andrew Pinski --- Most likely a dup of bug 110641. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110641 [Bug 110641] [14 Regression] ICE in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023
[Bug tree-optimization/110766] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: in gimple_phi_arg_def_from_edge, at gimple.h:4699
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110766 Andrew Pinski changed: What|Removed |Added Summary|ICE on valid code at -O3 on |[14 Regression] ICE on |x86_64-linux-gnu: in|valid code at -O3 on |gimple_phi_arg_def_from_edg |x86_64-linux-gnu: in |e, at gimple.h:4699 |gimple_phi_arg_def_from_edg ||e, at gimple.h:4699 Target Milestone|--- |14.0 Keywords||ice-on-valid-code See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=110669
[Bug tree-optimization/110766] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: in gimple_phi_arg_def_from_edge, at gimple.h:4699
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110766 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-07-21 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed. Note I also double checked the others which I had marked as a dup of bug 110669 to make sure they were fixed.
[Bug c++/102051] [coroutines] ICE in gimplify_var_or_parm_decl, at gimplify.c:2848
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102051 Harris Hancock changed: What|Removed |Added CC||harris.hancock at gmail dot com --- Comment #3 from Harris Hancock --- I have observed this bug, too, while trying to compile Cap'n Proto with coroutine support using GCC 10, 11, and 12. Like Iain, the preprocessed file which Vitali uploaded compiles for me, at least on various versions of x86-64 gcc available in Compiler Explorer. However, Vitali's reduced example, available in the first Compiler Explorer link he posted (https://godbolt.org/z/M5nfEPMKj) readily reproduces the ICE, both in Compiler Explorer on gcc trunk and also locally for me using GCC 12 distributed by Ubuntu Jammy (g++-12 (Ubuntu 12.1.0-2ubuntu1~22.04) 12.1.0), using `g++-12 -o the-file.o -c the-file.i -std=c++20`. The ICE disappears if we modify the following line ~Own() noexcept(false); to ~Own() noexcept(true); So, this appears to be a coroutine bug related to noexcept(false) destructors, similar to Bug 95822.
[Bug tree-optimization/110768] [14 Regression] Dead Code Elimination Regression since r14-2623-gc11a3aedec2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110768 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |14.0
[Bug c++/102051] [coroutines] ICE in gimplify_var_or_parm_decl, at gimplify.c:2848
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102051 --- Comment #4 from Harris Hancock --- Created attachment 55597 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55597&action=edit Reduced example which reproduces the ICE I'm attaching the reduced example from Vitali's first Compiler Explorer link: https://godbolt.org/z/M5nfEPMKj Compile with: g++ -o bug-102051-reduced-example.o -c bug-102051-reduced-example.cpp -std=c++20
[Bug middle-end/110764] [12/13/14 Regression] False positive -Warray-bounds warning swapping std::thread::id
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110764 --- Comment #1 from Andrew Pinski --- hmm, this code seems like undefined code. If I change 1 to 2 as you are accessing 2 elements via operator[], there is no warning ... The warning seems correct and even says 1 bounds is above the array bounds of [1]. The location of the warning seems off but I don't think the warning in this case is false positive though. Maybe it is due to the reduced testcase though.
[Bug middle-end/110764] [12/13/14 Regression] False positive -Warray-bounds warning swapping std::thread::id
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110764 --- Comment #2 from Andrew Pinski --- So reading the original bug report, it is almost definitely an incorrectly reduced testcase.