[Bug gcov-profile/118607] New: gcov-tool merge weights must be integers, while help command suggest users to use floats
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118607 Bug ID: 118607 Summary: gcov-tool merge weights must be integers, while help command suggest users to use floats Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: gigor-ads at yandex dot ru Target Milestone: --- Hello! I have tried to set weights, and as I saw, gcov-merge suggests to use floats as weights: merge [options] Merge coverage file contents -o, --output Output directory -v, --verbose Verbose mode -w, --weight Set weights (float point values) That comes from lines: https://github.com/gcc-mirror/gcc/blob/8c93a8aa67f12c8e03eb7fd90f671a03ae46935b/gcc/gcov-tool.cc#L164-L173 However, in gcov-tool implementation, the weights are integers: https://github.com/gcc-mirror/gcc/blob/8c93a8aa67f12c8e03eb7fd90f671a03ae46935b/gcc/gcov-tool.cc#L200 https://github.com/gcc-mirror/gcc/blob/8c93a8aa67f12c8e03eb7fd90f671a03ae46935b/gcc/gcov-tool.cc#L215 So, floating point values, which are entered by users, are not proper and integer values must be used. The patch is simple: diff --git a/gcc/gcov-tool.cc b/gcc/gcov-tool.cc index d8f2790b110..dceb8af03ad 100644 --- a/gcc/gcov-tool.cc +++ b/gcc/gcov-tool.cc @@ -169,7 +169,7 @@ print_merge_usage_message (int error_p) fnotice (file, " merge [options] Merge coverage file contents\n"); fnotice (file, "-o, --output Output directory\n"); fnotice (file, "-v, --verbose Verbose mode\n"); - fnotice (file, "-w, --weight Set weights (float point values)\n"); + fnotice (file, "-w, --weight Set weights (integer values)\n"); } static const struct option merge_options[] = @@ -240,7 +240,7 @@ print_merge_stream_usage_message (int error_p) fnotice (file, " merge-stream [options] [] Merge coverage stream file (or stdin)\n" "and coverage file contents\n"); fnotice (file, "-v, --verbose Verbose mode\n"); - fnotice (file, "-w, --weight Set weights (float point values)\n"); + fnotice (file, "-w, --weight Set weights (integer values)\n"); } static const struct option merge_stream_options[] =
[Bug rtl-optimization/118591] [lra][avr] Wrong code with -mlra in pr43879-3.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118591 --- Comment #4 from GCC Commits --- The master branch has been updated by Georg-Johann Lay : https://gcc.gnu.org/g:6f4592ae95eed53dc3a370f98c04a8f25f007811 commit r15-7123-g6f4592ae95eed53dc3a370f98c04a8f25f007811 Author: Georg-Johann Lay Date: Wed Jan 22 12:02:16 2025 +0100 AVR: Add test cases for PR118591. gcc/testsuite/ PR rtl-optimization/118591 * gcc.target/avr/torture/pr118591-1.c: New test. * gcc.target/avr/torture/pr118591-2.c: New test.
[Bug analyzer/115662] Feature request: support for linking SARIF files together
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115662 --- Comment #1 from David Malcolm --- The top-level object in a .sarif file is a sarifLog, and this contains zero of more runs: https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790732 So one could "link" multiple files by combining all the logs into one big "log" object. Note that a log object has a 3.13.5 inlineExternalProperties property which perhaps could be used to consolidate repeated information in the log objects.
[Bug target/118560] [15 regression] ICE when building powerpc-unknown-linux-gnu cross-compiler since r15-7008
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118560 Thomas Schwinge changed: What|Removed |Added CC||tschwinge at gcc dot gnu.org --- Comment #9 from Thomas Schwinge --- (In reply to GCC Commits from comment #6) > commit r15-7083-g07f62ed9a7b09951f83855e19d41641b098190b1 > Author: Vladimir N. Makarov > Date: Mon Jan 20 17:08:50 2025 -0500 > > [PR118560][LRA]: Fix typo in checking secondary memory mode for the reg > class For the record, this commit also cures a timeout regression on powerpc64le-unknown-linux-gnu, that we recently had acquired with commit r15-7008-g9f009e8865cda01310c52f7ec8bdaa3c557a2745 "[PR118067][LRA]: Check secondary memory mode for the reg class": {+WARNING: gcc.target/powerpc/pr79916.c (test for excess errors) program timed out.+} [-PASS:-]{+FAIL:+} gcc.target/powerpc/pr79916.c (test for excess errors) This is now back to PASS.
[Bug target/118597] [15 Regression] gcc.dg/vect/vect-fncall-mask.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118597 Thiago Jung Bauermann changed: What|Removed |Added CC||thiago.bauermann at linaro dot org --- Comment #1 from Thiago Jung Bauermann --- Our CI bisected it to commit r15-6945-gea1deefe54ea1c . https://linaro.atlassian.net/browse/GNU-1503 It also detected that spec2k6 433.milc with -Os increased in size by 4% from 98540 to 102636 bytes.
[Bug target/118597] [15 Regression] gcc.dg/vect/vect-fncall-mask.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118597 --- Comment #2 from Thiago Jung Bauermann --- (Thanks Christophe Lyon for pointing out this bugzilla to me).
[Bug target/116256] [15 Regression] RISC-V: testsuite failures since late-combine-pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116256 --- Comment #6 from Jeffrey A. Law --- So what's left here is the dup-{1,2,3} cases. IMHO this all ties back to the constant synthesis problem. They can be fixed by removing the mvconst_internal pattern -- but that leads to a new set of regressions which don't look tractable to solve. A great example would be and-shift32.c, but there are others. The fundmantal problem is exposing synthesis to combine means combine has to look at more instructions and it's limited in its search depth to 4. The mvconst_internal pattern is acting like a bridge to allow other combine patterns to trigger. In and-shift32.c we would need 5 insn combination support to bring together all the necessary insns to optimize that case without mvconst_internal.combine doesn't handle REG_EQUAL notes well and fixing that looks fairly painful. Basically I don't see a path right now to remove mvconst_internal without significant combiner surgery, worse yet, that surgery will hit the REG_DEAD note distribution code which is one of the hairier parts of combine. Trying to tackle the dup-{1,2,3} cases after reload is doomed to failure IMHO because we re-use the output regsiter from synthesis as a scratch in the synthesis sequence. That inhibits the post-reload optimizers significantly, primarily reload_cse, but also vsetvl optimization for larger vector lengths. I think all that tends to argue that a local cprop (but not full cse) pass may be the only path forward here, which I'll explore next.
[Bug target/116448] gcc.target/arm/vfp-1.c uses the wrong instructions on Cortex-M55
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116448 --- Comment #2 from Torbjorn SVENSSON --- Thanks Richard. I'll give it a try with -Os and create a patch with it.
[Bug tree-optimization/118605] gcc/tree-assume.cc:108: dangling field problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118605 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Agreed, given that typedef class bitmap_head *bitmap; even if what it binds to was in scope, it really doesn't make much sense to use reference to the bitmap_head pointer.
[Bug ipa/118125] [15 Regression] 7-16% slowdown of 510.parest_r on x86-64(-v3) since r15-6110-g92e0e0f8177530
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118125 --- Comment #2 from Martin Jambor --- On top if r15-7055-g459816efa13d9d I added a patch adding a dbg_counter to limit the updates. The slow-down is caused by two updates of value ranges in jump function, both of which are necessary to get the slow-down: +***dbgcnt: lower limit 2175 reached for ipa_update_vr.*** +***dbgcnt: upper limit 2175 reached for ipa_update_vr.*** +ipa-prop: Updating jump function VR of _M_create_storage/62058 -> allocate.constprop/58360 1 based on info from __ct_base /216 -> _M_create_storage/62058 1 (which is being inlined). + After intersecting: [irange] size_type [1, +INF] and [irange] size_type [0, 1152921504606846975] MASK 0xfff VALUE 0x0, setting it to: [irange] size_type [1, 1152921504606846975] and later +***dbgcnt: lower limit 2175 reached for ipa_update_vr.*** +***dbgcnt: upper limit 2175 reached for ipa_update_vr.*** +ipa-prop: Updating jump function VR of _M_create_storage/62058 -> allocate.constprop/58360 1 based on info from __ct_base /216 -> _M_create_storage/62058 1 (which is being inlined). + After intersecting: [irange] size_type [1, +INF] and [irange] size_type [0, 1152921504606846975] MASK 0xfff VALUE 0x0, setting it to: [irange] size_type [1, 1152921504606846975] + Apart from slight inlining ordering differences, this leads to the following differences in inlining results for hottest function that has slowed down (excluding differences only in symtab_node UIDs): -- --- sum-fast2025-01-22 14:34:31.711998384 +0100 +++ sum-slow2025-01-22 14:35:05.579500645 +0100 @@ -1,12 +1,12 @@ IPA function summary for solve/39387 inlinable fp_expression - global time: 584936.726562 + global time: 584902.237305 self size: 55 - global size: 1305 - min size: 1296 + global size: 1301 + min size: 1292 self stack: 648 global stack:1194 estimated growth:329 -size:891.50, time:571399.283203 +size:889.50, time:571399.283203 size:3.00, time:2.00, executed if:(not inlined) size:0.50, time:0.50, executed if:(op1 not sra candidate) && (not inlined), nonconst if:(op1 not sra candidate) && (op1[ref offset: 576] changed) && (not inlined) size:0.50, time:0.50, executed if:(op1 not sra candidate), nonconst if:(op1 not sra candidate) && (op1[ref offset: 576] changed) @@ -298,18 +298,18 @@ __ct_base /66382 inlined freq:17.25 cross module Stack frame offset 1194, callee self size 0 -reinit.constprop/66851 inlined +reinit.constprop/66419 inlined freq:17.25 Stack frame offset 1194, callee self size 0 operator new []/151 function body not available freq:5.69 loop depth: 2 size: 3 time: 12 operator delete []/146 function body not available freq:3.04 loop depth: 2 size: 2 time: 11 - _ZN6dealii6VectorIdE6reinitEjb.part.0/66852 inlined + _ZN6dealii6VectorIdE6reinitEjb.part.0/66420 inlined freq:5.86 Stack frame offset 1194, callee self size 0 -operator delete []/146 function body not available - freq:3.14 loop depth: 2 size: 2 time: 11 +__builtin_unreachable/57309 unreachable + freq:0.00 cross module loop depth: 2 size: 0 time: 0 predicate: (false) __dt_base /7005 call is unlikely and code size would grow freq:0.00 cross module loop depth: 2 size: 2 time: 11 callee size: 7 stack: 0 op0 is compile time invariant -- I.e. one call of operator delete[](void*) from function dealii::Vector::reinit(unsigned int, bool) is determined to never be executed.
[Bug middle-end/118608] [14/15 regression][mips64] Lack of sign extension with -Os after r14-6915
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118608 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.3
[Bug target/115439] [15 Regression] ICEs after r15-638 on master-thumb_m55_hard_eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115439 Christophe Lyon changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2025-01-22
[Bug target/115439] [15 Regression] ICEs after r15-638 on master-thumb_m55_hard_eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115439 --- Comment #8 from Christophe Lyon --- Patch proposal: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673791.html
[Bug c/118606] New: gcc/omp-general.cc:3294: Possible precedence problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118606 Bug ID: 118606 Summary: gcc/omp-general.cc:3294: Possible precedence problem Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- clang says: gcc/omp-general.cc:3294:9: warning: overloaded operator >> has higher precedence than comparison operator [-Woverloaded-shift-op-parentheses] Source code is if ((variants[i].score - 1) >> l <= (variants[i+1].score - 1) >> l) Maybe better code if (((variants[i].score - 1) >> l) <= ((variants[i+1].score - 1) >> l))
[Bug c++/118199] [15 regression] -fno-elide-constructors vs no_unique_address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118199 --- Comment #19 from GCC Commits --- The master branch has been updated by Simon Martin : https://gcc.gnu.org/g:e13e751d8144c9cfb7a9f1cd38119d1fa4ab38cf commit r15-7122-ge13e751d8144c9cfb7a9f1cd38119d1fa4ab38cf Author: Simon Martin Date: Wed Jan 22 10:44:32 2025 +0100 c++: Clear TARGET_EXPR_ELIDING_P when forced to use a copy constructor due to __no_unique_address__ [PR118199] We currently fail with a checking assert upon the following valid code when using -fno-elide-constructors === cut here === struct d { ~d(); }; d &b(); struct f { [[__no_unique_address__]] d e; }; struct h : f { h() : f{b()} {} } i; === cut here === The problem is that split_nonconstant_init_1 detects that it cannot elide the copy constructor due to __no_unique_address__ but does not clear TARGET_EXPR_ELIDING_P, and due to -fno-elide-constructors, we trip on a checking assert in cp_gimplify_expr. This patch fixes this by making sure that we clear TARGET_EXPR_ELIDING_P if we determine that we have to keep the copy constructor due to __no_unique_address__. An alternative would be to just check for elide_constructors in that assert, but I think it'd lose most of its value if we did so. PR c++/118199 gcc/cp/ChangeLog: * typeck2.cc (split_nonconstant_init_1): Clear TARGET_EXPR_ELIDING_P if we need to use a copy constructor because of __no_unique_address__. gcc/testsuite/ChangeLog: * g++.dg/init/no-elide3.C: New test.
[Bug c++/118199] [15 regression] -fno-elide-constructors vs no_unique_address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118199 Simon Martin changed: What|Removed |Added Resolution|--- |FIXED Known to work||15.0 Known to fail|15.0| Status|ASSIGNED|RESOLVED --- Comment #20 from Simon Martin --- Fixed in GCC 15.
[Bug target/118485] [15 Regression] gnat fails to build on m68k-linux-gnu-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118485 Eric Botcazou changed: What|Removed |Added Last reconfirmed||2025-01-22 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING --- Comment #3 from Eric Botcazou --- So that's specific to 32-bit architectures?
[Bug rtl-optimization/118610] [15 regression] gcc.dg/pr85467.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118610 Rainer Orth changed: What|Removed |Added Target Milestone|--- |15.0
[Bug rtl-optimization/118610] New: [15 regression] gcc.dg/pr85467.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118610 Bug ID: 118610 Summary: [15 regression] gcc.dg/pr85467.c FAILs Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ro at gcc dot gnu.org CC: vmakarov at gcc dot gnu.org Target Milestone: --- Target: sparc*-sun-solaris2.11 Between 20250117 (b5a069203fc074ab75d994c4a7e0f2db6a0a00fd) and 20250120 (10e98638998745ebc3888a20e661a8364e88ea3a), dozens of tests regressed on Solaris/SPARC (both 32 and 64-bit) like +FAIL: gcc.dg/pr85467.c (internal compiler error: maximum number of generated reload insns per insn achieved (90)) +FAIL: gcc.dg/pr85467.c (test for excess errors) Most of them were fixed by commit 07f62ed9a7b09951f83855e19d41641b098190b1 Author: Vladimir N. Makarov Date: Mon Jan 20 17:08:50 2025 -0500 [PR118560][LRA]: Fix typo in checking secondary memory mode for the reg class but this particular test still remains: Excess errors: during RTL pass: reload /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/pr85467.c:15:1: internal compiler error: maximum number of generated reload insns per insn achieved (90) 0x20bba77 internal_error(char const*, ...) /vol/gcc/src/hg/master/local/gcc/diagnostic-global-context.cc:517 0xe56b73 lra_constraints(bool) /vol/gcc/src/hg/master/local/gcc/lra-constraints.cc:5457 0xe3d183 lra(__FILE*, int) /vol/gcc/src/hg/master/local/gcc/lra.cc:2449 0xde503b do_reload /vol/gcc/src/hg/master/local/gcc/ira.cc:5977 0xde503b execute /vol/gcc/src/hg/master/local/gcc/ira.cc:6165
[Bug target/118485] [15 Regression] gnat fails to build on m68k-linux-gnu-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118485 Matthias Klose changed: What|Removed |Added Status|WAITING |UNCONFIRMED Ever confirmed|1 |0 --- Comment #4 from Matthias Klose --- I don't see any other gnat failures on 32bit archs. But I think m68k is the only big-endian one.
[Bug c++/101603] [meta-bug] pointer to member functions issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101603 Bug 101603 depends on bug 118509, which changed state. Bug 118509 Summary: [14 regression] Front-end produced uninitialized memory reference when compiling Nektar since r15-4595-gb25d3201b6338d https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118509 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug c++/118509] [14 regression] Front-end produced uninitialized memory reference when compiling Nektar since r15-4595-gb25d3201b6338d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118509 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #16 from Jakub Jelinek --- Temporary fix committed to 14.2.1, so it is fixed there as well, albeit with emitting extra assignments in GIMPLE (at least with -O and above most likely optimized away).
[Bug c++/118604] gcc/cp/parser.cc:51316: Non clear code produces clang warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118604 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org CC||jakub at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2025-01-22 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Jakub Jelinek --- Created attachment 60239 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60239&action=edit gcc15-pr118604.patch So far lightly tested patch. Also adjust the C FE for consistency.
[Bug target/116448] gcc.target/arm/vfp-1.c uses the wrong instructions on Cortex-M55
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116448 Richard Earnshaw changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2025-01-22 Ever confirmed|0 |1 --- Comment #1 from Richard Earnshaw --- This is a basic test of the ISA. It's probably more appropriate to use -Os so that the per-cpu instruction costs are ignored.
[Bug middle-end/118443] [Meta bug] Bugs triggered by and blocking more smtgcc testing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118443 Bug 118443 depends on bug 117186, which changed state. Bug 117186 Summary: [12/13 Regression] aarch64 wrong code for (a < b) < (b < a) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117186 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug c++/118590] [14/15 regression] ICE with acc enter data copyin and dependent types since r14-7033-g1413af02d62182
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 --- Comment #10 from Jakub Jelinek --- Note, it isn't enough though. This build something and only check if it is valid later way has its drawbacks. // PR c++/118590 // { dg-do compile } template struct A { int z; }; template struct B { char *w; A y; }; template void foo (B &x) { A c = x.y; #pragma acc enter data copyin(__uint128_t[0 : c.z]) } void bar (B &x) { foo (x); } ICEs during instantiation: 0x3045d59 internal_error(char const*, ...) ../../gcc/diagnostic-global-context.cc:517 0x11c640e crash_signal ../../gcc/toplev.cc:322 0x7a6adf tsubst_decl ../../gcc/cp/pt.cc:15720 0x7a979f tsubst(tree_node*, tree_node*, int, tree_node*) ../../gcc/cp/pt.cc:16415 0x7caf85 tsubst_expr(tree_node*, tree_node*, int, tree_node*) ../../gcc/cp/pt.cc:22131 0x7bf112 tsubst_stmt ../../gcc/cp/pt.cc:19978
[Bug rtl-optimization/117186] [12/13/14 Regression] aarch64 wrong code for (a < b) < (b < a)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117186 --- Comment #11 from GCC Commits --- The releases/gcc-14 branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:3228df20cfa3581015dc32657eb17d6f24af3104 commit r14-11237-g3228df20cfa3581015dc32657eb17d6f24af3104 Author: Richard Sandiford Date: Wed Jan 22 15:23:54 2025 + rtl: Remove invalid compare simplification [PR117186] g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html , added code to treat: (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0))) as a nop. This PR shows that that isn't always correct. The compare in the set above is between two 0/1 booleans (at least on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that produced the incoming (reg:CC cc) is unconstrained; it could be between arbitrary integers, or even floats. The fold is therefore replacing a cc that is valid for both signed and unsigned comparisons with one that is only known to be valid for signed comparisons. (gt (compare (gt cc 0) (lt cc 0) 0) does simplify to: (gt cc 0) but: (gtu (compare (gt cc 0) (lt cc 0) 0) does not simplify to: (gtu cc 0) The optimisation didn't come with a testcase, but it was added for i386's cmpstrsi, now cmpstrnsi. That probably doesn't matter as much as it once did, since it's now conditional on -minline-all-stringops. But the patch is almost 25 years old, so whatever the original motivation was, it seems likely that other things now rely on it. It therefore seems better to try to preserve the optimisation on rtl rather than get rid of it. To do that, we need to look at how the result of the outer compare is used. We'd therefore be looking at four instructions (the gt, the lt, the compare, and the use of the compare), but combine already allows that for 3-instruction combinations thanks to: /* If the source is a COMPARE, look for the use of the comparison result and try to simplify it unless we already have used undobuf.other_insn. */ When applied to boolean inputs, a comparison operator is effectively a boolean logical operator (AND, ANDNOT, XOR, etc.). simplify_logical_relational_operation already had code to simplify logical operators between two comparison results, but: * It only handled IOR, which doesn't cover all the cases needed here. The others are easily added. * It treated comparisons of integers as having an ORDERED/UNORDERED result. Therefore: * it would not treat "true for LT + EQ + GT" as "always true" for comparisons between integers, because the mask excluded the UNORDERED condition. * it would try to convert "true for LT + GT" into LTGT even for comparisons between integers. To prevent an ICE later, the code used: /* Many comparison codes are only valid for certain mode classes. */ if (!comparison_code_valid_for_mode (code, mode)) return 0; However, this used the wrong mode, since "mode" is here the integer result of the comparisons (and the mode of the IOR), not the mode of the things being compared. Thus the effect was to reject all floating-point-only codes, even when comparing floats. I think instead the code should detect whether the comparison is between integer values and remove UNORDERED from consideration if so. It then always produces a valid comparison (or an always true/false result), and so comparison_code_valid_for_mode is not needed. In particular, "true for LT + GT" becomes NE for comparisons between integers but remains LTGT for comparisons between floats. * There was a missing check for whether the comparison inputs had side effects. While there, it also seemed worth extending simplify_logical_relational_operation to unsigned comparisons, since that makes the testing easier. As far as that testing goes: the patch exhaustively tests all combinations of integer comparisons in: (cmp1 (cmp2 X Y) (cmp3 X Y)) for the 10 integer comparisons, giving 1000 fold attempts in total. It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1}) on the result of the fold, giving 9 checks per fold, or 9000 in total. That's probably more than is typical for self-tests, but it seems to complete in neglible time, even for -O0 builds. gcc/ PR rtl-optimization/117186 * rtl.h (simplify_context::simplify_logical_relational_operation): Add an invert0_p parameter. * simplify-rtx.cc (unsigned_comparison_to_mask): New function. (mask_to_unsigned_comparison): Likewise. (comparison_code_valid_for_mode): Delete. (simplify_contex
[Bug target/118184] [14 backport] glibc regression on aarch64 due to early_ra deleting movti instruction since r15-5422-g279475fd7236a9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118184 --- Comment #13 from GCC Commits --- The releases/gcc-14 branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:51761b3b8b98e1b9ca02ae293de00644da83b85d commit r14-11236-g51761b3b8b98e1b9ca02ae293de00644da83b85d Author: Richard Sandiford Date: Wed Jan 22 15:23:54 2025 + aarch64: Detect word-level modification in early-ra [PR118184] REGMODE_NATURAL_SIZE is set to 64 bits for everything except VLA SVE modes. This means that it's possible to modify (say) the highpart of a TI pseudo or a V2DI pseudo independently of the lowpart. Modifying such highparts requires a reload if the highpart ends up in the upper 64 bits of an FPR, since RTL semantics do not allow the highpart of a single hard register to be modified independently of the lowpart. early-ra missed a check for this case, which meant that it effectively treated an assignment to (subreg:DI (reg:TI R) 0) as an assignment to the whole of R. gcc/ PR target/118184 * config/aarch64/aarch64-early-ra.cc (allocno_assignment_is_rmw): New function. (early_ra::record_insn_defs): Mark the live range information as untrustworthy if an assignment would change part of an allocno but preserve the rest. gcc/testsuite/ * gcc.dg/torture/pr118184.c: New test.
[Bug rtl-optimization/117186] [12/13 Regression] aarch64 wrong code for (a < b) < (b < a)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117186 Richard Sandiford changed: What|Removed |Added Summary|[12/13/14 Regression] |[12/13 Regression] aarch64 |aarch64 wrong code for (a < |wrong code for (a < b) < (b |b) < (b < a)|< a) Known to fail|14.1.0 | Known to work||14.2.1 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #12 from Richard Sandiford --- Fixed on GCC 14 branch. Not planning to backport further.
[Bug c++/118396] [15 regression] -O1+ leads to reading uninitialized data when virtual destructor is present since r15-6369-gfa99002538bc91
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118396 --- Comment #16 from GCC Commits --- The trunk branch has been updated by Marek Polacek : https://gcc.gnu.org/g:cb828691fe692f9df002a2e3757a1aec68857e85 commit r15-7127-gcb828691fe692f9df002a2e3757a1aec68857e85 Author: Marek Polacek Date: Tue Jan 21 14:48:46 2025 -0500 c++: further tweak to cxx_eval_outermost_constant_expr [PR118396] This patch adds an error in a !allow_non_constant case when the initializer/object types don't match. PR c++/118396 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_outermost_constant_expr): Add an error call when !allow_non_constant. Reviewed-by: Jason Merrill
[Bug rtl-optimization/118611] New: LRA inserts unneeded reload on FMA chain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611 Bug ID: 118611 Summary: LRA inserts unneeded reload on FMA chain Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization, ra Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org Target Milestone: --- Target: aarch64* The following example: #include float32x4_t bad (float32x4_t x, float32x4_t c0, float32x4_t c1, float32x4_t c3, float32x4_t c2) { float32x4_t z2 = vmulq_f32 (x, x); float32x4_t p1 = vfmaq_laneq_f32 (c1, z2, c3, 0); float32x4_t p2 = vfmaq_laneq_f32 (c2, z2, c3, 2); // Mov is inserted to save P1. (Correct behaviour) float32x4_t p5 = vfmaq_f32 (p1, z2, p1); float32x4_t p6 = vfmaq_f32 (p1, z2, p2); // Mov is inserted to save P5, which is only used once. (Unneeded) float32x4_t y = vfmaq_f32 (p5, x, p6); return vfmaq_f32 (c0, x, y); } compiled with -O3 generates: bad: fmul v31.4s, v0.4s, v0.4s fmla v2.4s, v31.4s, v3.s[0] fmla v4.4s, v31.4s, v3.s[2] mov v30.16b, v2.16b fmla v30.4s, v31.4s, v2.4s fmla v2.4s, v31.4s, v4.4s mov v31.16b, v30.16b fmla v31.4s, v0.4s, v2.4s fmla v1.4s, v0.4s, v31.4s mov v0.16b, v1.16b ret where the second MOV is unneeded because v30 isn't live after the FMA. It seems that we know the lifetime (insn 17 16 18 2 (set (reg:V4SF 101 [ _8 ]) (fma:V4SF (reg:V4SF 118 [ x ]) (reg:V4SF 102 [ _9 ]) (reg:V4SF 103 [ _10 ]))) "":11639:10 2407 {fmav4sf4} (expr_list:REG_DEAD (reg:V4SF 103 [ _10 ]) (expr_list:REG_DEAD (reg:V4SF 102 [ _9 ]) (nil but still: Choosing alt 0 in insn 17: (0) =w (1) w (2) w (3) 0 {fmav4sf4} Creating newreg=125 from oldreg=103, assigning class FP_REGS to r125 17: r125:V4SF={r118:V4SF*r102:V4SF+r125:V4SF} REG_DEAD r103:V4SF REG_DEAD r102:V4SF Inserting insn reload before: 33: r125:V4SF=r103:V4SF Inserting insn reload after: 34: r101:V4SF=r125:V4SF
[Bug ipa/118125] [15 Regression] 7-16% slowdown of 510.parest_r on x86-64(-v3) since r15-6110-g92e0e0f8177530
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118125 --- Comment #3 from Martin Jambor --- Unfortunately I have lost access to the machine where I was debugging this due to some networking issue. Just before that I have discovered that an extra SLP vectorization in the slow version of the hottest function. I'll try to get back to this soon-ish.
[Bug target/118184] [14 backport] glibc regression on aarch64 due to early_ra deleting movti instruction since r15-5422-g279475fd7236a9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118184 Richard Sandiford changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #14 from Richard Sandiford --- Fixed
[Bug rtl-optimization/118608] New: [14/15 regression][mips64] Lack of sign extension with -Os after r14-6915
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118608 Bug ID: 118608 Summary: [14/15 regression][mips64] Lack of sign extension with -Os after r14-6915 Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: mateuszmar2 at gmail dot com Target Milestone: --- Hi, mips64-octeon2-linux-gnu-gcc v14.2.0 and newer generates wrong instructions for following code compiled with -Os: #include #include #define COUNT 10 typedef unsigned short u16; typedef unsigned int u32; typedef struct NeedleAddress { u16 nId; u16 mId; } NeedleAddress; u32 __attribute__ ((noinline)) prepareNeedle(const u16 upper, const u16 lower) { u32 needleAddress = 0; NeedleAddress *const addr = (NeedleAddress*)(&needleAddress); addr->mId = upper; addr->nId = lower; return needleAddress; } const u32* __attribute__ ((noinline)) findNeedle(const u32 needle, const u32* begin, const u32* end) { while ( begin != end && needle != *begin ) { ++begin; } return begin; } int main() { u32 needle = prepareNeedle(0xDCBA, 0xABCD); u32 haystack[COUNT] = {}; for (int i = 0; i < COUNT; i++) haystack[i] = needle; const u32* result = findNeedle(needle, haystack, haystack + COUNT); if (result == haystack + COUNT) printf("Wrong!\n"); else printf("Good!\n"); return 0; } We noticed this problem after an upgrade from gcc12.2.0 to gcc14.2.0. It seems that sign extension is not done by gcc14 in contrast to gcc12 which does it. The main difference seems to be in findNeedle() function. Its loop is executed COUNT times in case of gcc14 because needle is never equal to *begin: "needle != *begin" comparison in assembly and dump of registers compiled by gcc12: 0x00012b88 : 12120005beq s0,s2,0x12ba0 0x00012b8c : dfbf0028ld ra,40(sp) 0x00012b90 : 8e02lw v0,0(s0) => 0x00012b94 : 1451000abne v0,s1,0x12bc0 0x00012b98 : df998090ld t9,-32624(gp) (gdb) info registers zero at v0 v1 R0 0001 abcddcba 0001 a0 a1 a2 a3 R4 abcddcba 0001200112a0 0001200112c8 00fff7f7 a4 a5 a6 a7 R8 0004 00fff7fa0b40 t0 t1 t2 t3 R12 000120011290 0003 00fff7fe6fff s0 s1 s2 s3 R16 0001200112a0 abcddcba 0001200112c8 00012c60 s4 s5 s6 s7 R20 00012910 00ffcc78 00fff7ffae80 00fff7ffb7d8 t8 t9 k0 k1 R24 00012b4c gp sp s8 ra R28 000120018ca0 00ffcaa0 000120010c88 00012988 status lo hi badvaddr 9cf3 0028 000120011008 cause pc 00800024 00012b94 fcsr fir restart 00f3 "needle != *begin" comparison in assembly and dump of registers compiled by gcc14: 0x00aa0c58 : 12120005beq s0,s2,0xaa0c70 0x00aa0c5c : dfbf0028ld ra,40(sp) 0x00aa0c60 : 8e02lw v0,0(s0) => 0x00aa0c64 : 1451000abne v0,s1,0xaa0c90 0x00aa0c68 : df998098ld t9,-32616(gp) (gdb) info registers zero at v0 v1 R0 0001 abcddcba 0001 a0 a1 a2 a3 R4 abcddcba 00ab12a0 00ab12c8 00fff7f7 a4 a5 a6
[Bug tree-optimization/118605] gcc/tree-assume.cc:108: dangling field problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118605 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2025-01-22 --- Comment #3 from Jakub Jelinek --- Created attachment 60240 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60240&action=edit gcc15-pr118605.patch Lightly tested patch.
[Bug target/118609] New: gcc.target/i386/amxmovrs-t2rpntlvw-2.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118609 Bug ID: 118609 Summary: gcc.target/i386/amxmovrs-t2rpntlvw-2.c etc. FAIL Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ro at gcc dot gnu.org CC: haochen.jiang at intel dot com, lin1.hu at intel dot com Target Milestone: --- Target: i?86-*-*, x86_64-*-* When switching gas from 2.43 to 2.43.90, two tests FAIL on all x86 targets for 64-bit: +FAIL: gcc.target/i386/amxmovrs-t2rpntlvw-2.c (test for excess errors) Excess errors: /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/amxmovrs-t2rpntlvw-2.c:39: Warning: operand 2 `%tmm1' implicitly denotes `%tmm0' to `%tmm1' group in `t2rpntlvwz1' /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/amxmovrs-t2rpntlvw-2.c:53: Warning: operand 2 `%tmm1' implicitly denotes `%tmm0' to `%tmm1' group in `t2rpntlvwz1t1' +FAIL: gcc.target/i386/amxtranspose-2rpntlvw-2.c (test for excess errors) Excess errors: /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/amxtranspose-2rpntlvw-2.c:35: Warning: operand 2 `%tmm1' implicitly denotes `%tmm0' to `%tmm1' group in `t2rpntlvwz1' That gas warning was introduced in binutils commit commit 3c17b69fa1ac3b5c820caf5431532aa79e1e28cf Author: Jan Beulich Date: Mon Nov 18 11:45:34 2024 +0100 x86: generalize "implicit quad group" handling No idea where the problem is, though, but with the binutils 2.44 release imminent, this should be fixed somehow. With gas 2.43, the tests were UNSUPPORTED.
[Bug target/118609] gcc.target/i386/amxmovrs-t2rpntlvw-2.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118609 Rainer Orth changed: What|Removed |Added Target Milestone|--- |15.0
[Bug c/118606] gcc/omp-general.cc:3294: Possible precedence problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118606 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- What is confusing about that? Is that anyh different from non-overloaded operator? We have combine.cc: && nonzero_bits (XEXP (varop, 1), int_result_mode) >> count == 0 combine.cc: && const_op >> i == 0 simplify-rtx.cc: if (mask >> count == INTVAL (trueop1) in other places (sure, it isn't overloaded there, but should have the same precedence regardless of being overloaded or not).
[Bug c/118403] uninitialized warning with automatic union
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118403 --- Comment #14 from Stephen Hemminger --- (In reply to Sam James from comment #13) > (In reply to Stephen Hemminger from comment #12) > > What does `gcc --version` give? $ gcc-15 --version gcc-15 (GCC) 15.0.0 2024 (experimental)
[Bug c++/118590] [14/15 regression] ICE with acc enter data copyin and dependent types since r14-7033-g1413af02d62182
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- Created attachment 60241 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60241&action=edit gcc15-pr118590.patch Patch so far tested with make check RUNTESTFLAGS=c++.exp in libgomp and make check-g++ RUNTESTFLAGS='gomp.exp goacc.exp goacc-gomp.exp' in gcc.
[Bug tree-optimization/118605] gcc/tree-assume.cc:108: dangling field problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118605 Andrew Pinski changed: What|Removed |Added Component|c |tree-optimization --- Comment #1 from Andrew Pinski --- I suspect m_parm_list should just be bitmap instead. It is a pointer after all.
[Bug rtl-optimization/118591] [lra][avr] Wrong code with -mlra in pr43879-3.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118591 --- Comment #3 from Georg-Johann Lay --- Created attachment 60238 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60238&action=edit C99 test case that fails on ordinary AVRs (not avrtiny) This test case fails on ordinary AVRs like -mmcu=atmega128. It passes more arguments and clobbers some regs so that b lives in a stack slot: __attribute__((noipa)) void func2 (long long a1, long long a2, long b) { static unsigned char count = 0; if (b != count++) __builtin_abort (); } int main (void) { for (long b = 0; b < 5; ++b) { __asm ("" ::: "r5", "r9"); func2 (0, 0, b); } return 0; } $ avr-gcc -mmcu=atmega128 -dumpbase "" -save-temps -dp -Os -mlra ... The argument preparation for b reads: .L4: ldd r24,Y+4 ; 63 [c=4 l=1] movqi_insn/3 push r24 ; 9 [c=4 l=1] pushqi1/0 ldd r24,Y+4 ; 64 [c=4 l=1] movqi_insn/3 push r24 ; 11 [c=4 l=1] pushqi1/0 ldd r24,Y+4 ; 65 [c=4 l=1] movqi_insn/3 push r24 ; 13 [c=4 l=1] pushqi1/0 ldd r24,Y+4 ; 66 [c=4 l=1] movqi_insn/3 push r24 ; 15 [c=4 l=1] pushqi1/0 What's also strange is that the code has a frame size of 4 and consequently, b lives in Y[1]:SI. However, after the call of func2, the code accesses Y[5]:SI that was never initialized: /* prologue: function */ /* frame size = 4 */ /* stack size = 4 */ .L__stack_usage = 4 # long b = 0; std Y+1,__zero_reg__ ; 115 [c=4 l=1] movqi_insn/2 std Y+2,__zero_reg__ ; 116 [c=4 l=1] movqi_insn/2 std Y+3,__zero_reg__ ; 117 [c=4 l=1] movqi_insn/2 std Y+4,__zero_reg__ ; 118 [c=4 l=1] movqi_insn/2 .L4: # Crippled passing of b. Seems tlike LRA thinks that PUSH changes Y. ldd r24,Y+4 ; 63 [c=4 l=1] movqi_insn/3 push r24 ; 9 [c=4 l=1] pushqi1/0 ldd r24,Y+4 ; 64 [c=4 l=1] movqi_insn/3 push r24 ; 11 [c=4 l=1] pushqi1/0 ldd r24,Y+4 ; 65 [c=4 l=1] movqi_insn/3 push r24 ; 13 [c=4 l=1] pushqi1/0 ldd r24,Y+4 ; 66 [c=4 l=1] movqi_insn/3 push r24 ; 15 [c=4 l=1] pushqi1/0 # Prepare long long args a1 and a2. ... rcall func2 ; 32 [c=0 l=1] call_insn/1 # Following access is wrong. As it seems, regalloc # thinks that the PUSHes above changed the frame pointer? ldd r24,Y+5 ; 102 [c=4 l=1] movqi_insn/3 ldd r25,Y+6 ; 103 [c=4 l=1] movqi_insn/3 ldd r26,Y+7 ; 104 [c=4 l=1] movqi_insn/3 ldd r27,Y+8 ; 105 [c=4 l=1] movqi_insn/3 adiw r24,1 ; 84 [c=16 l=3] *addsi3/1 adc r26,__zero_reg__ adc r27,__zero_reg__ std Y+5,r24 ; 106 [c=4 l=1] movqi_insn/2 std Y+6,r25 ; 107 [c=4 l=1] movqi_insn/2 std Y+7,r26 ; 108 [c=4 l=1] movqi_insn/2 std Y+8,r27 ; 109 [c=4 l=1] movqi_insn/2 ; SP += 4 ; 35 [c=8 l=4] *addhi3_sp pop __tmp_reg__ pop __tmp_reg__ pop __tmp_reg__ pop __tmp_reg__ # After popping b, loading b is from the correct offset: ldd r24,Y+1 ; 110 [c=4 l=1] movqi_insn/3 ldd r25,Y+2 ; 111 [c=4 l=1] movqi_insn/3 ldd r26,Y+3 ; 112 [c=4 l=1] movqi_insn/3 ldd r27,Y+4 ; 113 [c=4 l=1] movqi_insn/3 sbiw r26,0 ; 87 [c=28 l=3] *cmpsi/2 sbci r25,hi8(5) sbci r24,lo8(5) brne .L4 ; 88 [c=4 l=1] branch ...
[Bug c++/118590] [14/15 regression] ICE with acc enter data copyin and dependent types since r14-7033-g1413af02d62182
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118590 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #8 from Jakub Jelinek --- I think we need something like --- gcc/cp/typeck.cc.jj 2025-01-21 16:26:04.154690509 +0100 +++ gcc/cp/typeck.cc2025-01-22 15:09:09.487161968 +0100 @@ -4867,6 +4867,11 @@ tree build_omp_array_section (location_t loc, tree array_expr, tree index, tree length) { + if (TREE_CODE (array_expr) == TYPE_DECL + || type_dependent_expression_p (array_expr)) +return build3_loc (loc, OMP_ARRAY_SECTION, NULL_TREE, array_expr, index, + length); + tree type = TREE_TYPE (array_expr); gcc_assert (type); type = non_reference (type); If array_expr is a type dependent expression, then all attempts to do something about its type are bogus. The type can be NULL or something the code doesn't really handle. But unfortunately type_dependent_expression_p will ICE if called on a TYPE_DECL which happens on pr67522.C. It will also ICE if array_expr is TYPE_P, but unsure if that can appear there. Sadly diagnostics for array sections is done much later...
[Bug rtl-optimization/118067] [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 --- Comment #19 from Uroš Bizjak --- (In reply to Eric Botcazou from comment #18) > The latest change made for this PR has introduced a number of regressions on > the SPARC of the form: FAOD, the referred "latest change" is: commit r15-6968-gd9835825b3d7193b3d6669174f4386be2cb1 Author: Vladimir N. Makarov Date: Thu Jan 16 12:17:31 2025 -0500 [PR118067][LRA]: Use the right mode to evaluate secondary memory reload ?
[Bug rtl-optimization/118067] [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 Eric Botcazou changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #20 from Eric Botcazou --- Most of them were have been fixed by: https://gcc.gnu.org/g:07f62ed9a7b09951f83855e19d41641b098190b1 commit r15-7083-g07f62ed9a7b09951f83855e19d41641b098190b1 Author: Vladimir N. Makarov Date: Mon Jan 20 17:08:50 2025 -0500 [PR118560][LRA]: Fix typo in checking secondary memory mode for the reg class The patch for PR118067 wrongly checked hard reg set subset. It worked for the equal sets as in PR118067. But it was wrong in other cases as in PR118560 (inordinate compile time). gcc/ChangeLog: PR target/118560 * lra-constraints.cc (invalid_mode_reg_p): Exchange args in hard_reg_set_subset_p call.
[Bug rtl-optimization/118610] [15 regression] gcc.dg/pr85467.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118610 Bug 118610 depends on bug 118067, which changed state. Bug 118067 Summary: [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/117868] [avr][lra] Wrong code with -mlra in simd-1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117868 Thomas Schwinge changed: What|Removed |Added CC||tschwinge at gcc dot gnu.org, ||vmakarov at gcc dot gnu.org --- Comment #7 from Thomas Schwinge --- (In reply to GCC Commits from comment #6) > commit r15-7078-g5cd4605141b8b45cab95e4de8005c69273071107 > Author: Denis Chertykov > Date: Tue Jan 21 00:27:04 2025 +0400 > > [PR117868][LRA]: Restrict the reuse of spill slots For the record, this commit also cures a timeout regression on powerpc64le-unknown-linux-gnu, that we recently had acquired with commit r15-7008-g9f009e8865cda01310c52f7ec8bdaa3c557a2745 "[PR118067][LRA]: Check secondary memory mode for the reg class": {+WARNING: gcc.target/powerpc/pr79916.c (test for excess errors) program timed out.+} [-PASS:-]{+FAIL:+} gcc.target/powerpc/pr79916.c (test for excess errors) This is now back to PASS.
[Bug rtl-optimization/117868] [avr][lra] Wrong code with -mlra in simd-1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117868 --- Comment #8 from Thomas Schwinge --- (In reply to me from comment #7) > (In reply to GCC Commits from comment #6) > > commit r15-7078-g5cd4605141b8b45cab95e4de8005c69273071107 > > Author: Denis Chertykov > > Date: Tue Jan 21 00:27:04 2025 +0400 > > > > [PR117868][LRA]: Restrict the reuse of spill slots > > For the record, this commit also cures [...] Please disregard; operator error, I still had the other commit reverted in my testing tree.
[Bug c/118606] gcc/omp-general.cc:3294: Possible precedence problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118606 Eric Gallager changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||egallager at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2025-01-22 --- Comment #2 from Eric Gallager --- I think the additional parentheses would be helpful; confirmed.
[Bug c++/118604] New: gcc/cp/parser.cc:51316: Non clear code produces clang warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118604 Bug ID: 118604 Summary: gcc/cp/parser.cc:51316: Non clear code produces clang warning Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- clang says: gcc/cp/parser.cc:51316:11: warning: logical not is only applied to the left hand side of this comparison [-Wlogical-not-parentheses] Source code is if (!strcmp (p, "when") == 0 && !default_p) Maybe better code: if (strcmp (p, "when") != 0 && !default_p)
[Bug c++/118199] [15 regression] -fno-elide-constructors vs no_unique_address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118199 Simon Martin changed: What|Removed |Added Known to work||12.4.0, 13.3.0, 14.2.0 --- Comment #18 from Simon Martin --- Set "Known to work" according to https://godbolt.org/z/4oPxhKKx8
[Bug c/118605] New: gcc/tree-assume.cc:108: dangling field problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118605 Bug ID: 118605 Summary: gcc/tree-assume.cc:108: dangling field problem Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- clang says: gcc/tree-assume.cc:108:67: warning: binding reference member 'm_parm_list' to stack allocated parameter 'p' [-Wdangling-field] Source code is assume_query::assume_query (function *f, bitmap p) : m_parm_list (p), Maybe better code: assume_query::assume_query (function *f, const bitmap & p) : m_parm_list (p),
[Bug rtl-optimization/118067] [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 Eric Botcazou changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- CC||ebotcazou at gcc dot gnu.org --- Comment #18 from Eric Botcazou --- The latest change made for this PR has introduced a number of regressions on the SPARC of the form: internal compiler error: maximum number of generated reload insns per insn achieved (90) for example: FAIL: gcc.dg/torture/pr103181.c -O0 (internal compiler error: maximum number of generated reload insns per insn achieved (90)
[Bug rtl-optimization/118610] [15 regression] gcc.dg/pr85467.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118610 Eric Botcazou changed: What|Removed |Added Last reconfirmed||2025-01-22 Ever confirmed|0 |1 CC||ebotcazou at gcc dot gnu.org Status|UNCONFIRMED |NEW Depends on||118067 --- Comment #1 from Eric Botcazou --- Indeed, I have reopened PR rtl-optimization/118067 Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 [Bug 118067] [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f
[Bug c++/55004] [meta-bug] constexpr issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004 Bug 55004 depends on bug 82304, which changed state. Bug 82304 Summary: GCC compiles constexpr function with double reinterpret_cast in a constant context https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82304 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug c++/82304] GCC compiles constexpr function with double reinterpret_cast in a constant context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82304 Marek Polacek changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||mpolacek at gcc dot gnu.org --- Comment #4 from Marek Polacek --- Looks fixed.
[Bug ipa/118097] [15 regression] recent bug with -O2, but not -O1 since r15-6294-g96fb71883d438b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118097 --- Comment #33 from Martin Jambor --- I have proposed a patch on the mailing list: https://inbox.sourceware.org/gcc-patches/ri6frlax0fz@virgil.suse.cz/T/#u
[Bug middle-end/118594] expansion of bf16 to float should not produce (subreg (mem ())
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118594 --- Comment #8 from Richard Sandiford --- I was going to say that force_subreg should call force_operand: diff --git a/gcc/explow.cc b/gcc/explow.cc index 7799a98053b..3f378174268 100644 --- a/gcc/explow.cc +++ b/gcc/explow.cc @@ -759,7 +759,9 @@ force_subreg (machine_mode outermode, rtx op, auto *start = get_last_insn (); op = copy_to_mode_reg (innermode, op); rtx res = simplify_gen_subreg (outermode, op, innermode, byte); - if (!res) + if (res) +res = force_operand (res, NULL_RTX); + else delete_insns_since (start); return res; } But that doesn't work because force_operand considers the subreg to be valid (and general_operand agrees). So if this isn't a regression, it might be better to wait until GCC 16 and disallow all (subreg (mem)) for INSN_SCHEDULING. (Or completely, if possible.)
[Bug tree-optimization/118602] Missed optimization for (0xc0 & c) == 0x80
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118602 --- Comment #4 from Peter Damianov --- >From what I can tell this applies to all integers: bool f (int c) { return (0xc000 & c) == 0x8000; } has the same transformation applied What clang does is valid because the function is checking the following bit pattern: 10XX the biggest value satisfying this bit pattern is: 1011 which is -64 And the other end of the range doesn't need to be considered, because: 1000 is INT8_MIN Clang 19 generates the slightly different: f(char): and dil, -64 neg dil setoal ret Which still spares doing a comparison, but is a little bigger I think it's doing the mask, then negating and checking for overflow Only INT_MIN will overflow if negated, so INT_MIN is the only value that will satisfy the bit pattern after all the other bits are masked off. This second transformation is probably generalizable to more circumstances. I don't know if it's worth filing a separate bug for that one. GCC doesn't do either of these things. I did try searching for dups and couldn't find any, but I have a vague memory of reading something similar here before...
[Bug rtl-optimization/117868] [avr][lra] Wrong code with -mlra in simd-1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117868 Denis Chertykov changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Denis Chertykov --- Fixed
[Bug middle-end/56183] [meta-bug][avr] Problems with register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183 Bug 56183 depends on bug 117868, which changed state. Bug 117868 Summary: [avr][lra] Wrong code with -mlra in simd-1.c https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117868 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 Bug 113934 depends on bug 117868, which changed state. Bug 117868 Summary: [avr][lra] Wrong code with -mlra in simd-1.c https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117868 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug target/118497] [15 Regression] Worse code generated on i686-linux since r15-1619
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118497 --- Comment #3 from Vladimir Makarov --- (In reply to Andrew Pinski from comment #2) > I wonder if the patch in r15-2810-g3c67a0fa1dd39a3378deb854a7fef0ff7fe38004 > (which was reverted due to a bootstrap failure on aarch64) fixes this one > too .. No, reverting the patch did not fixed it. The problem happens before LRA. Simply a pseudo is assigned to ax in RA instead of bx as it was before. Something became wrong with the hard reg cost calculation. I'll look at this. Cost calculation is a sensitive code and can trigger new testsuite failures on different targets. So it will need a lot of testing but I hope to fix it on this week.
[Bug c/118614] New: [riscv] Naked function attribute on riscv optimizes away C conditional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118614 Bug ID: 118614 Summary: [riscv] Naked function attribute on riscv optimizes away C conditional Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: charlie at rivosinc dot com Target Milestone: --- riscv GCC assumes that the when the s0 register is used as a function pointer it will always be non-zero. This causes the body of an if statement that is predicated on this s0 variable to always execute. I understand that the wording in the docs for naked functions say: "While using extended asm or a mixture of basic asm and C code may appear to work, they cannot be depended upon to work reliably and are not supported." However, since this issue appears to be constrained to the sp and s0 registers, I was wondering what was going on here. Here is the code: void __attribute__((__naked__)) ret_from_fork() { register int (*fn)(void *) asm("s0"); register void *fn_arg asm("s1"); if (fn) fn(fn_arg); } outputs: ret_from_fork: mv a0,s1 jalrs0 A godbolt link containing this: https://godbolt.org/z/h3cv6e18K.
[Bug c++/117827] [12/13/14/15 regression] Incorrect destructor calls after array-new-expression since r12-6328-gbeaee0a871b648
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117827 --- Comment #4 from Jakub Jelinek --- That regressed FAIL: g++.dg/eh/aggregate1.C -std=c++98 execution test FAIL: g++.dg/eh/aggregate1.C -std=c++11 execution test FAIL: g++.dg/eh/aggregate1.C -std=c++14 execution test FAIL: g++.dg/eh/aggregate1.C -std=c++17 execution test FAIL: g++.dg/eh/aggregate1.C -std=c++20 execution test FAIL: g++.dg/eh/aggregate1.C -std=c++23 execution test FAIL: g++.dg/eh/aggregate1.C -std=c++26 execution test FAIL: g++.dg/eh/ref-temp2.C -std=c++11 execution test FAIL: g++.dg/eh/ref-temp2.C -std=c++14 execution test FAIL: g++.dg/eh/ref-temp2.C -std=c++17 execution test FAIL: g++.dg/eh/ref-temp2.C -std=c++20 execution test FAIL: g++.dg/eh/ref-temp2.C -std=c++23 execution test FAIL: g++.dg/eh/ref-temp2.C -std=c++26 execution test FAIL: g++.dg/init/aggr7-eh3.C -std=c++98 execution test FAIL: g++.dg/init/aggr7-eh3.C -std=c++11 execution test FAIL: g++.dg/init/aggr7-eh3.C -std=c++14 execution test FAIL: g++.dg/init/array66.C -std=c++98 (test for excess errors) FAIL: g++.dg/init/aggr7-eh3.C -std=c++17 execution test FAIL: g++.dg/init/aggr7-eh3.C -std=c++20 execution test FAIL: g++.dg/init/aggr7-eh3.C -std=c++23 execution test FAIL: g++.dg/init/aggr7-eh3.C -std=c++26 execution test so back to the drawing board.
[Bug tree-optimization/118605] gcc/tree-assume.cc:108: dangling field problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118605 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=63181 --- Comment #4 from Andrew Pinski --- Note the missing warning in the GCC's C++ front-end is recorded as PR 63181.
[Bug target/118597] [15 Regression] gcc.dg/vect/vect-fncall-mask.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118597 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2025-01-22 See Also||https://linaro.atlassian.ne ||t/browse/GNU-1503 --- Comment #3 from Andrew Pinski --- .
[Bug c++/115769] Implement CWG 2867 - Order of initialization for structured bindings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115769 --- Comment #11 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6db9d4e954bff3dfd926c7c9b71e41e47b7089c8 commit r15-7128-g6db9d4e954bff3dfd926c7c9b71e41e47b7089c8 Author: Jakub Jelinek Date: Wed Jan 22 19:36:36 2025 +0100 c++: Implement for static locals CWG 2867 - Order of initialization for structured bindings [PR115769] On Wed, Aug 14, 2024 at 10:06:24AM +0200, Jakub Jelinek wrote: > Though, now that I think about it again, perhaps what we could do instead > is just make sure the _ZGVZ3barvEDC1x1y1z1wE initialization doesn't have > a CLEANUP_POINT_EXPR in it and wrap both the _ZGVZ3barvEDC1x1y1z1wE > and cp_finish_decomp created stuff into a single CLEANUP_POINT_EXPR. > That way, perhaps _ZGVZ3barvEDC1x1y1z1wE could be initialized by one thread > and _ZGVZ3barvE1x by a different, but the temporaries from _ZGVZ3barvEDC1x1y1z1wE > initialization would be only destructed after the _ZGVZ3barvE1w guard > was released by the thread which initialized _ZGVZ3barvEDC1x1y1z1wE. Here is the I believe ABI compatible version, which uses the separate guard variables, so different structured binding variables can be initialized in different threads, but the thread that did the artificial base initialization will keep temporaries live at least until the last guard variable is released (i.e. when even that variable has been initialized). 2025-01-22 Jakub Jelinek PR c++/115769 * decl.cc: Partially implement CWG 2867 - Order of initialization for structured bindings. (cp_finish_decl): If need_decomp_init, for function scope structure binding bases, temporarily clear stmts_are_full_exprs_p before calling expand_static_init, after it call cp_finish_decomp and wrap code emitted by both into maybe_cleanup_point_expr_void and ensure cp_finish_decomp isn't called again. * g++.dg/DRs/dr2867-3.C: New test. * g++.dg/DRs/dr2867-4.C: New test.
[Bug rtl-optimization/118502] Add shrink wrapping testcase for vector::push_back
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118502 --- Comment #4 from Andrew Pinski --- (In reply to Peter Bergner from comment #3) > It's nice to see Surya's patch finally actually helping other targets than > rs6000. That said... > > > (In reply to Andrew Pinski from comment #2) > > Patch for the testcase posted: > > https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673875.html > > I lost the email to your post, so Thanks for the typo fixes, I will fix them locally and make sure they are included before I get to push it.
[Bug middle-end/118614] [riscv] Naked function attribute on riscv optimizes away C conditional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118614 Andrew Pinski changed: What|Removed |Added Component|c |middle-end --- Comment #1 from Andrew Pinski --- I suspect this is just invalid. Based on the documentation of naked and based on the documentation of register asm variables.
[Bug tree-optimization/118602] Missed optimization for (0xc0 & c) == 0x80
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118602 --- Comment #5 from Peter Damianov --- I meant to say: 1011 is -65 in the previous comment.
[Bug fortran/118613] maxval/minval may evaluate argument too often
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118613 --- Comment #2 from anlauf at gcc dot gnu.org --- (In reply to anlauf from comment #1) > The following partial patch seems to fix the rank-2 cases here by forcing > a temporary that gets reused: This prints: 4 0 4 0 6 0. 4 0. 6 0. 4 0. The rank-1 case persists.
[Bug fortran/118613] maxval/minval may evaluate argument too often
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118613 --- Comment #1 from anlauf at gcc dot gnu.org --- The following partial patch seems to fix the rank-2 cases here by forcing a temporary that gets reused: diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index afbec5b2752..b852339bfaf 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -6710,6 +6710,7 @@ gfc_conv_intrinsic_minmaxval (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_copy_loopinfo_to_se (&arrayse, &loop); arrayse.ss = arrayss; gfc_conv_expr_val (&arrayse, arrayexpr); + arrayse.expr = gfc_evaluate_now (arrayse.expr, &arrayse.pre); gfc_add_block_to_block (&block, &arrayse.pre); gfc_init_block (&block2);
[Bug rtl-optimization/118562] [15 Regression] SEGV in late-combine (rtl_ssa::function_info::remove_use)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118562 Richard Sandiford changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #5 from Richard Sandiford --- Oops, seems to be a botched phi update. Testing a fix.
[Bug fortran/118613] maxval/minval may evaluate argument too often
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118613 --- Comment #3 from anlauf at gcc dot gnu.org --- Adding a second temporary reduces the evaluation count for the rank-1 case further: diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index afbec5b2752..0cdc886a715 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -6710,6 +6710,7 @@ gfc_conv_intrinsic_minmaxval (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_copy_loopinfo_to_se (&arrayse, &loop); arrayse.ss = arrayss; gfc_conv_expr_val (&arrayse, arrayexpr); + arrayse.expr = gfc_evaluate_now (arrayse.expr, &arrayse.pre); gfc_add_block_to_block (&block, &arrayse.pre); gfc_init_block (&block2); @@ -6816,6 +6817,7 @@ gfc_conv_intrinsic_minmaxval (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_copy_loopinfo_to_se (&arrayse, &loop); arrayse.ss = arrayss; gfc_conv_expr_val (&arrayse, arrayexpr); + arrayse.expr = gfc_evaluate_now (arrayse.expr, &arrayse.pre); gfc_add_block_to_block (&block, &arrayse.pre); /* MIN_EXPR/MAX_EXPR has unspecified behavior with NaNs or This gives: 4 0 4 0 5 0. 4 0. 5 0. 4 0. The remaining superfluous evaluation seems to be a bug in the algorithm as described in trans-intrinsic.cc: 2) Array mask is used and NaNs need to be supported, rank 1: limit = Infinity; nonempty = false; S = from; while (S <= to) { if (mask[S]) { nonempty = true; if (a[S] <= limit) goto lab; } S++; } limit = nonempty ? NaN : huge (limit); lab: while (S <= to) { if(mask[S]) limit = min (a[S], limit); S++; } 3) NaNs need to be supported, but it is known at compile time or cheaply at runtime whether array is nonempty or not, rank 1: limit = Infinity; S = from; while (S <= to) { if (a[S] <= limit) goto lab; S++; } limit = (from <= to) ? NaN : huge (limit); lab: while (S <= to) { limit = min (a[S], limit); S++; } We should increment S either after the comparison in the first while-loop, or increment it directly after the label lab.
[Bug rtl-optimization/118611] LRA inserts unneeded reload on FMA chain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org Depends on||82237 --- Comment #1 from Andrew Pinski --- I think this is the same as PR 82237. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82237 [Bug 82237] [AArch64] Destructive operations result in poor register allocation after scheduling
[Bug rtl-optimization/118611] LRA inserts unneeded reload on FMA chain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611 --- Comment #2 from Andrew Pinski --- (In reply to Andrew Pinski from comment #1) > I think this is the same as PR 82237. Or at least related.
[Bug rtl-optimization/118611] LRA inserts unneeded reload on FMA chain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > (In reply to Andrew Pinski from comment #1) > > I think this is the same as PR 82237. > > Or at least related. I'm not sure, in this one the instructions have a strict dependency chain, so scheduling can't do anything different here. This one seems to be more about how lra has resolved the tie for the 0 constraint in the instructions.
[Bug c++/118396] [15 regression] -O1+ leads to reading uninitialized data when virtual destructor is present since r15-6369-gfa99002538bc91
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118396 --- Comment #17 from Carlos Galvez --- (FYI I'm trying to test this but now I get out-of-memory errors when trying to compile a single .cpp file, will test with a newer patch in case it got fixed)
[Bug tree-optimization/118612] New: return value loaded despite noreturn attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118612 Bug ID: 118612 Summary: return value loaded despite noreturn attribute Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: drepper.fsp+rhbz at gmail dot com Target Milestone: --- The following code is, as can be seen, stand-alone with the definition of error from glibc inlined. The generated code for -O2 (gcc 15, 14, haven't tested others): : 0: 85 ff test %edi,%edi 2: 75 03 jne7 4: 31 c0 xor%eax,%eax 6: c3 ret 7: ba 00 00 00 00 mov$0x0,%edx c: 50 push %rax d: 31 f6 xor%esi,%esi f: bf 01 00 00 00 mov$0x1,%edi 14: 31 c0 xor%eax,%eax 16: e8 00 00 00 00 call 1b The problem is at address 0x14. This is the return value of 'f' but, as the compiler correctly determines, 'error' will not return and therefore this load is unnecessary. extern void __error_alias(int __status, int __errnum, const char *__format, ...) __asm__("error") __attribute__ ((__format__ (__printf__, 3, 4))); extern void __error_noreturn(int __status, int __errnum, const char *__format, ...) __asm__("error") __attribute__ ((__noreturn__, __format__ (__printf__, 3, 4))); extern inline __attribute__ ((__gnu_inline__)) void error (int __status, int __errnum, const char *__format, ...) { if (__builtin_constant_p (__status) && __status != 0) __error_noreturn (__status, __errnum, __format, __builtin_va_arg_pack ()); else __error_alias (__status, __errnum, __format, __builtin_va_arg_pack ()); } int f(int a) { if (a) error(1, 0, "string"); return 0; }
[Bug tree-optimization/118612] return value loaded despite noreturn attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118612 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Isn't xor%eax,%eax "no floating point ... arguments" for the ... call?
[Bug tree-optimization/118612] return value loaded despite noreturn attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118612 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED Target||x86_64 --- Comment #2 from Andrew Pinski --- eax here is not the return value at all. BUT rather since error is a varargs function, it specifies how many floating point registers are in use.
[Bug fortran/118613] New: maxval/minval may evaluate argument too often
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118613 Bug ID: 118613 Summary: maxval/minval may evaluate argument too often Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: anlauf at gcc dot gnu.org Target Milestone: --- Created attachment 60242 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60242&action=edit Testcase Found while looking into pr118580. The argument of minval/maxval may be evaluated more often than expected. The attached program prints: 4 0 4 0 6 0. 5 0. 6 0. 5 0. Expected: 4 0 4 0 4 0. 4 0. 4 0. 4 0. The latter is confirmed by other brands (e.g. Intel).
[Bug target/118280] [14/15 Regression] __atomic_test_and_set in Microblaze are broken (exposed by r14-4286)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118280 --- Comment #11 from Thomas Petazzoni --- Is there any update about this patch? Should consider Microblaze unmaintained/broken, and possibly drop support for it?
[Bug middle-end/118614] [riscv] Naked function attribute on riscv optimizes away C conditional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118614 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #2 from Andrew Pinski --- This has nothing to do with naked. Rather it has to do with `Specifying Registers for Local Variables`. https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Local-Register-Variables.html ``` Defining a register variable does not reserve the register. Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. ``` That is this is an invalid use of them.
[Bug middle-end/118594] expansion of bf16 to float should not produce (subreg (mem ())
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118594 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #10 from Jakub Jelinek --- Can't the caller if it sees it is a MEM use adjust_address with 0 offset instead of simplify_*subreg?
[Bug rtl-optimization/118502] Add shrink wrapping testcase for vector::push_back
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118502 Peter Bergner changed: What|Removed |Added CC||bergner at gcc dot gnu.org --- Comment #3 from Peter Bergner --- It's nice to see Surya's patch finally actually helping other targets than rs6000. That said... (In reply to Andrew Pinski from comment #2) > Patch for the testcase posted: > https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673875.html I lost the email to your post, so +// The fasth path is just checking if there is enough space and doing a few stores. s/The fasth/the fast/ +// We want to verify that shrink wrapping happen always. s/happen always/always happens/
[Bug middle-end/118594] expansion of bf16 to float should not produce (subreg (mem ())
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118594 --- Comment #11 from Andrew Pinski --- (In reply to Richard Sandiford from comment #8) > But that doesn't work because force_operand considers the subreg to be valid > (and general_operand agrees). So if this isn't a regression, it might be > better to wait until GCC 16 and disallow all (subreg (mem)) for > INSN_SCHEDULING. (Or completely, if possible.) Yes it is not a regression as far as I can tell. the code started to be accepted/ICEing when the conversion code was added to expr.cc. (In reply to Jakub Jelinek from comment #10) > Can't the caller if it sees it is a MEM use adjust_address with 0 offset > instead of simplify_*subreg? force_lowpart_subreg might be able to do that and might work too. And yes both things I think are GCC 16 material at this point. I think we should disallow subreg of mem completely and should do the adjust_address change.
[Bug rtl-optimization/118615] [15 Regression] Bootstrap failure on aarch64 after r15-2810
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118615 --- Comment #5 from Andrew Pinski --- (In reply to Jakub Jelinek from comment #4) > My bet is on r15-6661 then. That was my next guess anyways and you would be correct, that is: r15-6661-gc5db3f50bdf34e works Now to start bisecting when the difference in behavior between selftest and bootstrap comparison started to show up. And then to understand why turning on/off the scheduling pass before RA would make a difference.
[Bug target/118497] [15 Regression] Worse code generated on i686-linux since r15-1619
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118497 Sam James changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |vmakarov at gcc dot gnu.org Last reconfirmed||2025-01-23 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c) == 0x80)) since r15-6893
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 Alexandre Oliva changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |aoliva at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #9 from Alexandre Oliva --- Mine
[Bug target/118609] gcc.target/i386/amxmovrs-t2rpntlvw-2.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118609 --- Comment #1 from Haochen Jiang --- I am going to revise the testcase through the thread: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674157.html Dup with PR118270
[Bug target/118270] [15 Regression] Many AVX10.2 test failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118270 Sam James changed: What|Removed |Added CC||ro at gcc dot gnu.org --- Comment #3 from Sam James --- *** Bug 118609 has been marked as a duplicate of this bug. ***
[Bug target/118270] [15 Regression] Many AVX10.2 test failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118270 Sam James changed: What|Removed |Added Keywords||patch Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |haochen.jiang at intel dot com Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2025-01-23 --- Comment #4 from Sam James --- https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674157.html
[Bug target/118609] gcc.target/i386/amxmovrs-t2rpntlvw-2.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118609 Sam James changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED Keywords||testsuite-fail --- Comment #2 from Sam James --- . *** This bug has been marked as a duplicate of bug 118270 ***
[Bug fortran/118613] maxval/minval may evaluate argument too often
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118613 --- Comment #4 from anlauf at gcc dot gnu.org --- Created attachment 60243 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60243&action=edit Preliminary patch This does the job and fixes the rank-1 case. Regtests ok. Still need to adjust the documentation of the (corrected) algorithm.
[Bug target/118595] [15 regression] RISC-V: gfortran/c-interop test execution failures on RVV zvl > 128b since r15-3228-g771256bcb9d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118595 Sam James changed: What|Removed |Added CC||patrick at rivosinc dot com Keywords||testsuite-fail, wrong-code Summary|[15 regression] RISC-V: |[15 regression] RISC-V: |gfortran/c-interop test |gfortran/c-interop test |execution failures on RVV |execution failures on RVV |zvl > 128b |zvl > 128b since ||r15-3228-g771256bcb9d Target Milestone|--- |15.0
[Bug c++/117827] [12/13/14/15 regression] Incorrect destructor calls after array-new-expression since r12-6328-gbeaee0a871b648
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117827 --- Comment #3 from Jakub Jelinek --- --- gcc/cp/init.cc.jj 2025-01-02 11:47:10.771493771 +0100 +++ gcc/cp/init.cc 2025-01-22 17:56:48.268704123 +0100 @@ -4530,6 +4530,7 @@ build_vec_init (tree base, tree maxindex tree obase = base; bool xvalue = false; bool errors = false; + bool clear_rval = false; location_t loc = (init ? cp_expr_loc_or_input_loc (init) : location_of (base)); @@ -4720,6 +4721,10 @@ build_vec_init (tree base, tree maxindex errors = true; TARGET_EXPR_CLEANUP (iterator_targ) = e; CLEANUP_EH_ONLY (iterator_targ) = true; + /* Signal that we want to clear rval near the end of the statement + expression so that the the build_vec_delete_1 cleanup does nothing + after the whole construction succeeded. */ + clear_rval = true; /* Since we push this cleanup before doing any initialization, cleanups for any temporaries in the initialization are naturally within our @@ -5096,7 +5101,23 @@ build_vec_init (tree base, tree maxindex /* The value of the array initialization is the array itself, RVAL is a pointer to the first element. */ - finish_stmt_expr_expr (rval, stmt_expr); + if (clear_rval) +{ + /* If there is a build_vec_delete_1 cleanup on rval, make sure +to return the value of rval but clear the rval variable so that +the cleanup does nothing when reaching this. So, emit +({ ... base = rval; rval = nullptr; base; }) */ + finish_expr_stmt (cp_build_modify_expr (input_location, base, + NOP_EXPR, rval, + tf_warning_or_error)); + finish_expr_stmt (cp_build_modify_expr (input_location, rval, + NOP_EXPR, + build_zero_cst (ptype), + tf_warning_or_error)); + finish_stmt_expr_expr (base, stmt_expr); +} + else +finish_stmt_expr_expr (rval, stmt_expr); stmt_expr = finish_init_stmts (is_global, stmt_expr, compound_stmt); fixes this for me, but totally untested so far.
[Bug target/118595] [15 regression] RISC-V: gfortran/c-interop test execution failures on RVV zvl > 128b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118595 --- Comment #1 from Edwin Lu --- bisected to r15-3228-g771256bcb9d as the first bad commit
[Bug rtl-optimization/116028] [15 regression] gcc.dg/pr10474.c test failure since r15-1619-g3b9b8d6cfdf593
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028 Bug 116028 depends on bug 118615, which changed state. Bug 118615 Summary: [15 Regression] Bootstrap failure on aarch64 after r15-2810 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118615 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED