[Bug target/101933] Unloaded dll with global std::mutex causes exe to crash on exit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101933 Paul Jackson changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #4 from Paul Jackson --- I debugged it a bit more, and I found out that: 1. It's happening when exceptions are involved. 2. It's actually a bug of TDM-GCC. For details, please see my second comment in the GitHub issue: https://github.com/jmeubank/tdm-gcc/issues/38#issuecomment-912876481
[Bug libstdc++/102199] New: is_default_constructible incorrect for an inner type with NSDMI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199 Bug ID: 102199 Summary: is_default_constructible incorrect for an inner type with NSDMI Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: eyalroz1 at gmx dot com Target Milestone: --- Stackoverflow discussion: https://stackoverflow.com/q/69050558/1593077 Related LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=38374 GodBolt: https://godbolt.org/z/snPf7Ks4W Consider the following program: #include struct outer { struct inner { // inner() { } unsigned int x = 0; }; //static_assert(std::is_default_constructible::value, // "not default ctorable - inside"); }; static_assert(std::is_default_constructible::value, "not default ctorable - outside"); It compiles. But if we uncomment the first static_assert - it evaluates to false. Mind you: Not because struct inner is incomplete; it is simply deemed to not be default-constructible. But - it _is_ default constructible. And if we add a method to struct outer which default-constructs an inner, it will work. Also note that if we uncomment the explicit default ctor the definition of struct inner, both asserts pass. clang++ seems to exhibit this too (also with -stdlib=libc++). I'm not sure whether this is an actual bug in the library, or whether the standard mandates this in some freakish way, but - it's just wrong.
[Bug tree-optimization/93540] Attributes pure and const not working with aggregate return types, even trivial ones
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93540 Trass3r changed: What|Removed |Added CC||trass3r at gmail dot com --- Comment #2 from Trass3r --- I also ran into this when trying to optimize trivial but expensive getter functions, e.g. returning shared_ptr. struct Foo { int operator+(const Foo& f); int a; }; [[gnu::const]] Foo foo(); // int instead of Foo works auto testfunction() { return foo() + foo(); // results in 2 calls }
[Bug middle-end/32911] Function __attribute__ ((idempotent))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32911 Trass3r changed: What|Removed |Added CC||trass3r at gmail dot com --- Comment #7 from Trass3r --- OpenGL's bind functions are another example. They don't return anything so can't be marked pure/const but any subsequent calls with the same arguments are redundant.
[Bug c/102200] New: ice in put_ref, at pointer-query.cc:1351
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102200 Bug ID: 102200 Summary: ice in put_ref, at pointer-query.cc:1351 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- For this C source code: long try_extension_len; void try_extension_str() { char *curr = try_extension_str; char end = sizeof try_extension_str; while (try_extension_len) { if (curr < end) *curr = ';'; if (curr > &end) curr = &end; } } compiled with recent gcc trunk and compiler flag -O2, does this: during GIMPLE pass: strlen bug754.c: In function ‘try_extension_str’: bug754.c:2:6: internal compiler error: in put_ref, at pointer-query.cc:1351 2 | void try_extension_str() { | ^ 0xc7696c pointer_query::put_ref(tree_node*, access_ref const&, int) ../../trunk.git/gcc/pointer-query.cc:1351 The bug first seems to occur sometime between git hash 7a6f40d0452ec76e and 9695e1c23be5b5c5. Only 21 commits.
[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 --- Comment #25 from Daniel Berlin --- This seems like a bad idea, and is impossible in general. The whole point of the attributes is to tell the compiler things are pure/const in cases it can't already prove. It can already prove a lot, and doesn't need help in most of the simple examples being given (in other bugs). You are basically going to warn in the cases the compiler can't prove it (IE sees something it thinks makes the function not pure/const), and those are *exactly* the cases the attribute exists for - where the compiler doesn't know, but you do.
[Bug tree-optimization/102200] [12 Regression] ice in put_ref, at pointer-query.cc:1351
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102200 Andrew Pinski changed: What|Removed |Added Keywords||ice-on-valid-code Target Milestone|--- |12.0 Component|c |tree-optimization Summary|ice in put_ref, at |[12 Regression] ice in |pointer-query.cc:1351 |put_ref, at ||pointer-query.cc:1351
[Bug c++/102199] is_default_constructible incorrect for an inner type with NSDMI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199 Andrew Pinski changed: What|Removed |Added Component|libstdc++ |c++ --- Comment #1 from Andrew Pinski --- THis comes down to when the struct is complete.
[Bug c++/102199] is_default_constructible incorrect for an inner type with NSDMI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199 --- Comment #2 from Andrew Pinski --- This is because the following is still valid C++11: struct outer { struct inner { // inner() { } unsigned int x = y; }; static constexpr int y =10; }; That is inner is not completed until outer is completed.
[Bug c++/102201] New: Accepts invalid C++98 with nested class and sizeof of outer's non-static field
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102201 Bug ID: 102201 Summary: Accepts invalid C++98 with nested class and sizeof of outer's non-static field Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: accepts-invalid Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: struct outer { struct inner { inner() :x(sizeof(y)) { } unsigned int x; }; int y; }; - CUT The above code is valid C++11 but invalid C++98 because the field y is non-static.
[Bug c++/102199] is_default_constructible incorrect for an inner type with NSDMI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199 --- Comment #3 from Eyal Rozenberg --- Andrew: What you're saying would be plausible if g++ would find the structure to be incomplete. It does not. The completeness check passes; and it is why adding the explicit default ctor makes the asserting pass - despite your rationale applying to that case just as well.
[Bug tree-optimization/102200] [12 Regression] ice in put_ref, at pointer-query.cc:1351
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102200 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2021-09-04 --- Comment #1 from Andrew Pinski --- (In reply to David Binderman from comment #0) > The bug first seems to occur sometime between git hash 7a6f40d0452ec76e > and 9695e1c23be5b5c5. Only 21 commits. Most likely r12-3300-ece28da924dd Confirmed.
[Bug tree-optimization/102196] -Wmaybe-uninitialized: Maybe generate helpful hints?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102196 --- Comment #6 from Jan-Benedict Glaw --- Calling the compiler again with just adding -fanalyzer doesn't add more information to the output. Do I need to turn on extra warnings to enable static analysis for access to possibly uninitialized variables?
[Bug c++/101355] incorrect `this' in destructor calls when compiling coroutines with ubsan
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101355 Dan Klishch changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Dan Klishch --- GCC stopped instrumenting destructors in this particular case, so I guess the bug is fixed. https://godbolt.org/z/KGa6aGf5x
[Bug c++/102201] Accepts invalid C++98 with nested class and sizeof of outer's non-static field
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102201 Harald van Dijk changed: What|Removed |Added CC||harald at gigawatt dot nl --- Comment #1 from Harald van Dijk --- This doesn't need inner classes, a simpler reproducer is: struct S { int i; }; int j = sizeof S::i; gcc accepts this in all modes ever since the C++11 rule for non-static members in unevaluated contexts was implemented (4.4). clang says in C++98 mode: test.cc:2:19: error: invalid use of non-static data member 'i' int j = sizeof S::i; ~~~^ 1 error generated.
[Bug c/29970] mixing ({...}) with VLA leads to massive breakage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970 --- Comment #13 from Martin Uecker --- The remaining problem with constant index 0 for the patch mentioned above, appears to be related to fold_binary_loc which transforms (a + (x, 0)) to (x, a) which breaks if 'x' depends on something in 'a'.
[Bug target/56309] conditional moves instead of compare and branch result in almost 2x slower code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309 Peter Cordes changed: What|Removed |Added CC||peter at cordes dot ca --- Comment #36 from Peter Cordes --- Related: a similar case of cmov being a worse choice, for a threshold condition with an array input that happens to already be sorted: https://stackoverflow.com/questions/28875325/gcc-optimization-flag-o3-makes-code-slower-than-o2 GCC with -fprofile-generate / -fprofile-use does correctly decide to use branches. GCC7 and later (including current trunk) with -O3 -fno-tree-vectorize de-optimizes by putting the CMOV on the critical path, instead of as part of creating a zero/non-zero input for the ADD. PR82666. If you do allow full -O3, then vectorization is effective, though.
[Bug target/56309] conditional moves instead of compare and branch result in almost 2x slower code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309 --- Comment #37 from Peter Cordes --- Correction, PR82666 is that the cmov on the critical path happens even at -O2 (with GCC7 and later). Not just with -O3 -fno-tree-vectorize. Anyway, that's related, but probably separate from choosing to do if-conversion or not after inlining.
[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 --- Comment #26 from Federico Kircheis --- As multiple people commented this Ticket, I do not know to who the least message is sent, but I would like to give again my opinion on it, as I would really like to use those attributes in non-toy projects. > This seems like a bad idea I think there are valid use-cases for those warnings. > and is impossible in general Let me quote myself: > ... a warning that even only works for trivial case is much better than > nothing, because at least I know I can safely use the attribute for some > functions as a contract to the caller, and have it checked. There are now two possible outcomes if a compiler emits a warning. 1) I look at the definition, and *gasp*, the compiler is actually right. The function was pure before, but the last changes made it impure. Either I did not realize it, or I forgot to change the function declaration. Thank you GCC for making me aware of the issue, I'll fix it. 2) I look at the definition an think that GCC is wrong. I know better, and the function is pure. I can either try to simplify the function in such a way that GCC does not complain anymore (which might be a good idea), or I can use a pragma to ignore this one warning (and comment why it's ignored), or remove the attribute altogether, as GCC might call the function multiple times if it thinks it's impure (see example at the end). In the first approach, I can still benefit from warnings if the function changes again. In the second case I cant but at least, I can still grep in the entire codebase and check periodically which warnings have been disabled locally, just like I do for other warnings. In the third case yes, I would probably report a bug with a minimal example. This (hopefully), would improve GCC analysis capabilities. > The whole point of the attributes is to tell the compiler things are > pure/const in cases it can't already prove. That does not mean that it is not useful to let it do the check, *especially if it can prove that the attribute is used incorrectly*, but even if it can't prove anything. And also see the example at the end why this is not completely true. > It can already prove a lot, and doesn't need help in most of the simple > examples being given (in other bugs). But programmers (at least for the most use-cases I've seen) needs that type of support. I would like to know if a function has side effects. It's great if the compiler can see it automatically, but when reading and writing code, especially code not written by me or maintained by multiple authors, we might want to restrict the functionality of some functions. For side-effect free functions, the attributes const and pure are great, but using them is more harmful, because if used wrongly it introduces UB, thus 1) they do not really document if a function is pure, as there is no tooling checking if the statement is true 2) they introduce bugs that no-one can explain (see at the end). Thus a comment "this function is pure", is by contrast much better, as it does not introduce UB, but we all know that those kind of commends do not age well. Thus at the end, they get ignored because not trustworthy, and one need always to look at the implementation. > You are basically going to warn in the cases the compiler can't prove it [...] And for many use-cases it is fine. Also the second example I gave: // bar.hpp [[gnu::const]] int get_value(); // bar.cpp int get_value(){static int i = 0; return ++i;} // foo.cpp int foo(){ int i = get_value(); int j = get_value(); return i+j; } The compiler will still optimize the call to get_value, (unless it is able to see the definition of get_value and see that there are side effects). Thus, if the function is marked pure, the compiler * will not call it a second time if it does not see the implementation of `get_value` * will call it a second time if it sees the implementation of `get_value` and notices it is not pure. This is one of those bugs that no-one can explain, as simply moving code (making a function, for example, inline, or move it to another file), or changing optimization level, changes the behavior of the program. Thus, given main.cpp [[gnu::const]] int foo(); // foo.cpp int main(){ int i = foo(); int j = foo(); return i+j; } how many times is GCC going to call foo? If GCC thinks that the function is pure, then only once. If it thinks it is not pure, twice. I have no idea what GCC thinks, because there are no diagnostics for it! And look, it does not even matter if foo is pure or not, it matters if GCC thinks if it is pure or not. I can similarly tell GCC to inline functions, but if GCC doesn't at least it will tell me he didn't.(warning: 'always_inline' function might not be inlinable [-Wattributes]) We can of course say "those attributes are only for those people that really know better", but as the compiler is alread
[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 --- Comment #27 from Federico Kircheis --- Edit: sorry, my last comment about what GCC thinks is wrong. GCC seems to follow the gnu::pure/gnu::const directive to the letter, it does not ignore it when it sees the implementation of the function, thus my comment about information are already available can be ignored.
[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 --- Comment #28 from Federico Kircheis --- >Edit: sorry, my last comment about what GCC thinks is wrong. Unless it is going to inline the function call, in that case the attributes are as-if ignored (at least the case I've tested with GCC 11.2).
[Bug tree-optimization/93040] gcc doesn't optimize unaligned accesses to a 16-bit value on the x86 as well as it does a 32-bit value (or clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93040 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |10.0
[Bug tree-optimization/89809] movzwl is not utilized when uint16_t is loaded with bit-shifts (while memcpy does)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89809 Andrew Pinski changed: What|Removed |Added Known to fail||9.4.0 Resolution|--- |DUPLICATE Status|NEW |RESOLVED Known to work||10.1.0 Target Milestone|--- |10.0 --- Comment #4 from Andrew Pinski --- Fixed for GCC 10. Dup of bug 93040. *** This bug has been marked as a duplicate of bug 93040 ***
[Bug tree-optimization/93040] gcc doesn't optimize unaligned accesses to a 16-bit value on the x86 as well as it does a 32-bit value (or clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93040 Andrew Pinski changed: What|Removed |Added CC||nok.raven at gmail dot com --- Comment #6 from Andrew Pinski --- *** Bug 89809 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/89811] uint32_t load is not recognized if shifts are done in a fixed-size loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89811 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Last reconfirmed|2019-03-25 00:00:00 |2021-9-4
[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 --- Comment #29 from Daniel Berlin --- Let me try to explain a different way: The only functions GCC can warn about are those that don’t need the attributes in the first place. The way any warning would work is to detect whether it is pure/const, and then see how the user marked it. So anything it can properly detect as right or wrong didn’t need an attribute to begin with - the compiler could already tell if it was pure/const Rather than tell the user they got it wrong, you might as well tell the user to remove the attribute because it isn’t necessary and won’t be necessary. This is precisely why attributes are meant for when you are sure you know more than the compiler can tell, and *no other time *. It is a tool for experts. Giving a bunch of really contrived examples where users may update things wrong doesn’t seem like a good motivation to make a warning that can only possibly have a really high false positive rate. The same logic applies to a lot of expert-use-only attributes. It is assumed you know what you are doing, because the compiler can’t tell you you are wrong accurately On Sat, Sep 4, 2021 at 4:40 PM federico.kircheis at gmail dot com < gcc-bugzi...@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 > > --- Comment #28 from Federico Kircheis com> --- > >Edit: sorry, my last comment about what GCC thinks is wrong. > > Unless it is going to inline the function call, in that case the > attributes are > as-if ignored (at least the case I've tested with GCC 11.2). > > -- > You are receiving this mail because: > You are on the CC list for the bug.
[Bug middle-end/90424] memcpy into vector builtin not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90424 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2019-05-13 00:00:00 |2021-9-4 Severity|normal |enhancement Component|target |middle-end --- Comment #8 from Andrew Pinski --- Happens on aarch64 also.
[Bug target/85539] x86_64: loads are not always narrowed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85539 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=92180 Known to fail||10.3.0 Status|NEW |RESOLVED Known to work||11.1.0 --- Comment #3 from Andrew Pinski --- Trying 6 -> 7: 6: r86:DI=[r87:DI] REG_DEAD r87:DI 7: r85:SI=r86:DI#0 REG_DEAD r86:DI Successfully matched this instruction: (set (reg:SI 85 [ *p_3(D) ]) (mem:SI (reg:DI 87) [1 *p_3(D)+0 S4 A64])) allowing combination of insns 6 and 7 original costs 4 + 4 = 8 replacement cost 4 deferring deletion of insn with uid = 6. modifying insn i3 7: r85:SI=[r87:DI] REG_DEAD r87:DI deferring rescan insn with uid = 7. starting the processing of deferred insns rescanning insn with uid = 7. ending the processing of deferred insns This is because cse no longer props the subreg into the last move: (insn 7 6 8 2 (set (reg:SI 85) (subreg:SI (reg:DI 86) 0)) "/app/example.cpp":7:13 67 {*movsi_internal} (nil)) (insn 8 7 12 2 (set (reg:SI 83 [ ]) (reg:SI 85)) "/app/example.cpp":7:13 67 {*movsi_internal} (nil)) (insn 12 8 13 2 (set (reg/i:SI 0 ax) (reg:SI 83 [ ])) "/app/example.cpp":8:1 67 {*movsi_internal} (nil)) And this was due to the patch which fixes PR 92180 and it was an expected out come too.
[Bug middle-end/91899] Merge constant literals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91899 Andrew Pinski changed: What|Removed |Added Resolution|--- |WONTFIX Status|NEW |RESOLVED --- Comment #6 from Andrew Pinski --- You need to use -fmerge-all-constants and the linker will merge them.
[Bug target/93885] Spurious instruction kshiftlw issued
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93885 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Ever confirmed|0 |1 Last reconfirmed||2021-09-04 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed, this is due to UNSPEC_MASKOP on the shift which most likely can be removed these days.
[Bug tree-optimization/94834] Failure to optimize loop bswap pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94834 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #3 from Andrew Pinski --- This is a dup of bug 89811. *** This bug has been marked as a duplicate of bug 89811 ***
[Bug tree-optimization/89811] uint32_t load is not recognized if shifts are done in a fixed-size loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89811 Andrew Pinski changed: What|Removed |Added CC||gabravier at gmail dot com --- Comment #3 from Andrew Pinski --- *** Bug 94834 has been marked as a duplicate of this bug. ***
[Bug target/95974] AArch64 arm_neon.h stores interfere with gimple optimisations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95974 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Severity|normal |enhancement Last reconfirmed||2021-09-04 --- Comment #1 from Andrew Pinski --- Confirmed, maybe adding some access attributes will help this.
[Bug target/88473] AVX512: constant folding on mask does not remove unnecessary instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473 Andrew Pinski changed: What|Removed |Added Blocks||93885 --- Comment #7 from Andrew Pinski --- The UNSPEC_MASKOP ones are still there. PR 93885 is the same issue. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93885 [Bug 93885] Spurious instruction kshiftlw issued
[Bug target/97286] GCC sometimes uses an extra xmm register for the destination of _mm_blend_ps
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97286 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Keywords||ra
[Bug tree-optimization/99082] manual bit-field creation followed by manual extraction does not always produce good code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99082 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug target/93346] gcc does not generate BZHI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93346 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |10.0
[Bug target/82298] x86 BMI: no peephole for BZHI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82298 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |10.0 Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- Fixed in GCC 10. Dup of bug 93346. *** This bug has been marked as a duplicate of bug 93346 ***
[Bug target/93346] gcc does not generate BZHI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93346 Andrew Pinski changed: What|Removed |Added CC||peter at cordes dot ca --- Comment #8 from Andrew Pinski --- *** Bug 82298 has been marked as a duplicate of this bug. ***
[Bug middle-end/92080] Missed CSE of _mm512_set1_epi8(c) with _mm256_set1_epi8(c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2019-10-14 00:00:00 |2021-9-4 Severity|normal |enhancement --- Comment #5 from Andrew Pinski --- This gives good code: #include __m512i sinkz; __m256i sinky; void foo(char c) { __m512i a = _mm512_set1_epi8(c); sinkz = a; sinky = *((__m256i*)&a); }
[Bug target/94789] Failure to take advantage of shift operand semantics to turn subtraction into negate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94789 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487 --- Comment #30 from Federico Kircheis --- It seems to me we are not going to agree as we tend to repeat ourselves, lets see if we go around and around in circles or if it is more like a spiral ;) Your view is more about the compiler, how it is interpreting the attributes and thus why it is unneeded, mine is more about the developers writing (but most importantly) reading it. > The only functions GCC can warn about are those that don’t need the attributes in the first place. The way any warning would work is to detect whether it is pure/const, and then see how the user marked it. So anything it can properly detect as right or wrong didn’t need an attribute to begin with - the compiler could already tell if it was pure/const My knowledge about how GCC (or other compilers) works, is very limited, but If the function is implemented in another * translation unit * library * pre-compiled library * pre-compiled library created by another compiler does GCC know it can avoid calling it multiple times? Whole-program-optimization might help in some of those cases (I admit I have no idea; can the linker remove multiple function calls and replace them with a variable?), but depending on the project size it might add up a lot in term of compile-times. So even for simple functions, where GCC can clearly determine its purity, it can be useful adding the attribute. And even assuming that whole-program-optimization helps in most of those cases (which do not depend on the complexity or length of a function) how does someone know if adding those attributes to a function that is pure makes sense or not? Adding pure to `inline int answer_of_life(){return 42;}` might not make any difference (both for programmers and compiler, because of it's simplicity and because inline), but where should the line be drawn? Should I mark my functions (with something else as you are suggesting too it might do more harm than good), add for all those dummy tests, and check in the generated assembly if GCC recognizes them as pure and elides the second call? There must be surely be a better way, but I currently know no other. > Rather than tell the user they got it wrong, you might as well tell the user to remove the attribute because it isn’t necessary and won’t be necessary. No, removing it as unnecessary would be wrong. Then you cannot tell anymore the difference between functions that are pure by accident and by design. And you cannot prevent anymore a pure-function to getting nonpure, except by reading the code. It is useful for programmers (yes, even they look at the code), even for those function where GCC does not need the attribute. > Giving a bunch of really contrived examples where users may update things wrong doesn’t seem like a good motivation to make a warning that can only possibly have a really high false positive rate. Just adding a "printf" statement for debugging, or increasing/decreasing a global counter invalidates the pure attributes. Thus by trying to understand/analyze a bug, another is added. > It is a tool for experts. And I see no harm in making it more developer-friendly. Why would that be a bad idea? As you claimed previously. Because it is difficult to implement? I do not know if it is, but that would not make it a bad idea. Because of false positives? Developers can handle them, case-by-case by documenting and disabling (or ignoring) the diagnostic, or globally by not turning the diagnostic on. Just like any other diagnostic. Because it adds nothing from a compiler perspective? I'm still not convinced that it has no added value, especially when interacting with "extern" code/libraries. But it definitively has some value for developers. It's part of the API of a function, just like declaring the member function of a class const (or the parameter of a function). Adding const might even avoid some optimization, and leads to code-duplication when one needs overloads (like for operator[] in container-like classes), but from a developer perspective it's great. It helps to catch errors. Of course one could never use it, for the compiler it would be the same. And it would not invalidate it's original use-case, thus it would still be possible to use those attributes like today if someone wants to, they would not even need to change a thing.
[Bug tree-optimization/95433] Failure to completely optimize simple compare after operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95433 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |11.0 Status|NEW |RESOLVED --- Comment #8 from Andrew Pinski --- Fixed in GCC 11 by the commits.
[Bug middle-end/19987] [meta-bug] fold missing optimizations in general
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987 Bug 19987 depends on bug 95433, which changed state. Bug 95433 Summary: Failure to completely optimize simple compare after operations https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95433 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/97603] Failure to optimize out compare into reuse of subtraction result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97603 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug rtl-optimization/94798] Failure to optimize subtraction and 0 literal properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94798 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Severity|normal |enhancement Last reconfirmed||2021-09-04
[Bug tree-optimization/85375] possible missed optimisation / regression from 6.3 with while (__builtin_ffs(x) && x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85375 Andrew Pinski changed: What|Removed |Added Known to fail||10.3.0 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=95527 Target Milestone|--- |11.0 Resolution|--- |FIXED Status|NEW |RESOLVED Known to work||11.1.0 --- Comment #3 from Andrew Pinski --- After r11-1080 (PR 95527), __builtin_ffs(x) && x becomes just x != 0 and optimized. So yes fixed for GCC 11.
[Bug tree-optimization/85316] [meta-bug] VRP range propagation missed cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316 Bug 85316 depends on bug 85375, which changed state. Bug 85375 Summary: possible missed optimisation / regression from 6.3 with while (__builtin_ffs(x) && x) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85375 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/95527] Failure to optimize __builtin_ffs == 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95527 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |11.0 Status|NEW |RESOLVED --- Comment #6 from Andrew Pinski --- Fixed.
[Bug middle-end/19987] [meta-bug] fold missing optimizations in general
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987 Bug 19987 depends on bug 95527, which changed state. Bug 95527 Summary: Failure to optimize __builtin_ffs == 0 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95527 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug target/98453] aarch64: Missed opportunity for STP for vec_duplicate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98453 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-09-05 Severity|normal |enhancement Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed. Plus these functions too: typedef double v2df __attribute__((vector_size (16))); typedef float v2sf __attribute__((vector_size (8))); void food (v2df *x, double a) { v2df tmp = {a, a}; *x = tmp; } void foof (v2sf *x, float a) { v2sf tmp = {a, a}; *x = tmp; }
[Bug tree-optimization/94846] Failure to optimize jnc+inc into adc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94846 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement --- Comment #5 from Andrew Pinski --- After r12-897 (which added a late sink pass), we get the following in .optimized: if (_10 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870913]: _2 = _1 + 1; [local count: 1073741824]: # prephitmp_11 = PHI <_1(2), _2(3)> # _13 = PHI <_1(2), _2(3)> *p_5(D) = _13; return prephitmp_11; Notice how prephitmp_11 and _13 are the same but no RTL optimizers handles that.
[Bug tree-optimization/98357] Bounds check not eliminated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98357 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |12.0 Status|NEW |RESOLVED --- Comment #4 from Andrew Pinski --- Fixed on the trunk by some of the improvements to VRP (range).
[Bug tree-optimization/85316] [meta-bug] VRP range propagation missed cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316 Bug 85316 depends on bug 98357, which changed state. Bug 98357 Summary: Bounds check not eliminated https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98357 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/93326] switch optimisation of multiple jumptables into a lookup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93326 --- Comment #6 from Andrew Pinski --- (In reply to Andrew Pinski from comment #5) > So for the -fPIC case, we don't want to increase the number of runtime > relocations done. The number of runtime locations will happen in the > constable load table. I think we don't want to change that. And that is PR 99383.
[Bug tree-optimization/99383] No tree-switch-conversion under PIC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99383 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=93326, ||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=36881
[Bug tree-optimization/99383] No tree-switch-conversion under PIC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99383 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #8 from Andrew Pinski --- Dup of bug 84011. *** This bug has been marked as a duplicate of bug 84011 ***
[Bug tree-optimization/84011] Optimize switch table with run-time relocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84011 Andrew Pinski changed: What|Removed |Added CC||jengelh at inai dot de --- Comment #14 from Andrew Pinski --- *** Bug 99383 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/95410] Failure to optimize compare next to and properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95410 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/93745] Redundant store not eliminated with intermediate instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93745 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug target/101059] v4sf reduction not optimal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101059 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/66646] small loop turned into memmove because of tree ldist
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66646 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Last reconfirmed|2015-06-24 00:00:00 |2021-9-4
[Bug target/102202] New: Inefficent expansion of memset when range is [0,1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202 Bug ID: 102202 Summary: Inefficent expansion of memset when range is [0,1] Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* Take: void g(int a, char *d) { if (a < 0 || a > 1) __builtin_unreachable(); __builtin_memset(d, 0, a); } - CUT - GCC compiles on x86_64 to: g(int, char*): .cfi_startproc testl %edi, %edi je .L1 xorl%eax, %eax .L2: movl%eax, %edx addl$1, %eax movb$0, (%rsi,%rdx) cmpl%edi, %eax jb .L2 .L1: ret Which is better than clang/LLVM/ICC does but the loop is not needed as a will either be 0 or 1 and we already jump around the loop. Here is another example not using __builtin_unreachable: void g(int a, char *d) { __builtin_memset(d, 0, a&1); }
[Bug target/102202] Inefficent expansion of memset when range is [0,1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202 --- Comment #1 from Andrew Pinski --- Likewise for memcpy: typedef decltype(sizeof(0)) size_t; void g(size_t a, char *d, char *e) { __builtin_memcpy(d, e, a&1); }
[Bug target/102202] Inefficent expansion of memset when range is [0,1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202 --- Comment #2 from Andrew Pinski --- I wonder if we could do this expansion at the gimple level ... Though introducing branches might not be happy for some.
[Bug target/102203] New: __builtin_memset and __builtin_memcpy could be expanded inline if range is known to be small
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102203 Bug ID: 102203 Summary: __builtin_memset and __builtin_memcpy could be expanded inline if range is known to be small Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64*-*-* Take: typedef decltype(sizeof(0)) size_t; void g(size_t a, char *d, char *e) { if (a>16)__builtin_unreachable(); __builtin_memcpy(d, e, a); } - CUT This could be inlined like it is on x86_64.
[Bug rtl-optimization/80301] Sub-optimal code with an array of structs offsetted inside a struct global on x86/x86_64 at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80301 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement --- Comment #4 from Andrew Pinski --- We are able to do the 2->2 combine now (after r9-2064): Trying 9 -> 10: 9: {r87:DI=r86:DI+0x2;clobber flags:CC;} REG_DEAD r86:DI REG_UNUSED flags:CC 10: flags:CCZ=cmp([r87:DI*0x8+`m'],r83:SI) Failed to match this instruction: (parallel [ (set (reg:CCZ 17 flags) (compare:CCZ (mem:SI (plus:DI (mult:DI (reg:DI 86 [ indexD.2442 ]) (const_int 8 [0x8])) (const:DI (plus:DI (symbol_ref:DI ("m") [flags 0x2] ) (const_int 16 [0x10] [1 mD.2375.sD.2374[index_4(D)].aD.2372+0 S4 A64]) (reg:SI 83 [ ]))) (set (reg:DI 87) (plus:DI (reg:DI 86 [ indexD.2442 ]) (const_int 2 [0x2]))) ]) Failed to match this instruction: (parallel [ (set (reg:CCZ 17 flags) (compare:CCZ (mem:SI (plus:DI (mult:DI (reg:DI 86 [ indexD.2442 ]) (const_int 8 [0x8])) (const:DI (plus:DI (symbol_ref:DI ("m") [flags 0x2] ) (const_int 16 [0x10] [1 mD.2375.sD.2374[index_4(D)].aD.2372+0 S4 A64]) (reg:SI 83 [ ]))) (set (reg:DI 87) (plus:DI (reg:DI 86 [ indexD.2442 ]) (const_int 2 [0x2]))) ]) Successfully matched this instruction: (set (reg:DI 87) (plus:DI (reg:DI 86 [ indexD.2442 ]) (const_int 2 [0x2]))) Successfully matched this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (mem:SI (plus:DI (mult:DI (reg:DI 86 [ indexD.2442 ]) (const_int 8 [0x8])) (const:DI (plus:DI (symbol_ref:DI ("m") [flags 0x2] ) (const_int 16 [0x10] [1 mD.2375.sD.2374[index_4(D)].aD.2372+0 S4 A64]) (reg:SI 83 [ ]))) allowing combination of insns 9 and 10 original costs 4 + 13 = 17 replacement costs 4 + 13 = 17 modifying insn i2 9: r87:DI=r86:DI+0x2 deferring rescan insn with uid = 9. modifying insn i310: flags:CCZ=cmp([r86:DI*0x8+const(`m'+0x10)],r83:SI) REG_DEAD r86:DI deferring rescan insn with uid = 10. But then we don't sink the add into the conditional and do the combine there. The code we get now is: func(unsigned int): movl%edi, %edx movq%rdx, %rax leaq2(%rdx), %rcx cmpl%edx, m+16(,%rdx,8) je .L1 movlm+4(,%rcx,8), %eax .L1: ret
[Bug tree-optimization/85406] Unnecessary blend when vectorizing short-cutted calculations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85406 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement --- Comment #7 from Andrew Pinski --- I Noticed clang/LLVM does not do this either nor ICC.
[Bug ipa/84312] Variadic function without named argument not inlined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84312 Andrew Pinski changed: What|Removed |Added Known to fail||9.4.0 Resolution|--- |FIXED Status|NEW |RESOLVED CC||marxin at gcc dot gnu.org Component|tree-optimization |ipa See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=70929 Severity|normal |enhancement Known to work||10.1.0 --- Comment #2 from Andrew Pinski --- Fixed in GCC 10 by r10-.
[Bug middle-end/84756] Multiplication done twice just to get upper and lower parts of product
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84756 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-09-05 Component|target |middle-end Status|UNCONFIRMED |NEW Severity|normal |enhancement Ever confirmed|0 |1 --- Comment #2 from Andrew Pinski --- Confirmed, we should be able to do part (all?) of this at the gimple level: _3 = a_6(D) w* b_7(D); _4 = _3 >> 64; _5 = (long unsigned int) _4; *upper_9(D) = _5; _11 = a_6(D) * b_7(D); return _11; (long unsigned int)_3 is the same as _11.
[Bug c++/98869] Allowing mapping this in OpenMP target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98869 --- Comment #3 from Ye Luo --- This doesn't work with gcc 11.2 but works on devel/omp/gcc-11 branch.
[Bug c++/102204] New: OpenMP offload map type restriction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102204 Bug ID: 102204 Summary: OpenMP offload map type restriction Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: xw111luoye at gmail dot com Target Milestone: --- With branch devel/omp/gcc-11 I'm getting /home/yeluo/opt/qmcpack/build_rtx3060_gcc_offload_real/src/config.h:42:29: error: array section does not have mappable type in ‘map’ clause 42 | #define PRAGMA_OFFLOAD(x) _Pragma(x) | ^~~ /home/yeluo/opt/qmcpack/src/Particle/SoaDistanceTableAAOMPTarget.h:84:5: note: in expansion of macro ‘PRAGMA_OFFLOAD’ 84 | PRAGMA_OFFLOAD("omp target enter data map(to : this[:1])") | ^~ In file included from /home/yeluo/opt/qmcpack/src/Particle/createDistanceTableAAOMPTarget.cpp:19: /home/yeluo/opt/qmcpack/src/Particle/SoaDistanceTableAAOMPTarget.h:31:8: note: type ‘qmcplusplus::SoaDistanceTableAAOMPTarget’ with virtual members is not mappable 31 | struct SoaDistanceTableAAOMPTarget : public DTD_BConds, public DistanceTableData |^~~ because SoaDistanceTableAAOMPTarget is a derived class and there is virtual function overriding. https://github.com/QMCPACK/qmcpack/blob/1a7af8e589726a91da94e5f6ad8b4e8d9e2acd4d/src/Particle/SoaDistanceTableAAOMPTarget.h#L31 In my case virtual functions are never called in offload region and I map "this[:1]" for easy access a fixed data set. So I'm expecting just bit wise copy to the device. please remove this restriction.
[Bug target/85324] missing constant propagation on SSE/AVX conversion intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Last reconfirmed|2018-04-11 00:00:00 |2021-9-4
[Bug target/102205] New: vec + 1 could be done as vec - (-1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102205 Bug ID: 102205 Summary: vec + 1 could be done as vec - (-1) Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64 Take: template using V [[gnu::vector_size(16)]] = T; auto a1(V< int> b) { return 1 + b; } CUT Currently GCC produces: a1(int __vector(4)): paddd .LC0(%rip), %xmm0 ret .cfi_endproc .LFE0: .size a1(int __vector(4)), .-a1(int __vector(4)) .section.rodata.cst16,"aM",@progbits,16 .align 16 .LC0: .long 1 .long 1 .long 1 .long 1 But it might be best if GCC produces (like LLVM): a1(int __vector(4)): pcmpeqd %xmm1, %xmm1 psubd %xmm1, %xmm0 retq
[Bug middle-end/86085] I/O built-ins considered argument clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86085 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2018-06-13 00:00:00 |2021-9-4 Severity|normal |enhancement Component|tree-optimization |middle-end
[Bug middle-end/86085] I/O built-ins considered argument clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86085 --- Comment #3 from Andrew Pinski --- I thought builtin_fnspec and friends would have optimized this case but no. In fact starting with GCC 10, f even regresses, starting with r10-2814.
[Bug tree-optimization/85116] std::min_element does not optimize well with inlined predicate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85116 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Last reconfirmed|2018-03-29 00:00:00 |2021-9-4 Component|libstdc++ |tree-optimization
[Bug tree-optimization/86339] DOM does not handle RHS COND_EXPRs well
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86339 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/86241] duplicate strlen-like snprintf calls not folded
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86241 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Severity|normal |enhancement Status|UNCONFIRMED |NEW Last reconfirmed||2021-09-05 --- Comment #3 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/86604] phiopt missed optimization of conditional add
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86604 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org Severity|normal |enhancement --- Comment #2 from Andrew Pinski --- Maybe something like: (simplify (cond (ne bool@0 integer_zerop) (plus @1 integer_onep) @1) (plus (convert @0) @1)) Where bool is defined to be a var that is in the range of [0,1]. This seems like what LLVM does.
[Bug tree-optimization/89043] strcat (strcpy (d, a), b) not folded to stpcpy (strcpy (d, a), b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89043 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug ipa/88231] aligned functions laid down inefficiently
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231 Andrew Pinski changed: What|Removed |Added Severity|minor |enhancement
[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug rtl-optimization/52082] Memory loads not rematerialized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52082 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement --- Comment #3 from Andrew Pinski --- One thing I noticed that LLVM does to reduce the register pressure is: (z ? v4 [k] : v3 [k]) Gets pulled out of the loop such that it is: tmpaddr = z ? v4 : v3; and then inside the loop it does: (tempaddr)[k] GCC still has (I changed the bb order just so it is easier to see what is going on): if (z_39(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 5427362]: _21 = v3.3_18 + _157; iftmp.1_40 = *_21; goto ; [100.00%] [local count: 5427362]: _17 = v4.2_14 + _157; iftmp.1_41 = *_17; [local count: 10854724]: # m_8 = PHI if (m_8 != 0B) goto ; [94.50%] else goto ; [5.50%] we should able to do the similar it seems and need two less registers; one to hold z and one to hold either v3 or v4. This won't be enough for this testcase but it will be something.
[Bug middle-end/91409] Missed optimization on `labels as values` expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91409 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Component|target |middle-end
[Bug target/93396] [RX] tail call optimization does not work with indirect call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93396 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug target/93737] inline memmove for insertion into small arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93737 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-09-05 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Severity|normal |enhancement --- Comment #5 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/93560] strstr(s, s) not folded to s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93560 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Status|UNCONFIRMED |NEW Last reconfirmed||2021-09-05 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed, LLVM does this.
[Bug tree-optimization/93556] lower mempcpy to memcpy when result is unused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93556 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-09-05 Severity|normal |enhancement --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/93539] memmove over self with result of string function not eliminated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93539 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-09-05 Ever confirmed|0 |1 Severity|normal |enhancement Depends on||82991 Status|UNCONFIRMED |NEW See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=82991 --- Comment #1 from Andrew Pinski --- Confirmed, PR 82991 is related and will most likely solve this too. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82991 [Bug 82991] memcpy and strcpy return value can be assumed to be equal to first argument
[Bug rtl-optimization/93525] Left shift and arithmetic shift could be futher simplified in simplify-rtx.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93525 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/82911] missing strlen optimization for strncpy with constant strings and constant bound
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82911 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-09-05 Status|UNCONFIRMED |NEW Severity|normal |enhancement --- Comment #1 from Andrew Pinski --- Confirmed. A related testcase is: void f1 (char *d, char *e, bool b) { d[2] = 0; if (__builtin_strlen (d) > 2) // not eliminated but could be __builtin_abort (); } where the strlen's range should be [0,2]. Maybe we can add a class to the ranger for string and do the optimization that way instead. so the null store to d[2] the range for the string becomes [0,2].
[Bug tree-optimization/83190] missing strlen optimization of the empty string
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83190 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug middle-end/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 Andrew Pinski changed: What|Removed |Added Component|libstdc++ |middle-end Severity|normal |enhancement
[Bug ipa/88971] Branch optimization inconsistency (missed optimization)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-09-05 Status|UNCONFIRMED |NEW CC||marxin at gcc dot gnu.org Ever confirmed|0 |1 Component|middle-end |ipa --- Comment #9 from Andrew Pinski --- This is just an inlining problem: #include #include struct Data { int i; int j; }; class Test { public: template void CheckAndPrint(const T &t); }; template inline void Test::CheckAndPrint(const T &t) { asm volatile ("mfence" ::: "memory"); if (t.j > 0) { std::string a = "<"; std::string b = "<"; } asm volatile ("mfence" ::: "memory"); } int main() { Data data; std::cin >> data.i; std::cin >> data.j; Test t; t.CheckAndPrint(data); return 0; } --- CUT If we get rid of one of the std::string in Test::CheckAndPrint, we are able to inline the constructor (std::__cxx11::basic_string::basic_string<> ) and everything goes away. One thing I noticed is: [local count: 1073612976]: _24 = __builtin_strlen (__s_7(D)); ... if (_24 > 15) I wonder if we could have an IPA pass which clones this function based on the strlen that gets pass to the second argument and such.
[Bug tree-optimization/86708] strlen of an empty aggregate element or member string not folded
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86708 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #3 from Andrew Pinski --- Fixed by r10-2769. Specifically this part of the patch: * gimple-fold.c (fold_nonarray_ctor_reference): Return a STRING_CST for missing initializers.
[Bug tree-optimization/83819] [meta-bug] missing strlen optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83819 Bug 83819 depends on bug 86708, which changed state. Bug 86708 Summary: strlen of an empty aggregate element or member string not folded https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86708 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug middle-end/82170] gcc optimizes int range-checking poorly on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82170 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2017-09-14 00:00:00 |2021-9-4 Severity|normal |enhancement Component|target |middle-end