[Bug middle-end/115863] [15 Regression] zlib-1.3.1 miscompilation since r15-1936-g80e446e829d818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 --- Comment #18 from Hu Lin --- (In reply to Uroš Bizjak from comment #17) > (In reply to Hongtao Liu from comment #16) > > > Unfortunately, x86 has no vector mode .SAT_TRUNC instruction. > > No, AVX512 supports both signed and unsigned saturation > Indeed. > > BTW: PACKUSmn (despite the name) is not what we are looking for. Indeed.
[Bug target/115950] Missed SVE fold to INCP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115950 --- Comment #3 from ktkachov at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #2) > Hmm actually there are patterns there but they are not matching. Something > seems to be going wrong with define_insn_and_rewrite ... The MD pattern requires a (const_int SVE_KNOWN_PTRUE) in one of its operands but the attempted match has (const_int 0) i.e. SVE_MAYBE_NOT_PTRUE which blocks matching.
[Bug tree-optimization/115766] [12/13/14 Regression] wrong code at optimization levels -O2, -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115766 --- Comment #8 from Sam James --- Are you sure the reduced one is accurate? For me, it behaves the same with -O0..-O3 for GCC. For Clang, it has the same behaviour as GCC with -O0, and "passes" with > -O0.
[Bug tree-optimization/115868] [14 Regression] ICE: in exact_div, at poly-int.h:2156
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115868 --- Comment #3 from GCC Commits --- The releases/gcc-14 branch has been updated by Richard Biener : https://gcc.gnu.org/g:c58bede01c06c84f0b36881fafd1e5d6456a38f4 commit r14-10443-gc58bede01c06c84f0b36881fafd1e5d6456a38f4 Author: Richard Biener Date: Thu Jul 11 09:56:56 2024 +0200 tree-optimization/115868 - ICE with .MASK_CALL in simdclone The following adjusts mask recording which didn't take into account that we can merge call arguments from two vectors like _50 = {vect_d_1.253_41, vect_d_1.254_43}; _51 = VIEW_CONVERT_EXPR(mask__19.257_49); _52 = (unsigned int) _51; _53 = _Z3bazd.simdclone.7 (_50, _52); _54 = BIT_FIELD_REF <_53, 256, 0>; _55 = BIT_FIELD_REF <_53, 256, 256>; The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with partial vector usage enabled and AVX512 support. PR tree-optimization/115868 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly compute the number of mask copies required for vect_record_loop_mask. (cherry picked from commit abf3964711f05b6858d9775c3595ec2b45483e14)
[Bug tree-optimization/105769] [11/12/13/14/15 Regression] program segmentation fault with -ftree-vectorize and nested lambdas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105769 Richard Biener changed: What|Removed |Added Known to fail||12.4.0, 13.3.0, 14.1.0 Known to work||10.5.0 --- Comment #17 from Richard Biener --- I'm not actually seeing the problematic use of the hoisted address - the address value itself is stored and the trick of looking at SSA uses defs to pick up indirect address uses later doesn't work here as the only use is in the vector CTOR: _15 = (long unsigned int) &bias; _10 = (long unsigned int) &cov_jn; _12 = {_10, _15}; ... bias ={v} {CLOBBER(bob)}; but _12 is only used in MEM [(void *)&D.5715 + 32B] = _12; and then maybe indirectly __ct_comp (_14, &D.5715.__est); I can fix the miscompile with the following patch - we're treating all CLOBBER kinds as invalidating earlier mentions. I'm not sure that's really necessary and it's definitely harmful when there are hoisted address mentions. It also explains that -fstack-reuse=none doesn't help as the gimplifier only inserts CLOBBER_STORAGE_END clobbers. I'm also allowing CLOBBER_OBJECT_END here. I do not remember whether we discussed doing sth like this instead of the special SSA use handling we added? diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index eef565eddb5..92968075b04 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -632,6 +632,13 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict) that are COMPONENT_REFs. */ if (!VAR_P (lhs)) continue; + tree cl = gimple_assign_rhs1 (stmt); + /* When the clobber is possibly a object/storage start do not +ignore previous mentions at this point. Those might +include hoisted address uses. */ + if (CLOBBER_KIND (cl) != CLOBBER_STORAGE_END + && CLOBBER_KIND (cl) != CLOBBER_OBJECT_END) + continue; if (DECL_RTL_IF_SET (lhs) == pc_rtx && (v = decl_to_stack_part->get (lhs))) bitmap_clear_bit (work, *v);
[Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382 --- Comment #8 from GCC Commits --- The releases/gcc-14 branch has been updated by Richard Biener : https://gcc.gnu.org/g:bf64404280a90715d1228edef0d5756e81635a64 commit r14-10444-gbf64404280a90715d1228edef0d5756e81635a64 Author: Robin Dapp Date: Fri Jun 7 14:36:41 2024 +0200 vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382]. Currently we discard the cond-op mask when the loop is fully masked which causes wrong code in gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c when compiled with -O3 -march=cascadelake --param vect-partial-vector-usage=2. This patch ANDs both masks. gcc/ChangeLog: PR tree-optimization/115382 * tree-vect-loop.cc (vectorize_fold_left_reduction): Use prepare_vec_mask. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Remove static of prepare_vec_mask. * tree-vectorizer.h (prepare_vec_mask): Export. (cherry picked from commit 2b438a0d2aa80f051a09b245a58f643540d4004b)
[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #10 from Matthias Kretz (Vir) --- (In reply to Richard Biener from comment #9) > One issue with > > V load3(const unsigned long* ptr) > { > V ret = {}; > __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long)); > > is that we cannot load a vector worth of data from ptr because that might > trap Unless the target has a masked load instruction (e.g. AVX512) or ptr is known to be aligned to at least 16 Bytes (in which case we know there cannot be a page boundary at ptr + 24 Bytes). No? In this specific example, ptr is pointing to a 32-Byte vector object. The library can do this and it makes a difference: if (__builtin_object_size(ptr, 0) >= 4 * sizeof(T)) __builtin_memcpy(&ret, ptr, 4 * sizeof(T)); else __builtin_memcpy(&ret, ptr, 3 * sizeof(T));
[Bug c++/115964] New: GCC accepts invalid program with explicit object member function overloads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115964 Bug ID: 115964 Summary: GCC accepts invalid program with explicit object member function overloads Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: jlame646 at gmail dot com Target Milestone: --- The following invalid program is accepted by gcc. Demo: https://godbolt.org/z/dG4qEKzb5 ``` struct C { void j(this const C); void j() const ; //gcc ok, clang: nope, edg:ok void f(this C ); void f(C); //gcc ok, clang ok, edg:nope }; ``` Both these overloads are ill-formed as explained here: https://stackoverflow.com/questions/78758215/explicit-member-function-discrepancies-between-different-compilers
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #67 from uecker at gcc dot gnu.org --- (In reply to Andrew Church from comment #66) > (In reply to Andrew Church from comment #65) > > As one of the advocates for this behavior, it stems (at least in my case) > > from pre-C23 code in which [[attribute]] syntax was not available. If > > [[nodiscard]] suppresses the warning, I'd accept that as a solution. > > Premature reply (apologies) - I somehow gave myself the impression that > [[nodiscard]] could be put at the place of use. Since it's something the > library writer has to do, I think this is still not enough from the library > user's point of view. I agree that this is an argument for having a compiler switch. But also the library could switch to "discard" or add a condition that the lets the user of the library choose it. As a last resort, on the compiler side one can already use a pragma to turn off the warning at a specific point. So I am a bit unsure about whether the flag is worth having
[Bug tree-optimization/105769] [11/12/13/14/15 Regression] program segmentation fault with -ftree-vectorize and nested lambdas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105769 --- Comment #18 from Richard Biener --- (In reply to Richard Biener from comment #17) > I'm not actually seeing the problematic use of the hoisted address - the > address value itself is stored and the trick of looking at SSA uses defs to > pick up > indirect address uses later doesn't work here as the only use is in the > vector CTOR: > > _15 = (long unsigned int) &bias; > _10 = (long unsigned int) &cov_jn; > _12 = {_10, _15}; > ... > bias ={v} {CLOBBER(bob)}; > > but _12 is only used in > > MEM [(void *)&D.5715 + 32B] = _12; > > and then maybe indirectly > > __ct_comp (_14, &D.5715.__est); > > I can fix the miscompile with the following patch - we're treating all > CLOBBER kinds as invalidating earlier mentions. I'm not sure that's > really necessary and it's definitely harmful when there are hoisted > address mentions. It also explains that -fstack-reuse=none doesn't > help as the gimplifier only inserts CLOBBER_STORAGE_END clobbers. > I'm also allowing CLOBBER_OBJECT_END here. > > I do not remember whether we discussed doing sth like this instead of the > special SSA use handling we added? > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index eef565eddb5..92968075b04 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -632,6 +632,13 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, > bool for_conflict) > that are COMPONENT_REFs. */ > if (!VAR_P (lhs)) > continue; > + tree cl = gimple_assign_rhs1 (stmt); > + /* When the clobber is possibly a object/storage start do not > +ignore previous mentions at this point. Those might > +include hoisted address uses. */ > + if (CLOBBER_KIND (cl) != CLOBBER_STORAGE_END > + && CLOBBER_KIND (cl) != CLOBBER_OBJECT_END) > + continue; > if (DECL_RTL_IF_SET (lhs) == pc_rtx > && (v = decl_to_stack_part->get (lhs))) > bitmap_clear_bit (work, *v); It breaks g++.dg/opt/pr86214-1.C and gcc.target/i386/stack-check-17.c
[Bug tree-optimization/104515] [11/12/13/14/15 Regression] trivially-destructible destructors interfere with loop optimization - maybe related to lifetime-dse.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104515 --- Comment #9 from Richard Biener --- I think not re-emitting the clobber on the exit might be OK semantically - it's at most a missed optimization for a store. I think it's also OK for the stack slot sharing logic as any object becoming live after a clobber would need to have a mention inbetween the clobber and the exit. I'll note that if that's not OK then duplicating the clobber wouldn't either. As we don't know whether the object we store to ends its lifetime (it's only a may-alias) we can't do both - extend its lifetime and preserve the original one. We could maybe move the clobber (duplicating it to each edge but removing it from the loop body). I'm testing a patch.
[Bug c++/96717] -flifetime-dse=2 breaks webkit-gtk-2.28.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #10 from Richard Biener --- This was fixed long time ago.
[Bug tree-optimization/96881] [11 Regression] Clobbers on NULL vs. DCE since r8-1519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96881 Bug 96881 depends on bug 96717, which changed state. Bug 96717 Summary: -flifetime-dse=2 breaks webkit-gtk-2.28.4 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/96722] [8/9 Regression] Clobbers on NULL since r8-1519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96722 Bug 96722 depends on bug 96717, which changed state. Bug 96717 Summary: -flifetime-dse=2 breaks webkit-gtk-2.28.4 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug c++/96721] [11 Regression] pseudo-destructor calls on pointers since r11-2238
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96721 Bug 96721 depends on bug 96717, which changed state. Bug 96717 Summary: -flifetime-dse=2 breaks webkit-gtk-2.28.4 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/104515] [11/12/13/14/15 Regression] trivially-destructible destructors interfere with loop optimization - maybe related to lifetime-dse.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104515 --- Comment #10 from Richard Biener --- Testcase from PR96717, also fixed with my patch in testing. #include void pop_many(std::vector& v, unsigned n) { for (unsigned i = 0; i < n; ++i) { v.pop_back(); } }
[Bug tree-optimization/115868] [14 Regression] ICE: in exact_div, at poly-int.h:2156
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115868 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Richard Biener --- Fixed now.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 115868, which changed state. Bug 115868 Summary: [14 Regression] ICE: in exact_div, at poly-int.h:2156 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115868 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Target Milestone|--- |14.2 Resolution|--- |FIXED --- Comment #9 from Richard Biener --- Fixed now.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 115382, which changed state. Bug 115382 Summary: Wrong code with in-order conditional reduction and masked loops https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug testsuite/110445] [14 Regression] FAIL: gcc.dg/vect/slp-46.c with AVX2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110445 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|14.2|15.0 --- Comment #5 from Richard Biener --- So fixed. I'm not planning to backport.
[Bug other/115958] [15 regression] varasm.cc:8546:27: error: comparison of integer expressions of different signedness since r15-2039-g9964edfb4abdec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115958 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #68 from Andrew Church --- (In reply to uecker from comment #67) > But also the library could switch to "discard" or add a condition that the > lets the user of the library choose it. The issue here is that the library user has no control over what the library author chooses to do. If the library author does not make that change, the user currently has no recourse (other than the pragma workaround you suggest). > As a last resort, on the compiler side one can already use a pragma to turn > off the > warning at a specific point. While true, this also introduces both a compiler-specific hack and a lot of verboseness around what ought to be a simple declaration of "I wish to ignore this return value", and I feel like most code authors who encounter this problem are more likely to add -Wno-unused-result to their compiler flags (thus losing the check everywhere) than do a whole #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wunused-result" system(foo); #pragma GCC diagnostic pop just to make that one instance go away. I do agree that "(void)" is very idiomatic, and something like a [[discard]] statement attribute (which would silence warnings for both __attribute__((wur)) and [[nodiscard]]) would make the intent clearer. Perhaps something to suggest for a future version of the C standard?
[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #11 from rguenther at suse dot de --- On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 > > --- Comment #10 from Matthias Kretz (Vir) --- > (In reply to Richard Biener from comment #9) > > One issue with > > > > V load3(const unsigned long* ptr) > > { > > V ret = {}; > > __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long)); > > > > is that we cannot load a vector worth of data from ptr because that might > > trap > > Unless the target has a masked load instruction (e.g. AVX512) or ptr is known > to be aligned to at least 16 Bytes (in which case we know there cannot be a > page boundary at ptr + 24 Bytes). No? In this specific example, ptr is > pointing > to a 32-Byte vector object. Sure but here we have no alignment info available (at most 8 byte alignment from the pointer type). I don't think introducing a .MASK_LOAD for the purpose of eliding a memcpy is a good thing to do (locally, just taking into account the memcpy on its own). > The library can do this and it makes a difference: > > if (__builtin_object_size(ptr, 0) >= 4 * sizeof(T)) > __builtin_memcpy(&ret, ptr, 4 * sizeof(T)); > else > __builtin_memcpy(&ret, ptr, 3 * sizeof(T)); I see, but that's then of course after inlining. In my former C++ times I've used template metaprogramming to implement this as an unrolled element-by-element copy (emitting a loop would be possible as well, of course).
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #69 from Jonathan Wakely --- (In reply to Andrew Church from comment #68) > I do agree that "(void)" is very idiomatic, and something like a [[discard]] > statement attribute (which would silence warnings for both > __attribute__((wur)) and [[nodiscard]]) would make the intent clearer. > Perhaps something to suggest for a future version of the C standard? Maybe you want: [[maybe_unused]] auto _ = foo(); This is already in C23.
[Bug c++/115960] gcc throws an error when I use Optional in c++17.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115960 --- Comment #4 from Jonathan Wakely --- > The problem seems to be that, despite passing "-std=c++17", it doesn't use > c++17 > header files for the Optional identifier. Why should it? The name "Optional" is not part of any C++ standard.
[Bug c++/115965] New: Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 Bug ID: 115965 Summary: Stack smashing depending on order of declaration Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: nathan.teodosio at canonical dot com Target Milestone: --- If I execute the binary I get ---> % ./e *** stack smashing detected ***: terminated Aborted (core dumped) <--- However, no error is raised if I swap lines 17 (where a and b are declared) and 18 (where c is declared), or if I move either a or b definition to after c. Valgrind says: ---> % valgrind -s --track-origins=yes --leak-check=full --show-leak-kinds=all ./e ==173999== Memcheck, a memory error detector ==173999== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==173999== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info ==173999== Command: ./e ==173999== ==173999== Conditional jump or move depends on uninitialised value(s) ==173999==at 0x1092FA: main (in /tmp/e) ==173999== Uninitialised value was created by a stack allocation ==173999==at 0x109244: main (in /tmp/e) ==173999== *** stack smashing detected ***: terminated ==173999== ==173999== Process terminating with default action of signal 6 (SIGABRT): dumping core ==173999==at 0x4928B1C: __pthread_kill_implementation (pthread_kill.c:44) ==173999==by 0x4928B1C: __pthread_kill_internal (pthread_kill.c:78) ==173999==by 0x4928B1C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89) ==173999==by 0x48CF26D: raise (raise.c:26) ==173999==by 0x48B28FE: abort (abort.c:79) ==173999==by 0x48B37B5: __libc_message_impl.cold (libc_fatal.c:132) ==173999==by 0x49C0C18: __fortify_fail (fortify_fail.c:24) ==173999==by 0x49C1EA3: __stack_chk_fail (stack_chk_fail.c:24) ==173999==by 0x109300: main (in /tmp/e) ==173999== ==173999== HEAP SUMMARY: ==173999== in use at exit: 0 bytes in 0 blocks ==173999== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==173999== ==173999== All heap blocks were freed -- no leaks are possible ==173999== ==173999== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) ==173999== ==173999== 1 errors in context 1 of 1: ==173999== Conditional jump or move depends on uninitialised value(s) ==173999==at 0x1092FA: main (in /tmp/e) ==173999== Uninitialised value was created by a stack allocation ==173999==at 0x109244: main (in /tmp/e) ==173999== ==173999== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) Aborted (core dumped) <---
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 --- Comment #1 from Nathan Teodosio --- Created attachment 58690 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58690&action=edit Preprocessed file (compressed with Gzip)
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 --- Comment #2 from Nathan Teodosio --- Created attachment 58691 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58691&action=edit Source file
[Bug tree-optimization/115959] [15 Regression] rv64gcv ICE: segfault during GIMPLE pass: vect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115959 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- I will have a look, (gdb) p debug (cond_node) t.c:3:6: note: node (constant) 0x5a76f60 (max_nunits=1, refcnt=1) vector(4) unsigned char t.c:3:6: note: { 0 } not sure how this happened, it seems STMT_VINFO_REDUC_IDX got "off". t.c:3:6: note: node 0x5a76e30 (max_nunits=4, refcnt=2) vector(4) int t.c:3:6: note: op template: patt_31 = _4 != 0 ? t_14 : 0; t.c:3:6: note: [l] stmt 0 patt_31 = _4 != 0 ? t_14 : 0; t.c:3:6: note: children 0x5a76ec8 0x5a76f60 0x5a76ff8 0x5a77090 The stmts reduc_idx is 1 which is OK. Ah, but we have four children for this frankenstein COND_EXPR which still has a GENERIC first operand. Partial transitions haunt us here ...
[Bug target/115954] Alignment of _Atomic structs incompatible between GCC and LLVM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115954 --- Comment #12 from Wilco --- This came out of the AArch64 Atomic ABI design work: https://github.com/ARM-software/abi-aa/pull/256
[Bug middle-end/115527] incorrect folding of __builtin_clear_padding()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527 --- Comment #11 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:8b5919bae11754f4b65a17e63663d3143f9615ac commit r15-2090-g8b5919bae11754f4b65a17e63663d3143f9615ac Author: Jakub Jelinek Date: Wed Jul 17 11:38:33 2024 +0200 gimple-fold: Fix up __builtin_clear_padding lowering [PR115527] The builtin-clear-padding-6.c testcase fails as clear_padding_type doesn't correctly recompute the buf->size and buf->off members after expanding clearing of an array using a runtime loop. buf->size should be in that case the offset after which it should continue with next members or padding before them modulo UNITS_PER_WORD and buf->off that offset minus buf->size. That is what the code was doing, but with off being the start of the loop cleared array, not its end. So, the last hunk in gimple-fold.cc fixes that. When adding the testcase, I've noticed that the c-c++-common/torture/builtin-clear-padding-* tests, although clearly written as runtime tests to test the builtins at runtime, didn't have { dg-do run } directive and were just compile tests because of that. When adding that to the tests, builtin-clear-padding-1.c was already failing without that clear_padding_type hunk too, but builtin-clear-padding-5.c was still failing even after the change. That is due to a bug in clear_padding_flush which the patch fixes as well - when clear_padding_flush is called with full=true (that happens at the end of the whole __builtin_clear_padding or on those array padding clears done by a runtime loop), it wants to flush all the pending padding clearings rather than just some. If it is at the end of the whole object, it decreases wordsize when needed to make sure the code never writes including RMW cycles to something outside of the object: if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize) > (unsigned HOST_WIDE_INT) buf->sz) { gcc_assert (wordsize > 1); wordsize /= 2; i -= wordsize; continue; } but if it is full==true flush in the middle, this doesn't happen, but we still process just the buffer bytes before the current end. If that end is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18, nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones might be true, so in some spots we just didn't emit any clearing in that last chunk. 2024-07-17 Jakub Jelinek PR middle-end/115527 * gimple-fold.cc (clear_padding_flush): Introduce endsize variable and use it instead of wordsize when comparing it against nonzero_last. (clear_padding_type): Increment off by sz. * c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run directive. * c-c++-common/torture/builtin-clear-padding-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-3.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/builtin-clear-padding-5.c: Likewise. * c-c++-common/torture/builtin-clear-padding-6.c: New test.
[Bug other/115958] [15 regression] varasm.cc:8546:27: error: comparison of integer expressions of different signedness since r15-2039-g9964edfb4abdec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115958 --- Comment #1 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:74bcef4cf16b35fe64767c1e8e529bdd229841a3 commit r15-2091-g74bcef4cf16b35fe64767c1e8e529bdd229841a3 Author: Jakub Jelinek Date: Wed Jul 17 11:40:03 2024 +0200 varasm: Fix bootstrap after the .base64 changes [PR115958] Apparently there is a -Wsign-compare warning if ptrdiff_t has precision of int, then (t - s + 1 + 2) / 3 * 4 has int type while cnt unsigned int. This doesn't warn if ptrdiff_t has larger precision, say on x86_64 it is 64-bit and so (t - s + 1 + 2) / 3 * 4 has long type and cnt unsigned int. And it doesn't warn when using older binutils (in my tests I've used new binutils on x86_64 and old binutils on i686). Anyway, earlier condition guarantees that t - s is at most 256-ish and t >= s by construction, so we can just cast it to (unsigned) to avoid the warning. 2024-07-17 Jakub Jelinek PR other/115958 * varasm.cc (default_elf_asm_output_ascii): Cast t - s to unsigned to avoid -Wsign-compare warnings.
[Bug tree-optimization/114966] fails to optimize avx2 in-register permute written with std::experimental::simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966 --- Comment #5 from Hongtao Liu --- I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small memory Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101 Created a replacement for D.161366 offset: 64, size: 64: SR.21D.170102 Created a replacement for D.161366 offset: 128, size: 64: SR.22D.170103 Created a replacement for D.161547 offset: 0, size: 256: SR.23D.170104 _8 = BIT_FIELD_REF ; _9 = BIT_FIELD_REF ; _10 = BIT_FIELD_REF ; _11 = {0, _8, _9, _10}; to SR.20_3 = MEM [(struct simd *)&data]; SR.21_13 = MEM [(struct simd *)&data + 8B]; SR.22_14 = MEM [(struct simd *)&data + 16B]; _7 = SR.20_3; _8 = SR.21_13; _9 = SR.22_14; _10 = {0, _7, _8, _9}; So I guess for the later GCC somehow can't be sure the whole 256-bit memory is valid and fail to optimize it with vec_perm_expr?
[Bug tree-optimization/114966] fails to optimize avx2 in-register permute written with std::experimental::simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966 --- Comment #6 from rguenther at suse dot de --- On Wed, 17 Jul 2024, liuhongt at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966 > > --- Comment #5 from Hongtao Liu --- > I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small > memory > > > Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101 > Created a replacement for D.161366 offset: 64, size: 64: SR.21D.170102 > Created a replacement for D.161366 offset: 128, size: 64: SR.22D.170103 > Created a replacement for D.161547 offset: 0, size: 256: SR.23D.170104 > > > _8 = BIT_FIELD_REF *)&D.159286].D.158970._M_data, 64, 0>; > _9 = BIT_FIELD_REF *)&D.159286].D.158970._M_data, 64, 64>; > _10 = BIT_FIELD_REF *)&D.159286].D.158970._M_data, 64, 128>; > _11 = {0, _8, _9, _10}; > > to > > SR.20_3 = MEM [(struct simd *)&data]; > SR.21_13 = MEM [(struct simd *)&data + 8B]; > SR.22_14 = MEM [(struct simd *)&data + 16B]; > _7 = SR.20_3; > _8 = SR.21_13; > _9 = SR.22_14; > _10 = {0, _7, _8, _9}; > > > So I guess for the later GCC somehow can't be sure the whole 256-bit memory is > valid and fail to optimize it with vec_perm_expr? I think the above would be a candidate for SLP vectorization of the vector CTOR. A specific example we don't handle right now of course. Or alternatively by instruction combination in simplify_vector_constructor.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #70 from Andrew Church --- (In reply to Jonathan Wakely from comment #69) > Maybe you want: > > [[maybe_unused]] auto _ = foo(); If I could apply that attribute to the value itself, i.e.: [[maybe_unused]] foo(); that would do what I want. Since it only applies to a declaration, we're still left with idiosyncratic code ("auto _ =" instead of "(void)") which ideally should not be needed.
[Bug middle-end/115887] ICE: in gsi_insert_on_edge_immediate, at gimple-iterator.cc:849 with -O -fnon-call-exceptions -finstrument-functions and _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115887 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Jakub Jelinek --- Created attachment 58692 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58692&action=edit gcc15-pr115887.patch Untested fix.
[Bug bootstrap/115951] [15 Regression] pgo+lto enabled bootstrap fails building gnat (ICE in fold_stmt, at gimple-range-fold.cc:701)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115951 --- Comment #5 from Matthias Klose --- a new build survived on x86_64-linux-gnu. will wait on the results on other architectures.
[Bug target/115966] New: [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966 Bug ID: 115966 Summary: [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: needs-bisection, wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pheeck at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Host: x86_64-linux Target: x86_64-linux Compiling the 403.gcc CPU SPEC 2006 benchmark with -g -Ofast -march=native -fpermissive (403.gcc no longer compiles without -fpermissive) on an x86_64 machine results in a miscompilation. Here is what the benchmark reports: *** Miscompare of scilab.s; for details see /home/gcc/buildworker/source/cpu2006/benchspec/CPU2006/403.gcc/run/run_peak_ref_amd64-m64-mine./scilab.s.mis 157474: .long 1764174565 .long 1248582442 ^ 157475: .long 1072684140 .long 1072430610 ^ 157476: .long 103477976 .long 3882853149 ^ 157477: .long 1072638526 .long 1072384995 ^ 157478: .long 2763002310 .long 3786780864 ^ 157479: .long 1072546777 .long 1072746777 ^ 157484: .long 579723672 .long 1095315795 ^ 157485: .long 1072020921 .long 1072274451 ^ 157486: .long 1075822880 .long 1120461514 ^ 157487: .long 1071776283 .long 1071400834 ^ I've seen this on AMD Zen4, Zen3 and Zen2 machines and on an Intel Ice Lake machine. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966 Filip Kastl changed: What|Removed |Added Target Milestone|--- |15.0
[Bug tree-optimization/115959] [15 Regression] rv64gcv ICE: segfault during GIMPLE pass: vect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115959 --- Comment #3 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:24689b84b8ec0c74c2b9a72ec4fb467069806bda commit r15-2093-g24689b84b8ec0c74c2b9a72ec4fb467069806bda Author: Richard Biener Date: Wed Jul 17 11:42:13 2024 +0200 tree-optimization/115959 - ICE with SLP condition reduction The following fixes how during reduction epilogue generation we gather conditional compares for condition reductions, thereby following the reduction chain via STMT_VINFO_REDUC_IDX. The issue is that SLP nodes for COND_EXPRs can have either three or four children dependent on whether we have legacy GENERIC expressions in the transitional pattern GIMPLE for the COND_EXPR condition. PR tree-optimization/115959 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Get at the REDUC_IDX child in a safer way for COND_EXPR nodes. * gcc.dg/vect/pr115959.c: New testcase.
[Bug tree-optimization/115959] [15 Regression] rv64gcv ICE: segfault during GIMPLE pass: vect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115959 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Richard Biener --- Fixed.
[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966 --- Comment #1 from Richard Biener --- I think we have a few other gcc miscompiles now (but from SPEC CPU 2017), so this one is probably related. Does -fno-strict-aliasing fix it?
[Bug target/80881] Implement Windows native TLS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80881 --- Comment #24 from Julian Waters --- Thanks for the patch, I've been looking through it these past few days. While the simpler parts of it I can manage, I'm struggling terribly with understanding the RTL shifting code in legitimize_tls_address and the RTL templates in the machine definitions file (i386.md to be specific). Do you happen to know how to read the RTL code in the patch? I definitely need some help with figuring out how it works mechanically
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #71 from Petr Skocik --- An Ignore macro that works everywhere where a (void) cast syntactically works (i.e., even on void types for whatever reason) is easy: #define IGN$(Val) (__extension__({ \ __auto_type IGN$ = _Generic((typeof(Val)*)0, \ void*: ((void)(Val),0), default: Val); (void)IGN$; })) /// __attribute((warn_unused_result)) int getInt(void); void getVoid(void); void ign_test(void){ getInt(); //warning getVoid(); //no warning (void)getInt(); //traditionally with a warning (void)getVoid(); //no warning IGN$(getInt()); //no warning IGN$(getVoid()); //no warning } https://godbolt.org/z/4qa8TcWMM (Can be easily done wihtout __auto-type (=>use typeof) or (__extension__({ }) too (use do ;while(0)). Would strongly prefer if the current semantics of warn_unused_result were not broken by a late "correction". The time for a discussion on the semantics of warn_unused_result in combo with void cast is long gone. It's now been long established that simple (void) casts do NOT silence warn_unused_result. Let's not break code that expects such semantics. A conditional compiler flag to enable void casts to silence WUR might be in order, however, considering that clang disregards the established semantics and a void cast does silence WUR on clang (https://godbolt.org/z/4qa8TcWMM).
[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966 --- Comment #2 from Filip Kastl --- (In reply to Richard Biener from comment #1) > I think we have a few other gcc miscompiles now (but from SPEC CPU 2017), so > this one is probably related. Does -fno-strict-aliasing fix it? Just tested it, -fno-strict-aliasing doesn't help
[Bug middle-end/115967] New: ubsan: shift exponent 64 is too large for 64-bit type HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115967 Bug ID: 115967 Summary: ubsan: shift exponent 64 is too large for 64-bit type HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jamborm at gcc dot gnu.org CC: law at gcc dot gnu.org Blocks: 63426 Target Milestone: --- Host: x86_64-linux Target: x86_64-linux Undefined behavior sanitizer reports a failure when running Fortran testcase gfortran.dg/ieee/large_1.f90 at -O2 and higher: /home/mjambor/gcc/mine/src/gcc/ext-dce.cc:600:15: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' /home/mjambor/gcc/mine/src/gcc/ext-dce.cc:404:23: runtime error: left shift of negative value -1 FAIL: gfortran.dg/ieee/large_1.f90 -O2 (test for excess errors) The failure is present since the introduction of the source file ext-dce.cc with commit r15-1901-g98914f9eba5f19 (Jeff Law: [to-be-committed][RISC-V][V3] DCE analysis for extension elimination) One way to reproduce the issue is to bootstrap GCC with Fortran enabled and with --with-build-config=bootstrap-ubsan and then run the test case as usual. It is however much easier to (on an x86_64-linux at least) simply apply the following patch and then run make -k check-gfortran RUNTESTFLAGS="ieee.exp=large_1.f90" --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -597,6 +597,7 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap live_tmp) bit = subreg_lsb (y).to_constant (); if (dst_mask) { + gcc_assert (bit < 64); dst_mask <<= bit; if (!dst_mask) dst_mask = -0x1ULL; Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 [Bug 63426] [meta-bug] Issues found with -fsanitize=undefined
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 --- Comment #3 from Richard Biener --- I think that's somewhat expected.
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 --- Comment #4 from Jonathan Wakely --- Jut to be clear on the bug being reported, are you expecting the error to be detected in all cases?
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 --- Comment #5 from Nathan Teodosio --- In none of them. Or am I overlooking a buffer overrun here? Also with Clang I get no stack smashing even with -fstack-protector-all. In any case I fail to see why that would be dependent on which of the array definitions in main come first.
[Bug c++/115963] P3144R2 is not yet completely implemented with a nullptr constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115963 --- Comment #2 from Jiang An --- > The question becomes is that an oversight of P3144R2 or not and should a null > pointer > constant be valid always since that was never undefined or even had a chance > of being > undefined. CWG2392 (https://cplusplus.github.io/CWG/issues/2392.html) might be related. In order to keep deleting (constant) null pointer well-formed, we may need to choose one of following strategies: 1. Make a delete-expression whose operand (after conversion, if any) is a pointer to an incomplete class type potentially constant evaluated. So that we can distinguish constant null pointer operand in unevaluated operands (e.g. in decltype). 2. Always accept such a "bad" delete-expression in unevaluated operands. Both options reject non-constant and constant but non-null operands. It seems that GCC is behaving like option 1 (rejecting `decltype(delete std::declval())` while accepting `delete static_cast(nullptr)`). However, it doesn't seem possible to make it conforming to accept potentially-evaluated `delete reinterpret_cast(std::uintptr_t{})`, because the operand is not a constant (sub)expression. On the other hand, I tend to believe "the object being deleted has incomplete class type" is equivalent to "the pointed to type of the operand is an incomplete class type", but it's also not very clear whether such reading is correct.
[Bug fortran/84246] [11/12/13/14/15 Regression] [Coarray] ICE in conv_caf_send, at fortran/trans-intrinsic.c:1950
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84246 Andre Vehreschild changed: What|Removed |Added Status|ASSIGNED|WAITING --- Comment #9 from Andre Vehreschild --- Patch proposed: https://gcc.gnu.org/pipermail/fortran/2024-July/060692.html Waiting for review.
[Bug c/106800] Expose more vector extensions in C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106800 --- Comment #2 from Richard Biener --- I have something in the works.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #72 from Segher Boessenkool --- The correct way to not get the warning about unused results, is to _do_ use the function return value, of course, as I explained in #c18 already. Like: if (foo()) { /* The return value of foo can be ignored here because X and Y. */ } You *always* should explain why you can ignore it here (not just *that* you can, that's not an explanation, that's merely a statement), anyway, so this gives a nicely readable flow.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #73 from Zdenek Sojka --- See MISRA C:2012 Rule 17.7: "... If the return value of a function is intended not to be used explicitly, it should be cast to the void type. ..." It would be helpful if gcc could be used to write MISRA-compliant code, or at least if it wouldn't generate compilation warnings when the programmer is targeting MISRA-compliancy. I understand this might be a very complex topic, due to gcc being free software, and other companies possibly providing / selling the tool qualification.
[Bug tree-optimization/114440] Fail to recognize a chain of lane-reduced operations for loop reduction vect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114440 --- Comment #2 from GCC Commits --- The master branch has been updated by Feng Xue : https://gcc.gnu.org/g:178cc419512f7e358f88dfe2336625aa99cd7438 commit r15-2096-g178cc419512f7e358f88dfe2336625aa99cd7438 Author: Feng Xue Date: Wed May 29 17:22:36 2024 +0800 vect: Support multiple lane-reducing operations for loop reduction [PR114440] For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current vectorizer could only handle the pattern if the reduction chain does not contain other operation, no matter the other is normal or lane-reducing. This patches removes some constraints in reduction analysis to allow multiple arbitrary lane-reducing operations with mixed input vectypes in a loop reduction chain. For example: int sum = 1; for (i) { sum += d0[i] * d1[i]; // dot-prod sum += w[i]; // widen-sum sum += abs(s0[i] - s1[i]); // sad } The vector size is 128-bit vectorization factor is 16. Reduction statements would be transformed as: vector<4> int sum_v0 = { 0, 0, 0, 1 }; vector<4> int sum_v1 = { 0, 0, 0, 0 }; vector<4> int sum_v2 = { 0, 0, 0, 0 }; vector<4> int sum_v3 = { 0, 0, 0, 0 }; for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = WIDEN_SUM (w_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = SAD (s0_v0[i: 0 ~ 7 ], s1_v0[i: 0 ~ 7 ], sum_v0); sum_v1 = SAD (s0_v1[i: 8 ~ 15], s1_v1[i: 8 ~ 15], sum_v1); sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy } sum_v = sum_v0 + sum_v1 + sum_v2 + sum_v3; // = sum_v0 + sum_v1 2024-03-22 Feng Xue gcc/ PR tree-optimization/114440 * tree-vectorizer.h (vectorizable_lane_reducing): New function declaration. * tree-vect-stmts.cc (vect_analyze_stmt): Call new function vectorizable_lane_reducing to analyze lane-reducing operation. * tree-vect-loop.cc (vect_model_reduction_cost): Remove cost computation code related to emulated_mixed_dot_prod. (vectorizable_lane_reducing): New function. (vectorizable_reduction): Allow multiple lane-reducing operations in loop reduction. Move some original lane-reducing related code to vectorizable_lane_reducing. (vect_transform_reduction): Adjust comments with updated example. gcc/testsuite/ PR tree-optimization/114440 * gcc.dg/vect/vect-reduc-chain-1.c * gcc.dg/vect/vect-reduc-chain-2.c * gcc.dg/vect/vect-reduc-chain-3.c * gcc.dg/vect/vect-reduc-chain-dot-slp-1.c * gcc.dg/vect/vect-reduc-chain-dot-slp-2.c * gcc.dg/vect/vect-reduc-chain-dot-slp-3.c * gcc.dg/vect/vect-reduc-chain-dot-slp-4.c * gcc.dg/vect/vect-reduc-dot-slp-1.c
[Bug tree-optimization/114440] Fail to recognize a chain of lane-reduced operations for loop reduction vect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114440 --- Comment #3 from GCC Commits --- The master branch has been updated by Feng Xue : https://gcc.gnu.org/g:db3c8c9726d0bafbb9f85b6d7027fe83602643e7 commit r15-2097-gdb3c8c9726d0bafbb9f85b6d7027fe83602643e7 Author: Feng Xue Date: Wed May 29 17:28:14 2024 +0800 vect: Optimize order of lane-reducing operations in loop def-use cycles When transforming multiple lane-reducing operations in a loop reduction chain, originally, corresponding vectorized statements are generated into def-use cycles starting from 0. The def-use cycle with smaller index, would contain more statements, which means more instruction dependency. For example: int sum = 1; for (i) { sum += d0[i] * d1[i]; // dot-prod sum += w[i]; // widen-sum sum += abs(s0[i] - s1[i]); // sad sum += n[i]; // normal } Original transformation result: for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = WIDEN_SUM (w_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = SAD (s0_v0[i: 0 ~ 7 ], s1_v0[i: 0 ~ 7 ], sum_v0); sum_v1 = SAD (s0_v1[i: 8 ~ 15], s1_v1[i: 8 ~ 15], sum_v1); sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy ... } For a higher instruction parallelism in final vectorized loop, an optimal means is to make those effective vector lane-reducing ops be distributed evenly among all def-use cycles. Transformed as the below, DOT_PROD, WIDEN_SUM and SADs are generated into disparate cycles, instruction dependency among them could be eliminated. for (i / 16) { sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0); sum_v1 = sum_v1; // copy sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = sum_v0; // copy sum_v1 = WIDEN_SUM (w_v1[i: 0 ~ 15], sum_v1); sum_v2 = sum_v2; // copy sum_v3 = sum_v3; // copy sum_v0 = sum_v0; // copy sum_v1 = sum_v1; // copy sum_v2 = SAD (s0_v2[i: 0 ~ 7 ], s1_v2[i: 0 ~ 7 ], sum_v2); sum_v3 = SAD (s0_v3[i: 8 ~ 15], s1_v3[i: 8 ~ 15], sum_v3); ... } 2024-03-22 Feng Xue gcc/ PR tree-optimization/114440 * tree-vectorizer.h (struct _stmt_vec_info): Add a new field reduc_result_pos. * tree-vect-loop.cc (vect_transform_reduction): Generate lane-reducing statements in an optimized order.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #74 from Florian Weimer --- (In reply to Zdenek Sojka from comment #73) > See MISRA C:2012 Rule 17.7: > "... If the return value of a function is intended not to be used > explicitly, it should be cast to the void type. ..." > > It would be helpful if gcc could be used to write MISRA-compliant code, or > at least if it wouldn't generate compilation warnings when the programmer is > targeting MISRA-compliancy. Doesn't this (interpretation of MISRA) mean that compliant code cannot use __attribute__((warn_unused_result))? That doesn't require any GCC changes.
[Bug c++/115968] New: g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968 Bug ID: 115968 Summary: g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1) Product: gcc Version: 14.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: summersnow9403 at gmail dot com Target Milestone: --- The code example can be found on godbolt: https://godbolt.org/z/qPs4zhb3j The two functions test_eigen and test_eigen2 are expected to output the same result. However, when compiled with gcc 12 and above versions with -O2, the program prints all zeros for test_eigen. When compiled with gcc 12 and above without -O2, or any versions of clang with -O2, the program behaves correctly.
[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- const auto x See https://eigen.tuxfamily.org/dox/TopicPitfalls.html
[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968 --- Comment #2 from Andrew Pinski --- See C++11 and the auto keyword section of https://eigen.tuxfamily.org/dox/TopicPitfalls.html The problem is that eval() returns a temporary object (in this case a MatrixXd) which is then referenced by the Transpose<> expression. However, this temporary is deleted right after the first line, and then the C expression references a dead object.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #75 from Zdenek Sojka --- (In reply to Florian Weimer from comment #74) > (In reply to Zdenek Sojka from comment #73) > > See MISRA C:2012 Rule 17.7: > > "... If the return value of a function is intended not to be used > > explicitly, it should be cast to the void type. ..." > > > > It would be helpful if gcc could be used to write MISRA-compliant code, or > > at least if it wouldn't generate compilation warnings when the programmer is > > targeting MISRA-compliancy. > > Doesn't this (interpretation of MISRA) mean that compliant code cannot use > __attribute__((warn_unused_result))? That doesn't require any GCC changes. With the current state of things, yes. MISRA suggests adding the (void) cast, that does not suppress the warning. For me the ideal state would be to have a -Wwarn-any-unused-result, to consider all functions as having the "__attribute__((__warn_unused_result__))" attribute, with the option of (void) cast to suppress the warning. Just the following sentence of the MISRA C:2012 explains the "(void)" cast as the way of preventing dead code; the example from comment #72: > if (foo()) { > /* The return value of foo can be ignored here because X and Y. */ > } creates a condition that needs to be covered, even though it might not be possible to trigger either FALSE or TRUE outcome. Of course there are other solutions (and possible justifications); I just wanted to show that the (void) cast of unused function result might not be that uncommon.
[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968 --- Comment #3 from HanatoK --- (In reply to Andrew Pinski from comment #2) > See C++11 and the auto keyword section of > https://eigen.tuxfamily.org/dox/TopicPitfalls.html > > > The problem is that eval() returns a temporary object (in this case a > MatrixXd) which is then referenced by the Transpose<> expression. However, > this temporary is deleted right after the first line, and then the C > expression references a dead object. Thanks! I can confirm that with the clang sanitizers "-fsanitize=address,undefined,leak" I encounter the "stack-use-after-scope" error, so the clang result is actually printed from out-of-scope variables. I think you are right that I misuse the auto keyword.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #76 from Jakub Jelinek --- (void) casts not quieting the warning was an intentional request when the warning has been added, I really don't think it is a good idea to change that. The fact that clang people can't properly implement Perhaps you can ask glibc to recategorize some of the declarations to use [[nodiscard]] instead of __attribute__((__warn_unused_result__)), IMHO it is helpful to have different badnesses of ignoring the result, WUR attribute should be used for the cases where it is always or pretty much always a very severe bug, while nodiscard can be used for the lighter cases (using the result is nice to have, but usually nothing wrong will happen if it is ignored). E.g. ignoring return value of realloc is pretty much always a bad idea and just (void) realloc (...); is something that shouldn't be supported.
[Bug rtl-optimization/115876] [15 regression] ext-dce.cc has ubsan issues; shifting negative values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115876 Andrew Pinski changed: What|Removed |Added CC||jamborm at gcc dot gnu.org --- Comment #10 from Andrew Pinski --- *** Bug 115967 has been marked as a duplicate of this bug. ***
[Bug middle-end/115967] ubsan: shift exponent 64 is too large for 64-bit type HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115967 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- Already recorded as PR 115876 . *** This bug has been marked as a duplicate of bug 115876 ***
[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 Bug 63426 depends on bug 115967, which changed state. Bug 115967 Summary: ubsan: shift exponent 64 is too large for 64-bit type HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115967 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug c++/111890] ICE in tsubst_friend_function with friend function declared inside a concept constrainted class inside a template class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111890 --- Comment #7 from GCC Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:247335823f420eb1dd56f4bf32ac78d441f5ccc2 commit r15-2098-g247335823f420eb1dd56f4bf32ac78d441f5ccc2 Author: Patrick Palka Date: Wed Jul 17 11:08:35 2024 -0400 c++: constrained partial spec type context [PR111890] maybe_new_partial_specialization wasn't propagating TYPE_CONTEXT when creating a new class type corresponding to a constrained partial spec, which do_friend relies on via template_class_depth to distinguish a template friend from a non-template friend, and so in the below testcase we were incorrectly instantiating the non-template operator+ as if it were a template leading to an ICE. PR c++/111890 gcc/cp/ChangeLog: * pt.cc (maybe_new_partial_specialization): Propagate TYPE_CONTEXT to the newly created partial specialization. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-partial-spec15.C: New test. Reviewed-by: Jason Merrill
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #77 from Andrew Church --- (In reply to Segher Boessenkool from comment #72) > if (foo()) { > /* The return value of foo can be ignored here because X and Y. */ > } This is just another idiom, with "if(){}" replacing "(void)"; it does not directly indicate that the value is unused, as a hypothetical [[discard]] would do. I would even argue that it is worse because (as Zdenek points out) it adds a branch which would either need to be tested, potentially requiring additional failure injection logic to trigger the failing case, or documented as not needing to be covered by a test. In general, I would consider any code structure with no behavioral effect but a semantic side-effect (including casting to void, assigning to an unused variable, or testing in a conditional with an empty block) a code smell, and would prefer an explicit [[discard]] to make the intent clear. Given that we have no [[discard]], I still hold that cast-to-void is the best existing option due both to conciseness and to widespread recognition of its intent. There's also an argument to be made that allowing the warning to be bypassed with if(){} or assignment to an unused variable is weakening the original intent behind WUR, as Jakub mentions. (In reply to Jakub Jelinek from comment #76) > (void) casts not quieting the warning was an intentional request when the > warning has been added, I really don't think it is a good idea to change > that. This is why I initially suggested a compiler option (-Wunused-result=strict) to select the behavior. It could of course be coded in reverse, defaulting to the current behavior and having e.g. -Wunused-result=lax to inhibit WUR warnings. The fundamental problem with the request behind this feature (in particular, with the fact that the request comes from a library author) is that the end user of the compiler is the library user, not the library author, and if the end user considers the warnings useless, they will find one or another way around them, however much collateral damage (in the form of missed errors) that may cause. Given that, I think it's reasonable to offer a middle-ground option that lets the end user reject the library author's original intent of forcing return value usage but retain the ability to check for accidentally unused return values.
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #6 from Andrew Pinski --- (In reply to Nathan Teodosio from comment #5) > In none of them. Or am I overlooking a buffer overrun here? You definitely are overlooking one. for (size_t i = 0; i < size; i += hn::Lanes(d)) { hn::Store(x, d, x_array + i); hn::Lanes(d) is 4. so you are storing 0,1,2,3 and then 4,5,6,7 . Except there are only 5 elements of x_array so 5,6,7 stores is broken. >In any case I fail to see why that would be dependent on which of the array >definitions in main come first. Because -fstack-protector-all only checks one place in the stack rather than after each array. So the order of the arrays on the stack for a tie breaker is the order of how the user order was. So it just happens to be at the end you get the stack smasher error. With -fsanitize=address all arrays have a redzone and you get the following eror message and that is indepdent of the order of arrays since all load/stores are checked. = ==1==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x782bede00030 at pc 0x00401b3e bp 0x7ffeb05deab0 sp 0x7ffeb05deaa8 READ of size 16 at 0x782bede00030 thread T0 #0 0x401b3d in _mm_load_si128(long long __vector(2) const*) /opt/compiler-explorer/gcc-trunk-20240717/lib/gcc/x86_64-linux-gnu/15.0.0/include/emmintrin.h:701 #1 0x401b3d in Load > /opt/compiler-explorer/libs/highway/trunk/hwy/ops/x86_128-inl.h:2069 #2 0x401311 in MulAddLoop(int const*, int const*, unsigned long, int*) /app/example.cpp:11 #3 0x401954 in main /app/example.cpp:22 #4 0x782befa29d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 490fef8403240c91833978d494d39e537409b92e) #5 0x782befa29e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 490fef8403240c91833978d494d39e537409b92e) #6 0x401104 in _start (/app/output.s+0x401104) (BuildId: 4af5893bdf93a048dba77151f2e0b5e5a0ee46bd) Address 0x782bede00030 is located in stack of thread T0 at offset 48 in frame #0 0x4014cf in main /app/example.cpp:18 This frame has 3 object(s): [32, 52) 'a' (line 19) <== Memory access at offset 48 partially overflows this variable [96, 116) 'b' (line 19) [160, 180) 'c' (line 20) HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow /opt/compiler-explorer/gcc-trunk-20240717/lib/gcc/x86_64-linux-gnu/15.0.0/include/emmintrin.h:701 in _mm_load_si128(long long __vector(2) const*) Shadow bytes around the buggy address: 0x782beddffd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x782beddffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x782beddffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x782beddfff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x782beddfff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x782bede0: f1 f1 f1 f1 00 00[04]f2 f2 f2 f2 f2 00 00 04 f2 0x782bede00080: f2 f2 f2 f2 00 00 04 f3 f3 f3 f3 f3 00 00 00 00 0x782bede00100: f1 f1 f1 f1 f1 f1 01 f2 00 00 f2 f2 f8 f8 f2 f2 0x782bede00180: f8 f8 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 0x782bede00200: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 0x782bede00280: f5 f5 f5 f5 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user:f7 Container overflow: fc Array cookie:ac Intra object redzone:bb ASan internal: fe Left alloca redzone: ca Right alloca redzone:cb ==1==ABORTING
[Bug c++/115965] Stack smashing depending on order of declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965 --- Comment #7 from Andrew Pinski --- Note valgrind in this case cannot always capture buffer overruns due to it cann't easily add a redzone (buffer to detect overruns) for stack arrays. This is why -fsanitize=address is more powerful than both of the other two here.
[Bug tree-optimization/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 --- Comment #7 from GCC Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:0135a90de5a99b51001b6152d8b548151ebfa1c3 commit r15-2099-g0135a90de5a99b51001b6152d8b548151ebfa1c3 Author: Tamar Christina Date: Wed Jul 17 16:22:14 2024 +0100 middle-end: fix 0 offset creation and folding [PR115936] As shown in PR115936 SCEV and IVOPTS create an invalidate IV when the IV is a pointer type: ivtmp.39_65 = ivtmp.39_59 + 0B; where the IVs are DI mode and the offset is a pointer. This comes from this weird candidate: Candidate 8: Var befor: ivtmp.39_59 Var after: ivtmp.39_65 Incr POS: before exit test IV struct: Type: sizetype Base: 0 Step: 0B Biv:N Overflowness wrto loop niter: No-overflow This IV was always created just ended up not being used. This is created by SCEV. simple_iv_with_niters in the case where no CHREC is found creates an IV with base == ev, offset == 0; however in this case EV is a POINTER_PLUS_EXPR and so the type is a pointer. it ends up creating an unusable expression. gcc/ChangeLog: PR tree-optimization/115936 * tree-scalar-evolution.cc (simple_iv_with_niters): Use sizetype for pointers.
[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #3 from Andrew Pinski --- Does -fno-ext-dce fix it? There are a few bugs that Jeff has been working in ext-dce so I wonder if this might be on of them? Also do you have a last known to work?
[Bug tree-optimization/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from Tamar Christina --- Fixed, thanks for the report. Bug is latent on branches so won't backport for now.
[Bug middle-end/115887] ICE: in gsi_insert_on_edge_immediate, at gimple-iterator.cc:849 with -O -fnon-call-exceptions -finstrument-functions and _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115887 --- Comment #3 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:5104fe4c7808a66ed3041a8da8e4720585cc8a1f commit r15-2101-g5104fe4c7808a66ed3041a8da8e4720585cc8a1f Author: Jakub Jelinek Date: Wed Jul 17 17:32:21 2024 +0200 bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate [PR115887] The following testcase ICEs on x86_64-linux, because we try to gsi_insert_on_edge_immediate a statement on an edge which already has statements queued with gsi_insert_on_edge, and the deferral has been intentional so that we don't need to deal with cfg changes in between. The following patch uses the delayed insertion as well. 2024-07-17 Jakub Jelinek PR middle-end/115887 * gimple-lower-bitint.cc (gimple_lower_bitint): Use gsi_insert_on_edge instead of gsi_insert_on_edge_immediate and set edge_insertions to true. * gcc.dg/bitint-108.c: New test.
[Bug c++/115754] [14 Regression] C++26 ICE on constexpr new
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115754 --- Comment #4 from GCC Commits --- The releases/gcc-14 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:297ea7e5bb5c4d92cf3fe29182d432694de858cc commit r14-10445-g297ea7e5bb5c4d92cf3fe29182d432694de858cc Author: Jakub Jelinek Date: Tue Jul 2 22:09:58 2024 +0200 c++: Fix ICE on constexpr placement new [PR115754] C++26 is making in P2747R2 paper placement new constexpr. While working on a patch for that, I've noticed we ICE starting with GCC 14 on the following testcase. The problem is that e.g. for the void * to sometype * casts checks, we really assume the casts have their operand constant evaluated as prvalue, but on the testcase the cast itself is evaluated with vc_discard and that means op can end up e.g. a VAR_DECL which the later code doesn't like and asserts on. If the result type is void, we don't really need the cast operand for anything, so can use vc_discard for the recursive call, VIEW_CONVERT_EXPR can appear on the lhs, so we need to honor the lval but otherwise the patch uses vc_prvalue. I'd like to get this patch in before the rest of P2747R2 implementation, so that it can be backported to 14.2 later on. 2024-07-02 Jakub Jelinek Jason Merrill PR c++/115754 * constexpr.cc (cxx_eval_constant_expression) : For conversions to void, pass vc_discard to the recursive call and otherwise for tcode other than VIEW_CONVERT_EXPR pass vc_prvalue. * g++.dg/cpp26/pr115754.C: New test. (cherry picked from commit 1250540a98e0a1dfa4d7834672d88d8543ea70b1)
[Bug middle-end/115887] ICE: in gsi_insert_on_edge_immediate, at gimple-iterator.cc:849 with -O -fnon-call-exceptions -finstrument-functions and _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115887 --- Comment #4 from GCC Commits --- The releases/gcc-14 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:01dfc5b4add9a5ed48c46f6b25cde6e55b9f3ff1 commit r14-10447-g01dfc5b4add9a5ed48c46f6b25cde6e55b9f3ff1 Author: Jakub Jelinek Date: Wed Jul 17 17:32:21 2024 +0200 bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate [PR115887] The following testcase ICEs on x86_64-linux, because we try to gsi_insert_on_edge_immediate a statement on an edge which already has statements queued with gsi_insert_on_edge, and the deferral has been intentional so that we don't need to deal with cfg changes in between. The following patch uses the delayed insertion as well. 2024-07-17 Jakub Jelinek PR middle-end/115887 * gimple-lower-bitint.cc (gimple_lower_bitint): Use gsi_insert_on_edge instead of gsi_insert_on_edge_immediate and set edge_insertions to true. * gcc.dg/bitint-108.c: New test. (cherry picked from commit 5104fe4c7808a66ed3041a8da8e4720585cc8a1f)
[Bug middle-end/115527] incorrect folding of __builtin_clear_padding()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527 --- Comment #12 from GCC Commits --- The releases/gcc-14 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:d668f875985cf61d3a898d95cf01df90a720e5c2 commit r14-10446-gd668f875985cf61d3a898d95cf01df90a720e5c2 Author: Jakub Jelinek Date: Wed Jul 17 11:38:33 2024 +0200 gimple-fold: Fix up __builtin_clear_padding lowering [PR115527] The builtin-clear-padding-6.c testcase fails as clear_padding_type doesn't correctly recompute the buf->size and buf->off members after expanding clearing of an array using a runtime loop. buf->size should be in that case the offset after which it should continue with next members or padding before them modulo UNITS_PER_WORD and buf->off that offset minus buf->size. That is what the code was doing, but with off being the start of the loop cleared array, not its end. So, the last hunk in gimple-fold.cc fixes that. When adding the testcase, I've noticed that the c-c++-common/torture/builtin-clear-padding-* tests, although clearly written as runtime tests to test the builtins at runtime, didn't have { dg-do run } directive and were just compile tests because of that. When adding that to the tests, builtin-clear-padding-1.c was already failing without that clear_padding_type hunk too, but builtin-clear-padding-5.c was still failing even after the change. That is due to a bug in clear_padding_flush which the patch fixes as well - when clear_padding_flush is called with full=true (that happens at the end of the whole __builtin_clear_padding or on those array padding clears done by a runtime loop), it wants to flush all the pending padding clearings rather than just some. If it is at the end of the whole object, it decreases wordsize when needed to make sure the code never writes including RMW cycles to something outside of the object: if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize) > (unsigned HOST_WIDE_INT) buf->sz) { gcc_assert (wordsize > 1); wordsize /= 2; i -= wordsize; continue; } but if it is full==true flush in the middle, this doesn't happen, but we still process just the buffer bytes before the current end. If that end is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18, nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones might be true, so in some spots we just didn't emit any clearing in that last chunk. 2024-07-17 Jakub Jelinek PR middle-end/115527 * gimple-fold.cc (clear_padding_flush): Introduce endsize variable and use it instead of wordsize when comparing it against nonzero_last. (clear_padding_type): Increment off by sz. * c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run directive. * c-c++-common/torture/builtin-clear-padding-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-3.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/builtin-clear-padding-5.c: Likewise. * c-c++-common/torture/builtin-clear-padding-6.c: New test. (cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #78 from Segher Boessenkool --- (In reply to Andrew Church from comment #77) > (In reply to Segher Boessenkool from comment #72) > > if (foo()) { > > /* The return value of foo can be ignored here because X and Y. */ > > } > > This is just another idiom, with "if(){}" replacing "(void)"; it does not > directly indicate that the value is unused, That is not what I said either, of course. It *does* give the author a great place to add commentary why the return value is not actually used here. I.e. the only thing that *actually matters*: it encourages one human (the author) to communicate important information to the code reader. Code is for humans to read (and write)! If you care about the compiler first, you are doing it Wrong(tm). > as a hypothetical [[discard]] > would do. I would even argue that it is worse because (as Zdenek points > out) it adds a branch which would either need to be tested, potentially > requiring additional failure injection logic to trigger the failing case, or > documented as not needing to be covered by a test. If your coverage testing framework does not handle empty BBs specially, get a better coverage framework. > There's also an argument to be made that allowing the warning to be bypassed > with if(){} or assignment to an unused variable is weakening the original > intent behind WUR, as Jakub mentions. It does the opposite, as I have explained many times now. I haven't seen Jakub say anything like you say btw. (I find the unused var thing just clumsy, noisy, inelegant, and distracting. Not something someone who cares about readable code would ever do. There are much better options!) > The fundamental problem with the request behind this feature (in particular, > with the fact that the request comes from a library author) is that the end > user of the compiler is the library user, not the library author, and if the > end user considers the warnings useless, they will find one or another way > around them, however much collateral damage (in the form of missed errors) > that may cause. Given that, I think it's reasonable to offer a > middle-ground option that lets the end user reject the library author's > original intent of forcing return value usage but retain the ability to > check for accidentally unused return values. The warning is exactly for cases like realloc(). If the user finds the warning useless in that case, the user is a fool. If someone (the user, the author, anyone) used warn_unused_result where it is not appropriate, just fix *that*. The attribute is specifically for cases where not looking at the result value is a big (often hard to find) bug, or even a security problem. In cases where you just have some silly coding standard with silly rules you want to sillily follow, well, that is your own problem!
[Bug target/90616] Suboptimal code generated for accessing an aligned array.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90616 --- Comment #2 from GCC Commits --- The master branch has been updated by Georg-Johann Lay : https://gcc.gnu.org/g:e21fef7da92ef36af1e1b020ae5f35ef4f3c3fce commit r15-2102-ge21fef7da92ef36af1e1b020ae5f35ef4f3c3fce Author: Georg-Johann Lay Date: Thu Jul 4 12:08:34 2024 +0200 AVR: target/90616 - Improve adding constants that are 0 mod 256. This patch introduces a new insn that works as an insn combine pattern for (plus:HI (zero_extend:HI (reg:QI)) (const_0mod256_operannd:HI)) which requires at most 2 instructions. When the input register operand is already in HImode, the addhi3 printer only adds the hi8 part when it sees a SYMBOL_REF or CONST aligned to at least 256 bytes. (The CONST_INT case was already handled). gcc/ PR target/90616 * config/avr/predicates.md (const_0mod256_operand): New predicate. * config/avr/constraints.md (Cp8): New constraint. * config/avr/avr.md (*aligned_add_symbol): New insn. * config/avr/avr.cc (avr_out_plus_symbol) [HImode]: When op2 is a multiple of 256, there is no need to add / subtract the lo8 part. (avr_rtx_costs_1) [PLUS && HImode]: Return expected costs for new insn *aligned_add_symbol as it applies.
[Bug target/90616] Suboptimal code generated for accessing an aligned array.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90616 Georg-Johann Lay changed: What|Removed |Added Build|amd64-portbld-freebsd10.4 | Priority|P3 |P5 Target Milestone|--- |15.0 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #3 from Georg-Johann Lay --- Added in v15.
[Bug middle-end/115527] incorrect folding of __builtin_clear_padding()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527 --- Comment #13 from GCC Commits --- The releases/gcc-11 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:3eec2d768d72944ed209b51ba60455d751b9aede commit r11-11580-g3eec2d768d72944ed209b51ba60455d751b9aede Author: Jakub Jelinek Date: Wed Jul 17 11:38:33 2024 +0200 gimple-fold: Fix up __builtin_clear_padding lowering [PR115527] The builtin-clear-padding-6.c testcase fails as clear_padding_type doesn't correctly recompute the buf->size and buf->off members after expanding clearing of an array using a runtime loop. buf->size should be in that case the offset after which it should continue with next members or padding before them modulo UNITS_PER_WORD and buf->off that offset minus buf->size. That is what the code was doing, but with off being the start of the loop cleared array, not its end. So, the last hunk in gimple-fold.cc fixes that. When adding the testcase, I've noticed that the c-c++-common/torture/builtin-clear-padding-* tests, although clearly written as runtime tests to test the builtins at runtime, didn't have { dg-do run } directive and were just compile tests because of that. When adding that to the tests, builtin-clear-padding-1.c was already failing without that clear_padding_type hunk too, but builtin-clear-padding-5.c was still failing even after the change. That is due to a bug in clear_padding_flush which the patch fixes as well - when clear_padding_flush is called with full=true (that happens at the end of the whole __builtin_clear_padding or on those array padding clears done by a runtime loop), it wants to flush all the pending padding clearings rather than just some. If it is at the end of the whole object, it decreases wordsize when needed to make sure the code never writes including RMW cycles to something outside of the object: if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize) > (unsigned HOST_WIDE_INT) buf->sz) { gcc_assert (wordsize > 1); wordsize /= 2; i -= wordsize; continue; } but if it is full==true flush in the middle, this doesn't happen, but we still process just the buffer bytes before the current end. If that end is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18, nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones might be true, so in some spots we just didn't emit any clearing in that last chunk. 2024-07-17 Jakub Jelinek PR middle-end/115527 * gimple-fold.c (clear_padding_flush): Introduce endsize variable and use it instead of wordsize when comparing it against nonzero_last. (clear_padding_type): Increment off by sz. * c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run directive. * c-c++-common/torture/builtin-clear-padding-2.c: Likewise. * c-c++-common/torture/builtin-clear-padding-3.c: Likewise. * c-c++-common/torture/builtin-clear-padding-4.c: Likewise. * c-c++-common/torture/builtin-clear-padding-5.c: Likewise. * c-c++-common/torture/builtin-clear-padding-6.c: New test. (cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)
[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966 --- Comment #4 from Filip Kastl --- My last known to work is r15-1566-gfd536b8412d4da. And yes, -no-ext-dce does fix the issue!
[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62" since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526 --- Comment #23 from GCC Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:0841fd4c42ab053be951b7418233f0478282d020 commit r15-2104-g0841fd4c42ab053be951b7418233f0478282d020 Author: Uros Bizjak Date: Wed Jul 17 18:11:26 2024 +0200 alpha: Fix duplicate !tlsgd!62 assemble error [PR115526] Add missing "cannot_copy" attribute to instructions that have to stay in 1-1 correspondence with another insn. PR target/115526 gcc/ChangeLog: * config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy attribute. (movdi_er_tlsgd): Ditto. (movdi_er_tlsldm): Ditto. (call_value_osf_): Ditto. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr115526.c: New test.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #79 from Andrew Church --- (In reply to Segher Boessenkool from comment #78) > If someone (the user, the author, anyone) used warn_unused_result where it is > not appropriate, just fix *that*. The attribute is specifically for cases > where not looking at the result value is a big (often hard to find) bug, The issue here is that the library user _cannot_ (realistically) fix improper usage of WUR by the library author. The intent of -Wunused-result=... is to offer a low-resistance path with fewer side effects than just a blanket -Wno-unused-result. > or even a security problem. The question of whether ignoring a return value from a function is a security problem is rarely a static determination. Does the following function raise a security problem? void spawn_command(const char *cmd) { (void) system(cmd); } In some cases certainly, but if cmd is just setting keyboard LEDs to indicate progress, probably not. Only the library user knows for sure, so the library author should not be using WUR here (though the weaker [[nodiscard]] would arguably be appropriate). If glibc had stuck to just using WUR on realloc(), this entire discussion would probably never had arisen, because everyone can agree that ignoring the return value from realloc() is an error (or a deliberate sticking-out-of-the-tongue to show that there's exactly one case it's safe to ignore the return value from realloc(), which is when it's called with a size of zero, and _that_ is a case I'll happily disregard.)
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 --- Comment #80 from Segher Boessenkool --- (In reply to Andrew Church from comment #79) > (In reply to Segher Boessenkool from comment #78) > > If someone (the user, the author, anyone) used warn_unused_result where it > > is > > not appropriate, just fix *that*. The attribute is specifically for cases > > where not looking at the result value is a big (often hard to find) bug, > > The issue here is that the library user _cannot_ (realistically) fix > improper usage of WUR by the library author. This is exactly the same problem as any other problematic bug in library code. Sane users vote with their feet in such cases, if a bug report (or whatever equivalent) does not easily get it satisfactorily fixed.
[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62" since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526 --- Comment #24 from GCC Commits --- The releases/gcc-14 branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:3a963d441a68797956a5f67dcb351b2dbd4ac1d0 commit r14-10448-g3a963d441a68797956a5f67dcb351b2dbd4ac1d0 Author: Uros Bizjak Date: Wed Jul 17 18:11:26 2024 +0200 alpha: Fix duplicate !tlsgd!62 assemble error [PR115526] Add missing "cannot_copy" attribute to instructions that have to stay in 1-1 correspondence with another insn. PR target/115526 gcc/ChangeLog: * config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy attribute. (movdi_er_tlsgd): Ditto. (movdi_er_tlsldm): Ditto. (call_value_osf_): Ditto. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr115526.c: New test. (cherry picked from commit 0841fd4c42ab053be951b7418233f0478282d020)
[Bug tree-optimization/107200] False positive -Wdangling-pointer?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107200 --- Comment #5 from Andrew Pinski --- See https://libeigen.gitlab.io/docs/TopicPitfalls.html section "C++11 and the auto keyword" explictly.
[Bug c++/115291] armv8-a GCC emits float32x2_t loads from uninitialized stack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115291 Andrew Pinski changed: What|Removed |Added Resolution|INVALID |DUPLICATE --- Comment #4 from Andrew Pinski --- . *** This bug has been marked as a duplicate of bug 107200 ***
[Bug tree-optimization/107200] False positive -Wdangling-pointer?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107200 Andrew Pinski changed: What|Removed |Added CC||akihiko.odaki at daynix dot com --- Comment #6 from Andrew Pinski --- *** Bug 115291 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/107200] False positive -Wdangling-pointer?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107200 Andrew Pinski changed: What|Removed |Added CC||summersnow9403 at gmail dot com --- Comment #7 from Andrew Pinski --- *** Bug 115968 has been marked as a duplicate of this bug. ***
[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968 Andrew Pinski changed: What|Removed |Added Resolution|INVALID |DUPLICATE --- Comment #4 from Andrew Pinski --- . *** This bug has been marked as a duplicate of bug 107200 ***
[Bug c++/111890] ICE in tsubst_friend_function with friend function declared inside a concept constrainted class inside a template class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111890 --- Comment #8 from Harry Butterworth --- I applied patch to 14.1.0 and my code compiles now. Thanks.
[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62" since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526 --- Comment #25 from GCC Commits --- The releases/gcc-13 branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:37bd7d5c4e17c97d2b7d50f630b1cf8b347a31f4 commit r13-8920-g37bd7d5c4e17c97d2b7d50f630b1cf8b347a31f4 Author: Uros Bizjak Date: Wed Jul 17 18:11:26 2024 +0200 alpha: Fix duplicate !tlsgd!62 assemble error [PR115526] Add missing "cannot_copy" attribute to instructions that have to stay in 1-1 correspondence with another insn. PR target/115526 gcc/ChangeLog: * config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy attribute. (movdi_er_tlsgd): Ditto. (movdi_er_tlsldm): Ditto. (call_value_osf_): Ditto. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr115526.c: New test. (cherry picked from commit 0841fd4c42ab053be951b7418233f0478282d020)
[Bug tree-optimization/111150] (vec CMP vec) != (vec CMP vec) should just produce (vec CMP vec) ^ (vec CMP vec)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50 --- Comment #1 from GCC Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:44fcc1ca11e7ea35dc9fb25a5317346bc1eaf7b2 commit r15-2106-g44fcc1ca11e7ea35dc9fb25a5317346bc1eaf7b2 Author: Eikansh Gupta Date: Wed May 22 23:28:48 2024 +0530 MATCH: Simplify (a ? x : y) eq/ne (b ? x : y) [PR50] This patch adds match pattern for `(a ? x : y) eq/ne (b ? x : y)`. In forwprop1 pass, depending on the type of `a` and `b`, GCC produces `vec_cond` or `cond_expr`. Based on the observation that `(x != y)` is TRUE, the pattern can be optimized to produce `(a^b ? TRUE : FALSE)`. The patch adds match pattern for a, b: (a ? x : y) != (b ? x : y) --> (a^b) ? TRUE : FALSE (a ? x : y) == (b ? x : y) --> (a^b) ? FALSE : TRUE (a ? x : y) != (b ? y : x) --> (a^b) ? TRUE : FALSE (a ? x : y) == (b ? y : x) --> (a^b) ? FALSE : TRUE PR tree-optimization/50 gcc/ChangeLog: * match.pd (`(a ? x : y) eq/ne (b ? x : y)`): New pattern. (`(a ? x : y) eq/ne (b ? y : x)`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr50.c: New test. * gcc.dg/tree-ssa/pr50-1.c: New test. * g++.dg/tree-ssa/pr50.C: New test. Signed-off-by: Eikansh Gupta
[Bug tree-optimization/111150] (vec CMP vec) != (vec CMP vec) should just produce (vec CMP vec) ^ (vec CMP vec)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |15.0
[Bug tree-optimization/111150] (vec CMP vec) != (vec CMP vec) should just produce (vec CMP vec) ^ (vec CMP vec)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #2 from Andrew Pinski --- Fixed.
[Bug c++/115964] GCC accepts invalid program with explicit object member function overloads
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115964 Marek Polacek changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-07-17 Status|UNCONFIRMED |NEW CC||mpolacek at gcc dot gnu.org --- Comment #1 from Marek Polacek --- Confirmed, I guess. I hope there isn't a DR changing this to be well-formed that I missed :). Thanks for the report.
[Bug c++/110343] [C++26] P2558R2 - Add @, $, and ` to the basic character set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110343 Jakub Jelinek changed: What|Removed |Added CC||redi at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- I've tried to understand the preprocessor issue mentioned in the paper, but am confused on what is the right behavior and why. Consider #define STR(x) #x const char *a = "\u00b7"; const char *b = STR(\u00b7); const char *c = "\u0041"; const char *d = STR(\u0041); const char *e = STR(a\u00b7); const char *f = STR(a\u0041); const char *g = STR(a \u00b7); const char *h = STR(a \u0041); const char *i = "\u066d"; const char *j = STR(\u066d); const char *k = "\u0040"; const char *l = STR(\u0040); const char *m = STR(a\u066d); const char *n = STR(a\u0040); const char *o = STR(a \u066d); const char *p = STR(a \u0040); Neither clang nor gcc emit any diagnostics on the a, c, i and k initializers, those are certainly valid. g++ emits with -pedantic-errors errors on all the others, while clang++ on the ones with STR involving \u0041, \u0040 and a\u0066d. The chosen values are \u0040 '@' as something being changed by this paper, \u0041 'A', \u00b7 as an example of character which is pedantically valid in identifiers if not at the start and \u066d s something pedantically not valid in identifiers. Now, https://eel.is/c++draft/lex.charset#6 says that UCN used outside of a string/character literal which corresponds to basic character set character (or control character) is ill-formed, that would make d, f, h cases invalid for C++ and l, n, p cases invalid for C++26. https://eel.is/c++draft/lex.name states which characters can appear at the start of the identifier and which can appear after the start. And https://eel.is/c++draft/lex.pptoken states that preprocessing-token is either identifier, or tons of other things, or "each non-whitespace character that cannot be one of the above" Then https://eel.is/c++draft/lex.pptoken#1 says that this last category is invalid if the preprocessing token is being converted into token. And https://eel.is/c++draft/lex.pptoken#2 includes "If any character not in the basic character set matches the last category, the program is ill-formed." Now, e.g. for the C++23 STR(\u0040) case, \u0040 is there not in the basic character set, so valid outside of the literals (not the case anymore in C++26), but it isn't nondigit and doesn't have XID_Start property, so it isn't IMHO an identifier and so must be the "each non-whitespace character that cannot be one of the above" case. Why doesn't the above mentioned https://eel.is/c++draft/lex.pptoken#2 sentence make that invalid? Ignoring that, I'd say it would be then stringized and that feels like it is what clang++ is doing. Now, e.g. for the STR(a\u066d) case, I wonder why that isn't lexed as a identifier followed by \u066d "each non-whitespace character that cannot be one of the above" token and stringified similarly, clang++ rejects that. What GCC libcpp seems to be doing is that if that forms_identifier_p calls _cpp_valid_utf8 or _cpp_valid_ucn with an argument which tells it is first or second+ in identifier, and e.g. _cpp_valid_ucn then for UCNs valid in string literals calls else if (identifier_pos) { int validity = ucn_valid_in_identifier (pfile, result, nst); if (validity == 0) cpp_error (pfile, CPP_DL_ERROR, "universal character %.*s is not valid in an identifier", (int) (str - base), base); else if (validity == 2 && identifier_pos == 1) cpp_error (pfile, CPP_DL_ERROR, "universal character %.*s is not valid at the start of an identifier", (int) (str - base), base); } so basically all those invalid in identifiers cases emit an error and pretend to be valid in identifiers, rather than what e.g. _cpp_valid_utf8 does for C but not for C++ and only for the chars completely invalid in identifiers rather than just valid in identifiers but not at the start: /* In C++, this is an error for invalid character in an identifier because logically, the UTF-8 was converted to a UCN during translation phase 1 (even though we don't physically do it that way). In C, this byte rather becomes grammatically a separate token. */ if (CPP_OPTION (pfile, cplusplus)) cpp_error (pfile, CPP_DL_ERROR, "extended character %.*s is not valid in an identifier", (int) (*pstr - base), base); else { *pstr = base; return false; } The comment doesn't really match what is done in recent C++ versions because there UCNs are translated to characters and not the other way around.
[Bug c++/115900] [14/15 Regression] constexpr object modification during construction gives "Modifying a const object is not allowed in a constant expression"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115900 --- Comment #8 from GCC Commits --- The trunk branch has been updated by Marek Polacek : https://gcc.gnu.org/g:d890b04197fb0ddba4fbfb32f88e266fa27e02f3 commit r15-2108-gd890b04197fb0ddba4fbfb32f88e266fa27e02f3 Author: Marek Polacek Date: Wed Jul 17 11:19:32 2024 -0400 c++: wrong error initializing empty class [PR115900] In r14-409, we started handling empty bases first in cxx_fold_indirect_ref_1 so that we don't need to recurse and waste time. This caused a bogus "modifying a const object" error. I'm appending my analysis from the PR, but basically, cxx_fold_indirect_ref now returns a different object than before, and we mark the wrong thing as const, but since we're initializing an empty object, we should avoid setting the object constness. ~~ Pre-r14-409: we're evaluating the call to C::C(), which is in the body of B::B(), which is the body of D::D(&d): C::C ((struct C *) this, NON_LVALUE_EXPR <0>) It's a ctor so we get here: 3118 /* Remember the object we are constructing or destructing. */ 3119 tree new_obj = NULL_TREE; 3120 if (DECL_CONSTRUCTOR_P (fun) || DECL_DESTRUCTOR_P (fun)) 3121 { 3122 /* In a cdtor, it should be the first `this' argument. 3123 At this point it has already been evaluated in the call 3124 to cxx_bind_parameters_in_call. */ 3125 new_obj = TREE_VEC_ELT (new_call.bindings, 0); new_obj=(struct C *) &d.D.2656 3126 new_obj = cxx_fold_indirect_ref (ctx, loc, DECL_CONTEXT (fun), new_obj); new_obj=d.D.2656.D.2597 We proceed to evaluate the call, then we get here: 3317 /* At this point, the object's constructor will have run, so 3318 the object is no longer under construction, and its possible 3319 'const' semantics now apply. Make a note of this fact by 3320 marking the CONSTRUCTOR TREE_READONLY. */ 3321 if (new_obj && DECL_CONSTRUCTOR_P (fun)) 3322 cxx_set_object_constness (ctx, new_obj, /*readonly_p=*/true, 3323 non_constant_p, overflow_p); new_obj is still d.D.2656.D.2597, its type is "C", cxx_set_object_constness doesn't set anything as const. This is fine. After r14-409: on line 3125, new_obj is (struct C *) &d.D.2656 as before, but we go to cxx_fold_indirect_ref_1: 5739 if (is_empty_class (type) 5740 && CLASS_TYPE_P (optype) 5741 && lookup_base (optype, type, ba_any, NULL, tf_none, off)) 5742 { 5743 if (empty_base) 5744 *empty_base = true; 5745 return op; type is C, which is an empty class; optype is "const D", and C is a base of D. So we return the VAR_DECL 'd'. Then we get to cxx_set_object_constness with object=d, which is const, so we mark the constructor READONLY. Then we're evaluating A::A() which has ((A*)this)->data = 0; we evaluate the LHS to d.D.2656.a, for which the initializer is {.D.2656={.a={.data=}}} which is TREE_READONLY and 'd' is const, so we think we're modifying a const object and fail the constexpr evaluation. PR c++/115900 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_call_expression): Set new_obj to NULL_TREE if cxx_fold_indirect_ref set empty_base to true. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-init23.C: New test.