[Bug go/106266] New: Libgo fails with recent glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106266 Bug ID: 106266 Summary: Libgo fails with recent glibc Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: go Assignee: ian at airs dot com Reporter: marxin at gcc dot gnu.org Target Milestone: --- Fails with: [ 4342s] /home/abuild/rpmbuild/BUILD/gcc-13.0.0+git194331/obj-x86_64-suse-linux/./gcc/xgcc -B/home/abuild/rpmbuild/BUILD/gcc-13.0.0+git194331/obj-x86_64-suse-linux/./gcc/ -B/usr/x86_64-suse-linux/bin/ -B/usr/x86_64-suse-linux/lib/ -isystem /usr/x86_64-suse-linux/include -isystem /usr/x86_64-suse-linux/sys-include -DHAVE_CONFIG_H -I. -I../../../libgo -I ../../../libgo/runtime -I../../../libgo/../libffi/include -I../libffi/include -pthread -L../libatomic/.libs -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O -fdump-go-spec=tmp-gen-sysinfo.go -std=gnu99 -S -o sysinfo.s ../../../libgo/sysinfo.c [ 4342s] In file included from /usr/include/linux/fs.h:19, [ 4342s] from ../../../libgo/sysinfo.c:162: [ 4342s] /usr/include/linux/mount.h:95:6: error: redeclaration of 'enum fsconfig_command' [ 4342s]95 | enum fsconfig_command { [ 4342s] | ^~~~ [ 4342s] In file included from ../../../libgo/sysinfo.c:141: [ 4342s] /usr/include/sys/mount.h:189:6: note: originally defined here [ 4342s] 189 | enum fsconfig_command [ 4342s] | ^~~~ It's very similar to the following libsanitizer issue: https://github.com/llvm/llvm-project/issues/56421
[Bug go/106266] Libgo fails with recent glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106266 Martin Liška changed: What|Removed |Added Last reconfirmed||2022-07-12 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Target Milestone|--- |13.0
Re: [Bug target/106265] RISC-V SPEC2017 507.cactu code bloat due to address generation
To be clear, `li rx, 4096' isn't unsupported: it's a very-much-supported idiom for `lui rx, 1`. On Mon, Jul 11, 2022 at 11:45 PM rguenth at gcc dot gnu.org via Gcc-bugs wrote: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106265 > > --- Comment #5 from Richard Biener --- > So why do we even emit unsupported 'li 4096' and leave it to the linker to > "optimize(?)"? At least the cost of this should be reflected - IIRC powerpc > recently got improvements for similar cases by changing the targets rtx_cost > hook to properly const SET from CONST_INT so that CSE doesn't leave so many > sets from constants around. > > OTOH LRA rematerialization also could be the culprit, thinking rematerializing > the constant is cheaper than spilling a register holding it.
[Bug target/106265] RISC-V SPEC2017 507.cactu code bloat due to address generation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106265 --- Comment #6 from Andrew Waterman --- To be clear, `li rx, 4096' isn't unsupported: it's a very-much-supported idiom for `lui rx, 1`. On Mon, Jul 11, 2022 at 11:45 PM rguenth at gcc dot gnu.org via Gcc-bugs wrote: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106265 > > --- Comment #5 from Richard Biener --- > So why do we even emit unsupported 'li 4096' and leave it to the linker to > "optimize(?)"? At least the cost of this should be reflected - IIRC powerpc > recently got improvements for similar cases by changing the targets rtx_cost > hook to properly const SET from CONST_INT so that CSE doesn't leave so many > sets from constants around. > > OTOH LRA rematerialization also could be the culprit, thinking rematerializing > the constant is cheaper than spilling a register holding it.
[Bug preprocessor/106267] New: #pragma GCC diagnostic ignored not preserved for a -Wattribute-alias warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106267 Bug ID: 106267 Summary: #pragma GCC diagnostic ignored not preserved for a -Wattribute-alias warning Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: marxin at gcc dot gnu.org Target Milestone: --- For the following code snippet isolated from Linux kernel the ignored -Wattribute-alias is not preserved: $ cat posix-stubs.c #define __diag_GCC(version, severity, s) \ __diag_GCC_##version(__diag_GCC_##severity s) #define __diag_GCC_ignore ignored #define __diag_str1(s) #s #define __diag_str(s) __diag_str1(s) #define __diag(s) _Pragma(__diag_str(GCC diagnostic s)) #define __diag_GCC_8(s) __diag(s) #define __diag_pop() __diag(pop) #define __diag_push() __diag(push) #define __diag_ignore(compiler, version, option, comment) \ __diag_##compiler(version, ignore, option) #define SYSCALL_ALIAS_PROTO(a) \ __diag_push(); \ __diag_ignore(GCC, 8, "-Wattribute-alias", \ "Alias to nonimplemented syscall"); \ typeof(a) a __attribute__((alias("sys_ni_posix_timers"))); \ __diag_pop() int sys_timer_create(int); void sys_ni_posix_timers(void) {} #define SYS_NI(name) SYSCALL_ALIAS_PROTO(sys_##name) SYS_NI(timer_create) $ gcc posix-stubs.c -c -Werror posix-stubs.c:23:42: error: ‘sys_timer_create’ alias between functions of incompatible types ‘int(int)’ and ‘void(void)’ [-Werror=attribute-alias=] 23 | #define SYS_NI(name) SYSCALL_ALIAS_PROTO(sys_##name) | ^~~~ posix-stubs.c:17:19: note: in definition of macro ‘SYSCALL_ALIAS_PROTO’ 17 | typeof(a) a __attribute__((alias("sys_ni_posix_timers"))); \ | ^ posix-stubs.c:24:1: note: in expansion of macro ‘SYS_NI’ 24 | SYS_NI(timer_create) | ^~ posix-stubs.c:21:6: note: aliased declaration here 21 | void sys_ni_posix_timers(void) {} | ^~~ cc1: all warnings being treated as errors While using pre-processed input works correctly: $ gcc posix-stubs.c -c -Werror -E > posix-stubs.i && gcc posix-stubs.i -c -Werror
[Bug preprocessor/106267] #pragma GCC diagnostic ignored not preserved for a -Wattribute-alias warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106267 Martin Liška changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2022-07-12 Status|UNCONFIRMED |NEW
[Bug fortran/106268] New: [suboptimal] Remove unnecessary loops releated to fortran compare to ifort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106268 Bug ID: 106268 Summary: [suboptimal] Remove unnecessary loops releated to fortran compare to ifort Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- For the kernel inner loop body, gcc generate an loop, while icc doesn't, see detail in https://godbolt.org/z/G77nKnf8W. ``` DO i = 1, 1 do l = 1, ADM_lall rhogw_vm(:,ADM_kmin ,l) = 0.D0 rhogw_vm(:,ADM_kmax+1,l) = 0.D0 enddo enddo ```
[Bug preprocessor/106267] #pragma GCC diagnostic ignored not preserved for a -Wattribute-alias warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106267 --- Comment #1 from Martin Liška --- Simplified test-case: #define SYSCALL_ALIAS_PROTO(a) \ _Pragma ("GCC diagnostic push"); \ _Pragma ("GCC diagnostic ignored \"-Wattribute-alias\""); \ typeof(a) a __attribute__((alias("sys_ni_posix_timers"))); \ _Pragma ("GCC diagnostic pop"); int sys_timer_create(int); void sys_ni_posix_timers(void) {} SYSCALL_ALIAS_PROTO(sys_timer_create)
[Bug preprocessor/106267] #pragma GCC diagnostic ignored not preserved for a -Wattribute-alias warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106267 Martin Liška changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #2 from Martin Liška --- Oh, it's fixed on master since r13-1596-g0587cef3d7962a8b. *** This bug has been marked as a duplicate of bug 97498 ***
[Bug preprocessor/97498] #pragma GCC diagnostic ignored "-Wunused-function" inconsistent
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97498 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #5 from Martin Liška --- *** Bug 106267 has been marked as a duplicate of this bug. ***
[Bug preprocessor/97498] #pragma GCC diagnostic ignored "-Wunused-function" inconsistent
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97498 --- Comment #6 from Martin Liška --- Am I correct that it's not something for backport to GCC 12 or any older version?
[Bug preprocessor/97498] #pragma GCC diagnostic ignored "-Wunused-function" inconsistent
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97498 --- Comment #7 from Martin Liška --- (In reply to Martin Liška from comment #6) > Am I correct that it's not something for backport to GCC 12 or any older > version? We actually speak about 2 modified lines, so can we backport it, please?
[Bug tree-optimization/99416] s211 benchmark of TSVC is vectorized by icc and not by gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99416 --- Comment #4 from Richard Biener --- (In reply to Richard Biener from comment #3) > Note it's only the outer loop that confuses us here. With that removed we > have > the following because of yet another "heuristic" to disable distribution. In fact we first analyze the whole nest but then continue to look at the inner loop only, so this isn't really an issue. The fusing because of shared memory refs is only because of the double use of d[i], b[i], b[i-1] or b[i+1] are not detected as problematic for distribution (the "same memory object" check isn't working as intended). Fuse partitions because they have shared memory refs: Part 1: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 16 Part 2: 0, 1, 5, 9, 10, 11, 12, 13, 14, 15, 16 note the intersection of both partitions includes half of the stmts (0, 1, 5, 6, 15, 16) that would be duplicated (5 is the d[i] load) while the other half is different. To defeat the final fusing reason we need a positive motivation, like tracking whether we know a partition can or cannot be vectorized (or whether we are not sure). For the partition containing the b[i], b[i+1] dependence distance of 1 we know we cannot vectorize (with a VF > 0).
[Bug fortran/106268] [suboptimal] Remove unnecessary loops releated to fortran compare to ifort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106268 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||missed-optimization Last reconfirmed||2022-07-12 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- I think ICC elides the outer loop, it does keep the inner loop where GCC unrolls the inner loop, keeping the outer loop. There's no "benchmark loop" elimination in GCC yet, instead we rely on code sinking which, at the moment, cannot sink the memset calls out of the outer loop. #include void foo (int *mem, int n) { for (int i = 0; i < 1000; ++i) memset (mem, 0, n * sizeof (int)); } ICC probably manages to elide the loop around the memset here.
[Bug target/106253] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106253 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org Target|aarch64-linux-gnu |aarch64-linux-gnu, ||arm-none-linux-gnueabihf --- Comment #6 from Tamar Christina --- Same problem happens with Armhf when building libc. during GIMPLE pass: vect In file included from sha256.c:213: ./sha256-block.c: In function ‘__sha256_process_block’: ./sha256-block.c:6:1: internal compiler error: in vect_transform_loops, at tree-vectorizer.cc:1032 6 | __sha256_process_block (const void *buffer, size_t len, struct sha256_ctx *ctx) | ^~
[Bug c/45840] Enhance __builtin_object_size to return useful result when applied to T (*p)[N]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45840 Jonathan Wakely changed: What|Removed |Added CC||redi at gcc dot gnu.org --- Comment #2 from Jonathan Wakely --- For C++ I expected this to give a sensible answer: size_t f(char (&s)[10]) { return __builtin_object_size(s, 0); } It's true that you can explicitly cast a different type to char(&)[10] but any attempt to access another type through that would (I think) be undefined. Presumably __builtin_object_size is being used alongside some access ... if not, who cares what it returns? Obviously this example is contrived (the array size is known here) but it matters for e.g. std::istream& read(std::istream& in, char (&buf)[10]) { return in >> buf; } The operator>> in libstdc++ uses __builtin_object_size to avoid overflow here, but it doesn't work because the size of char(&)[10] is not known. If the function is called like read(std::cin, reinterpret_cast(x)) then either x is a char array with at least 10 bytes (in which case it's harmless to treat it as a smaller array) or it's not (in which case writing up to 10 bytes to it is undefined anyway). By extension, maybe this should give a known size too (but maybe this is a separate enhancement request): size_t f(char* s) { s[9] = '\0'; return __builtin_object_size(s, 0); } If s[9] isn't valid, we already have UB. So in the absence of anything more precise, 10 is a reasonable lower bound for the array size.
[Bug tree-optimization/105860] [10/11/12/13 Regression] Miscompilation causing clobbered union contents since r10-918-gc56c86024f8fba0c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105860 --- Comment #9 from CVS Commits --- The releases/gcc-11 branch has been updated by Martin Jambor : https://gcc.gnu.org/g:16afe2e2862f3dd93c711d7f8d436dee23c6c34d commit r11-10144-g16afe2e2862f3dd93c711d7f8d436dee23c6c34d Author: Martin Jambor Date: Tue Jul 12 13:16:35 2022 +0200 tree-sra: Fix union handling in build_reconstructed_reference As the testcase in PR 105860 shows, the code that tries to re-use the handled_component chains in SRA can be horribly confused by unions, where it thinks it has found a compatible structure under which it can chain the references, but in fact it found the type it was looking for elsewhere in a union and generated a write to a completely wrong part of an aggregate. I don't remember whether the plan was to support unions at all in build_reconstructed_reference but it can work, to an extent, if we make sure that we start the search only outside the outermost union, which is what the patch does (and the extra testcase verifies). Additionally, this commit also contains sqashed in it a backport of b984b84cbe4bf026edef2ba37685f3958a1dc1cf which fixes the testcase gcc.dg/tree-ssa/alias-access-path-13.c for many 32-bit targets. gcc/ChangeLog: 2022-07-01 Martin Jambor PR tree-optimization/105860 * tree-sra.c (build_reconstructed_reference): Start expr traversal only just below the outermost union. gcc/testsuite/ChangeLog: 2022-07-01 Martin Jambor PR tree-optimization/105860 * gcc.dg/tree-ssa/alias-access-path-13.c: New test. * gcc.dg/tree-ssa/pr105860.c: Likewise. (cherry picked from commit b110e5283e368b5377e04766e4ff82cd52634208)
[Bug c++/106269] New: the "operator delete" selection does not follow c++ spec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106269 Bug ID: 106269 Summary: the "operator delete" selection does not follow c++ spec Product: gcc Version: rust/master Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: chumarshal at foxmail dot com Target Milestone: --- #include #include void operator delete[]( void* ptr ) noexcept { std::cout << "delete with 1 parameters==" << std::endl; ::operator delete(ptr); } void operator delete[]( void* ptr, std::size_t size ) noexcept { std::cout << "delete with 2 parameters==" << std::endl; ::operator delete(ptr); } int main() { int* p = new int[2]; p[0] = 1; p[1] = 2; delete[] p; } The result is: delete with 1 parameters== But it does not follow c++14 spec as follow: C++14 Standard (ISO/IEC 14882:2014), Section 5.3.5, Paragraph 10: "If the type is complete and if deallocation function lookup finds both a usual deallocation function with only a pointer parameter and a usual deallocation function with both a pointer parameter and a size parameter, then the selected deallocation function shall be the one with two parameters. Otherwise, the selected deallocation function shall be the function with one parameter."
[Bug tree-optimization/106268] [suboptimal] Remove unnecessary loops releated to fortran compare to ifort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106268 --- Comment #2 from vfdff --- it seems different for the C version, see detail https://godbolt.org/z/vc1edYKhf in your above case, the icc also doesn't elide the outer loop.
[Bug c++/106269] the "operator delete" selection does not follow c++ spec
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106269 Jonathan Wakely changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Jonathan Wakely --- This is the correct behaviour, see https://wg21.link/cwg1788 which changed the standard.
[Bug target/106253] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106253 --- Comment #7 from Richard Biener --- Btw, I can see FAIL: gcc.dg/vect/vect-rounding-lceil.c (internal compiler error: in vect_transf orm_loops, at tree-vectorizer.cc:1032) FAIL: gcc.dg/vect/vect-rounding-lfloor.c (internal compiler error: in vect_transform_loops, at tree-vectorizer.cc:1032) FAIL: gfortran.dg/g77/20010430.f -O2 (internal compiler error: in vect_transform_loops, at tree-vectorizer.cc:1032) (NINT user) when testing aarch64-linux. Those are all this aarch64 builtin issue (and probably also reproduce on the arm backend side).
[Bug target/106253] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106253 --- Comment #8 from CVS Commits --- The trunk branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:00eab0c654e09c8a0f1b1a3b1c7bff8764e64991 commit r13-1647-g00eab0c654e09c8a0f1b1a3b1c7bff8764e64991 Author: Richard Sandiford Date: Tue Jul 12 14:09:44 2022 +0100 Add internal functions for iround etc. [PR106253] The PR is about the aarch64 port using an ACLE built-in function to vectorise a scalar function call, even though the ECF_* flags for the ACLE function didn't match the ECF_* flags for the scalar call. To some extent that kind of difference is inevitable, since the ACLE intrinsics are supposed to follow the behaviour of the underlying instruction as closely as possible. Also, using target-specific builtins has the drawback of limiting further gimple optimisation, since the gimple optimisers won't know what the function does. We handle several other maths functions, including round, floor and ceil, by defining directly-mapped internal functions that are linked to the associated built-in functions. This has two main advantages: - it means that, internally, we are not restricted to the set of scalar types that happen to have associated C/C++ functions - the functions (and thus the underlying optabs) extend naturally to vectors This patch takes the same approach for the remaining functions handled by aarch64_builtin_vectorized_function. gcc/ PR target/106253 * predict.h (insn_optimization_type): Declare. * predict.cc (insn_optimization_type): New function. * internal-fn.def (IFN_ICEIL, IFN_IFLOOR, IFN_IRINT, IFN_IROUND) (IFN_LCEIL, IFN_LFLOOR, IFN_LRINT, IFN_LROUND, IFN_LLCEIL) (IFN_LLFLOOR, IFN_LLRINT, IFN_LLROUND): New internal functions. * internal-fn.cc (unary_convert_direct): New macro. (expand_convert_optab_fn): New function. (expand_unary_convert_optab_fn): New macro. (direct_unary_convert_optab_supported_p): Likewise. * optabs.cc (expand_sfix_optab): Pass insn_optimization_type to convert_optab_handler. * config/aarch64/aarch64-protos.h (aarch64_builtin_vectorized_function): Delete. * config/aarch64/aarch64-builtins.cc (aarch64_builtin_vectorized_function): Delete. * config/aarch64/aarch64.cc (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Delete. * config/i386/i386.cc (ix86_optab_supported_p): Handle lround_optab. * config/i386/i386.md (lround2): Remove optimize_insn_for_size_p test. gcc/testsuite/ PR target/106253 * gcc.target/aarch64/vect_unary_1.c: Add tests for iroundf, llround, iceilf, llceil, ifloorf, llfloor, irintf and llrint. * gfortran.dg/vect/pr106253.f: New test.
[Bug target/106253] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106253 rsandifo at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #9 from rsandifo at gcc dot gnu.org --- Fixed for aarch64. I'll do the same thing for arm.
[Bug c++/105989] Coroutine frame space for temporaries in a co_await expression is not reused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105989 --- Comment #4 from Michal Jankovič --- Comment on attachment 53273 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53273 Experimental patch implementing the proposed transformation diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc index edb3b706ddc..ed1ac4decaf 100644 --- a/gcc/cp/coroutines.cc +++ b/gcc/cp/coroutines.cc @@ -1997,6 +1997,7 @@ struct local_var_info bool is_static; bool has_value_expr_p; location_t def_loc; + vec *field_access_path; }; /* For figuring out what local variable usage we have. */ @@ -2009,6 +2010,26 @@ struct local_vars_transform hash_map *local_var_uses; }; +/* Build a COMPONENT_REF chain for accessing a nested variable in the coroutine + frame. */ +static tree +build_local_var_frame_access_expr (local_vars_transform *lvt, + local_var_info *local_var) +{ + tree access_expr = lvt->actor_frame; + + for (tree path_elem_id : *local_var->field_access_path) +{ + tree path_elem_member = lookup_member ( + TREE_TYPE (access_expr), path_elem_id, 1, 0, tf_warning_or_error); + access_expr = build3_loc ( + lvt->loc, COMPONENT_REF, TREE_TYPE (path_elem_member), + access_expr, path_elem_member, NULL_TREE); +} + + return access_expr; +} + static tree transform_local_var_uses (tree *stmt, int *do_subtree, void *d) { @@ -2040,12 +2061,7 @@ transform_local_var_uses (tree *stmt, int *do_subtree, void *d) if (local_var.field_id == NULL_TREE) continue; /* Wasn't used. */ - tree fld_ref - = lookup_member (lvd->coro_frame_type, local_var.field_id, -/*protect=*/1, /*want_type=*/0, -tf_warning_or_error); - tree fld_idx = build3_loc (lvd->loc, COMPONENT_REF, TREE_TYPE (lvar), -lvd->actor_frame, fld_ref, NULL_TREE); + tree fld_idx = build_local_var_frame_access_expr (lvd, &local_var); local_var.field_idx = fld_idx; SET_DECL_VALUE_EXPR (lvar, fld_idx); DECL_HAS_VALUE_EXPR_P (lvar) = true; @@ -3873,14 +3889,24 @@ analyze_fn_parms (tree orig) /* Small helper for the repetitive task of adding a new field to the coro frame type. */ +static void +coro_make_frame_entry_id (tree *field_list, tree id, tree fld_type, + location_t loc) +{ + tree decl = build_decl (loc, FIELD_DECL, id, fld_type); + DECL_CHAIN (decl) = *field_list; + *field_list = decl; +} + +/* Same as coro_make_frame_entry_id, but creates an identifier from string. */ + static tree coro_make_frame_entry (tree *field_list, const char *name, tree fld_type, location_t loc) { tree id = get_identifier (name); - tree decl = build_decl (loc, FIELD_DECL, id, fld_type); - DECL_CHAIN (decl) = *field_list; - *field_list = decl; + coro_make_frame_entry_id (field_list, id, fld_type, loc); + return id; } @@ -3894,6 +3920,8 @@ struct local_vars_frame_data location_t loc; bool saw_capture; bool local_var_seen; + tree orig; + vec *field_access_path; }; /* A tree-walk callback that processes one bind expression noting local @@ -3912,6 +3940,21 @@ register_local_var_uses (tree *stmt, int *do_subtree, void *d) if (TREE_CODE (*stmt) == BIND_EXPR) { + tree scope_field_id = NULL_TREE; + if (lvd->nest_depth != 0) + { + /* Create identifier under which fields for this bind-expression will +be accessed. */ + char *scope_field_name + = xasprintf ("_Scope%u_%u", lvd->nest_depth, lvd->bind_indx); + scope_field_id = get_identifier (scope_field_name); + free (scope_field_name); + + vec_safe_push (lvd->field_access_path, scope_field_id); + } + + tree scope_variables = NULL_TREE; + tree lvar; unsigned serial = 0; for (lvar = BIND_EXPR_VARS (*stmt); lvar != NULL; @@ -3980,17 +4023,99 @@ register_local_var_uses (tree *stmt, int *do_subtree, void *d) /* TODO: Figure out if we should build a local type that has any excess alignment or size from the original decl. */ - local_var.field_id = coro_make_frame_entry (lvd->field_list, buf, + + local_var.field_id = coro_make_frame_entry (&scope_variables, buf, lvtype, lvd->loc); free (buf); /* We don't walk any of the local var sub-trees, they won't contain any bind exprs. */ + + local_var.field_access_path = make_tree_vector_copy ( + lvd->field_access_path); + vec_safe_push (local_var.field_access_path, local_var.field_id); } + + unsigned bind_indx = lvd->bind_indx; + tree* parent_field_list = lvd->field_list; + + /* Collect the scope structs of child bind-expressions when recursing. */ + tree child_scopes = NULL_TREE; +
[Bug c++/66290] wrong location for -Wunused-macros
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66290 --- Comment #4 from Lewis Hyatt --- OK, I understand now why done_lexing is necessary, plenty of places call back into libcpp after lexing, e.g. to interpret strings, and this may generate warnings. I think that one line patch is the way to go then. diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index b9f01a65ed7..25a3c50de8e 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1283,6 +1283,7 @@ c_common_finish (void) /* For performance, avoid tearing down cpplib's internal structures with cpp_destroy (). */ + done_lexing = false; cpp_finish (parse_in, deps_stream); if (deps_stream && deps_stream != out_stream && deps_stream != stdout
[Bug c++/105989] Coroutine frame space for temporaries in a co_await expression is not reused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105989 Michal Jankovič changed: What|Removed |Added Attachment #53273|0 |1 is obsolete|| --- Comment #5 from Michal Jankovič --- Created attachment 53290 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53290&action=edit Patch implementing the proposed transformation Submitted patch implementing the proposed transformation, along with a test case.
[Bug c++/105912] internal compiler error: in extract_call_expr, at cp/call.cc:7114
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105912 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org
[Bug target/106270] New: [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 Bug ID: 106270 Summary: [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: qinzhao at gcc dot gnu.org Target Milestone: --- Created attachment 53291 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53291&action=edit tar ball for the small testing case In aarch64 backend, aarch64_is_long_call_p always return FALSE by default, as a result, direct calls (bl) are generated for all the calls. For smaller applications, generating direct calls by default for all the calls will have good run-time performance. However, for larger applications, When the size of the .text section exceed some limitation, linker will fail with the following error: relocation truncated to fit: R_AARCH64_CALL26 against symbol This can be showed by the attached tar ball for a small testing case: 1. download the tar ball, and untar it; 2. cd aarch64_long_call 3. update the "GCC" in "build.sh", "assemble.sh" to your own gcc (gcc 8 and above all have the same issue). 4. sh do_all.sh gcc -S bar.c foo.c patching file bar.s Hunk #1 succeeded at 21 (offset -2 lines). gcc -c bar.s foo.s ld: warning: cannot find entry symbol _start; defaulting to 1000 bar.o: In function `bar': bar.c:(.text+0x10): relocation truncated to fit: R_AARCH64_CALL26 against symbol `foo' defined in .text section in foo.o
[Bug target/106270] [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 --- Comment #1 from qinzhao at gcc dot gnu.org --- from Jose Marchesi: We looked at this issue and these are our findings. - When this problem happens: When the linker (ld) fails to insert a veneer (that transforms the immediate bl jumps generated by GCC into an indirect call) in range with the original bl instruction. This issue happens because it contains .text sections bigger than ~128MiB or when the range between a bl instruction and the beginning/or/end of the containing section exceeds ~128MiB. Note that this also happens with linker-generated PLT sections when linking shared objects, much like the veneers. - Proposed solutions: We can think on two complementary solutions: a) At the moment the aarch64 linker seems to only know how to insert a veneer _after_ the section that contains the branch/call instruction. We could make the linker smarter so it knows how to insert the veneer _before_ the sect on containing the branch/call instruction. This would make things better, but obviously would not work in all cases. b) So, in any case, we propose to add a new option to GCC (aarch64 specific) that will make GCC to generate indirect branch instructions (blr) for non-PLT calls to symbols not having local binding/scope: -mlong-calls -> Make aarch64_is_long_call_p return true -> This makes GCC to generate blr instead of bl when compiling non-PIC and calls to symbol references with non-local binding. -> This will have an impact in performance, but this will _not_ impact inter-section calls nor to calls to local objects.
[Bug tree-optimization/106249] [13 Regression] ICE in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:645
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106249 --- Comment #4 from Arseny Solokha --- Finally, a C testcase, and w/o -funreachable-traps: void foo (double *arr) { int i, j; for (i = 0; i < 4; ++i) for (j = 0; j < 4; ++j) arr[j] = 0; for (i = 1; i < 4; ++i) for (j = 0; j < 4; ++j) arr[j] = 1.0 / (i + 1); } % gcc-13.0.0 -O1 -floop-unroll-and-jam --param unroll-jam-min-percent=0 -c o87rfyb9.c during GIMPLE pass: unrolljam o87rfyb9.c: In function 'foo': o87rfyb9.c:2:1: internal compiler error: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:645 2 | foo (double *arr) | ^~~ 0x772a0b check_loop_closed_ssa_def /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/tree-ssa-loop-manip.cc:645 0x1064e6f check_loop_closed_ssa_bb /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/tree-ssa-loop-manip.cc:670 0x1066116 verify_loop_closed_ssa(bool, loop*) /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/tree-ssa-loop-manip.cc:695 0x1066116 verify_loop_closed_ssa(bool, loop*) /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/tree-ssa-loop-manip.cc:679 0x1068849 checking_verify_loop_closed_ssa /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/tree-ssa-loop-manip.h:34 0x1068849 tree_transform_and_unroll_loop(loop*, unsigned int, tree_niter_desc*, void (*)(loop*, void*), void*) /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/tree-ssa-loop-manip.cc:1431 0x1cf56cc tree_loop_unroll_and_jam /var/tmp/portage/sys-devel/gcc-13.0.0_p20220710/work/gcc-13-20220710/gcc/gimple-loop-jam.cc:595
[Bug target/106270] [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from Wilco --- GCC will crash well before reaching 128 MBytes of .text. So what is the real underlying problem? Note that GCC could split huge .text sections automatically to allow insertion of linker veneers every 128MB. So -mlong-calls is simply an incorrect solution for a problem that doesn't exist yet...
[Bug target/106270] [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 --- Comment #3 from Jose E. Marchesi --- Wilco: The assessment in comment 1 was extracted from an internal discussion on an issue that is still under investigation. We are certainly hitting a cant-reach-the-linker-generated-veneer problem, but it is not fully clear to us how since it is getting difficult to get proper reproducers. In any case, the idea of splitting of the text section by the compiler is interesting, and a much better solution than -mlong-calls since it wouldn't involve generate unnecessary indirect branches. But how would the back-end keep track on the size of the code it generates? Using insn size attributes?
[Bug target/106271] New: Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such file or directory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106271 Bug ID: 106271 Summary: Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such file or directory Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- I thought I would give the new gcc92 machine a spin and tried bootstrapping gcc on it. Configure was done with ../Gcc/configure --disable-multilib --prefix=$HOME --enable-languages=c,c++,fortran and the last few lines of output were /home/tkoenig/trunk-bin/./gcc/xgcc -B/home/tkoenig/trunk-bin/./gcc/ -B/home/tkoenig/riscv64-unknown-linux-gnu/bin/ -B/home/tkoenig/riscv64-unknown-linux-gnu/lib/ -isystem /home/tkoenig/riscv64-unknown-linux-gnu/include -isystem /home/tkoenig/riscv64-unknown-linux-gnu/sys-include -fno-checking -g -O2 -O2 -g -O2 -DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -fPIC -I. -I. -I../.././gcc -I../../../Gcc/libgcc -I../../../Gcc/libgcc/. -I../../../Gcc/libgcc/../gcc -I../../../Gcc/libgcc/../include -DHAVE_CC_TLS -o _ashrdi3.o -MT _ashrdi3.o -MD -MP -MF _ashrdi3.dep -DL_ashrdi3 -c ../../../Gcc/libgcc/libgcc2.c -fvisibility=hidden -DHIDE_EXPORTS In file included from ../../../Gcc/libgcc/../gcc/tsystem.h:87, from ../../../Gcc/libgcc/libgcc2.c:27: /usr/include/stdio.h:27:10: fatal error: bits/libc-header-start.h: No such file or directory 27 | #include | ^~ In file included from ../../../Gcc/libgcc/../gcc/tsystem.h:87, from ../../../Gcc/libgcc/libgcc2.c:27: /usr/include/stdio.h:27:10: fatal error: bits/libc-header-start.h: No such file or directory 27 | #include | ^~ compilation terminated. compilation terminated. In file included from ../../../Gcc/libgcc/../gcc/tsystem.h:87, from ../../../Gcc/libgcc/libgcc2.c:27: /usr/include/stdio.h:27:10: fatal error: bits/libc-header-start.h: No such file or directory 27 | #include | ^~ compilation terminated. make[3]: *** [Makefile:501: _negdi2.o] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: *** [Makefile:501: _lshrdi3.o] Error 1 make[3]: *** [Makefile:501: _ashldi3.o] Error 1 In file included from ../../../Gcc/libgcc/../gcc/tsystem.h:87, from ../../../Gcc/libgcc/libgcc2.c:27: /usr/include/stdio.h:27:10: fatal error: bits/libc-header-start.h: No such file or directory 27 | #include | ^~ compilation terminated. make[3]: *** [Makefile:501: _ashrdi3.o] Error 1
[Bug target/106270] [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 --- Comment #4 from Qing Zhao --- > On Jul 12, 2022, at 1:02 PM, wilco at gcc dot gnu.org > wrote: > > Note that GCC could split huge .text sections automatically to allow insertion > of linker veneers every 128MB. Does GCC do this by default? Any option is needed for this functionality?
[Bug target/106271] Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such file or directory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106271 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |WAITING Ever confirmed|0 |1 Last reconfirmed||2022-07-12 --- Comment #1 from Andrew Pinski --- I suspect configure is not detecting multi-arch correctly or --disable-multilib interacting with multi-arch support which causes things to be broken. OR multi-arch support is not in the riscv backend yet.
[Bug fortran/106049] ICE in gfc_simplify_pack, at fortran/simplify.cc:6481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106049 --- Comment #4 from CVS Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:6e9d5dfc2911e3acc6039ebfe3837e7ba4be197f commit r13-1650-g6e9d5dfc2911e3acc6039ebfe3837e7ba4be197f Author: Harald Anlauf Date: Tue Jul 5 22:20:05 2022 +0200 Fortran: error recovery simplifying PACK with invalid arguments [PR106049] gcc/fortran/ChangeLog: PR fortran/106049 * simplify.cc (is_constant_array_expr): A non-zero-sized constant array shall have a non-empty constructor. When the constructor is empty or missing, treat as non-constant. gcc/testsuite/ChangeLog: PR fortran/106049 * gfortran.dg/pack_simplify_1.f90: New test.
[Bug fortran/106049] ICE in gfc_simplify_pack, at fortran/simplify.cc:6481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106049 anlauf at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from anlauf at gcc dot gnu.org --- Fixed.
[Bug target/106270] [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 --- Comment #5 from Wilco --- (In reply to Jose E. Marchesi from comment #3) > Wilco: The assessment in comment 1 was extracted from an internal discussion > on an issue that is still under investigation. We are certainly hitting a > cant-reach-the-linker-generated-veneer problem, but it is not fully clear to > us how since it is getting difficult to get proper reproducers. It is worth checking you're using a recent binutils since old ones had a bug in the veneer code (https://sourceware.org/bugzilla/show_bug.cgi?id=25665). You can hit offset limits easily if you use a linker script which places text sections very far apart. As the example shows, incorrect use of alignment directives can cause issues as well. Ideally the assembler should give a warning if there is a text section larger than 127 MB. > In any case, the idea of splitting of the text section by the compiler is > interesting, and a much better solution than -mlong-calls since it wouldn't > involve generate unnecessary indirect branches. > > But how would the back-end keep track on the size of the code it generates? > Using insn size attributes? Yes, GCC tracks branch ranges. CBZ and TBZ have a small range and are automatically handled if out of range. IIRC GCC doesn't yet extend Bcc, so if a single function is over 1MB, GCC won't be able to compile it.
[Bug target/106270] [Aarch64] -mlong-calls should be provided on aarch64 for users with large applications
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106270 --- Comment #6 from Wilco --- (In reply to Qing Zhao from comment #4) > > On Jul 12, 2022, at 1:02 PM, wilco at gcc dot gnu.org > > wrote: > > > > Note that GCC could split huge .text sections automatically to allow > > insertion > > of linker veneers every 128MB. > > Does GCC do this by default? Any option is needed for this functionality? No, currently it is not able to reach this limit, but once it can, it should be done automatically.
[Bug c/106272] New: clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 Bug ID: 106272 Summary: clang build: new warning ? Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- Recent gcc builds with clang say this: libcpp/include/line-map.h:1882:12: warning: moving a temporary object prevents copy elision [-Wpessimizing-move] I have little idea what that means, but the line of code is return std::move (label_text (buffer, true)); This new warning appears about 700 times, so it might be important.
[Bug preprocessor/106272] clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 Jonathan Wakely changed: What|Removed |Added Last reconfirmed||2022-07-12 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Jonathan Wakely --- (In reply to David Binderman from comment #0) > This new warning appears about 700 times, so it might be important. It's not. But the move is useless and shouldn't be there.
[Bug target/106273] New: [13 Regression] wrong code with -Og -march=cascadelake (due to ANDN?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106273 Bug ID: 106273 Summary: [13 Regression] wrong code with -Og -march=cascadelake (due to ANDN?) Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 53292 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53292&action=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -Og -march=cascadelake testcase.c $ ./a.out Aborted The value is 10 instead of 5. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r13-1649-20220712164051-gcab411a2b4b-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r13-1649-20220712164051-gcab411a2b4b-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.0.0 20220712 (experimental) (GCC)
[Bug target/106273] [13 Regression] wrong code with -Og -march=cascadelake (due to ANDN?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106273 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |13.0
[Bug preprocessor/106272] clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 --- Comment #2 from David Binderman --- I just noticed similar four lines earlier: libcpp/include/line-map.h:1876:12: warning: moving a temporary object prevents copy elision [-Wpessimizing-move] Source code is return std::move (label_text (const_cast (buffer), false)); About 700 mentions of this one, as well.
[Bug lto/106274] New: Loss of macro tracking information with -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106274 Bug ID: 106274 Summary: Loss of macro tracking information with -flto Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: lhyatt at gcc dot gnu.org CC: marxin at gcc dot gnu.org Target Milestone: --- This is related to PR101551 but easier to demonstrate a testcase for: === #define X(p) p == 0 int f(void *) __attribute__((nonnull)); int f(void *p) { return X(p); } When compiled without -flto, there is information on the macro expansion in the diagnostics: $ gcc -c t5.c -Wnonnull-compare /home/lewis/t5.c: In function ‘f’: /home/lewis/t5.c:1:16: warning: nonnull argument ‘p’ compared to NULL [-Wnonnull-compare] 1 | #define X(p) p == 0 |^ /home/lewis/t5.c:4:12: note: in expansion of macro ‘X’ 4 | return X(p); |^ However, if you add -flto, you don't get the extra information: $ gcc -c t5.c -Wnonnull-compare -flto /home/lewis/t5.c: In function ‘f’: /home/lewis/t5.c:1:16: warning: nonnull argument ‘p’ compared to NULL [-Wnonnull-compare] 1 | #define X(p) p == 0 |^ The reason is that this warning is generated after the ipa_free_lang_data pass, and that does this: /* If we are the LTO frontend we have freed lang-specific data already. */ if (in_lto_p || (!flag_generate_lto && !flag_generate_offload)) { /* Rebuild type inheritance graph even when not doing LTO to get consistent profile data. */ rebuild_type_inheritance_graph (); return 0; } ... /* Reset diagnostic machinery. */ tree_diagnostics_defaults (global_dc); With -flto, flag_generate_lto is true, so it doesn't return early, and proceeds to the last line, which resets the diagnostic finalizer to default_diagnostic_finalizer, which is not aware of virtual locations. PR101551 is about more or less the same thing except it's the other case that prevents this function from returning early (flag_generate_offload == true). I am not sure to what extent they are otherwise related. Is it possible to avoid resetting the diagnostics machinery in either of these cases? Thanks...
[Bug target/106265] RISC-V SPEC2017 507.cactu code bloat due to address generation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106265 --- Comment #7 from Vineet Gupta --- (In reply to Richard Biener from comment #5) > So why do we even emit unsupported 'li 4096' and leave it to the linker to > "optimize(?)"? li 4096 is really a pseudo-op - LUI is used to build 32-bit constants. For this problem whether LI or LUI is used is just a detail. > OTOH LRA rematerialization also could be the culprit, thinking rematerializing > the constant is cheaper than spilling a register holding it. Thx for the pointer. I tried disabling it -fno-lra-remat and it generates exactly same code.
[Bug preprocessor/106272] clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 Eric Gallager changed: What|Removed |Added Keywords||build, diagnostic CC||egallager at gcc dot gnu.org --- Comment #3 from Eric Gallager --- Note that GCC has its own version of -Wpessimizing-move, too... any idea why clang's version of the flag catches it, but gcc's doesn't?
[Bug target/106271] Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such file or directory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106271 --- Comment #2 from Thomas Koenig --- (In reply to Andrew Pinski from comment #1) > I suspect configure is not detecting multi-arch correctly or > --disable-multilib interacting with multi-arch support which causes things > to be broken. > OR multi-arch support is not in the riscv backend yet. I get the same result without --disable-multilib.
[Bug libstdc++/106275] New: unordered_map with std::string key, std::hash, and custom equality predicate weirdness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106275 Bug ID: 106275 Summary: unordered_map with std::string key, std::hash, and custom equality predicate weirdness Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: cuzdav at gmail dot com Target Milestone: --- g++12.1 (on linux), Does not occur on GCC 11 or earlier. std::unordered_map, CustomPred> acts strangely such that finds do not seem to use the hasher properly, and seem to use a linear search, invoking the equality predicate against every key. A custom hasher implemented in terms of std::hash fixes it, as does using the default equality predicate. It also does not happen with key type of, say, "int". I've only seen it for std::string (in my limited experimentation.) I added output to the predicate to indicate when it's called, and it shows excessive calls printed when "SHOWBUG" macro is defined. #include #include #include #include struct EqualToWrapper{ bool operator()(const std::string& key1, const std::string& key2) const { std::cout << "equal_to(key1=" << key1 << ", key2=" << key2 << ")\n"; return std::equal_to<>{}(key1, key2); } }; #ifdef SHOWBUG using UsageMap = std::unordered_map, EqualToWrapper>; #else struct MyHash : public std::hash {}; using UsageMap = std::unordered_map; #endif int main() { UsageMap m; m.insert(std::make_pair("A", 111)); m.insert(std::make_pair("B", 222)); m.insert(std::make_pair("C", 333)); m.insert(std::make_pair("D", 444)); m.insert(std::make_pair("E", 555)); m.insert(std::make_pair("F", 666)); m.find("foo"); } With a custom equality predicate and my derived-from-std::hash hasher, output on g++ 12 is: equal_to(key1=C, key2=A) When run with -DSHOWBUG macro defined, output is: equal_to(key1=B, key2=A) equal_to(key1=C, key2=B) equal_to(key1=C, key2=A) equal_to(key1=D, key2=B) equal_to(key1=D, key2=C) equal_to(key1=D, key2=A) equal_to(key1=E, key2=B) equal_to(key1=E, key2=D) equal_to(key1=E, key2=C) equal_to(key1=E, key2=A) equal_to(key1=F, key2=E) equal_to(key1=F, key2=B) equal_to(key1=F, key2=D) equal_to(key1=F, key2=C) equal_to(key1=F, key2=A) equal_to(key1=foo, key2=F) equal_to(key1=foo, key2=E) equal_to(key1=foo, key2=B) equal_to(key1=foo, key2=D) equal_to(key1=foo, key2=C) equal_to(key1=foo, key2=A) On Godbolt: https://godbolt.org/z/GP5dox1qs $ g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/opt/imc/gcc-12.1.0/libexec/gcc/x86_64-pc-linux-gnu/12.1.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-12.1.0/configure --prefix=/opt/imc/gcc-12.1.0 --enable-languages=c,c++,fortran,lto --disable-multilib --with-build-time-tools=/build/INSTALLDIR//opt/imc/gcc-12.1.0/bin --enable-libstdcxx-time=rt Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.1.0 (GCC)
[Bug libstdc++/106275] unordered_map with std::string key, std::hash, and custom equality predicate weirdness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106275 --- Comment #1 from Chris Uzdavinis --- Created attachment 53293 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53293&action=edit preprocessed code
[Bug tree-optimization/106246] [13 Regression] powerpc-darwin9 bootstrap fails after r13-1575-gcf3a120084e946 ICE vect_transform_loops, at tree-vectorizer.cc:1032
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106246 Iain Sandoe changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #8 from Iain Sandoe --- indeed, fixed by r13-1603-g415d2c38edad thanks.
[Bug libstdc++/106248] [11/12/13 Regression] operator>>std::basic_istream at boundary condition behave differently in different opt levels
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106248 --- Comment #8 from CVS Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:5ae74944af1de032d4a27fad4a2287bd3a2163fd commit r13-1651-g5ae74944af1de032d4a27fad4a2287bd3a2163fd Author: Jonathan Wakely Date: Tue Jul 12 11:18:47 2022 +0100 libstdc++: Check for EOF if extraction avoids buffer overflow [PR106248] In r11-2581-g17abcc77341584 (for LWG 2499) I added overflow checks to the pre-C++20 operator>>(istream&, char*) overload. Those checks can cause extraction to stop after filling the buffer, where previously it would have tried to extract another character and stopped at EOF. When that happens we no longer set eofbit in the stream state, which is consistent with the behaviour of the new C++20 overload, but is an observable and unexpected change in the C++17 behaviour. What makes it worse is that the behaviour change is dependent on optimization, because __builtin_object_size is used to detect the buffer size and that only works when optimizing. To avoid the unexpected and optimization-dependent change in behaviour, set eofbit manually if we stopped extracting because of the buffer size check, but had reached EOF anyway. If the stream's rdstate() != goodbit or width() is non-zero and smaller than the buffer, there's nothing to do. Otherwise, we filled the buffer and need to check for EOF, and maybe set eofbit. The new check is guarded by #ifdef __OPTIMIZE__ because otherwise __builtin_object_size is useless. There's no point compiling and emitting dead code that can't be eliminated because we're not optimizing. We could add extra checks that the next character in the buffer is not whitespace, to detect the case where we stopped early and prevented a buffer overflow that would have happened otherwise. That would allow us to assert or set badbit in the stream state when undefined behaviour was prevented. However, those extra checks would increase the size of the function, potentially reducing the likelihood of it being inlined, and so making the buffer size detection less reliable. It seems preferable to prevent UB and silently truncate, rather than miss the UB and allow the overflow to happen. libstdc++-v3/ChangeLog: PR libstdc++/106248 * include/std/istream [C++17] (operator>>(istream&, char*)): Set eofbit if we stopped extracting at EOF. * testsuite/27_io/basic_istream/extractors_character/char/pr106248.cc: New test. * testsuite/27_io/basic_istream/extractors_character/wchar_t/pr106248.cc: New test.
[Bug libstdc++/106248] [11/12 Regression] operator>>std::basic_istream at boundary condition behave differently in different opt levels
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106248 Jonathan Wakely changed: What|Removed |Added Known to fail||12.1.0 Summary|[11/12/13 Regression] |[11/12 Regression] |operator>>std::basic_istrea |operator>>std::basic_istrea |m at boundary condition |m at boundary condition |behave differently in |behave differently in |different opt levels|different opt levels --- Comment #9 from Jonathan Wakely --- Fixed on trunk only so far.
[Bug libstdc++/106275] unordered_map with std::string key, std::hash, and custom equality predicate weirdness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106275 --- Comment #2 from Jonathan Wakely --- The difference in behaviour is due to r12-6272-ge3ef832a9e8d6a
[Bug libstdc++/106275] unordered_map with std::string key, std::hash, and custom equality predicate weirdness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106275 --- Comment #3 from Jonathan Wakely --- template struct _Hashtable_hash_traits { static constexpr std::size_t __small_size_threshold() noexcept { return std::__is_fast_hash<_Hash>::value ? 0 : 20; } }; This new trait determines whether to just do a linear search in a small container: template auto _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: find(const key_type& __k) -> iterator { if (size() <= __small_size_threshold()) { for (auto __it = begin(); __it != end(); ++__it) if (this->_M_key_equals(__k, *__it._M_cur)) return __it; return end(); } __hash_code __code = this->_M_hash_code(__k); std::size_t __bkt = _M_bucket_index(__code); return iterator(_M_find_node(__bkt, __k, __code)); } The __is_fast_hash trait is true for std::hash but false for your custom hash function.
[Bug libstdc++/106275] unordered_map with std::string key, std::hash, and custom equality predicate weirdness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106275 --- Comment #4 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #3) > The __is_fast_hash trait is true for std::hash but false for > your custom hash function. Oops, I mean the other way around! std::hash is considered slow, your custom hash function is not.
[Bug libstdc++/106275] unordered_map with std::string key, std::hash, and custom equality predicate weirdness
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106275 --- Comment #5 from Jonathan Wakely --- This "bug" (i.e. doing a linear search for small containers using slow hash functions) makes the find operation more than twice as fast: BM_std_hash 4.67 ns 4.66 ns150610579 BM_custom_hash 12.0 ns 12.0 ns 57529241 See PR 68303 for more details.
[Bug preprocessor/106272] clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #4 from Marek Polacek --- (In reply to Eric Gallager from comment #3) > Note that GCC has its own version of -Wpessimizing-move, too... any idea why > clang's version of the flag catches it, but gcc's doesn't? No, but I'm going to reduce libcpp/line-map.ii to create a testcase and file a bug and since it's my warning, maybe even fix it.
[Bug preprocessor/106272] clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 Marek Polacek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |mpolacek at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #5 from Marek Polacek --- And while at it, why don't I fix this one.
[Bug c++/106276] New: Missing -Wpessimizing-move warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106276 Bug ID: 106276 Summary: Missing -Wpessimizing-move warning Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: mpolacek at gcc dot gnu.org Target Milestone: --- Looks like we're missing a warning on the std::move line here: namespace std { template _Tp &&move(_Tp &&); } char take_buffer; struct label_text { label_text take() { return std::move(label_text(&take_buffer)); } label_text(char *); }; $ xclang++ -c -Wall -W line-map.ii line-map.ii:6:30: warning: moving a temporary object prevents copy elision [-Wpessimizing-move] label_text take() { return std::move(label_text(&take_buffer)); } ^ line-map.ii:6:30: note: remove std::move call here label_text take() { return std::move(label_text(&take_buffer)); } ^~~ $ xg++ -c -Wall -W line-map.ii $
[Bug preprocessor/106272] clang build: new warning ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106272 Marek Polacek changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=106276 --- Comment #6 from Marek Polacek --- (In reply to Marek Polacek from comment #4) > (In reply to Eric Gallager from comment #3) > > Note that GCC has its own version of -Wpessimizing-move, too... any idea why > > clang's version of the flag catches it, but gcc's doesn't? > > No, but I'm going to reduce libcpp/line-map.ii to create a testcase and file > a bug and since it's my warning, maybe even fix it. Bug 106276.
[Bug middle-end/106277] New: missed-optimization: redundant movzx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106277 Bug ID: 106277 Summary: missed-optimization: redundant movzx Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: jl1184 at duke dot edu Target Milestone: --- I came across this when examining a loop that runs slower than I expected. It involves explicit and implicit conversions between 8-bit and 32/64-bit values, and as I looked through the generated assembly using Godbolt compiler explorer, I found lots of movzx instructions that don't seem to break dependency or play a role in correctness, not to mention many use the same register like "movzx eax al", which cannot be eliminated. I then tried some simple examples on Godbolt with X86-64 GCC 12.1, and found that this behavior is persistent and easily reproducible, even when I specify "-march=skylake". Here's an example: #include int add2bytes(uint8_t* a, uint8_t* b) { return uint8_t(*a + *b); } gcc -O3 gives: add2bytes(unsigned char*, unsigned char*): movzx eax, BYTE PTR [rsi] add al, BYTE PTR [rdi] movzx eax, al ret The first movzx here breaks dependency on old eax value, but what is the second movzx doing? I don't think there's any dependency it can break, and it shouldn't affect the result either. I also asked this on Stack Overflow and [Peter Cordes] has a great response (https://stackoverflow.com/a/72953035/14730360) explaining how this extra movzx is bad for the vast majority of X86-64 processors. IMHO newer versions of GCC should give newer processors more weight in performance tradeoff. Probably -mtune=generic in a later GCC shouldn't care about P6-family partial-register stalls. Practically there should be so few still using those CPUs to run latest compiled softwares. Godbolt link with code for examples: https://godbolt.org/z/4n6ezaav7 Here's another example closer to what I was originally examining: int foo(uint8_t* a, uint8_t i, uint8_t j) { return a[a[i] | a[j]]; } gcc -O3 gives: foo(unsigned char*, unsigned char, unsigned char): movzx esi, sil movzx edx, dl movzx eax, BYTE PTR [rdi+rsi] or al, BYTE PTR [rdi+rdx] movzx eax, al movzx eax, BYTE PTR [rdi+rax] ret As was discussed in the Stack Overflow post, the first 2 movzx should be changed to use different registers so that some CPUs can have the benefit from mov elimination. The "movzx eax, al" just seems unnecessary. The upper bits of RAX should already be cleared, and the dependency of RAX on the "or" is not something that "movzx eax al" can break. So I think it's better to just do "movzx eax, byte ptr [rdi + rax]" after the "or". Or maybe even better, just use "mov eax, byte ptr [rdi + rax]" since EAX should already be free and cleaned in upper bits at this point.
[Bug tree-optimization/106237] [13 regression] serveral tests begin ICEing starting with r13-1575-gcf3a120084e946
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106237 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING |RESOLVED --- Comment #3 from Richard Biener --- (In reply to seurer from comment #2) > They use whichever mcpu matches the machine. > > The ICEs are fixed but there is a different problem introduced with your fix > g:79f18ac6b7ab7744fcf8937ea4bc0c40f3efc629, r13-1599-g79f18ac6b7ab77 That just exposed what previously ICEd I think. > make -k check-gcc RUNTESTFLAGS="powerpc.exp=gcc.target/powerpc/pr56605.c" > FAIL: gcc.target/powerpc/pr56605.c scan-rtl-dump-times combine > "\\(compare:CC \\((?:and|zero_extend):(?:[SD]I) \\((?:sub)?reg:[SD]I" 1 > > which might just be the test case needing updating I suppose. It occurs on > the same machines as the original problem. It still fails with current > trunk. I can see this with a cross to ppc64le as well, the pattern matches two times. It's not clear to me what the testcase intends to test - it lacks a comment :/ c3d2600cfb476^ also exhibits this problem, so does d2a89809^ so this problem must exist for longer time and it seems unrelated to this issue at hand. Can you properly bisect and open a different bugreport for this?
[Bug lto/106274] Loss of macro tracking information with -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106274 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2022-07-13 Version|unknown |12.1.0 Keywords||diagnostic, lto Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- I think the point is that in free-lang-data we are bulldozering over data structures that the frontend might not be happy about (in the attempt to make the streamed IL small), so we try to reset all callbacks into frontend code that might crash. I'm not sure to what extent this is still required with respect to the diagnostic context though - you'd have to try. There's also the old long-standing TODO to perform this "frontend scrapping" also when not using -flto just to save on memory for the followup optimization (but this runs into the same issue that late diagnostics then appear "mangled").