GCC 12.0.0 Status Report (2021-10-01), Stage 3 to start Nov 15th
Status == The GCC development branch is open for general development (Stage 1), but the two-month general bugfixing period (Stage 3) is ahead with historical data telling us to expect it to start Nov 15th and last through the Christmas holidays. Take the quality data below with a big grain of salt - most of the new P3 classified bugs will become P1 or P2 (generally every regression against GCC 11 is to be considered P1 if it concerns primary or secondary platforms). Quality Data Priority # Change from last report --- --- P1 15 + 15 P2 282 + 33 P3 193 + 159 P4 202 + 2 P5 25 + 1 --- --- Total P1-P3 490 + 207 Total 717 + 209 Previous Report === https://gcc.gnu.org/pipermail/gcc/2021-April/235831.html
RE: [PATCH][GCC] aarch64: add armv9-a to -march
> -Original Message- > From: Przemyslaw Wirkus > Sent: Wednesday, September 22, 2021 9:33 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: [PATCH][GCC] aarch64: add armv9-a to -march > > Patch is adding new command line option 'armv9-a' to -march. > > OK for master? Ok. Thanks, Kyrill > > gcc/ChangeLog: > > 2021-09-22 Przemyslaw Wirkus > > * config/aarch64/aarch64-arches.def (AARCH64_ARCH): Added > armv9-a. > * config/aarch64/aarch64.h (AARCH64_FL_V9): New. > (AARCH64_FL_FOR_ARCH9): New flags for Armv9-A. > (AARCH64_ISA_V9): New ISA flag.
[patch] Fix ICE with stack checking emulation at -O2
Hi, this is a regression present on mainline, 11 and 10 branches: on bare-metal platforms, the Ada compiler emulates stack checking (it is required by the language and tested by ACATS) in the runtime via the stack_check_libfunc hook of the RTL middle-end. Calls to the function are generated as libcalls but they now require a proper function type at -O2 or above. Tested on powerpc-elf, OK for mainline, 11 and 10 branches? 2021-10-01 Eric Botcazou * explow.c: Include langhooks.h. (set_stack_check_libfunc): Build a proper function type. -- Eric Botcazoudiff --git a/gcc/explow.c b/gcc/explow.c index b6da277f689..a35423f5d16 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see #include "recog.h" #include "diagnostic-core.h" #include "stor-layout.h" +#include "langhooks.h" #include "except.h" #include "dojump.h" #include "explow.h" @@ -1641,8 +1642,14 @@ set_stack_check_libfunc (const char *libfunc_name) { gcc_assert (stack_check_libfunc == NULL_RTX); stack_check_libfunc = gen_rtx_SYMBOL_REF (Pmode, libfunc_name); + tree ptype += Pmode == ptr_mode + ? ptr_type_node + : lang_hooks.types.type_for_mode (Pmode, 1); + tree ftype += build_function_type_list (void_type_node, ptype, NULL_TREE); tree decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, - get_identifier (libfunc_name), void_type_node); + get_identifier (libfunc_name), ftype); DECL_EXTERNAL (decl) = 1; SET_SYMBOL_REF_DECL (stack_check_libfunc, decl); }
Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling
On Thu, 30 Sep 2021, Andre Vieira (lists) wrote: > Hi, > > > >> That just forces trying the vector modes we've tried before. Though I might > >> need to revisit this now I think about it. I'm afraid it might be possible > >> for > >> this to generate an epilogue with a vf that is not lower than that of the > >> main > >> loop, but I'd need to think about this again. > >> > >> Either way I don't think this changes the vector modes used for the > >> epilogue. > >> But maybe I'm just missing your point here. > > Yes, I was refering to the above which suggests that when we vectorize > > the main loop with V4SF but unroll then we try vectorizing the > > epilogue with V4SF as well (but not unrolled). I think that's > > premature (not sure if you try V8SF if the main loop was V4SF but > > unrolled 4 times). > > My main motivation for this was because I had a SVE loop that vectorized with > both VNx8HI, then V8HI which beat VNx8HI on cost, then it decided to unroll > V8HI by two and skipped using VNx8HI as a predicated epilogue which would've > been the best choice. I see, yes - for fully predicated epilogues it makes sense to consider the same vector mode as for the main loop anyways (independent on whether we're unrolling or not). One could argue that with an unrolled V4SImode main loop a predicated V8SImode epilogue would also be a good match (but then somehow costing favored the unrolled V4SI over the V8SI for the main loop...). > So that is why I decided to just 'reset' the vector_mode selection. In a > scenario where you only have the traditional vector modes it might make less > sense. > > Just realized I still didn't add any check to make sure the epilogue has a > lower VF than the previous loop, though I'm still not sure that could happen. > I'll go look at where to add that if you agree with this. As said above, it only needs a lower VF in case the epilogue is not fully masked - otherwise the same VF would be OK. > >> I can move it there, it would indeed remove the need for the change to > >> vect_update_vf_for_slp, the change to > >> vect_determine_partial_vectors_and_peeling would still be required I think. > >> It > >> is meant to disable using partial vectors in an unrolled loop. > > Why would we disable the use of partial vectors in an unrolled loop? > The motivation behind that is that the overhead caused by generating > predicates for each iteration will likely be too much for it to be profitable > to unroll. On top of that, when dealing with low iteration count loops, if > executing one predicated iteration would be enough we now still need to > execute all other unrolled predicated iterations, whereas if we keep them > unrolled we skip the unrolled loops. OK, I guess we're not factoring in costs when deciding on predication but go for it if it's gernally enabled and possible. With the proposed scheme we'd then cost the predicated not unrolled loop against a not predicated unrolled loop which might be a bit apples vs. oranges also because the target made the unroll decision based on the data it collected for the predicated loop. > > Sure but I'm suggesting you keep the not unrolled body as one way of > > costed vectorization but then if the target says "try unrolling" > > re-do the analysis with the same mode but a larger VF. Just like > > we iterate over vector modes you'll now iterate over pairs of > > vector mode + VF (unroll factor). It's not about re-using the costing > > it's about using costing that is actually relevant and also to avoid > > targets inventing two distinct separate costings - a target (powerpc) > > might already compute load/store density and other stuff for the main > > costing so it should have an idea whether doubling or triplicating is OK. > > > > Richard. > Sounds good! I changed the patch to determine the unrolling factor later, > after all analysis has been done and retry analysis if an unrolling factor > larger than 1 has been chosen for this loop and vector_mode. > > gcc/ChangeLog: > > * doc/tm.texi: Document TARGET_VECTORIZE_UNROLL_FACTOR. > * doc/tm.texi.in: Add entries for TARGET_VECTORIZE_UNROLL_FACTOR. > * params.opt: Add vect-unroll and vect-unroll-reductions > parameters. What's the reason to add the --params? It looks like this makes us unroll with a static number short-cutting the target. IMHO that's never going to be a great thing - but what we could do is look at loop->unroll and try to honor that (factoring in that the vectorization factor is already the times we unroll). So I'd leave those params out for now, the user would have a much more fine-grained way to control this with the unroll pragma. Adding a max-vect-unroll parameter would be another thing but that would apply after the targets or pragma decision. > * target.def: Define hook TARGET_VECTORIZE_UNROLL_FACTOR. I still do not like the new target hook - as said I'd like to make you have the finis_cost hook allow the target to specify a s
Re: [patch] Fix ICE with stack checking emulation at -O2
On Fri, Oct 1, 2021 at 10:17 AM Eric Botcazou via Gcc-patches wrote: > > Hi, > > this is a regression present on mainline, 11 and 10 branches: on bare-metal > platforms, the Ada compiler emulates stack checking (it is required by the > language and tested by ACATS) in the runtime via the stack_check_libfunc hook > of the RTL middle-end. Calls to the function are generated as libcalls but > they now require a proper function type at -O2 or above. > > Tested on powerpc-elf, OK for mainline, 11 and 10 branches? OK though I wonder if you could get away with using built_function_type (void_type_node, NULL_TREE); aka a non-prototype void f(). Did you track down what changed the requirement? Thanks, Richard. > > 2021-10-01 Eric Botcazou > > * explow.c: Include langhooks.h. > (set_stack_check_libfunc): Build a proper function type. > > -- > Eric Botcazou
Re: [patch] Fix ICE with stack checking emulation at -O2
> OK though I wonder if you could get away with using > built_function_type (void_type_node, NULL_TREE); aka > a non-prototype void f(). See below. > Did you track down what changed the requirement? The new function-abi.cc module, so I'd rather have a correct prototype. -- Eric Botcazou
Re: [patch] Fix ICE with stack checking emulation at -O2
On Fri, Oct 1, 2021 at 10:30 AM Eric Botcazou wrote: > > > OK though I wonder if you could get away with using > > built_function_type (void_type_node, NULL_TREE); aka > > a non-prototype void f(). > > See below. > > > Did you track down what changed the requirement? > > The new function-abi.cc module, so I'd rather have a correct prototype. I see, yes that makes sense. Thanks, Richard. > -- > Eric Botcazou > >
[committed] openmp: Add alloc_align attribute to omp_aligned_*alloc and testcase for omp_realloc
Hi! This patch adds alloc_align attribute to omp_aligned_{,c}alloc so that if the first argument is constant, GCC can assume requested alignment. Additionally, it adds testsuite coverage for omp_realloc which I haven't managed to write in the patch from yesterday. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2021-10-01 Jakub Jelinek * omp.h.in (omp_aligned_alloc, omp_aligned_calloc): Add __alloc_align__ (1) attribute. * testsuite/libgomp.c-c++-common/alloc-9.c: New test. --- libgomp/omp.h.in.jj 2021-09-30 09:29:56.738900869 +0200 +++ libgomp/omp.h.in2021-09-30 10:49:33.85699 +0200 @@ -306,7 +306,7 @@ extern void *omp_aligned_alloc (__SIZE_T omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), - __alloc_size__ (2))); + __alloc_size__ (2), __alloc_align__ (1))); extern void *omp_calloc (__SIZE_TYPE__, __SIZE_TYPE__, omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), @@ -315,7 +315,7 @@ extern void *omp_aligned_calloc (__SIZE_ omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) __GOMP_NOTHROW __attribute__((__malloc__, __malloc__ (omp_free), - __alloc_size__ (2, 3))); + __alloc_size__ (2, 3), __alloc_align__ (1))); extern void *omp_realloc (void *, __SIZE_TYPE__, omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR, omp_allocator_handle_t __GOMP_DEFAULT_NULL_ALLOCATOR) --- libgomp/testsuite/libgomp.c-c++-common/alloc-9.c.jj 2021-09-30 13:06:28.794494653 +0200 +++ libgomp/testsuite/libgomp.c-c++-common/alloc-9.c2021-09-30 15:26:42.717931994 +0200 @@ -0,0 +1,271 @@ +#include +#include +#include + +const omp_alloctrait_t traits2[] += { { omp_atk_alignment, 16 }, +{ omp_atk_sync_hint, omp_atv_default }, +{ omp_atk_access, omp_atv_default }, +{ omp_atk_pool_size, 1024 }, +{ omp_atk_fallback, omp_atv_default_mem_fb }, +{ omp_atk_partition, omp_atv_environment } }; +omp_alloctrait_t traits3[] += { { omp_atk_sync_hint, omp_atv_uncontended }, +{ omp_atk_alignment, 32 }, +{ omp_atk_access, omp_atv_all }, +{ omp_atk_pool_size, 512 }, +{ omp_atk_fallback, omp_atv_allocator_fb }, +{ omp_atk_fb_data, 0 }, +{ omp_atk_partition, omp_atv_default } }; +const omp_alloctrait_t traits4[] += { { omp_atk_alignment, 128 }, +{ omp_atk_pool_size, 1024 }, +{ omp_atk_fallback, omp_atv_null_fb } }; + +int +main () +{ + int *volatile p = (int *) omp_alloc (3 * sizeof (int), omp_default_mem_alloc); + int *volatile q; + int *volatile r; + omp_alloctrait_t traits[3] += { { omp_atk_alignment, 64 }, + { omp_atk_fallback, omp_atv_null_fb }, + { omp_atk_pool_size, 4096 } }; + omp_alloctrait_t traits5[2] += { { omp_atk_fallback, omp_atv_null_fb }, + { omp_atk_pool_size, 4096 } }; + omp_allocator_handle_t a, a2; + + if uintptr_t) p) % __alignof (int)) != 0) +abort (); + p[0] = 1; + p[1] = 2; + p[2] = 3; + p = (int *) omp_realloc (p, 4 * sizeof (int), omp_default_mem_alloc, omp_default_mem_alloc); + if uintptr_t) p) % __alignof (int)) != 0 || p[0] != 1 || p[1] != 2 || p[2] != 3) +abort (); + p[0] = 4; + p[1] = 5; + p[2] = 6; + p[3] = 7; + p = (int *) omp_realloc (p, 2 * sizeof (int), omp_default_mem_alloc, omp_default_mem_alloc); + if uintptr_t) p) % __alignof (int)) != 0 || p[0] != 4 || p[1] != 5) +abort (); + p[0] = 8; + p[1] = 9; + if (omp_realloc (p, 0, omp_null_allocator, omp_default_mem_alloc) != NULL) +abort (); + p = (int *) omp_realloc (NULL, 2 * sizeof (int), omp_default_mem_alloc, omp_null_allocator); + if uintptr_t) p) % __alignof (int)) != 0) +abort (); + p[0] = 1; + p[1] = 2; + p = (int *) omp_realloc (p, 5 * sizeof (int), omp_default_mem_alloc, omp_default_mem_alloc); + if uintptr_t) p) % __alignof (int)) != 0 || p[0] != 1 || p[1] != 2) +abort (); + p[0] = 3; + p[1] = 4; + p[2] = 5; + p[3] = 6; + p[4] = 7; + omp_free (p, omp_null_allocator); + omp_set_default_allocator (omp_default_mem_alloc); + if (omp_realloc (NULL, 0, omp_null_allocator, omp_null_allocator) != NULL) +abort (); + p = (int *) omp_alloc (sizeof (int), omp_null_allocator); + if uintptr_t) p) % __alignof (int)) != 0) +abort (); + p[0] = 3; + p = (int *) omp_realloc (p, 3 * sizeof (int), omp_null_allocator, omp_null_allocator); + if uintptr_t) p) % __alignof (int)) != 0 || p[0] != 3) +abort (); + p[0] = 4; + p[1] = 5; + p[2] = 6; + if (omp_realloc (p, 0, omp_null_allocator, omp_get_default_allocator ()) != NULL) +abort (); + a = om
[committed] openmp: Avoid PLT relocations for omp_* symbols in libgomp
Hi! This patch avoids the following relocations: readelf -Wr libgomp.so.1.0.0 | grep omp_ 000470e0 02070007 R_X86_64_JUMP_SLOT 0001d9d0 omp_fulfill_event@@OMP_5.0.1 + 0 00047170 00b80007 R_X86_64_JUMP_SLOT e760 omp_display_env@@OMP_5.1 + 0 000471e0 00e80007 R_X86_64_JUMP_SLOT f910 omp_get_initial_device@@OMP_4.5 + 0 00047280 01950007 R_X86_64_JUMP_SLOT 00015940 omp_get_active_level@@OMP_3.0 + 0 000472c8 020d0007 R_X86_64_JUMP_SLOT 00035210 omp_get_team_num@@OMP_4.0 + 0 000472f0 01470007 R_X86_64_JUMP_SLOT 00035200 omp_get_num_teams@@OMP_4.0 + 0 by using ialias{,_call,_redirect} macros as needed. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. We still have many acc_* PLT relocations, could somebody please fix those? readelf -Wr libgomp.so.1.0.0 | grep acc_ 00046fb8 01ed0006 R_X86_64_GLOB_DAT 00036350 acc_prof_unregister@@OACC_2.5.1 + 0 00046fd8 00a40006 R_X86_64_GLOB_DAT 00035f30 acc_prof_register@@OACC_2.5.1 + 0 00046fe0 01d10006 R_X86_64_GLOB_DAT 00035ee0 acc_prof_lookup@@OACC_2.5.1 + 0 00047058 01dd0007 R_X86_64_JUMP_SLOT 00031f40 acc_create_async@@OACC_2.5 + 0 00047068 01150007 R_X86_64_JUMP_SLOT 0002fc60 acc_get_property@@OACC_2.6 + 0 00047070 01fb0007 R_X86_64_JUMP_SLOT 00032ce0 acc_wait_all@@OACC_2.0 + 0 00047080 00650007 R_X86_64_JUMP_SLOT 0002f990 acc_on_device@@OACC_2.0 + 0 00047088 00ae0007 R_X86_64_JUMP_SLOT 00032140 acc_attach_async@@OACC_2.6 + 0 00047090 02190007 R_X86_64_JUMP_SLOT 0002f550 acc_get_device_type@@OACC_2.0 + 0 00047098 01cb0007 R_X86_64_JUMP_SLOT 00032090 acc_copyout_finalize@@OACC_2.5 + 0 000470a8 00520007 R_X86_64_JUMP_SLOT 00031f80 acc_copyin@@OACC_2.0 + 0 000470b8 01ad0007 R_X86_64_JUMP_SLOT 00032030 acc_delete_finalize@@OACC_2.5 + 0 000470e8 01090007 R_X86_64_JUMP_SLOT 00031f00 acc_create@@OACC_2.0 + 0 000470f8 00590007 R_X86_64_JUMP_SLOT 00032b70 acc_wait_async@@OACC_2.0 + 0 00047110 01310007 R_X86_64_JUMP_SLOT 00032860 acc_async_test@@OACC_2.0 + 0 00047118 01ff0007 R_X86_64_JUMP_SLOT 0002f720 acc_get_device_num@@OACC_2.0 + 0 00047128 01910007 R_X86_64_JUMP_SLOT 00032020 acc_delete_async@@OACC_2.5 + 0 00047130 01d20007 R_X86_64_JUMP_SLOT 0002efa0 acc_shutdown@@OACC_2.0 + 0 00047150 00d7 R_X86_64_JUMP_SLOT 00031f00 acc_present_or_create@@OACC_2.0 + 0 00047188 01920007 R_X86_64_JUMP_SLOT 00031910 acc_is_present@@OACC_2.0 + 0 00047190 01aa0007 R_X86_64_JUMP_SLOT 0002fca0 acc_get_property_string@@OACC_2.6 + 0 000471d0 01bf0007 R_X86_64_JUMP_SLOT 00032120 acc_update_self_async@@OACC_2.5 + 0 00047200 02050007 R_X86_64_JUMP_SLOT 00032e00 acc_wait_all_async@@OACC_2.0 + 0 00047208 00a60007 R_X86_64_JUMP_SLOT 00031790 acc_deviceptr@@OACC_2.0 + 0 00047218 00750007 R_X86_64_JUMP_SLOT 00032000 acc_delete@@OACC_2.0 + 0 00047238 01e90007 R_X86_64_JUMP_SLOT 0002f3a0 acc_set_device_type@@OACC_2.0 + 0 00047240 01f60007 R_X86_64_JUMP_SLOT 0002ef20 acc_init@@OACC_2.0 + 0 00047248 01880007 R_X86_64_JUMP_SLOT 00032060 acc_copyout@@OACC_2.0 + 0 00047258 021f0007 R_X86_64_JUMP_SLOT 00032a80 acc_wait@@OACC_2.0 + 0 00047270 01bc0007 R_X86_64_JUMP_SLOT 00032100 acc_update_self@@OACC_2.0 + 0 00047288 01140007 R_X86_64_JUMP_SLOT 00032080 acc_copyout_async@@OACC_2.5 + 0 00047290 013d0007 R_X86_64_JUMP_SLOT 0002f850 acc_set_device_num@@OACC_2.0 + 0 000472a8 00c50007 R_X86_64_JUMP_SLOT 000320e0 acc_update_device_async@@OACC_2.5 + 0 000472c0 01460007 R_X86_64_JUMP_SLOT 00031fc0 acc_copyin_async@@OACC_2.5 + 0 000472f8 006a0007 R_X86_64_JUMP_SLOT 0002f310 acc_get_num_devices@@OACC_2.0 + 0 00047350 02170007 R_X86_64_JUMP_SLOT 00031f80 acc_present_or_copyin@@OACC_2.0 + 0 00047360 02090007 R_X86_64_JUMP_SLOT 000320c0 acc_update_device@@OACC_2.0 + 0 00047380 00840007 R_X86_64_JUMP_SLOT
[committed] openmp: Differentiate between order(concurrent) and order(reproducible:concurrent)
Hi! While OpenMP 5.1 implies order(concurrent) is the same thing as order(reproducible:concurrent), this is going to change in OpenMP 5.2, where essentially order(concurrent) means nothing is stated on whether it is reproducible or unconstrained (and is determined by other means, e.g. for/do with schedule static or runtime with static being selected is implicitly reproducible, distribute with dist_schedule static is implicitly reproducible, loop is implicitly reproducible) and when the modifier is specified explicitly, it overrides the implicit behavior either way. And, when order(reproducible:concurrent) is used with e.g. schedule(dynamic) or some other schedule that is by definition not reproducible, it is implementation's duty to ensure it is reproducible, either by remembering how it scheduled some loop and then replaying the same schedule when seeing loops with the same directive/schedule/number of iterations, or by overriding the schedule to some reproducible one. This patch doesn't implement the 5.2 wording just yet, but in the FEs differentiates between the 3 states - no explicit modifier, explicit reproducible or explicit unconstrainted, so that the middle-end can easily switch any time. Instead it follows the 5.1 wording where both order(concurrent) (implicit or explicit) or order(reproducible:concurrent) imply reproducibility. And, it implements the easier method, when for/do should be reproducible, it just chooses static schedule. order(concurrent) implies no OpenMP APIs in the loop body nor threadprivate vars, so the exact scheduling isn't (easily at least) observable. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2021-10-01 Jakub Jelinek gcc/ * tree.h (OMP_CLAUSE_ORDER_REPRODUCIBLE): Define. * tree-pretty-print.c (dump_omp_clause) : Print reproducible: for OMP_CLAUSE_ORDER_REPRODUCIBLE. * omp-general.c (omp_extract_for_data): If OMP_CLAUSE_ORDER is seen without OMP_CLAUSE_ORDER_UNCONSTRAINED, overwrite sched_kind to OMP_CLAUSE_SCHEDULE_STATIC. gcc/c-family/ * c-omp.c (c_omp_split_clauses): Also copy OMP_CLAUSE_ORDER_REPRODUCIBLE. gcc/c/ * c-parser.c (c_parser_omp_clause_order): Set OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier. gcc/cp/ * parser.c (cp_parser_omp_clause_order): Set OMP_CLAUSE_ORDER_REPRODUCIBLE for explicit reproducible: modifier. gcc/fortran/ * gfortran.h (gfc_omp_clauses): Add order_reproducible bitfield. * dump-parse-tree.c (show_omp_clauses): Print REPRODUCIBLE: for it. * openmp.c (gfc_match_omp_clauses): Set order_reproducible for explicit reproducible: modifier. * trans-openmp.c (gfc_trans_omp_clauses): Set OMP_CLAUSE_ORDER_REPRODUCIBLE for order_reproducible. (gfc_split_omp_clauses): Also copy order_reproducible. gcc/testsuite/ * gfortran.dg/gomp/order-5.f90: Adjust scan-tree-dump-times regexps. libgomp/ * testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test. * testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test. --- gcc/tree.h.jj 2021-09-22 09:29:01.049814034 +0200 +++ gcc/tree.h 2021-09-30 16:17:53.051099703 +0200 @@ -1718,6 +1718,9 @@ class auto_suppress_location_wrappers /* True for unconstrained modifier on order(concurrent) clause. */ #define OMP_CLAUSE_ORDER_UNCONSTRAINED(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_ORDER)->base.public_flag) +/* True for reproducible modifier on order(concurrent) clause. */ +#define OMP_CLAUSE_ORDER_REPRODUCIBLE(NODE) \ + TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_ORDER)) #define OMP_CLAUSE_REDUCTION_CODE(NODE)\ (OMP_CLAUSE_RANGE_CHECK (NODE, OMP_CLAUSE_REDUCTION, \ --- gcc/tree-pretty-print.c.jj 2021-09-22 09:29:01.051814006 +0200 +++ gcc/tree-pretty-print.c 2021-09-30 16:18:42.713406739 +0200 @@ -1165,6 +1165,8 @@ dump_omp_clause (pretty_printer *pp, tre pp_string (pp, "order("); if (OMP_CLAUSE_ORDER_UNCONSTRAINED (clause)) pp_string (pp, "unconstrained:"); + else if (OMP_CLAUSE_ORDER_REPRODUCIBLE (clause)) + pp_string (pp, "reproducible:"); pp_string (pp, "concurrent)"); break; --- gcc/omp-general.c.jj2021-09-01 11:37:41.966556334 +0200 +++ gcc/omp-general.c 2021-09-30 16:36:56.599142089 +0200 @@ -193,6 +193,7 @@ omp_extract_for_data (gomp_for *for_stmt == GF_OMP_FOR_KIND_DISTRIBUTE; bool taskloop = gimple_omp_for_kind (for_stmt) == GF_OMP_FOR_KIND_TASKLOOP; + bool order_reproducible = false; tree iterv, countv; fd->for_stmt = for_stmt; @@ -277,10 +278,25 @@ omp_extract_for_data (gomp_for *for_stmt && !OMP_CLAUSE__SCANTEMP__CONTROL (t)) fd->have_nonctrl_scantemp = true; break; + case OMP_CLAUSE_ORDER: + /* FIXME: For OpenMP 5.2 this should change to + if (OMP_C
[committed] Fix bb-slp-pr97709.c after computed goto change
From: Andrew Pinski Looks like I tested the change for bb-slp-pr97709.c on an older tree which did not have the error message so I had missed one more place where the change was needed. Anyways committed after testing to make sure the testcase passes now. gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-pr97709.c: Fix for computed goto pointers. --- gcc/testsuite/gcc.dg/vect/bb-slp-pr97709.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr97709.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97709.c index d0f3d05..56ec0f6 100644 --- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr97709.c +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97709.c @@ -17,7 +17,7 @@ g: d = 0; h: c = 1; - goto *a; + goto *(void*)(__INTPTR_TYPE__)a; i: { struct b b = {c, d}; -- 1.8.3.1
[Patch] Add/update libgomp.fortran/alloc-*.f90 [Re: [committed] openmp: Add alloc_align attribute to omp_aligned_*alloc and testcase for omp_realloc]
... and attached the Fortran version of the C/C++ testcase. OK? Tobias On 01.10.21 10:59, Jakub Jelinek wrote: 2021-10-01 Jakub Jelinek * omp.h.in (omp_aligned_alloc, omp_aligned_calloc): Add __alloc_align__ (1) attribute. * testsuite/libgomp.c-c++-common/alloc-9.c: New test. - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 Add/update libgomp.fortran/alloc-*.f90 libgomp/ChangeLog: * testsuite/libgomp.fortran/alloc-10.f90: Fix alignment check. * testsuite/libgomp.fortran/alloc-7.f90: Fix array access. * testsuite/libgomp.fortran/alloc-8.f90: Likewise. * testsuite/libgomp.fortran/alloc-11.f90: New test for omp_realloc, based on libgomp.c-c++-common/alloc-9.c. libgomp/testsuite/libgomp.fortran/alloc-10.f90 | 4 +- libgomp/testsuite/libgomp.fortran/alloc-11.f90 | 301 + libgomp/testsuite/libgomp.fortran/alloc-7.f90 | 14 +- libgomp/testsuite/libgomp.fortran/alloc-8.f90 | 2 +- 4 files changed, 311 insertions(+), 10 deletions(-) diff --git a/libgomp/testsuite/libgomp.fortran/alloc-10.f90 b/libgomp/testsuite/libgomp.fortran/alloc-10.f90 index 060c16f312b..3eab8598dec 100644 --- a/libgomp/testsuite/libgomp.fortran/alloc-10.f90 +++ b/libgomp/testsuite/libgomp.fortran/alloc-10.f90 @@ -134,7 +134,7 @@ program main ip(420 / c_sizeof (0)) = 6 q = omp_aligned_calloc (8_c_size_t, 24_c_size_t, 32_c_size_t, a2) call c_f_pointer (q, iq, [768 / c_sizeof (0)]) - if (mod (TRANSFER (p, iptr), 16) /= 0) & + if (mod (TRANSFER (q, iptr), 16) /= 0) & stop 18 do i = 1, 768 / c_sizeof (0) if (iq(i) /= 0) & @@ -144,7 +144,7 @@ program main iq(768 / c_sizeof (0)) = 8 r = omp_aligned_calloc (8_c_size_t, 64_c_size_t, 8_c_size_t, a2) call c_f_pointer (r, ir, [512 / c_sizeof (0)]) - if (mod (TRANSFER (p, iptr), 8) /= 0) & + if (mod (TRANSFER (r, iptr), 8) /= 0) & stop 20 do i = 1, 512 / c_sizeof (0) if (ir(i) /= 0) & diff --git a/libgomp/testsuite/libgomp.fortran/alloc-11.f90 b/libgomp/testsuite/libgomp.fortran/alloc-11.f90 new file mode 100644 index 000..22b4f92a336 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/alloc-11.f90 @@ -0,0 +1,301 @@ +! { dg-additional-sources alloc-7.c } +! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" } +module m + use omp_lib + use iso_c_binding + implicit none + + type (omp_alloctrait), parameter :: traits2(*) & += [ omp_alloctrait (omp_atk_alignment, 16), & +omp_alloctrait (omp_atk_sync_hint, omp_atv_default), & +omp_alloctrait (omp_atk_access, omp_atv_default), & +omp_alloctrait (omp_atk_pool_size, 1024), & +omp_alloctrait (omp_atk_fallback, omp_atv_default_mem_fb), & +omp_alloctrait (omp_atk_partition, omp_atv_environment)] + type (omp_alloctrait) :: traits3(7) & += [ omp_alloctrait (omp_atk_sync_hint, omp_atv_uncontended), & +omp_alloctrait (omp_atk_alignment, 32), & +omp_alloctrait (omp_atk_access, omp_atv_all), & +omp_alloctrait (omp_atk_pool_size, 512), & +omp_alloctrait (omp_atk_fallback, omp_atv_allocator_fb), & +omp_alloctrait (omp_atk_fb_data, 0), & +omp_alloctrait (omp_atk_partition, omp_atv_default)] + + type (omp_alloctrait), parameter :: traits4(*) & += [ omp_alloctrait (omp_atk_alignment, 128), & +omp_alloctrait (omp_atk_pool_size, 1024), & +omp_alloctrait (omp_atk_fallback, omp_atv_null_fb)] + + interface +integer(c_int) function get__alignof_int () bind(C) + import :: c_int +end + end interface +end module m + +program main + use m + implicit none (external, type) + type(c_ptr) :: p, q, r + integer, pointer, contiguous :: ip(:), iq(:), ir(:) + type (omp_alloctrait) :: traits(3) + type (omp_alloctrait) :: traits5(2) + integer (omp_allocator_handle_kind) :: a, a2 + integer (c_ptrdiff_t) :: iptr + + traits = [ omp_alloctrait (omp_atk_alignment, 64), & + omp_alloctrait (omp_atk_fallback, omp_atv_null_fb), & + omp_alloctrait (omp_atk_pool_size, 4096)] + traits5 = [ omp_alloctrait (omp_atk_fallback, omp_atv_null_fb), & + omp_alloctrait (omp_atk_pool_size, 4096)] + + p = omp_alloc (3 * c_sizeof (0), omp_default_mem_alloc) + call c_f_pointer (p, ip, [3]) + if (mod (TRANSFER (p, iptr), get__alignof_int ()) /= 0) & +stop 1 + ip(1) = 1 + ip(2) = 2 + ip(3) = 3 + p = omp_realloc (p, 4 * c_sizeof (0), omp_default_mem_alloc, omp_default_mem_alloc) + call c_f_pointer (p, ip, [4]) + if (mod (TRANSFER (p, iptr), get__alignof_int ()) /= 0 & + .or. ip(1) /= 1 .or. ip(2) /= 2 .or. ip(3) /= 3) & +stop 2 + ip(1) = 4 + ip(2) = 5 + ip(3) = 6 + ip(4) = 7 + p = omp_realloc (p, 2 * c_sizeof (0), omp_default_mem_al
[Patch] Add libgomp.fortran/order-reproducible-*.f90 [Re: [committed] openmp: Differentiate between order(concurrent) and order(reproducible:concurrent)]
On 01.10.21 11:03, Jakub Jelinek wrote: 2021-10-01 Jakub Jelinek libgomp/ * testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test. * testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test. Attached is the Fortran version of the two patches – the Fortran FE modifications were already in Jakub's patch. Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 Add libgomp.fortran/order-reproducible-*.f90 libgomp/ChangeLog: * testsuite/libgomp.fortran/order-reproducible-1.f90: New test based on libgomp.c-c++-common/order-reproducible-1.c. * testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise. .../libgomp.fortran/order-reproducible-1.f90 | 70 ++ .../libgomp.fortran/order-reproducible-2.f90 | 36 +++ 2 files changed, 106 insertions(+) diff --git a/libgomp/testsuite/libgomp.fortran/order-reproducible-1.f90 b/libgomp/testsuite/libgomp.fortran/order-reproducible-1.f90 new file mode 100644 index 000..2b852ebc70b --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/order-reproducible-1.f90 @@ -0,0 +1,70 @@ +program main + implicit none + interface +subroutine usleep(t) bind(C, name="my_usleep") + use iso_c_binding + integer(c_int), value :: t +end subroutine + end interface + + integer :: a(128) + integer :: i + + !$omp teams num_teams(5) +!$omp loop bind(teams) +do i = 1, 128 + a(i) = i + if (i == 0) then +call usleep (20) + else if (i == 17) then +call usleep (40) + end if +end do +!$omp loop bind(teams) +do i = 1, 128 + a(i) = a(i) + i +end do + !$omp end teams + do i = 1, 128 +if (a(i) /= 2 * i) & + stop 1 + end do + !$omp teams num_teams(5) +!$omp loop bind(teams) order(concurrent) +do i = 1, 128 + a(i) = a(i) * 2 + if (i == 1) then +call usleep (20) + else if (i == 13) then +call usleep (40) + end if +end do +!$omp loop bind(teams) order(concurrent) +do i = 1, 128 + a(i) = a(i) + i +end do + !$omp end teams + do i = 1, 128 +if (a(i) /= 5 * i) & + stop 2 + end do + !$omp teams num_teams(5) +!$omp loop bind(teams) order(reproducible:concurrent) +do i = 1, 128 + a(i) = a(i) * 2 + if (i == 3) then +call usleep (20) + else if (i == 106) then +call usleep (40) + end if +end do +!$omp loop bind(teams) order(reproducible:concurrent) +do i = 1, 128 + a(i) = a(i) + i +end do + !$omp end teams + do i = 1, 128 +if (a(i) /= 11 * i) & + stop 3 + end do +end program main diff --git a/libgomp/testsuite/libgomp.fortran/order-reproducible-2.f90 b/libgomp/testsuite/libgomp.fortran/order-reproducible-2.f90 new file mode 100644 index 000..af18c82f700 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/order-reproducible-2.f90 @@ -0,0 +1,36 @@ +! { dg-additional-sources my-usleep.c } +program main + implicit none + interface +subroutine usleep(t) bind(C, name="my_usleep") + use iso_c_binding + integer(c_int), value :: t +end subroutine + end interface + + integer :: a(128) + integer :: i + + !$omp parallel num_threads(8) +!$omp barrier +!$omp do schedule (dynamic, 2) order(reproducible:concurrent) +do i = 1, 128 + a(i) = i + if (i == 1) then +call usleep (20) + else if (i == 18) then +call usleep (40) + end if +end do +!$omp end do nowait +!$omp do schedule (dynamic, 2) order(reproducible:concurrent) +do i = 1, 128 + a(i) = a(i) + i +end do +!$omp end do nowait + !$omp end parallel + do i = 1, 128 +if (a(i) /= 2 * i) & + stop + end do +end program main
Re: [Patch] Add/update libgomp.fortran/alloc-*.f90 [Re: [committed] openmp: Add alloc_align attribute to omp_aligned_*alloc and testcase for omp_realloc]
On Fri, Oct 01, 2021 at 11:32:24AM +0200, Tobias Burnus wrote: > libgomp/ChangeLog: > > * testsuite/libgomp.fortran/alloc-10.f90: Fix alignment check. > * testsuite/libgomp.fortran/alloc-7.f90: Fix array access. > * testsuite/libgomp.fortran/alloc-8.f90: Likewise. > * testsuite/libgomp.fortran/alloc-11.f90: New test for omp_realloc, > based on libgomp.c-c++-common/alloc-9.c. > > libgomp/testsuite/libgomp.fortran/alloc-10.f90 | 4 +- > libgomp/testsuite/libgomp.fortran/alloc-11.f90 | 301 > + > libgomp/testsuite/libgomp.fortran/alloc-7.f90 | 14 +- > libgomp/testsuite/libgomp.fortran/alloc-8.f90 | 2 +- > 4 files changed, 311 insertions(+), 10 deletions(-) LGTM. Jakub
Re: [Patch] Add libgomp.fortran/order-reproducible-*.f90 [Re: [committed] openmp: Differentiate between order(concurrent) and order(reproducible:concurrent)]
On Fri, Oct 01, 2021 at 11:34:15AM +0200, Tobias Burnus wrote: > On 01.10.21 11:03, Jakub Jelinek wrote: > > 2021-10-01 Jakub Jelinek > > libgomp/ > > * testsuite/libgomp.c-c++-common/order-reproducible-1.c: New test. > > * testsuite/libgomp.c-c++-common/order-reproducible-2.c: New test. > > Attached is the Fortran version of the two patches – the Fortran FE > modifications were already in Jakub's patch. > > Tobias > > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 > München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas > Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht > München, HRB 106955 > Add libgomp.fortran/order-reproducible-*.f90 > > libgomp/ChangeLog: > > * testsuite/libgomp.fortran/order-reproducible-1.f90: New test > based on libgomp.c-c++-common/order-reproducible-1.c. > * testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise. > > .../libgomp.fortran/order-reproducible-1.f90 | 70 > ++ > .../libgomp.fortran/order-reproducible-2.f90 | 36 +++ > 2 files changed, 106 insertions(+) > > diff --git a/libgomp/testsuite/libgomp.fortran/order-reproducible-1.f90 > b/libgomp/testsuite/libgomp.fortran/order-reproducible-1.f90 > new file mode 100644 > index 000..2b852ebc70b > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/order-reproducible-1.f90 > @@ -0,0 +1,70 @@ No ! { dg-additional-sources my-usleep.c } here? How does it work then? And no my-usleep.c in the patch. > +program main > + implicit none > + interface > +subroutine usleep(t) bind(C, name="my_usleep") > + use iso_c_binding > + integer(c_int), value :: t > +end subroutine > + end interface > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/order-reproducible-2.f90 > @@ -0,0 +1,36 @@ > +! { dg-additional-sources my-usleep.c } > +program main > + implicit none > + interface > +subroutine usleep(t) bind(C, name="my_usleep") > + use iso_c_binding > + integer(c_int), value :: t > +end subroutine > + end interface Jakub
[PATCH] gcov: make profile merging smarter
Support merging of profiles that are built from a different .o files but belong to the same source file. Moreover, a checksum is verified during profile merging and so we can safely combine such profile. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. I'm going to install the patch if there are no comments about it. Thanks, Martin PR gcov-profile/90364 gcc/ChangeLog: * coverage.c (build_info): Emit checksum to the global variable. (build_info_type): Add new field for checksum. (coverage_obj_finish): Pass object_checksum. (coverage_init): Use 0 as checksum for .gcno files. * gcov-dump.c (dump_gcov_file): Dump also new checksum field. * gcov.c (read_graph_file): Read also checksum. libgcc/ChangeLog: * libgcov-driver.c (merge_one_data): Skip timestamp and verify checksums. (write_one_data): Write also checksum. * libgcov-util.c (read_gcda_file): Read also checksum field. * libgcov.h (struct gcov_info): Add new field. --- gcc/coverage.c | 50 - gcc/gcov-dump.c | 9 gcc/gcov.c | 5 + libgcc/libgcov-driver.c | 8 +-- libgcc/libgcov-util.c | 3 +++ libgcc/libgcov.h| 1 + 6 files changed, 54 insertions(+), 22 deletions(-) diff --git a/gcc/coverage.c b/gcc/coverage.c index 10d7f8366cb..4467f1eaa5c 100644 --- a/gcc/coverage.c +++ b/gcc/coverage.c @@ -129,16 +129,7 @@ static const char *const ctr_names[GCOV_COUNTERS] = { #undef DEF_GCOV_COUNTER /* Forward declarations. */ -static void read_counts_file (void); static tree build_var (tree, tree, int); -static void build_fn_info_type (tree, unsigned, tree); -static void build_info_type (tree, tree); -static tree build_fn_info (const struct coverage_data *, tree, tree); -static tree build_info (tree, tree); -static bool coverage_obj_init (void); -static vec *coverage_obj_fn -(vec *, tree, struct coverage_data const *); -static void coverage_obj_finish (vec *); /* Return the type node for gcov_type. */ @@ -218,6 +209,9 @@ read_counts_file (void) tag = gcov_read_unsigned (); bbg_file_stamp = crc32_unsigned (bbg_file_stamp, tag); + /* Read checksum. */ + gcov_read_unsigned (); + counts_hash = new hash_table (10); while ((tag = gcov_read_unsigned ())) { @@ -935,6 +929,12 @@ build_info_type (tree type, tree fn_info_ptr_type) DECL_CHAIN (field) = fields; fields = field; + /* Checksum. */ + field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, + get_gcov_unsigned_t ()); + DECL_CHAIN (field) = fields; + fields = field; + /* Filename */ field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, build_pointer_type (build_qualified_type @@ -977,7 +977,7 @@ build_info_type (tree type, tree fn_info_ptr_type) function info objects. */ static tree -build_info (tree info_type, tree fn_ary) +build_info (tree info_type, tree fn_ary, unsigned object_checksum) { tree info_fields = TYPE_FIELDS (info_type); tree merge_fn_type, n_funcs; @@ -996,13 +996,19 @@ build_info (tree info_type, tree fn_ary) /* next -- NULL */ CONSTRUCTOR_APPEND_ELT (v1, info_fields, null_pointer_node); info_fields = DECL_CHAIN (info_fields); - + /* stamp */ CONSTRUCTOR_APPEND_ELT (v1, info_fields, build_int_cstu (TREE_TYPE (info_fields), bbg_file_stamp)); info_fields = DECL_CHAIN (info_fields); + /* Checksum. */ + CONSTRUCTOR_APPEND_ELT (v1, info_fields, + build_int_cstu (TREE_TYPE (info_fields), + object_checksum)); + info_fields = DECL_CHAIN (info_fields); + /* Filename */ da_file_name_len = strlen (da_file_name); filename_string = build_string (da_file_name_len + 1, da_file_name); @@ -1214,7 +1220,8 @@ coverage_obj_fn (vec *ctor, tree fn, function objects from CTOR. Generate the gcov_info initializer. */ static void -coverage_obj_finish (vec *ctor) +coverage_obj_finish (vec *ctor, +unsigned object_checksum) { unsigned n_functions = vec_safe_length (ctor); tree fn_info_ary_type = build_array_type @@ -1231,7 +1238,7 @@ coverage_obj_finish (vec *ctor) varpool_node::finalize_decl (fn_info_ary); DECL_INITIAL (gcov_info_var) -= build_info (TREE_TYPE (gcov_info_var), fn_info_ary); += build_info (TREE_TYPE (gcov_info_var), fn_info_ary, object_checksum); varpool_node::finalize_decl (gcov_info_var); } @@ -1300,7 +1307,6 @@ coverage_init (const char *filename) strcpy (da_file_name + prefix_len + len, GCOV_DATA_SUFFIX); bbg_file_stamp = local_tick; - if (flag_auto_profile) read_autofdo_file (); else if (flag_branch_probabilities) @@ -1328,6 +1334,8 @@ coverage_init (const char *filename) gcov_write_unsigned (GCOV_NOTE_MAG
RE: [PATCH][GCC] aarch64: add armv9-a to -march
> > Subject: [PATCH][GCC] aarch64: add armv9-a to -march > > > > Patch is adding new command line option 'armv9-a' to -march. > > > > OK for master? > > Ok. commit f0688d42c9b74a6999548ff2e79ae440b049b87f > Thanks, > Kyrill > > > > > gcc/ChangeLog: > > > > 2021-09-22 Przemyslaw Wirkus > > > > * config/aarch64/aarch64-arches.def (AARCH64_ARCH): Added > > armv9-a. > > * config/aarch64/aarch64.h (AARCH64_FL_V9): New. > > (AARCH64_FL_FOR_ARCH9): New flags for Armv9-A. > > (AARCH64_ISA_V9): New ISA flag.
[committed] wwwdocs: gcc-12/changes.html: Simplify AVX512-FP16 news
Just some editorial changes to simplify things. Pushed. Gerald --- htdocs/gcc-12/changes.html | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 0e2962ee..45e87ea4 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -169,14 +169,14 @@ a work-in-progress. IA-32/x86-64 - New ISA extension support for Intel AVX512-FP16 was added to GCC. + New ISA extension support for Intel AVX512-FP16 was added. AVX512FP16 intrinsics are available via the -mavx512fp16 compiler switch. - For both C and C++, The _Float16 type is supported on + For both C and C++ the _Float16 type is supported on x86 systems with SSE2 enabled. Without {-mavx512fp16}, - all operations will be emulated by software emulation and the - float instructions. + all operations will be emulated in software and float + instructions. -- 2.33.0
Re: [PATCH] [GCC12] Mention Intel AVX512-FP16 and _Float16 support.
On Fri, 24 Sep 2021, Hongtao Liu via Gcc-patches wrote: > + New ISA extension support for Intel AVX512-FP16 was added to GCC. > + AVX512FP16 intrinsics are available [...] So, is it AVX512-FP16 or AVX512FP16? Gerald
Re: [PATCH] gcov: make profile merging smarter
On Fri, Oct 1, 2021 at 11:55 AM Martin Liška wrote: > > Support merging of profiles that are built from a different .o files > but belong to the same source file. Moreover, a checksum is verified > during profile merging and so we can safely combine such profile. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > I'm going to install the patch if there are no comments about it. Is the ->stamp field now unused? I wonder whether crc32 is good enough or whether we need to enable -fprofile-correction by default for example? But -fprofile-correction is about -fprofile-use, not profile merging, right? What does the latter do upon mismatch? Alternatively would it be possible to keep multiple sets of data in the file, one for each 'stamp'? How does the profile-use step figure a mismatch here, or does it not care whether it mixes input with 'different stamp'? The current behavior is also somewhat odd: > ./a.out libgcov profiling error:/tmp/t.gcda:overwriting an existing profile data with a different timestamp but it should probably say 'warning' since it indeed simply overwrites data instead of merging it. I wonder if we can simply drop the stamp check alltogether and make the checks that match up the data against the number of counters behave as to warning and overwriting the data instead of failing fatally? The actual merging of data would need to be delayed, but at least until the first actual merge was done we could simply proceed? Richard. > Thanks, > Martin > > PR gcov-profile/90364 > > gcc/ChangeLog: > > * coverage.c (build_info): Emit checksum to the global variable. > (build_info_type): Add new field for checksum. > (coverage_obj_finish): Pass object_checksum. > (coverage_init): Use 0 as checksum for .gcno files. > * gcov-dump.c (dump_gcov_file): Dump also new checksum field. > * gcov.c (read_graph_file): Read also checksum. > > libgcc/ChangeLog: > > * libgcov-driver.c (merge_one_data): Skip timestamp and verify > checksums. > (write_one_data): Write also checksum. > * libgcov-util.c (read_gcda_file): Read also checksum field. > * libgcov.h (struct gcov_info): Add new field. > --- > gcc/coverage.c | 50 - > gcc/gcov-dump.c | 9 > gcc/gcov.c | 5 + > libgcc/libgcov-driver.c | 8 +-- > libgcc/libgcov-util.c | 3 +++ > libgcc/libgcov.h| 1 + > 6 files changed, 54 insertions(+), 22 deletions(-) > > diff --git a/gcc/coverage.c b/gcc/coverage.c > index 10d7f8366cb..4467f1eaa5c 100644 > --- a/gcc/coverage.c > +++ b/gcc/coverage.c > @@ -129,16 +129,7 @@ static const char *const ctr_names[GCOV_COUNTERS] = { > #undef DEF_GCOV_COUNTER > > /* Forward declarations. */ > -static void read_counts_file (void); > static tree build_var (tree, tree, int); > -static void build_fn_info_type (tree, unsigned, tree); > -static void build_info_type (tree, tree); > -static tree build_fn_info (const struct coverage_data *, tree, tree); > -static tree build_info (tree, tree); > -static bool coverage_obj_init (void); > -static vec *coverage_obj_fn > -(vec *, tree, struct coverage_data const *); > -static void coverage_obj_finish (vec *); > > /* Return the type node for gcov_type. */ > > @@ -218,6 +209,9 @@ read_counts_file (void) > tag = gcov_read_unsigned (); > bbg_file_stamp = crc32_unsigned (bbg_file_stamp, tag); > > + /* Read checksum. */ > + gcov_read_unsigned (); > + > counts_hash = new hash_table (10); > while ((tag = gcov_read_unsigned ())) > { > @@ -935,6 +929,12 @@ build_info_type (tree type, tree fn_info_ptr_type) > DECL_CHAIN (field) = fields; > fields = field; > > + /* Checksum. */ > + field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, > + get_gcov_unsigned_t ()); > + DECL_CHAIN (field) = fields; > + fields = field; > + > /* Filename */ > field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, > build_pointer_type (build_qualified_type > @@ -977,7 +977,7 @@ build_info_type (tree type, tree fn_info_ptr_type) > function info objects. */ > > static tree > -build_info (tree info_type, tree fn_ary) > +build_info (tree info_type, tree fn_ary, unsigned object_checksum) > { > tree info_fields = TYPE_FIELDS (info_type); > tree merge_fn_type, n_funcs; > @@ -996,13 +996,19 @@ build_info (tree info_type, tree fn_ary) > /* next -- NULL */ > CONSTRUCTOR_APPEND_ELT (v1, info_fields, null_pointer_node); > info_fields = DECL_CHAIN (info_fields); > - > + > /* stamp */ > CONSTRUCTOR_APPEND_ELT (v1, info_fields, > build_int_cstu (TREE_TYPE (info_fields), > bbg_file_stamp)); > info_fields = DECL_CHAIN (info_fields); > > + /* Checksum. */ > + CONSTRUCTOR_APPEND_ELT (v1, info_fie
[committed] wwwdocs: Consistently use 32-bit instead of 32bit
Just a little thing I noticed in one of the recent commits. Pushed. --- htdocs/gcc-12/changes.html | 2 +- htdocs/gcc-8/changes.html | 2 +- htdocs/news.html | 4 ++-- htdocs/news/sparc.html | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 45e87ea4..4f7bbd33 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -55,7 +55,7 @@ a work-in-progress. The hppa[12] wwwdocs:*-*-hpux10* and hppa[12]*-*-hpux11* -configurations targeting 32bit PA-RISC with HP-UX have been obsoleted and +configurations targeting 32-bit PA-RISC with HP-UX have been obsoleted and will be removed in a future release. The support for the m32r-*-linux*, m32rle-*-linux*, diff --git a/htdocs/gcc-8/changes.html b/htdocs/gcc-8/changes.html index d9404170..84f32d3e 100644 --- a/htdocs/gcc-8/changes.html +++ b/htdocs/gcc-8/changes.html @@ -300,7 +300,7 @@ f () { /* Do something. */; } Fixed illegal addresses generated from address expressions which refer only to offset 0. Fixed a bug with reg+offset addressing on 32b segments. -In 'large' mode, the offset is treated as 32bits unless it's +In 'large' mode, the offset is treated as 32-bit unless it's in global, read-only or kernarg address space. Fixed a crash caused sometimes by calls with more than 4 arguments. diff --git a/htdocs/news.html b/htdocs/news.html index a4acd823..24eab163 100644 --- a/htdocs/news.html +++ b/htdocs/news.html @@ -1559,7 +1559,7 @@ improvement work were and will be of great use. September 21, 1999 Nick Clifton of Cygnus Solutions has donated support for the Fujitsu -FR30 processor. The FR30 is a low-cost 32bit cpu intended for larger +FR30 processor. The FR30 is a low-cost 32-bit CPU intended for larger embedded applications. It has a simple load/store architecture, 16 general registers and a variable length instruction set. @@ -1762,7 +1762,7 @@ are expected in the future. January 21, 1999 Cygnus donates support for the PowerPC -750 processor. The PPC750 is a 32bit superscalar implementation of the +750 processor. The PPC750 is a 32-bit superscalar implementation of the PowerPC family manufactured by both Motorola and IBM. The PPC750 is targeted at high end Macs as well as high end embedded applications. diff --git a/htdocs/news/sparc.html b/htdocs/news/sparc.html index fcfb8cfa..4c379a06 100644 --- a/htdocs/news/sparc.html +++ b/htdocs/news/sparc.html @@ -138,7 +138,7 @@ improve long term maintainability of the compiler. Details follow. 3) Full support for nearly all features of the new 64-bit SPARC ELF V9 ABI. This includes support for all meaningful code models, including MediumLow, MediumMiddle, MediunAny (both old and new for - backwards compatibility with older GCC versions), and 32bit. + backwards compatibility with older GCC versions), and 32-bit. 4) Tremendously improved support for instruction level parallelism on UltraSPARC. Using some new pieces of infrastructure added to -- 2.33.0
Re: [PATCH] gcov: make profile merging smarter
On 10/1/21 12:17, Richard Biener wrote: On Fri, Oct 1, 2021 at 11:55 AM Martin Liška wrote: Support merging of profiles that are built from a different .o files but belong to the same source file. Moreover, a checksum is verified during profile merging and so we can safely combine such profile. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. I'm going to install the patch if there are no comments about it. Is the ->stamp field now unused? Yes, it's still used for pairing of .gcda and .gcno files when --coverage is used. I wonder whether crc32 is good enough or whether we need to enable Dunno. We can alternatively use a stronger hashing function, maybe inchash? -fprofile-correction by default for example? Probably not. But -fprofile-correction is about -fprofile-use, not profile merging, right? What does the latter do upon mismatch? It prints the 'libgcov profiling error'. Alternatively would it be possible to keep multiple sets of data in the file, one for each 'stamp'? Yes, we can theoretically do it, but I'm not planning working on that now. How does the profile-use step figure a mismatch here, or does it not care whether it mixes input with 'different stamp'? Based on function id, it does verification for cfg_checksum and lineno_checksum. The current behavior is also somewhat odd: ./a.out libgcov profiling error:/tmp/t.gcda:overwriting an existing profile data with a different timestamp but it should probably say 'warning' since it indeed simply overwrites data instead of merging it. Yes, I can prepare an incremental patch for it. I wonder if we can simply drop the stamp check alltogether and make the checks that match up the data against the number of counters behave as to warning and overwriting the data instead of failing fatally? The actual merging of data would need to be delayed, but at least until the first actual merge was done we could simply proceed? Do you mean doing only merging of functions that have correct checksum, and bailing out for the functions that don't? Thanks for the ideas. Martin Richard. Thanks, Martin PR gcov-profile/90364 gcc/ChangeLog: * coverage.c (build_info): Emit checksum to the global variable. (build_info_type): Add new field for checksum. (coverage_obj_finish): Pass object_checksum. (coverage_init): Use 0 as checksum for .gcno files. * gcov-dump.c (dump_gcov_file): Dump also new checksum field. * gcov.c (read_graph_file): Read also checksum. libgcc/ChangeLog: * libgcov-driver.c (merge_one_data): Skip timestamp and verify checksums. (write_one_data): Write also checksum. * libgcov-util.c (read_gcda_file): Read also checksum field. * libgcov.h (struct gcov_info): Add new field. --- gcc/coverage.c | 50 - gcc/gcov-dump.c | 9 gcc/gcov.c | 5 + libgcc/libgcov-driver.c | 8 +-- libgcc/libgcov-util.c | 3 +++ libgcc/libgcov.h| 1 + 6 files changed, 54 insertions(+), 22 deletions(-) diff --git a/gcc/coverage.c b/gcc/coverage.c index 10d7f8366cb..4467f1eaa5c 100644 --- a/gcc/coverage.c +++ b/gcc/coverage.c @@ -129,16 +129,7 @@ static const char *const ctr_names[GCOV_COUNTERS] = { #undef DEF_GCOV_COUNTER /* Forward declarations. */ -static void read_counts_file (void); static tree build_var (tree, tree, int); -static void build_fn_info_type (tree, unsigned, tree); -static void build_info_type (tree, tree); -static tree build_fn_info (const struct coverage_data *, tree, tree); -static tree build_info (tree, tree); -static bool coverage_obj_init (void); -static vec *coverage_obj_fn -(vec *, tree, struct coverage_data const *); -static void coverage_obj_finish (vec *); /* Return the type node for gcov_type. */ @@ -218,6 +209,9 @@ read_counts_file (void) tag = gcov_read_unsigned (); bbg_file_stamp = crc32_unsigned (bbg_file_stamp, tag); + /* Read checksum. */ + gcov_read_unsigned (); + counts_hash = new hash_table (10); while ((tag = gcov_read_unsigned ())) { @@ -935,6 +929,12 @@ build_info_type (tree type, tree fn_info_ptr_type) DECL_CHAIN (field) = fields; fields = field; + /* Checksum. */ + field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, + get_gcov_unsigned_t ()); + DECL_CHAIN (field) = fields; + fields = field; + /* Filename */ field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, build_pointer_type (build_qualified_type @@ -977,7 +977,7 @@ build_info_type (tree type, tree fn_info_ptr_type) function info objects. */ static tree -build_info (tree info_type, tree fn_ary) +build_info (tree info_type, tree fn_ary, unsigned object_checksum) { tree info_fields = TYPE_FIELDS (info_type); tree merge_fn_type, n_funcs; @@
Re: [PATCH] Replace VRP threader with a hybrid forward threader.
On Fri, 24 Sep 2021, Aldy Hernandez via Gcc-patches wrote: > This patch implements the new hybrid forward threader and replaces the > embedded VRP threader with it. I'm not sure this is the right of the patches to follow-up around this, but between Jeff writing "Note we've got massive failures in the tester starting sometime yesterday and I suspect all the threader work. So I'm going to slow down on reviews of that code as we stabilize stuff." in another thread and you "There seems to be a memory consumption issue on 32 bit hosts after the hybrid threader patchset. I'm having a hard time reproducing..." in yet another I can report that my i586-unknown-freebsd11 nightly tester started to fail on Sep 28 at 00:40 UTC, still failed Sep 29 and Sep 30, and successfully passed last night. Failures were all at the same point in all-stage2-gcc: cc1plus: out of memory allocating 65536 bytes after a total of 0 bytes gmake[3]: *** [Makefile:1136: insn-emit.o] Error 1 cc1plus: out of memory allocating 65536 bytes after a total of 0 bytes gmake[3]: *** [Makefile:1136: insn-emit.o] Error 1 cc1plus: out of memory allocating 86776 bytes after a total of 0 bytes gmake[3]: *** [Makefile:1136: insn-emit.o] Error 1 Is this under control now, or was last night just a lucky one? Since that reproduced somewhat regularly, how may I be able to help? Gerald
Re: [PATCH] c: [PR32122] Require pointer types for computed gotos
On Mon, Sep 20, 2021 at 12:15 AM apinski--- via Gcc-patches wrote: > > From: Andrew Pinski > > So GCC has always accepted non-pointer types in computed gotos but > that was wrong based on the documentation: > Any expression of type void * is allowed. > > So this fixes the problem by requiring the type to > be a pointer type. > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > > PR c/32122 > > gcc/c/ChangeLog: > > * c-parser.c (c_parser_statement_after_labels): Pass > the c_expr instead of the tree to c_finish_goto_ptr. > * c-typeck.c (c_finish_goto_ptr): Change the second > argument type to c_expr. > * c-tree.h (c_finish_goto_ptr): Likewise. > Error out if the expression was not of a pointer type. > > gcc/testsuite/ChangeLog: > > * gcc.dg/comp-goto-5.c: New test. > * gcc.dg/comp-goto-6.c: New test. > --- > gcc/c/c-parser.c | 2 +- > gcc/c/c-tree.h | 2 +- > gcc/c/c-typeck.c | 11 ++- > gcc/testsuite/gcc.dg/comp-goto-5.c | 11 +++ > gcc/testsuite/gcc.dg/comp-goto-6.c | 6 ++ > 5 files changed, 29 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/comp-goto-5.c > create mode 100644 gcc/testsuite/gcc.dg/comp-goto-6.c > > diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c > index fb1399e300d..bcd8a05489f 100644 > --- a/gcc/c/c-parser.c > +++ b/gcc/c/c-parser.c > @@ -6141,7 +6141,7 @@ c_parser_statement_after_labels (c_parser *parser, bool > *if_p, > c_parser_consume_token (parser); > val = c_parser_expression (parser); > val = convert_lvalue_to_rvalue (loc, val, false, true); > - stmt = c_finish_goto_ptr (loc, val.value); > + stmt = c_finish_goto_ptr (loc, val); > } > else > c_parser_error (parser, "expected identifier or %<*%>"); > diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h > index d50d0cb7f2d..a046c6b0926 100644 > --- a/gcc/c/c-tree.h > +++ b/gcc/c/c-tree.h > @@ -746,7 +746,7 @@ extern tree c_finish_expr_stmt (location_t, tree); > extern tree c_finish_return (location_t, tree, tree); > extern tree c_finish_bc_stmt (location_t, tree, bool); > extern tree c_finish_goto_label (location_t, tree); > -extern tree c_finish_goto_ptr (location_t, tree); > +extern tree c_finish_goto_ptr (location_t, c_expr val); > extern tree c_expr_to_decl (tree, bool *, bool *); > extern tree c_finish_omp_construct (location_t, enum tree_code, tree, tree); > extern tree c_finish_oacc_data (location_t, tree, tree); > diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c > index 49d1bb067a0..b472e448011 100644 > --- a/gcc/c/c-typeck.c > +++ b/gcc/c/c-typeck.c > @@ -10783,10 +10783,19 @@ c_finish_goto_label (location_t loc, tree label) > the GOTO. */ > > tree > -c_finish_goto_ptr (location_t loc, tree expr) > +c_finish_goto_ptr (location_t loc, c_expr val) > { > + tree expr = val.value; >tree t; >pedwarn (loc, OPT_Wpedantic, "ISO C forbids %"); > + if (expr != error_mark_node > + && !POINTER_TYPE_P (TREE_TYPE (expr)) > + && !null_pointer_constant_p (expr)) > +{ > + error_at (val.get_location (), > + "computed goto must be pointer type"); > + expr = build_zero_cst (ptr_type_node); > +} >expr = c_fully_fold (expr, false, NULL); >expr = convert (ptr_type_node, expr); >t = build1 (GOTO_EXPR, void_type_node, expr); > diff --git a/gcc/testsuite/gcc.dg/comp-goto-5.c > b/gcc/testsuite/gcc.dg/comp-goto-5.c > new file mode 100644 > index 000..d487729a5d4 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/comp-goto-5.c > @@ -0,0 +1,11 @@ > +/* PR c/32122 */ > +/* { dg-do compile } */ > +/* { dg-options "" } */ > + > +enum {a=1}; > +void foo() > +{ > + goto * > +a; /* { dg-error "computed goto must be pointer type" } */ > +} > + > diff --git a/gcc/testsuite/gcc.dg/comp-goto-6.c > b/gcc/testsuite/gcc.dg/comp-goto-6.c > new file mode 100644 > index 000..497f6cd76ca > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/comp-goto-6.c > @@ -0,0 +1,6 @@ > +/* PR c/32122 */ > +/* { dg-do compile } */ > +/* { dg-options "" } */ > +void foo(void *a) { goto *1000; } /* { dg-error "computed goto must be > pointer type" } */ > +void foo1(void *a) { goto *a; } > + > -- > 2.17.1 > Maybe add to one of the testcases a test to ensure that the cast-to-void workaround works successfully? e.g. void foo2(void *a) { goto *(void *)1000; } /* { dg-bogus "computed goto must be pointer type" } */
Re: [PATCH] Replace VRP threader with a hybrid forward threader.
On 10/1/21 12:55 PM, Gerald Pfeifer wrote: On Fri, 24 Sep 2021, Aldy Hernandez via Gcc-patches wrote: This patch implements the new hybrid forward threader and replaces the embedded VRP threader with it. I'm not sure this is the right of the patches to follow-up around this, but between Jeff writing "Note we've got massive failures in the tester starting sometime yesterday and I suspect all the threader work. So I'm going to slow down on reviews of that code as we stabilize stuff." Most of this has been resolved. Some of it was some out-of-tree patches Jeff had on his tree, and some other stuff were tests that needed adjustments on other architectures. Both have been fixed. That being said, the visium & bfin embedded targets have some failures I've yet to look at. On a similar topic, disallowing threading paths that cross loops has brought some problems that I'm looking at. in another thread and you "There seems to be a memory consumption issue on 32 bit hosts after the hybrid threader patchset. I'm having a hard time reproducing..." This has been fixed by: commit 64dd46dbc682fbbc03a74e0298f7ac471c5e80f2 Author: Aldy Hernandez Date: Thu Sep 30 02:19:36 2021 +0200 Plug memory leak in hybrid_threader. Tested on x86-64 Linux. Aldy
Re: [PATCH] c: [PR32122] Require pointer types for computed gotos
On Fri, Oct 1, 2021 at 4:03 AM Eric Gallager via Gcc-patches wrote: > > On Mon, Sep 20, 2021 at 12:15 AM apinski--- via Gcc-patches > wrote: > > > > From: Andrew Pinski > > > > So GCC has always accepted non-pointer types in computed gotos but > > that was wrong based on the documentation: > > Any expression of type void * is allowed. > > > > So this fixes the problem by requiring the type to > > be a pointer type. > > > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > > > > PR c/32122 > > > > gcc/c/ChangeLog: > > > > * c-parser.c (c_parser_statement_after_labels): Pass > > the c_expr instead of the tree to c_finish_goto_ptr. > > * c-typeck.c (c_finish_goto_ptr): Change the second > > argument type to c_expr. > > * c-tree.h (c_finish_goto_ptr): Likewise. > > Error out if the expression was not of a pointer type. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/comp-goto-5.c: New test. > > * gcc.dg/comp-goto-6.c: New test. > > --- > > gcc/c/c-parser.c | 2 +- > > gcc/c/c-tree.h | 2 +- > > gcc/c/c-typeck.c | 11 ++- > > gcc/testsuite/gcc.dg/comp-goto-5.c | 11 +++ > > gcc/testsuite/gcc.dg/comp-goto-6.c | 6 ++ > > 5 files changed, 29 insertions(+), 3 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/comp-goto-5.c > > create mode 100644 gcc/testsuite/gcc.dg/comp-goto-6.c > > > > diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c > > index fb1399e300d..bcd8a05489f 100644 > > --- a/gcc/c/c-parser.c > > +++ b/gcc/c/c-parser.c > > @@ -6141,7 +6141,7 @@ c_parser_statement_after_labels (c_parser *parser, > > bool *if_p, > > c_parser_consume_token (parser); > > val = c_parser_expression (parser); > > val = convert_lvalue_to_rvalue (loc, val, false, true); > > - stmt = c_finish_goto_ptr (loc, val.value); > > + stmt = c_finish_goto_ptr (loc, val); > > } > > else > > c_parser_error (parser, "expected identifier or %<*%>"); > > diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h > > index d50d0cb7f2d..a046c6b0926 100644 > > --- a/gcc/c/c-tree.h > > +++ b/gcc/c/c-tree.h > > @@ -746,7 +746,7 @@ extern tree c_finish_expr_stmt (location_t, tree); > > extern tree c_finish_return (location_t, tree, tree); > > extern tree c_finish_bc_stmt (location_t, tree, bool); > > extern tree c_finish_goto_label (location_t, tree); > > -extern tree c_finish_goto_ptr (location_t, tree); > > +extern tree c_finish_goto_ptr (location_t, c_expr val); > > extern tree c_expr_to_decl (tree, bool *, bool *); > > extern tree c_finish_omp_construct (location_t, enum tree_code, tree, > > tree); > > extern tree c_finish_oacc_data (location_t, tree, tree); > > diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c > > index 49d1bb067a0..b472e448011 100644 > > --- a/gcc/c/c-typeck.c > > +++ b/gcc/c/c-typeck.c > > @@ -10783,10 +10783,19 @@ c_finish_goto_label (location_t loc, tree label) > > the GOTO. */ > > > > tree > > -c_finish_goto_ptr (location_t loc, tree expr) > > +c_finish_goto_ptr (location_t loc, c_expr val) > > { > > + tree expr = val.value; > >tree t; > >pedwarn (loc, OPT_Wpedantic, "ISO C forbids %"); > > + if (expr != error_mark_node > > + && !POINTER_TYPE_P (TREE_TYPE (expr)) > > + && !null_pointer_constant_p (expr)) > > +{ > > + error_at (val.get_location (), > > + "computed goto must be pointer type"); > > + expr = build_zero_cst (ptr_type_node); > > +} > >expr = c_fully_fold (expr, false, NULL); > >expr = convert (ptr_type_node, expr); > >t = build1 (GOTO_EXPR, void_type_node, expr); > > diff --git a/gcc/testsuite/gcc.dg/comp-goto-5.c > > b/gcc/testsuite/gcc.dg/comp-goto-5.c > > new file mode 100644 > > index 000..d487729a5d4 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/comp-goto-5.c > > @@ -0,0 +1,11 @@ > > +/* PR c/32122 */ > > +/* { dg-do compile } */ > > +/* { dg-options "" } */ > > + > > +enum {a=1}; > > +void foo() > > +{ > > + goto * > > +a; /* { dg-error "computed goto must be pointer type" } */ > > +} > > + > > diff --git a/gcc/testsuite/gcc.dg/comp-goto-6.c > > b/gcc/testsuite/gcc.dg/comp-goto-6.c > > new file mode 100644 > > index 000..497f6cd76ca > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/comp-goto-6.c > > @@ -0,0 +1,6 @@ > > +/* PR c/32122 */ > > +/* { dg-do compile } */ > > +/* { dg-options "" } */ > > +void foo(void *a) { goto *1000; } /* { dg-error "computed goto must be > > pointer type" } */ > > +void foo1(void *a) { goto *a; } > > + > > -- > > 2.17.1 > > > > Maybe add to one of the testcases a test to ensure that the > cast-to-void workaround works successfully? > e.g. > void foo2(void *a) { goto *(void *)1000; } /* { dg-bogus "computed > goto must be pointer type" } */ There actually were a few testcases which needed to be f
RE: [PATCH] aarch64: Improve size heuristic for cpymem expansion
> -Original Message- > From: Gcc-patches bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe > LYON via Gcc-patches > Sent: Thursday, September 30, 2021 2:51 PM > To: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] aarch64: Improve size heuristic for cpymem expansion > > > On 29/09/2021 12:20, Kyrylo Tkachov via Gcc-patches wrote: > > Hi all, > > > > Similar to my previous patch for setmem this one does the same for the > cpymem expansion. > > We count the number of ops emitted and compare it against the > alternative of just calling > > the library function when optimising for size. > > For the code: > > void > > cpy_127 (char *out, char *in) > > { > >__builtin_memcpy (out, in, 127); > > } > > > > void > > cpy_128 (char *out, char *in) > > { > >__builtin_memcpy (out, in, 128); > > } > > > > we now emit a call to memcpy (with an extra MOV-immediate instruction > for the size) instead of: > > cpy_127(char*, char*): > > ldp q0, q1, [x1] > > stp q0, q1, [x0] > > ldp q0, q1, [x1, 32] > > stp q0, q1, [x0, 32] > > ldp q0, q1, [x1, 64] > > stp q0, q1, [x0, 64] > > ldr q0, [x1, 96] > > str q0, [x0, 96] > > ldr q0, [x1, 111] > > str q0, [x0, 111] > > ret > > cpy_128(char*, char*): > > ldp q0, q1, [x1] > > stp q0, q1, [x0] > > ldp q0, q1, [x1, 32] > > stp q0, q1, [x0, 32] > > ldp q0, q1, [x1, 64] > > stp q0, q1, [x0, 64] > > ldp q0, q1, [x1, 96] > > stp q0, q1, [x0, 96] > > ret > > > > which is a clear code size win. Speed optimisation heuristics remain > unchanged. > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Pushing to trunk. > > > > Thanks, > > Kyrill > > > > 2021-09-29 Kyrylo Tkachov > > > > * config/aarch64/aarch64.c (aarch64_expand_cpymem): Count > number of > > emitted operations and adjust heuristic for code size. > > > > 2021-09-29 Kyrylo Tkachov > > > > * gcc.target/aarch64/cpymem-size.c: New test. > > > Hi Kyrill, > > Just to mention that the new test fails with -mabi=ilp32... Drat, thanks for letting me know. Pushing fix to trunk. Kyrill gcc/testsuite/ * gcc.target/aarch64/cpymem-size.c: Adjust scan for ilp32. > > > Thanks, > > > Christophe > > test.patch Description: test.patch
[PATCH] ubsan: Move INT_MIN / -1 instrumentation from -fsanitize=integer-divide-by-zero to -fsanitize=signed-integer-overflow [PR102515]
Hi! As noted by Richi, in clang INT_MIN / -1 is instrumented under -fsanitize=signed-integer-overflow rather than -fsanitize=integer-divide-by-zero as we did and doing it in the former makes more sense, as it is overflow during division rather than division by zero. I've verified on godbolt that clang behaved that way since 3.2-ish times or so when sanitizers were added. Furthermore, we've been using -f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float -fsanitize=float-divide-by-zero instrumentation _abort suffix. The case where INT_MIN / -1 is instrumented by one sanitizer and x / 0 by another one when both are enabled is slightly harder if the -f{,no-}sanitize-recover={integer-divide-by-zero,signed-integer-overflow} flags differ, then we need to emit both __ubsan_handle_divrem_overflow and __ubsan_handle_divrem_overflow_abort calls guarded by their respective checks rather than one guarded by check1 || check2. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-10-01 Jakub Jelinek Richard Biener PR sanitizer/102515 gcc/ * doc/invoke.texi (-fsanitize=integer-divide-by-zero): Remove INT_MIN / -1 division detection from here ... (-fsanitize=signed-integer-overflow): ... and add it here. gcc/c-family/ * c-ubsan.c (ubsan_instrument_division): Check the right flag_sanitize_recover bit, depending on which sanitization is done. Sanitize INT_MIN / -1 under SANITIZE_SI_OVERFLOW rather than SANITIZE_DIVIDE. If both SANITIZE_SI_OVERFLOW and SANITIZE_DIVIDE is enabled, neither check is known to be false and flag_sanitize_recover bits for those two aren't the same, emit both __ubsan_handle_divrem_overflow and __ubsan_handle_divrem_overflow_abort calls. gcc/c/ * c-typeck.c (build_binary_op): Call ubsan_instrument_division for division even for SANITIZE_SI_OVERFLOW. gcc/cp/ * typeck.c (cp_build_binary_op): Call ubsan_instrument_division for division even for SANITIZE_SI_OVERFLOW. gcc/testsuite/ * c-c++-common/ubsan/div-by-zero-3.c: Use -fsanitize=signed-integer-overflow instead of -fsanitize=integer-divide-by-zero. * c-c++-common/ubsan/div-by-zero-5.c: Likewise. * c-c++-common/ubsan/div-by-zero-4.c: Likewise. Add -fsanitize-undefined-trap-on-error. * c-c++-common/ubsan/float-div-by-zero-2.c: New test. * c-c++-common/ubsan/overflow-div-1.c: New test. * c-c++-common/ubsan/overflow-div-2.c: New test. * c-c++-common/ubsan/overflow-div-3.c: New test. --- gcc/doc/invoke.texi.jj 2021-09-29 10:07:41.841880681 +0200 +++ gcc/doc/invoke.texi 2021-09-30 20:21:37.736147295 +0200 @@ -15229,7 +15229,7 @@ ISO C90 and C99, etc. @item -fsanitize=integer-divide-by-zero @opindex fsanitize=integer-divide-by-zero -Detect integer division by zero as well as @code{INT_MIN / -1} division. +Detect integer division by zero. @item -fsanitize=unreachable @opindex fsanitize=unreachable @@ -15261,7 +15261,8 @@ returning a value. This option works in @opindex fsanitize=signed-integer-overflow This option enables signed integer overflow checking. We check that the result of @code{+}, @code{*}, and both unary and binary @code{-} -does not overflow in the signed arithmetics. Note, integer promotion +does not overflow in the signed arithmetics. This also detects +@code{INT_MIN / -1} signed division. Note, integer promotion rules must be taken into account. That is, the following is not an overflow: @smallexample --- gcc/c-family/c-ubsan.c.jj 2021-01-04 10:25:50.355103357 +0100 +++ gcc/c-family/c-ubsan.c 2021-09-30 20:18:53.273433013 +0200 @@ -39,8 +39,9 @@ along with GCC; see the file COPYING3. tree ubsan_instrument_division (location_t loc, tree op0, tree op1) { - tree t, tt; + tree t, tt, x = NULL_TREE; tree type = TREE_TYPE (op0); + enum sanitize_code flag = SANITIZE_DIVIDE; /* At this point both operands should have the same type, because they are already converted to RESULT_TYPE. @@ -58,24 +59,42 @@ ubsan_instrument_division (location_t lo op1, build_int_cst (type, 0)); else if (TREE_CODE (type) == REAL_TYPE && sanitize_flags_p (SANITIZE_FLOAT_DIVIDE)) -t = fold_build2 (EQ_EXPR, boolean_type_node, -op1, build_real (type, dconst0)); +{ + t = fold_build2 (EQ_EXPR, boolean_type_node, + op1, build_real (type, dconst0)); + flag = SANITIZE_FLOAT_DIVIDE; +} else -return NULL_TREE; +t = NULL_TREE; /* We check INT_MIN / -1 only for signed types. */ if (TREE_CODE (type) == INTEGER_TYPE - && sanitize_flags_p (SANITIZE_DIVIDE) + && sanitize_flags_p (SANITIZE_SI_OVERFLOW) && !TYPE_UNSIGNED (type)) { - tree x; tt = fold_build2 (EQ_EXPR, boolean_type_node, unshare_expr (op1),
RE: [PATCH][GCC] aarch64: enable cortex-a510 CPU
Hi Przemek, > -Original Message- > From: Przemyslaw Wirkus > Sent: Wednesday, September 22, 2021 9:35 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: [PATCH][GCC] aarch64: enable cortex-a510 CPU > > Patch is adding 'cortex-a510' to -mcpu command line option. > > gcc/ChangeLog: > > 2021-09-02 Przemyslaw Wirkus > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): New > Cortex-A510 core. > * config/aarch64/aarch64-tune.md: Regenerate. > * doc/invoke.texi: Update docs. +/* Arm9.0-A Architecture Processors. */ Typo, should be "Armv9.0-a". + +/* Arm ('A') cores. */ +AARCH64_CORE("cortex-a510", cortexa510, cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, neoversen2, 0x41, 0xd46, -1) + We'll need to update the tuning anyway once we do it properly, but for now I think for the COSTS field (4th to last) we should go with cortexa53 rather than neoversen2. Ok with those changes. Thanks, Kyrill
Re: [PATCH] ubsan: Move INT_MIN / -1 instrumentation from -fsanitize=integer-divide-by-zero to -fsanitize=signed-integer-overflow [PR102515]
On Fri, 1 Oct 2021, Jakub Jelinek wrote: > Hi! > > As noted by Richi, in clang INT_MIN / -1 is instrumented under > -fsanitize=signed-integer-overflow rather than > -fsanitize=integer-divide-by-zero as we did and doing it in the former > makes more sense, as it is overflow during division rather than division > by zero. > I've verified on godbolt that clang behaved that way since 3.2-ish times or > so when sanitizers were added. > Furthermore, we've been using > -f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float > -fsanitize=float-divide-by-zero instrumentation _abort suffix. > The case where INT_MIN / -1 is instrumented by one sanitizer and > x / 0 by another one when both are enabled is slightly harder if > the -f{,no-}sanitize-recover={integer-divide-by-zero,signed-integer-overflow} > flags differ, then we need to emit both __ubsan_handle_divrem_overflow > and __ubsan_handle_divrem_overflow_abort calls guarded by their respective > checks rather than one guarded by check1 || check2. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Maybe the change is worth mentioning in changes.html or maybe it's just a bugfix... Thanks, Richard. > 2021-10-01 Jakub Jelinek > Richard Biener > > PR sanitizer/102515 > gcc/ > * doc/invoke.texi (-fsanitize=integer-divide-by-zero): Remove > INT_MIN / -1 division detection from here ... > (-fsanitize=signed-integer-overflow): ... and add it here. > gcc/c-family/ > * c-ubsan.c (ubsan_instrument_division): Check the right > flag_sanitize_recover bit, depending on which sanitization > is done. Sanitize INT_MIN / -1 under SANITIZE_SI_OVERFLOW > rather than SANITIZE_DIVIDE. If both SANITIZE_SI_OVERFLOW > and SANITIZE_DIVIDE is enabled, neither check is known > to be false and flag_sanitize_recover bits for those two > aren't the same, emit both __ubsan_handle_divrem_overflow > and __ubsan_handle_divrem_overflow_abort calls. > gcc/c/ > * c-typeck.c (build_binary_op): Call ubsan_instrument_division > for division even for SANITIZE_SI_OVERFLOW. > gcc/cp/ > * typeck.c (cp_build_binary_op): Call ubsan_instrument_division > for division even for SANITIZE_SI_OVERFLOW. > gcc/testsuite/ > * c-c++-common/ubsan/div-by-zero-3.c: Use > -fsanitize=signed-integer-overflow instead of > -fsanitize=integer-divide-by-zero. > * c-c++-common/ubsan/div-by-zero-5.c: Likewise. > * c-c++-common/ubsan/div-by-zero-4.c: Likewise. Add > -fsanitize-undefined-trap-on-error. > * c-c++-common/ubsan/float-div-by-zero-2.c: New test. > * c-c++-common/ubsan/overflow-div-1.c: New test. > * c-c++-common/ubsan/overflow-div-2.c: New test. > * c-c++-common/ubsan/overflow-div-3.c: New test. > > --- gcc/doc/invoke.texi.jj2021-09-29 10:07:41.841880681 +0200 > +++ gcc/doc/invoke.texi 2021-09-30 20:21:37.736147295 +0200 > @@ -15229,7 +15229,7 @@ ISO C90 and C99, etc. > > @item -fsanitize=integer-divide-by-zero > @opindex fsanitize=integer-divide-by-zero > -Detect integer division by zero as well as @code{INT_MIN / -1} division. > +Detect integer division by zero. > > @item -fsanitize=unreachable > @opindex fsanitize=unreachable > @@ -15261,7 +15261,8 @@ returning a value. This option works in > @opindex fsanitize=signed-integer-overflow > This option enables signed integer overflow checking. We check that > the result of @code{+}, @code{*}, and both unary and binary @code{-} > -does not overflow in the signed arithmetics. Note, integer promotion > +does not overflow in the signed arithmetics. This also detects > +@code{INT_MIN / -1} signed division. Note, integer promotion > rules must be taken into account. That is, the following is not an > overflow: > @smallexample > --- gcc/c-family/c-ubsan.c.jj 2021-01-04 10:25:50.355103357 +0100 > +++ gcc/c-family/c-ubsan.c2021-09-30 20:18:53.273433013 +0200 > @@ -39,8 +39,9 @@ along with GCC; see the file COPYING3. > tree > ubsan_instrument_division (location_t loc, tree op0, tree op1) > { > - tree t, tt; > + tree t, tt, x = NULL_TREE; >tree type = TREE_TYPE (op0); > + enum sanitize_code flag = SANITIZE_DIVIDE; > >/* At this point both operands should have the same type, > because they are already converted to RESULT_TYPE. > @@ -58,24 +59,42 @@ ubsan_instrument_division (location_t lo >op1, build_int_cst (type, 0)); >else if (TREE_CODE (type) == REAL_TYPE > && sanitize_flags_p (SANITIZE_FLOAT_DIVIDE)) > -t = fold_build2 (EQ_EXPR, boolean_type_node, > - op1, build_real (type, dconst0)); > +{ > + t = fold_build2 (EQ_EXPR, boolean_type_node, > +op1, build_real (type, dconst0)); > + flag = SANITIZE_FLOAT_DIVIDE; > +} >else > -return NULL_TREE; > +t = NULL_TREE; > >/* We check INT_MIN / -1 only for sign
Re: [PATCH] ubsan: Move INT_MIN / -1 instrumentation from -fsanitize=integer-divide-by-zero to -fsanitize=signed-integer-overflow [PR102515]
On Fri, Oct 01, 2021 at 01:53:37PM +0200, Richard Biener wrote: > OK. Maybe the change is worth mentioning in changes.html or maybe it's > just a bugfix... I think it is worth mentioning it there. And so I'd probably wouldn't try to backport it, except perhaps the recover for float-divide-by-zero fix. Jakub
RE: [PATCH][GCC] aarch64: enable cortex-a710 CPU
> -Original Message- > From: Przemyslaw Wirkus > Sent: Wednesday, September 22, 2021 9:37 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: [PATCH][GCC] aarch64: enable cortex-a710 CPU > > Patch is adding 'cortex-a710' to -mcpu command line option. > > gcc/ChangeLog: > > 2021-09-02 Przemyslaw Wirkus > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): New > Cortex-A710 core. > * config/aarch64/aarch64-tune.md: Regenerate. > * doc/invoke.texi: Update docs. diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 478f7e1c8145365f42f43ad94d90c633aae66ebd..a8027e92fa8f7554e2b19d00f7c85c6ed48a92e5 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -166,4 +166,6 @@ AARCH64_CORE("cortex-r82", cortexr82, cortexa53, 8R, AARCH64_FL_FOR_ARCH8_R, cor /* Arm ('A') cores. */ AARCH64_CORE("cortex-a510", cortexa510, cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, neoversen2, 0x41, 0xd46, -1) +AARCH64_CORE("cortex-a710", cortexa710, cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, neoversen2, 0x41, 0xd47, -1) + Again, we'd need to revisit big-core scheduling properly at some point, but for now I think for the scheduling field (3rd) we should use cortexa57 rather than cortexa55. Ok with that change. Thanks, Kyrill
RE: [PATCH][GCC] aarch64: enable cortex-x2 CPU
> -Original Message- > From: Przemyslaw Wirkus > Sent: Wednesday, September 22, 2021 9:38 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: [PATCH][GCC] aarch64: enable cortex-x2 CPU > > Patch is adding 'cortex-x2' to -mcpu command line option. > > OK for master? > > gcc/ChangeLog: > > 2021-09-02 Przemyslaw Wirkus > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): New > Cortex-X2 core. > * config/aarch64/aarch64-tune.md: Regenerate. > * doc/invoke.texi: Update docs. diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index a8027e92fa8f7554e2b19d00f7c85c6ed48a92e5..34d9646ab6a32a19e7cd09d95594b59278d02920 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -168,4 +168,6 @@ AARCH64_CORE("cortex-a510", cortexa510, cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | AARCH64_CORE("cortex-a710", cortexa710, cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, neoversen2, 0x41, 0xd47, -1) +AARCH64_CORE("cortex-x2", cortexx2, cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, neoversen2, 0x41, 0xd48, -1) + Let's use cortexa57 for scheduling here for now. Thanks, Kyrill
RE: [PATCH][GCC] aarch64: enable cortex-x2 CPU
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, October 1, 2021 1:17 PM > To: Przemyslaw Wirkus ; gcc- > patc...@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Marcus Shawcroft > > Subject: RE: [PATCH][GCC] aarch64: enable cortex-x2 CPU > > > > > -Original Message- > > From: Przemyslaw Wirkus > > Sent: Wednesday, September 22, 2021 9:38 AM > > To: gcc-patches@gcc.gnu.org > > Cc: Richard Earnshaw ; Richard Sandiford > > ; Marcus Shawcroft > > ; Kyrylo Tkachov > > > Subject: [PATCH][GCC] aarch64: enable cortex-x2 CPU > > > > Patch is adding 'cortex-x2' to -mcpu command line option. > > > > OK for master? > > > > gcc/ChangeLog: > > > > 2021-09-02 Przemyslaw Wirkus > > > > * config/aarch64/aarch64-cores.def (AARCH64_CORE): New > > Cortex-X2 core. > > * config/aarch64/aarch64-tune.md: Regenerate. > > * doc/invoke.texi: Update docs. > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index > a8027e92fa8f7554e2b19d00f7c85c6ed48a92e5..34d9646ab6a32a19e7cd09d9 > 5594b59278d02920 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -168,4 +168,6 @@ AARCH64_CORE("cortex-a510", cortexa510, > cortexa55, 9A, AARCH64_FL_FOR_ARCH9 | > > AARCH64_CORE("cortex-a710", cortexa710, cortexa55, 9A, > AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | > AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, > neoversen2, 0x41, 0xd47, -1) > > +AARCH64_CORE("cortex-x2", cortexx2, cortexa55, 9A, > AARCH64_FL_FOR_ARCH9 | AARCH64_FL_SVE2_BITPERM | > AARCH64_FL_MEMTAG | AARCH64_FL_I8MM | AARCH64_FL_BF16, > neoversen2, 0x41, 0xd48, -1) > + > > Let's use cortexa57 for scheduling here for now. I should have said, ok with that change. Kyrill > Thanks, > Kyrill
[COMMITTED] Remove shadowed oracle field.
The m_oracle field in the path solver was shadowing the base class. This was causing subtle problems while calculating outgoing edges between blocks, because the query object being passed did not have an oracle set. This should further improve our solving ability. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::compute_ranges): Use get_path_oracle. * gimple-range-path.h (class path_range_query): Remove shadowed m_oracle field. (path_range_query::get_path_oracle): New. --- gcc/gimple-range-path.cc | 2 +- gcc/gimple-range-path.h | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc index a29d5318ca9..422abfddb8f 100644 --- a/gcc/gimple-range-path.cc +++ b/gcc/gimple-range-path.cc @@ -480,7 +480,7 @@ path_range_query::compute_ranges (const vec &path, if (m_resolve) { add_copies_to_imports (); - m_oracle->reset_path (); + get_path_oracle ()->reset_path (); compute_relations (path); } diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h index cf49c6dc086..5f4e73a5949 100644 --- a/gcc/gimple-range-path.h +++ b/gcc/gimple-range-path.h @@ -38,7 +38,6 @@ public: bool range_of_expr (irange &r, tree name, gimple * = NULL) override; bool range_of_stmt (irange &r, gimple *, tree name = NULL) override; bool unreachable_path_p (); - path_oracle *oracle () { return m_oracle; } void dump (FILE *) override; void debug (); @@ -46,6 +45,7 @@ private: bool internal_range_of_expr (irange &r, tree name, gimple *); bool defined_outside_path (tree name); void range_on_path_entry (irange &r, tree name); + path_oracle *get_path_oracle () { return (path_oracle *)m_oracle; } // Cache manipulation. void set_cache (const irange &r, tree name); @@ -85,7 +85,6 @@ private: auto_bitmap m_imports; gimple_ranger &m_ranger; non_null_ref m_non_null; - path_oracle *m_oracle; // Current path position. unsigned m_pos; -- 2.31.1
[PATCH] Pass relations down to range_operator::op[12]_range.
It looks like we don't pass relations down to the op[12]_range operators. This is causing problems when implementing some relational magic for the shift operators. Andrew, this looks like an oversight. If so, how does this look? Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-gori.cc (gimple_range_calc_op1): Add relation argument. (gimple_range_calc_op2): Same. (gori_compute::compute_operand1_range): Pass relation to gimple_range_calc_op*. (gori_compute::compute_operand2_range): Same. --- gcc/gimple-range-gori.cc | 28 +--- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc index 4a1ade7f921..c7cfb71d849 100644 --- a/gcc/gimple-range-gori.cc +++ b/gcc/gimple-range-gori.cc @@ -59,7 +59,8 @@ gimple_range_calc_op1 (irange &r, const gimple *stmt, const irange &lhs_range) bool gimple_range_calc_op1 (irange &r, const gimple *stmt, - const irange &lhs_range, const irange &op2_range) + const irange &lhs_range, const irange &op2_range, + relation_kind rel) { // Unary operation are allowed to pass a range in for second operand // as there are often additional restrictions beyond the type which @@ -72,7 +73,7 @@ gimple_range_calc_op1 (irange &r, const gimple *stmt, return true; } return gimple_range_handler (stmt)->op1_range (r, type, lhs_range, -op2_range); +op2_range, rel); } // Calculate what we can determine of the range of this statement's @@ -82,7 +83,8 @@ gimple_range_calc_op1 (irange &r, const gimple *stmt, bool gimple_range_calc_op2 (irange &r, const gimple *stmt, - const irange &lhs_range, const irange &op1_range) + const irange &lhs_range, const irange &op1_range, + relation_kind rel) { tree type = TREE_TYPE (gimple_range_operand2 (stmt)); // An empty range is viral. @@ -92,7 +94,7 @@ gimple_range_calc_op2 (irange &r, const gimple *stmt, return true; } return gimple_range_handler (stmt)->op2_range (r, type, lhs_range, -op1_range); +op1_range, rel); } // Return TRUE if GS is a logical && or || expression. @@ -1000,6 +1002,12 @@ gori_compute::compute_operand1_range (irange &r, gimple *stmt, int_range_max op1_range, op2_range; tree op1 = gimple_range_operand1 (stmt); tree op2 = gimple_range_operand2 (stmt); + relation_kind rel; + + if (op1 && op2) +rel = src.query_relation (op1, op2); + else +rel = VREL_NONE; // Fetch the known range for op1 in this block. src.get_operand (op1_range, op1); @@ -1008,7 +1016,7 @@ gori_compute::compute_operand1_range (irange &r, gimple *stmt, if (op2) { src.get_operand (op2_range, op2); - if (!gimple_range_calc_op1 (r, stmt, lhs, op2_range)) + if (!gimple_range_calc_op1 (r, stmt, lhs, op2_range, rel)) return false; } else @@ -1016,7 +1024,7 @@ gori_compute::compute_operand1_range (irange &r, gimple *stmt, // We pass op1_range to the unary operation. Nomally it's a // hidden range_for_type parameter, but sometimes having the // actual range can result in better information. - if (!gimple_range_calc_op1 (r, stmt, lhs, op1_range)) + if (!gimple_range_calc_op1 (r, stmt, lhs, op1_range, rel)) return false; } @@ -1077,12 +1085,18 @@ gori_compute::compute_operand2_range (irange &r, gimple *stmt, int_range_max op1_range, op2_range; tree op1 = gimple_range_operand1 (stmt); tree op2 = gimple_range_operand2 (stmt); + relation_kind rel; + + if (op1 && op2) +rel = src.query_relation (op1, op2); + else +rel = VREL_NONE; src.get_operand (op1_range, op1); src.get_operand (op2_range, op2); // Intersect with range for op2 based on lhs and op1. - if (!gimple_range_calc_op2 (r, stmt, lhs, op1_range)) + if (!gimple_range_calc_op2 (r, stmt, lhs, op1_range, rel)) return false; unsigned idx; -- 2.31.1
[PATCH] Handle EQ_EXPR relation for operator_lshift.
Knowing that X << X is non-zero means X is also non-zero. This patch teaches this this to range-ops. As usual, the big twiddling experts could come up with all sorts of fancy enhancements in this area, and we welcome all patches :). I will push this pending tests. gcc/ChangeLog: PR tree-optimization/102546 * range-op.cc (operator_lshift::op1_range): Handle EQ_EXPR relation. --- gcc/range-op.cc | 19 --- gcc/testsuite/gcc.dg/tree-ssa/pr102546.c | 23 +++ 2 files changed, 39 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102546.c diff --git a/gcc/range-op.cc b/gcc/range-op.cc index 5e37133026d..53f3be4266e 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -2075,9 +2075,14 @@ operator_lshift::op1_range (irange &r, tree type, const irange &lhs, const irange &op2, - relation_kind rel ATTRIBUTE_UNUSED) const + relation_kind rel) const { tree shift_amount; + int_range<2> adjust (type); + + if (rel == EQ_EXPR && !lhs.contains_p (build_zero_cst (type))) +adjust.set_nonzero (type); + if (op2.singleton_p (&shift_amount)) { wide_int shift = wi::to_wide (shift_amount); @@ -2086,10 +2091,11 @@ operator_lshift::op1_range (irange &r, if (wi::ge_p (shift, wi::uhwi (TYPE_PRECISION (type), TYPE_PRECISION (op2.type ())), UNSIGNED)) - return false; + goto done; if (shift == 0) { r = lhs; + r.intersect (adjust); return true; } @@ -2126,9 +2132,16 @@ operator_lshift::op1_range (irange &r, if (utype != type) range_cast (r, type); + r.intersect (adjust); return true; } - return false; + + done: + if (adjust.varying_p ()) +return false; + + r = adjust; + return true; } bool diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102546.c b/gcc/testsuite/gcc.dg/tree-ssa/pr102546.c new file mode 100644 index 000..4bd98747732 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102546.c @@ -0,0 +1,23 @@ +// { dg-do compile } +// { dg-options "-O3 -fdump-tree-optimized" } + +static int a; +static char b, c, d; +void bar(void); +void foo(void); + +int main() { +int f = 0; +for (; f <= 5; f++) { +bar(); +b = b && f; +d = f << f; +if (!(a >= d || f)) +foo(); +c = 1; +for (; c; c = 0) +; +} +} + +// { dg-final { scan-tree-dump-not "foo" "optimized" } } -- 2.31.1
Re: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction
On Thu, Sep 16, 2021 at 8:13 PM Di Zhao OS wrote: > > Sorry about updating on this after so long. It took me much time to work out a > new plan and pass the tests. > > The new idea is to use one variable to represent a set of equal variables at > some basic-block. This variable is called a "equivalence head" or "equiv-head" > in the code. (There's no-longer a "equivalence map".) > > - Initially an SSA_NAME's "equivalence head" is its value number. Temporary > equivalence heads are recorded as unary NOP_EXPR results in the vn_nary_op_t > map. Besides, when inserting into vn_nary_op_t map, make the new result at > front of the vn_pval list, so that when searching for a variable's > equivalence head, the first result represents the largest equivalence set at > current location. > - In vn_ssa_aux_t, maintain a list of references to valid_info->nary entry. > For recorded equivalences, the reference is result->entry; for normal N-ary > operations, the reference is operand->entry. > - When recording equivalences, if one side A is constant or has more refs, > make > it the new equivalence head of the other side B. Traverse B's ref-list, if a > variable C's previous equiv-head is B, update to A. And re-insert B's n-ary > operations by replacing B with A. > - When inserting and looking for the results of n-ary operations, insert and > lookup by the operands' equiv-heads. > > So except for the refs in vn_ssa_aux_t, this scheme uses the original > infrastructure to its best. Quadric search time is avoided at the cost of some > re-insertions. Test results on SPEC2017 intrate (counts and percentages): > > |more bb |more bb |more stmt|more stmt|more |more > |removed |removed |removed |removed |nv_nary_ops|nv_nary_ops > |at fre1 |at fre1 |at fre1 |at fre1 |inserted |inserted > -- > 500.perlbench_r| 64 | 1.98% | 103 | 0.19% | 11260 | 12.16% > 502.gcc_r | 671| 4.80% | 317 | 0.23% | 13964 | 6.09% > 505.mcf_r | 5 | 35.71% | 9 | 1.40% | 32| 2.52% > 520.omnetpp| 132| 5.45% | 39 | 0.11% | 1895 | 3.91% > 523.xalancbmk_r| 238| 3.26% | 313 | 0.36% | 1417 | 1.27% > 525.x264_r | 4 | 1.36% | 27 | 0.11% | 1752 | 6.78% > 531.deepsjeng_r| 1 | 3.45% | 2 | 0.14% | 228 | 8.67% > 541.leela_r| 2 | 0.63% | 0 | 0.00% | 92| 1.14% > 548.exchange2_r| 0 | 0.00% | 3 | 0.04% | 49| 1.03% > 557.xz_r | 0 | 0.00% | 3 | 0.07% | 272 | 7.55% > > There're more basic_blocks and statements removed compared with last > implementation, the reasons are: > 1) "CONST op CONST" simplification is included. It is missed in previous > patch. > 2) By inserting RHS of statements on equiv-heads, more N-ary operations can be >simplified. One example is in 'ssa-fre-97.c' in the patch file. > > While jump-threading & vrp also utilize temporary equivalences (so some of the > newly removed blocks and statements can also be covered by them), I think this > patch is a supplement, in cases when jump threading cannot take place (the > original example), or value number info needs to be involved (the > 'ssa-fre-97.c' example). > > Fixed the former issue with non-iterate mode. > > About recording the temporary equivalences generated by PHIs (i.e. the > 'record_equiv_from_previous_cond' stuff), I have to admit it looks strange and > the code size is large, but I haven't find a better way yet. Consider a piece > of CFG like the one below, if we want to record x==x2 on the true edge when > processing bb1, the location (following current practice) will be bb2. But > that > is not useful at bb5 or bb6, because bb2 doesn't dominate them. And I can't > find a place to record x==x1 when processing bb1. > If we can record things on edges rather than blocks, say x==x1 on 1->3 and > x==x2 on 1->2, then perhaps with an extra check for "a!=0", x2 can be a valid > equiv-head for x since bb5. But I think it lacks efficiency and is not > persuasive. It is more efficient to find a valid previous predicate when > processing bb4, because the vn_nary_op_t will be fetched anyway. > -- > | if (a != 0) | bb1 > -- > f | \ t > |--- > || bb2 | > |--- > | / > - > | x = PHI | bb3 > - >| > >| >-- >| if (a != 0) | bb4 >-- >|f \t > - --- > bb7 | where | | bb5 | ==> where "x==x2" is recorded now > | "x==x1" is| --- > | recorded |\ > | now |... > - | >
[PATCH][GCC][committed] aarch64: fix AARCH64_FL_V9 flag value
Patch is fixing AARCH64_FL_V9 flag value which is now wrongly set due to merge error. Committed as obvious. gcc/ChangeLog: * config/aarch64/aarch64.h (AARCH64_FL_V9): Update value. --- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 6908b8f4a16..2792bb29adb 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -230,8 +230,6 @@ extern unsigned aarch64_architecture_version; /* Pointer Authentication (PAUTH) extension. */ #define AARCH64_FL_PAUTH (1ULL << 40) -/* Armv9.0-A. */ -#define AARCH64_FL_V9 (1ULL << 41) /* Armv9.0-A Architecture. */ /* 64-byte atomic load/store extensions. */ #define AARCH64_FL_LS64 (1ULL << 41) @@ -239,6 +237,9 @@ extern unsigned aarch64_architecture_version; /* Armv8.7-a architecture extensions. */ #define AARCH64_FL_V8_7 (1ULL << 42) +/* Armv9.0-A. */ +#define AARCH64_FL_V9 (1ULL << 43) /* Armv9.0-A Architecture. */ + /* Has FP and SIMD. */ #define AARCH64_FL_FPSIMD (AARCH64_FL_FP | AARCH64_FL_SIMD)
Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]
On 9/30/21 14:24, nick huang wrote: You may need to run contrib/gcc-git-customization.sh to get the git gcc-verify command. I re-setup and can use git gcc-verify. Now I can see it rejects because I forgot to add a description of modified file. Now that it passes gcc-verify and I attach the changelog as attachment. Thank you again for your patient explanation and help! You're welcome, thanks for your patience as well! Unfortunately, git gcc-verify still fails with this version: ERR: line should start with a tab: "PR c++/101783" ERR: line should start with a tab: "* tree.c (cp_build_qualified_type_real): Excluding typedef from error" ERR: line should start with a tab: "PR c++/101783" ERR: line should start with a tab: "* g++.dg/parse/pr101783.C: New test." It might work better to attach the output of git format-patch. Also, your commit subject line is too long, at 83 characters: It must be under 75 characters, and preferably closer to 50. I might shorten it to c++: cv-qualified ref introduced by typedef [PR101783] * tree.c (cp_build_qualified_type_real): Excluding typedef from error A change description in the ChangeLog should use present tense ("Exclude"), have a period at the end, and line wrap at 75 characters like the rest of the commit message. So, * tree.c (cp_build_qualified_type_real): Exclude typedef from error. + ([dcl.type.decltype]),in which case the cv-qualifiers are ignored. + */ We usually don't put */ on its own line. Jason
Re: [PATCH] c++: Implement C++20 -Wdeprecated-array-compare [PR97573]
On 9/30/21 17:56, Marek Polacek wrote: On Thu, Sep 30, 2021 at 03:34:24PM -0400, Jason Merrill wrote: On 9/30/21 10:50, Marek Polacek wrote: This patch addresses one of my leftovers from GCC 11. C++20 introduced [depr.array.comp]: "Equality and relational comparisons between two operands of array type are deprecated." so this patch adds -Wdeprecated-array-compare (enabled by default in C++20). Why not enable it by default in all modes? It was always pretty dubious code. Sure, it could be done, but it kind of complicates things: we'd probably need a different option and a different message because it seems incorrect to say "deprecated" in e.g. C++17 when this was only deprecated in C++20. The warning could say "deprecated in C++20", which is always true? I'd rather not add another option; if it stays -Wdeprecated-array-compare but -Wno-deprecated doesn't turn it off that also seems weird. I could rename it to -Warray-compare, enable by -Wall, and only append "is deprecated" to the warning message in C++20. Does that seem better? That sounds fine too. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/97573 gcc/c-family/ChangeLog: * c-opts.c (c_common_post_options): In C++20, turn on -Wdeprecated-array-compare. * c.opt (Wdeprecated-array-compare): New option. gcc/cp/ChangeLog: * typeck.c (do_warn_deprecated_array_compare): New. (cp_build_binary_op): Call it for equality and relational comparisons. gcc/ChangeLog: * doc/invoke.texi: Document -Wdeprecated-array-compare. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr15791-1.C: Add dg-warning. * g++.dg/cpp2a/array-comp1.C: New test. * g++.dg/cpp2a/array-comp2.C: New test. * g++.dg/cpp2a/array-comp3.C: New test. --- gcc/c-family/c-opts.c | 5 gcc/c-family/c.opt| 4 +++ gcc/cp/typeck.c | 28 +++ gcc/doc/invoke.texi | 19 - gcc/testsuite/g++.dg/cpp2a/array-comp1.C | 34 +++ gcc/testsuite/g++.dg/cpp2a/array-comp2.C | 31 + gcc/testsuite/g++.dg/cpp2a/array-comp3.C | 29 +++ gcc/testsuite/g++.dg/tree-ssa/pr15791-1.C | 2 +- 8 files changed, 150 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp1.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp3.C diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 3eaab5e1530..00b52cc5e12 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -962,6 +962,11 @@ c_common_post_options (const char **pfilename) warn_deprecated_enum_float_conv, cxx_dialect >= cxx20 && warn_deprecated); + /* -Wdeprecated-array-compare is enabled by default in C++20. */ + SET_OPTION_IF_UNSET (&global_options, &global_options_set, + warn_deprecated_array_compare, + cxx_dialect >= cxx20 && warn_deprecated); + /* Declone C++ 'structors if -Os. */ if (flag_declone_ctor_dtor == -1) flag_declone_ctor_dtor = optimize_size; diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 9c151d19870..a4f0ea68594 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -540,6 +540,10 @@ Wdeprecated C C++ ObjC ObjC++ CPP(cpp_warn_deprecated) CppReason(CPP_W_DEPRECATED) ; Documented in common.opt +Wdeprecated-array-compare +C++ ObjC++ Var(warn_deprecated_array_compare) Warning +Warn about deprecated comparisons between two operands of array type. + Wdeprecated-copy C++ ObjC++ Var(warn_deprecated_copy) Warning LangEnabledBy(C++ ObjC++, Wextra) Mark implicitly-declared copy operations as deprecated if the class has a diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index a2398dbe660..1e3a41104d6 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "asan.h" #include "gimplify.h" +#include "tree-pretty-print.h" static tree cp_build_addr_expr_strict (tree, tsubst_flags_t); static tree cp_build_function_call (tree, tree, tsubst_flags_t); @@ -4725,6 +4726,21 @@ do_warn_enum_conversions (location_t loc, enum tree_code code, tree type0, } } +/* Warn about C++20 [depr.array.comp] array comparisons: "Equality + and relational comparisons between two operands of array type are + deprecated." */ + +static inline void +do_warn_deprecated_array_compare (location_t location, tree_code code, + tree op0, tree op1) +{ + if (warning_at (location, OPT_Wdeprecated_array_compare, + "comparison between two arrays is deprecated")) +inform (location, "use unary %<+%> which decays operands to pointers " + "or %<&%D[0] %s &
Re: [PATCH] Pass relations down to range_operator::op[12]_range.
On 10/1/21 8:43 AM, Aldy Hernandez wrote: It looks like we don't pass relations down to the op[12]_range operators. This is causing problems when implementing some relational magic for the shift operators. Andrew, this looks like an oversight. If so, how does this look? Hrm. It's at least partial. It will utilize relations as they exist at the query point, but they would not include any relations introduced by the unwind sequence. at the If, there is no relation c_1 < a_2. Its true we are checking the true edge, and in theory the relation should be registered on that true edge.. Perhaps I need to register the relation on the edge from the stmt before resolving the stmt rather than after like we currently do. Let me think about that for a bit. Andrew
[PATCH] c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]
Here during partial ordering of the two partial specializations we end up in unify with parm=arg=NONTYPE_ARGUMENT_PACK, and crash shortly thereafter because uses_template_parms calls potential_constant_expression which doesn't handle NONTYPE_ARGUMENT_PACK. This patch fixes this by checking dependent_template_arg_p instead of uses_template_parms when parm==arg, which does handle NONTYPE_ARGUMENT_PACK. We could also perhaps fix uses_template_parms / inst_dep_expr_p to better handle NONTYPE_ARGUMENT_PACK, but interestingly none of our existing tests exercise calling those functions on NONTYPE_ARGUMENT_PACK, so such a fix would be seemingly moot. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? PR c++/102547 gcc/cp/ChangeLog: * pt.c (unify): Check dependent_template_arg_p instead of uses_template_parms when parm==arg. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-partial2.C: New test. --- gcc/cp/pt.c | 2 +- .../g++.dg/cpp0x/variadic-partial2.C | 16 ++ .../g++.dg/cpp0x/variadic-partial2a.C | 22 +++ 3 files changed, 39 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 1dcdffe322a..59c00c77a30 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -23587,7 +23587,7 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict, even if ARG == PARM, since we won't record unifications for the template parameters. We might need them if we're trying to figure out which of two things is more specialized. */ - if (arg == parm && !uses_template_parms (parm)) + if (arg == parm && !dependent_template_arg_p (parm)) return unify_success (explain_p); /* Handle init lists early, so the rest of the function can assume diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C new file mode 100644 index 000..df61f26a3c1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C @@ -0,0 +1,16 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } + +template +struct vals { }; + +template +struct vals_client { }; + +template +struct vals_client, T> { }; + +template +struct vals_client, void> { }; + +template struct vals_client, void>; //- "sorry, unimplemented..., ICE" diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C new file mode 100644 index 000..cc0ea488ad3 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C @@ -0,0 +1,22 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } +// A version of variadic-partial2.C where the partial ordering is performed +// on function templates instead of class templates. + +template +struct vals { }; + +template +void f(V, T) { }; + +template +void f(vals, T) { }; + +template +void f(vals, char) { }; + +template void f(vals<1, 2>, char); //- "sorry, unimplemented..., ICE" + +int main() { + f(vals<1, 3>{}, 'a'); //- "sorry, unimplemented..., ICE" +} -- 2.33.0.610.gcefe983a32
[PATCH][pushed] options: fix concat of options.
It's quite an obvious error I made during concat of merged_decoded_options. make check -k RUNTESTFLAGS="i386.exp" works fine now. Martin PR target/102552 gcc/c-family/ChangeLog: * c-common.c (parse_optimize_options): decoded_options[0] is used for program name, so merged_decoded_options should also respect that. --- gcc/c-family/c-common.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 7b99a5546ea..5845c675e85 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -5912,9 +5912,10 @@ parse_optimize_options (tree args, bool attr_p) cl_decoded_option *merged_decoded_options = XNEWVEC (cl_decoded_option, merged_decoded_options_count); + /* Note the first decoded_options is used for the program name. */ for (unsigned i = 0; i < save_opt_count; ++i) -merged_decoded_options[i] = save_opt_decoded_options[i]; - for (unsigned i = 0; i < decoded_options_count; ++i) +merged_decoded_options[i + 1] = save_opt_decoded_options[i]; + for (unsigned i = 1; i < decoded_options_count; ++i) merged_decoded_options[save_opt_count + i] = decoded_options[i]; /* And apply them. */ -- 2.33.0
Re: [PATCH] c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]
On 10/1/21 09:46, Patrick Palka wrote: Here during partial ordering of the two partial specializations we end up in unify with parm=arg=NONTYPE_ARGUMENT_PACK, and crash shortly thereafter because uses_template_parms calls potential_constant_expression which doesn't handle NONTYPE_ARGUMENT_PACK. This patch fixes this by checking dependent_template_arg_p instead of uses_template_parms when parm==arg, which does handle NONTYPE_ARGUMENT_PACK. We could also perhaps fix uses_template_parms / inst_dep_expr_p to better handle NONTYPE_ARGUMENT_PACK, Please. but interestingly none of our existing tests exercise calling those functions on NONTYPE_ARGUMENT_PACK, so such a fix would be seemingly moot. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? PR c++/102547 gcc/cp/ChangeLog: * pt.c (unify): Check dependent_template_arg_p instead of uses_template_parms when parm==arg. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-partial2.C: New test. --- gcc/cp/pt.c | 2 +- .../g++.dg/cpp0x/variadic-partial2.C | 16 ++ .../g++.dg/cpp0x/variadic-partial2a.C | 22 +++ 3 files changed, 39 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 1dcdffe322a..59c00c77a30 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -23587,7 +23587,7 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict, even if ARG == PARM, since we won't record unifications for the template parameters. We might need them if we're trying to figure out which of two things is more specialized. */ - if (arg == parm && !uses_template_parms (parm)) + if (arg == parm && !dependent_template_arg_p (parm)) return unify_success (explain_p); /* Handle init lists early, so the rest of the function can assume diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C new file mode 100644 index 000..df61f26a3c1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C @@ -0,0 +1,16 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } + +template +struct vals { }; + +template +struct vals_client { }; + +template +struct vals_client, T> { }; + +template +struct vals_client, void> { }; + +template struct vals_client, void>; //- "sorry, unimplemented..., ICE" diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C new file mode 100644 index 000..cc0ea488ad3 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C @@ -0,0 +1,22 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } +// A version of variadic-partial2.C where the partial ordering is performed +// on function templates instead of class templates. + +template +struct vals { }; + +template +void f(V, T) { }; + +template +void f(vals, T) { }; + +template +void f(vals, char) { }; + +template void f(vals<1, 2>, char); //- "sorry, unimplemented..., ICE" + +int main() { + f(vals<1, 3>{}, 'a'); //- "sorry, unimplemented..., ICE" +}
[committed] libstdc++: Replace try-catch in std::list::merge to avoid O(N) size
The current std::list::merge code calls size() before starting to merge any elements, so that the _M_size members can be updated after the merge finishes. The work is done in a try-block so that the sizes can still be updated in an exception handler if any element comparison throws. The _M_size members only exist for the cxx11 ABI, so the initial call to size() and the try-catch are only needed for that ABI. For the old ABI the size() call performs an O(N) list traversal to get a value that isn't even used, and catching exceptions just to rethrow them isn't needed either. This refactors the merge functions to remove the try-catch block and use an RAII type instead. For the cxx11 ABI that type's destructor updates the list sizes, and for the old ABI it's a no-op. libstdc++-v3/ChangeLog: * include/bits/list.tcc (list::merge): Remove call to size() and try-catch block. Use _Finalize_merge instead. * include/bits/stl_list.h (list::_Finalize_merge): New scope guard type to update _M_size members after a merge. Tested x86_64-linux. Committed to trunk. commit b8d42cfa84fb31e592411e6cea41bdde980c51d7 Author: Jonathan Wakely Date: Wed Sep 29 20:46:55 2021 libstdc++: Replace try-catch in std::list::merge to avoid O(N) size The current std::list::merge code calls size() before starting to merge any elements, so that the _M_size members can be updated after the merge finishes. The work is done in a try-block so that the sizes can still be updated in an exception handler if any element comparison throws. The _M_size members only exist for the cxx11 ABI, so the initial call to size() and the try-catch are only needed for that ABI. For the old ABI the size() call performs an O(N) list traversal to get a value that isn't even used, and catching exceptions just to rethrow them isn't needed either. This refactors the merge functions to remove the try-catch block and use an RAII type instead. For the cxx11 ABI that type's destructor updates the list sizes, and for the old ABI it's a no-op. libstdc++-v3/ChangeLog: * include/bits/list.tcc (list::merge): Remove call to size() and try-catch block. Use _Finalize_merge instead. * include/bits/stl_list.h (list::_Finalize_merge): New scope guard type to update _M_size members after a merge. diff --git a/libstdc++-v3/include/bits/list.tcc b/libstdc++-v3/include/bits/list.tcc index 0ce4c47a90e..62b0ba1063a 100644 --- a/libstdc++-v3/include/bits/list.tcc +++ b/libstdc++-v3/include/bits/list.tcc @@ -416,29 +416,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER iterator __last1 = end(); iterator __first2 = __x.begin(); iterator __last2 = __x.end(); - const size_t __orig_size = __x.size(); - __try { - while (__first1 != __last1 && __first2 != __last2) - if (*__first2 < *__first1) - { - iterator __next = __first2; - _M_transfer(__first1, __first2, ++__next); - __first2 = __next; - } - else - ++__first1; - if (__first2 != __last2) - _M_transfer(__last1, __first2, __last2); - this->_M_inc_size(__x._M_get_size()); - __x._M_set_size(0); - } - __catch(...) + const _Finalize_merge __fin(*this, __x, __first2); + + while (__first1 != __last1 && __first2 != __last2) + if (*__first2 < *__first1) + { + iterator __next = __first2; + _M_transfer(__first1, __first2, ++__next); + __first2 = __next; + } + else + ++__first1; + if (__first2 != __last2) { - const size_t __dist = std::distance(__first2, __last2); - this->_M_inc_size(__orig_size - __dist); - __x._M_set_size(__dist); - __throw_exception_again; + _M_transfer(__last1, __first2, __last2); + __first2 = __last2; } } } @@ -463,30 +456,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER iterator __last1 = end(); iterator __first2 = __x.begin(); iterator __last2 = __x.end(); - const size_t __orig_size = __x.size(); - __try - { - while (__first1 != __last1 && __first2 != __last2) - if (__comp(*__first2, *__first1)) - { - iterator __next = __first2; - _M_transfer(__first1, __first2, ++__next); - __first2 = __next; - } - else - ++__first1; - if (__first2 != __last2) - _M_transfer(__last1, __first2, __last2); - this->_M_inc_size(__x._M_get_size()); - __x._M_set_size(0); -
[committed] libstdc++: Fix _ForwardIteratorConcept for __gnu_debug::vector
The recent changes to the _GLIBCXX_CONCEPT_CHECKS checks for forward iterators don't work for vector iterators in debug mode, because the _Safe_iterator specializations don't match the special cases I added for _Bit_iterator and _Bit_const_iterator. This refactors the _ForwardIteratorReferenceConcept class template to identify vector iterators using a new trait, which also works for debug iterators. libstdc++-v3/ChangeLog: * include/bits/boost_concept_check.h (_Is_vector_bool_iterator): New trait to identify vector iterators, including debug ones. (_ForwardIteratorReferenceConcept): Add default template argument using _Is_vector_bool_iterator and use it in partial specialization for the vector cases. (_Mutable_ForwardIteratorReferenceConcept): Likewise. * testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error line number. Tested x86_64-linux. Committed to trunk. commit c67339d12653c33f85f8141789d7a7cf38831cbd Author: Jonathan Wakely Date: Thu Sep 30 11:25:15 2021 libstdc++: Fix _ForwardIteratorConcept for __gnu_debug::vector The recent changes to the _GLIBCXX_CONCEPT_CHECKS checks for forward iterators don't work for vector iterators in debug mode, because the _Safe_iterator specializations don't match the special cases I added for _Bit_iterator and _Bit_const_iterator. This refactors the _ForwardIteratorReferenceConcept class template to identify vector iterators using a new trait, which also works for debug iterators. libstdc++-v3/ChangeLog: * include/bits/boost_concept_check.h (_Is_vector_bool_iterator): New trait to identify vector iterators, including debug ones. (_ForwardIteratorReferenceConcept): Add default template argument using _Is_vector_bool_iterator and use it in partial specialization for the vector cases. (_Mutable_ForwardIteratorReferenceConcept): Likewise. * testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error line number. diff --git a/libstdc++-v3/include/bits/boost_concept_check.h b/libstdc++-v3/include/bits/boost_concept_check.h index 71c99c13e93..81352518c50 100644 --- a/libstdc++-v3/include/bits/boost_concept_check.h +++ b/libstdc++-v3/include/bits/boost_concept_check.h @@ -47,11 +47,19 @@ namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION +_GLIBCXX_BEGIN_NAMESPACE_CONTAINER struct _Bit_iterator; struct _Bit_const_iterator; +_GLIBCXX_END_NAMESPACE_CONTAINER _GLIBCXX_END_NAMESPACE_VERSION } +namespace __gnu_debug +{ + template +class _Safe_iterator; +} + namespace __gnu_cxx _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION @@ -478,10 +486,32 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; }; _ValueT __val() const; }; -#pragma GCC diagnostic push -#pragma GCC diagnostic ignored "-Wunused-variable" + template + struct _Is_vector_bool_iterator + { static const bool __value = false; }; - template +#ifdef _GLIBCXX_DEBUG + namespace __cont = ::std::_GLIBCXX_STD_C; +#else + namespace __cont = ::std; +#endif + + // Trait to identify vector::iterator + template <> + struct _Is_vector_bool_iterator<__cont::_Bit_iterator> + { static const bool __value = true; }; + + // And for vector::const_iterator. + template <> + struct _Is_vector_bool_iterator<__cont::_Bit_const_iterator> + { static const bool __value = true; }; + + // And for __gnu_debug::vector iterators too. + template + struct _Is_vector_bool_iterator<__gnu_debug::_Safe_iterator<_It, _Seq, _Tag> > + : _Is_vector_bool_iterator<_It> { }; + + template ::__value> struct _ForwardIteratorReferenceConcept { void __constraints() { @@ -493,7 +523,7 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; }; } }; - template + template ::__value> struct _Mutable_ForwardIteratorReferenceConcept { void __constraints() { @@ -503,26 +533,22 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; }; } }; - // vector::iterator is not a real forward reference, but pretend it is. - template <> - struct _ForwardIteratorReferenceConcept + // vector iterators are not real forward iterators, but we ignore that. + template + struct _ForwardIteratorReferenceConcept<_Tp, true> { void __constraints() { } }; - // vector::iterator is not a real forward reference, but pretend it is. - template <> - struct _Mutable_ForwardIteratorReferenceConcept + // vector iterators are not real forward iterators, but we ignore that. + template + struct _Mutable_ForwardIteratorReferenceConcept<_Tp, true> { void __constraints() { } }; - // And vector::const iterator too. - template <> - struct _ForwardIteratorReferenceConcept - { -void __constraints() { } - }; +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-W
[committed] libstdc++: Add noexcept to istream_iterator and ostream_iterator
libstdc++-v3/ChangeLog: * include/bits/stream_iterator.h (istream_iterator): Add noexcept to constructors and non-throwing member functions and friend functions. (ostream_iterator): Likewise. Tested x86_64-linux. Committed to trunk. commit 901fa4cc27ce693b361220818732556bfa586eea Author: Jonathan Wakely Date: Thu Sep 30 14:39:36 2021 libstdc++: Add noexcept to istream_iterator and ostream_iterator libstdc++-v3/ChangeLog: * include/bits/stream_iterator.h (istream_iterator): Add noexcept to constructors and non-throwing member functions and friend functions. (ostream_iterator): Likewise. diff --git a/libstdc++-v3/include/bits/stream_iterator.h b/libstdc++-v3/include/bits/stream_iterator.h index d74c158f342..5a132319111 100644 --- a/libstdc++-v3/include/bits/stream_iterator.h +++ b/libstdc++-v3/include/bits/stream_iterator.h @@ -65,6 +65,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION public: /// Construct end of input stream iterator. _GLIBCXX_CONSTEXPR istream_iterator() + _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Tp>::value) : _M_stream(0), _M_value(), _M_ok(false) {} /// Construct start of input stream iterator. @@ -73,6 +74,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { _M_read(); } istream_iterator(const istream_iterator& __obj) + _GLIBCXX_NOEXCEPT_IF(is_nothrow_copy_constructible<_Tp>::value) : _M_stream(__obj._M_stream), _M_value(__obj._M_value), _M_ok(__obj._M_ok) { } @@ -91,7 +93,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_NODISCARD const _Tp& - operator*() const + operator*() const _GLIBCXX_NOEXCEPT { __glibcxx_requires_cond(_M_ok, _M_message(__gnu_debug::__msg_deref_istream) @@ -101,7 +103,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_NODISCARD const _Tp* - operator->() const { return std::__addressof((operator*())); } + operator->() const _GLIBCXX_NOEXCEPT + { return std::__addressof((operator*())); } istream_iterator& operator++() @@ -126,7 +129,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION private: bool - _M_equal(const istream_iterator& __x) const + _M_equal(const istream_iterator& __x) const _GLIBCXX_NOEXCEPT { // Ideally this would just return _M_stream == __x._M_stream, // but code compiled with old versions never sets _M_stream to null. @@ -148,6 +151,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_NODISCARD friend bool operator==(const istream_iterator& __x, const istream_iterator& __y) + _GLIBCXX_NOEXCEPT { return __x._M_equal(__y); } #if __cpp_impl_three_way_comparison < 201907L @@ -156,13 +160,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_NODISCARD friend bool operator!=(const istream_iterator& __x, const istream_iterator& __y) + _GLIBCXX_NOEXCEPT { return !__x._M_equal(__y); } #endif #if __cplusplus > 201703L && __cpp_lib_concepts [[nodiscard]] friend bool - operator==(const istream_iterator& __i, default_sentinel_t) + operator==(const istream_iterator& __i, default_sentinel_t) noexcept { return !__i._M_stream; } #endif }; @@ -200,7 +205,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION public: /// Construct from an ostream. - ostream_iterator(ostream_type& __s) + ostream_iterator(ostream_type& __s) _GLIBCXX_NOEXCEPT : _M_stream(std::__addressof(__s)), _M_string(0) {} /** @@ -213,11 +218,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __s Underlying ostream to write to. * @param __c CharT delimiter string to insert. */ - ostream_iterator(ostream_type& __s, const _CharT* __c) + ostream_iterator(ostream_type& __s, const _CharT* __c) _GLIBCXX_NOEXCEPT : _M_stream(std::__addressof(__s)), _M_string(__c) { } /// Copy constructor. - ostream_iterator(const ostream_iterator& __obj) + ostream_iterator(const ostream_iterator& __obj) _GLIBCXX_NOEXCEPT : _M_stream(__obj._M_stream), _M_string(__obj._M_string) { } #if __cplusplus >= 201103L @@ -240,15 +245,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_NODISCARD ostream_iterator& - operator*() + operator*() _GLIBCXX_NOEXCEPT { return *this; } ostream_iterator& - operator++() + operator++() _GLIBCXX_NOEXCEPT { return *this; } ostream_iterator& - operator++(int) + operator++(int) _GLIBCXX_NOEXCEPT { return *this; } };
[committed] libstdc++: Add missing header to test
We need to include (or one of the containers) to get a definition for std::begin. libstdc++-v3/ChangeLog: * testsuite/25_algorithms/is_permutation/2.cc: Include . Tested x86_64-linux. Committed to trunk. commit 94311bf34704ebecf745043fe2df03df201052fe Author: Jonathan Wakely Date: Fri Oct 1 12:55:53 2021 libstdc++: Add missing header to test We need to include (or one of the containers) to get a definition for std::begin. libstdc++-v3/ChangeLog: * testsuite/25_algorithms/is_permutation/2.cc: Include . diff --git a/libstdc++-v3/testsuite/25_algorithms/is_permutation/2.cc b/libstdc++-v3/testsuite/25_algorithms/is_permutation/2.cc index 8d15c22f593..252226ca08f 100644 --- a/libstdc++-v3/testsuite/25_algorithms/is_permutation/2.cc +++ b/libstdc++-v3/testsuite/25_algorithms/is_permutation/2.cc @@ -20,6 +20,7 @@ // 25.2.12 [alg.is_permutation] Is permutation #include +#include #include #include
[committed] libstdc++: Define basic_regex::multiline for non-strict modes
The regex_constants::multiline constant is defined for non-strict C++11 and C++14 modes, on the basis that the feature is a DR (even though it was really a new feature addition to C++17 and probably shouldn't have gone through the issues list). This makes the basic_regex::multiline constant defined consistently with the regex_constants::multiline one. For strict C++11 and C++14 mode we don't define them, because multiline is not a reserved name in those standards. libstdc++-v3/ChangeLog: * include/bits/regex.h (basic_regex::multiline): Define for non-strict C++11 and C++14 modes. * include/bits/regex_constants.h (regex_constants::multiline): Add _GLIBCXX_RESOLVE_LIB_DEFECTS comment. Tested x86_64-linux. Committed to trunk. commit 17374dab3eefd282977ad90743c9aff97f2e41ce Author: Jonathan Wakely Date: Fri Oct 1 14:06:42 2021 libstdc++: Define basic_regex::multiline for non-strict modes The regex_constants::multiline constant is defined for non-strict C++11 and C++14 modes, on the basis that the feature is a DR (even though it was really a new feature addition to C++17 and probably shouldn't have gone through the issues list). This makes the basic_regex::multiline constant defined consistently with the regex_constants::multiline one. For strict C++11 and C++14 mode we don't define them, because multiline is not a reserved name in those standards. libstdc++-v3/ChangeLog: * include/bits/regex.h (basic_regex::multiline): Define for non-strict C++11 and C++14 modes. * include/bits/regex_constants.h (regex_constants::multiline): Add _GLIBCXX_RESOLVE_LIB_DEFECTS comment. diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 664944b0ef2..ff908da3e94 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -424,7 +424,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 static constexpr flag_type awk = regex_constants::awk; static constexpr flag_type grep = regex_constants::grep; static constexpr flag_type egrep = regex_constants::egrep; -#if __cplusplus >= 201703L +#if __cplusplus >= 201703L || !defined __STRICT_ANSI__ static constexpr flag_type multiline = regex_constants::multiline; #endif ///@} diff --git a/libstdc++-v3/include/bits/regex_constants.h b/libstdc++-v3/include/bits/regex_constants.h index af689ff93af..0fd2879c817 100644 --- a/libstdc++-v3/include/bits/regex_constants.h +++ b/libstdc++-v3/include/bits/regex_constants.h @@ -171,6 +171,8 @@ namespace regex_constants static_cast(1 << _S_egrep); #if __cplusplus >= 201703L || !defined __STRICT_ANSI__ + // _GLIBCXX_RESOLVE_LIB_DEFECTS + // 2503. multiline option should be added to syntax_option_type /** * Specifies that the `^` anchor matches at the beginning of a line, * and the `$` anchor matches at the end of a line, not only at the
Re: [PATCH] libiberty: prevent null dereferencing on dlang_type
On Thu, Sep 23, 2021 at 8:55 AM Jeff Law via Gcc-patches wrote: > > > > On 9/23/2021 4:17 AM, ibuclaw--- via Gcc-patches wrote: > >> On 22/09/2021 03:31 Luís Ferreira wrote: > >> > >> > >> This patch prevents dereferencing a null reference on a crafted > >> malformed magled name, often causing SIGSEGV to be raised. > >> > > OK, seems reasonable to me. > I pushed this to the trunk. > > Thanks, > jeff > This caused: FAIL at line 997: unknown demangling style _D00 FAIL at line 1001: unknown demangling style _D01_D FAIL at line 1005: unknown demangling style _D9223372036854775817 FAIL at line 1009: unknown demangling style _D1az FAIL at line 1013: unknown demangling style _D1aN FAIL at line 1017: unknown demangling style _D1aF FAIL at line 1021: unknown demangling style _D1aM FAIL at line 1025: unknown demangling style _D1aFZNz FAIL at line 1029: unknown demangling style _D1aFNzZv FAIL at line 1033: unknown demangling style _D4testFDX FAIL at line 1037: unknown demangling style _D5__T0aZv FAIL at line 1041: unknown demangling style _D10__T4testYZv FAIL at line 1045: unknown demangling style _D4testFBaZv FAIL at line 1049: unknown demangling style _D8__T4test FAIL at line 1053: unknown demangling style _D10__T4testVi FAIL at line 1057: unknown demangling style _D10__T4testVai ... FAIL at line 1445: unknown demangling style _D3mod4funcFZ__T6nestedTiZQkMFNaNbNiNfZi FAIL at line 1449: unknown demangling style _D3mod4funcFZ__T6nestedTiZ4__S1QpMFNaNbNiNfZi FAIL at line 1452: unknown demangling style _D6mangle__T8fun21753VSQv6S21753S1f_DQBj10__lambda71MFNaNbNiNfZvZQCbQp ./test-demangle: 359 tests, 115 failures make[5]: *** [Makefile:55: check-d-demangle] Error 1 -- H.J.
Re: [PATCH] c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]
On Fri, 1 Oct 2021, Jason Merrill wrote: > On 10/1/21 09:46, Patrick Palka wrote: > > Here during partial ordering of the two partial specializations we end > > up in unify with parm=arg=NONTYPE_ARGUMENT_PACK, and crash shortly > > thereafter because uses_template_parms calls potential_constant_expression > > which doesn't handle NONTYPE_ARGUMENT_PACK. > > > > This patch fixes this by checking dependent_template_arg_p instead of > > uses_template_parms when parm==arg, which does handle NONTYPE_ARGUMENT_PACK. > > We could also perhaps fix uses_template_parms / inst_dep_expr_p to better > > handle NONTYPE_ARGUMENT_PACK, > > Please. Sounds good, like the following then? Passes light testing, bootstrap and regtest on progress. -- >8 -- PR c++/102547 gcc/cp/ChangeLog: * pt.c (instantiation_dependent_expression_p): Sidestep checking potential_constant_expression on NONTYPE_ARGUMENT_PACK. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-partial2.C: New test. * g++.dg/cpp0x/variadic-partial2a.C: New test. --- gcc/cp/pt.c | 4 +++- .../g++.dg/cpp0x/variadic-partial2.C | 16 ++ .../g++.dg/cpp0x/variadic-partial2a.C | 22 +++ 3 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 1dcdffe322a..643204103c5 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -27705,7 +27705,9 @@ instantiation_dependent_expression_p (tree expression) { return (instantiation_dependent_uneval_expression_p (expression) || (processing_template_decl - && potential_constant_expression (expression) + && expression != NULL_TREE + && (TREE_CODE (expression) == NONTYPE_ARGUMENT_PACK + || potential_constant_expression (expression)) && value_dependent_expression_p (expression))); } diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C new file mode 100644 index 000..df61f26a3c1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C @@ -0,0 +1,16 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } + +template +struct vals { }; + +template +struct vals_client { }; + +template +struct vals_client, T> { }; + +template +struct vals_client, void> { }; + +template struct vals_client, void>; //- "sorry, unimplemented..., ICE" diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C new file mode 100644 index 000..cc0ea488ad3 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C @@ -0,0 +1,22 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } +// A version of variadic-partial2.C where the partial ordering is performed +// on function templates instead of class templates. + +template +struct vals { }; + +template +void f(V, T) { }; + +template +void f(vals, T) { }; + +template +void f(vals, char) { }; + +template void f(vals<1, 2>, char); //- "sorry, unimplemented..., ICE" + +int main() { + f(vals<1, 3>{}, 'a'); //- "sorry, unimplemented..., ICE" +} -- 2.33.0.610.gcefe983a32
Re: [PATCH] libiberty: prevent null dereferencing on dlang_type
Hi, Yes, I'm sorry, I forgot to add --format=dlang parameter. This patch fixes it https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580544.html . On Fri, 2021-10-01 at 07:23 -0700, H.J. Lu wrote: > On Thu, Sep 23, 2021 at 8:55 AM Jeff Law via Gcc-patches > wrote: > > > > > > > > On 9/23/2021 4:17 AM, ibuclaw--- via Gcc-patches wrote: > > > > On 22/09/2021 03:31 Luís Ferreira wrote: > > > > > > > > > > > > This patch prevents dereferencing a null reference on a crafted > > > > malformed magled name, often causing SIGSEGV to be raised. > > > > > > > OK, seems reasonable to me. > > I pushed this to the trunk. > > > > Thanks, > > jeff > > > > This caused: > > FAIL at line 997: unknown demangling style _D00 > FAIL at line 1001: unknown demangling style _D01_D > FAIL at line 1005: unknown demangling style _D9223372036854775817 > FAIL at line 1009: unknown demangling style _D1az > FAIL at line 1013: unknown demangling style _D1aN > FAIL at line 1017: unknown demangling style _D1aF > FAIL at line 1021: unknown demangling style _D1aM > FAIL at line 1025: unknown demangling style _D1aFZNz > FAIL at line 1029: unknown demangling style _D1aFNzZv > FAIL at line 1033: unknown demangling style _D4testFDX > FAIL at line 1037: unknown demangling style _D5__T0aZv > FAIL at line 1041: unknown demangling style _D10__T4testYZv > FAIL at line 1045: unknown demangling style _D4testFBaZv > FAIL at line 1049: unknown demangling style _D8__T4test > FAIL at line 1053: unknown demangling style _D10__T4testVi > FAIL at line 1057: unknown demangling style _D10__T4testVai > ... > FAIL at line 1445: unknown demangling style > _D3mod4funcFZ__T6nestedTiZQkMFNaNbNiNfZi > FAIL at line 1449: unknown demangling style > _D3mod4funcFZ__T6nestedTiZ4__S1QpMFNaNbNiNfZi > FAIL at line 1452: unknown demangling style > _D6mangle__T8fun21753VSQv6S21753S1f_DQBj10__lambda71MFNaNbNiNfZvZQCbQp > ./test-demangle: 359 tests, 115 failures > make[5]: *** [Makefile:55: check-d-demangle] Error 1 > > -- Sincerely, Luís Ferreira @ lsferreira.net signature.asc Description: This is a digitally signed message part
Re: [PATCH] Handle EQ_EXPR relation for operator_lshift.
Well, after talking with Andrew it seems that X << Y being non-zero also implies X is non-zero. So we don't even need relationals here. So, I leave gori relationals in his capable hands, while I test this much simpler patch which fixes the PR with no additional infrastructure ;-). Will push pending tests. Aldy On Fri, Oct 1, 2021 at 2:43 PM Aldy Hernandez wrote: > > Knowing that X << X is non-zero means X is also non-zero. This patch > teaches this this to range-ops. > > As usual, the big twiddling experts could come up with all sorts of > fancy enhancements in this area, and we welcome all patches :). > > I will push this pending tests. > > gcc/ChangeLog: > > PR tree-optimization/102546 > * range-op.cc (operator_lshift::op1_range): Handle EQ_EXPR > relation. > --- > gcc/range-op.cc | 19 --- > gcc/testsuite/gcc.dg/tree-ssa/pr102546.c | 23 +++ > 2 files changed, 39 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102546.c > > diff --git a/gcc/range-op.cc b/gcc/range-op.cc > index 5e37133026d..53f3be4266e 100644 > --- a/gcc/range-op.cc > +++ b/gcc/range-op.cc > @@ -2075,9 +2075,14 @@ operator_lshift::op1_range (irange &r, > tree type, > const irange &lhs, > const irange &op2, > - relation_kind rel ATTRIBUTE_UNUSED) const > + relation_kind rel) const > { >tree shift_amount; > + int_range<2> adjust (type); > + > + if (rel == EQ_EXPR && !lhs.contains_p (build_zero_cst (type))) > +adjust.set_nonzero (type); > + >if (op2.singleton_p (&shift_amount)) > { >wide_int shift = wi::to_wide (shift_amount); > @@ -2086,10 +2091,11 @@ operator_lshift::op1_range (irange &r, >if (wi::ge_p (shift, wi::uhwi (TYPE_PRECISION (type), > TYPE_PRECISION (op2.type ())), > UNSIGNED)) > - return false; > + goto done; >if (shift == 0) > { > r = lhs; > + r.intersect (adjust); > return true; > } > > @@ -2126,9 +2132,16 @@ operator_lshift::op1_range (irange &r, > >if (utype != type) > range_cast (r, type); > + r.intersect (adjust); >return true; > } > - return false; > + > + done: > + if (adjust.varying_p ()) > +return false; > + > + r = adjust; > + return true; > } > > bool > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102546.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr102546.c > new file mode 100644 > index 000..4bd98747732 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102546.c > @@ -0,0 +1,23 @@ > +// { dg-do compile } > +// { dg-options "-O3 -fdump-tree-optimized" } > + > +static int a; > +static char b, c, d; > +void bar(void); > +void foo(void); > + > +int main() { > +int f = 0; > +for (; f <= 5; f++) { > +bar(); > +b = b && f; > +d = f << f; > +if (!(a >= d || f)) > +foo(); > +c = 1; > +for (; c; c = 0) > +; > +} > +} > + > +// { dg-final { scan-tree-dump-not "foo" "optimized" } } > -- > 2.31.1 > From fa11285b9ff1d75c877369c1df7760c3f76a4fe5 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Fri, 1 Oct 2021 13:05:36 +0200 Subject: [PATCH] [PR102546] X << Y being non-zero implies X is also non-zero. This patch teaches this to range-ops. Tested on x86-64 Linux. gcc/ChangeLog: PR tree-optimization/102546 * range-op.cc (operator_lshift::op1_range): Teach range-ops that X << Y is non-zero implies X is also non-zero. --- gcc/range-op.cc | 18 ++ gcc/testsuite/gcc.dg/tree-ssa/pr102546.c | 23 +++ 2 files changed, 37 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102546.c diff --git a/gcc/range-op.cc b/gcc/range-op.cc index 5e37133026d..2baca4a197f 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -2078,6 +2078,12 @@ operator_lshift::op1_range (irange &r, relation_kind rel ATTRIBUTE_UNUSED) const { tree shift_amount; + + if (!lhs.contains_p (build_zero_cst (type))) +r.set_nonzero (type); + else +r.set_varying (type); + if (op2.singleton_p (&shift_amount)) { wide_int shift = wi::to_wide (shift_amount); @@ -2089,21 +2095,24 @@ operator_lshift::op1_range (irange &r, return false; if (shift == 0) { - r = lhs; + r.intersect (lhs); return true; } // Work completely in unsigned mode to start. tree utype = type; + int_range_max tmp_range; if (TYPE_SIGN (type) == SIGNED) { int_range_max tmp = lhs; utype = unsigned_type_for (type); range_cast (tmp, utype); - op_rshift.fold_range (r, utype, tmp, op2); + op_rshift.fold_range (tmp_range, utype, tmp, op2); } else - op_rshift.fold_ra
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
> On Sep 30, 2021, at 2:31 PM, Jason Merrill wrote: > > On 9/30/21 11:42, Qing Zhao wrote: >>> On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: >>> >>> On Thu, 30 Sep 2021, Jason Merrill wrote: >>> On 9/29/21 17:30, Qing Zhao wrote: > Hi, > > PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 > > Is due to -ftrivial-auto-var-init adding initialization for READONLY > variable “this” in the following routine: (t.cpp.005t.original) > > === > > ;; Function A::foo():: (null) > ;; enabled by -tree-original > > { > const struct A * const this [value-expr: &__closure->__this]; >const struct A * const this [value-expr: &__closure->__this]; > return = (double) ((const struct A *) this)->a; > } > === > > However, in the above routine, “this” is NOT marked as READONLY, but its > value-expr "&__closure->__this” is marked as READONLY. > > There are two major issues: > > 1. In the routine “is_var_need_auto_init”, we should exclude “decl” that > is > marked as READONLY; > 2. In the C++ FE, “this” should be marked as READONLY. > > The idea solution will be: > > 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); > 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; > > Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? > > In the case it’s not a quick fix in C++FE, I proposed the following fix in > middle end: > > Let me know your comments or suggestions on this. > > Thanks a lot for the help. I'd think is_var_need_auto_init should be false for any variable with DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of naming objects that are initialized elsewhere. >>> >>> IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to >>> auto-init VLAs, otherwise I tend to agree - would we handle those >>> when we see a DECL_EXPR then? >> The current implementation is: >> gimplify_decl_expr: >> For each DECL_EXPR “decl” >>If (VAR_P (decl) && !DECL_EXTERNAL (decl)) >> { >> if (is_vla (decl)) >> gimplify_vla_decl (decl, …); /* existing handling: create a >> VALUE_EXPR for this vla decl*/ >> … >> if (has_explicit_init (decl)) >>{ >> …; /* existing handling. */ >>} >> else if (is_var_need_auto_init (decl)) /*. New code. */ >>{ >> gimple_add_init_for_auto_var (….); /* new code. */ >> ... >>} >> } >> Since the “DECL_VALUE_EXPR (decl)” is NOT a DECL_EXPR, it will not be >> scanned and added initialization. >> if we do not add initialization for a decl that has DECL_VALUE_EXPR, then >> the “DECL_VALUE_EXPR (decl)” will not be added an initialization either. We >> will miss adding initializations for these decls. >> So, I think that the current implementation is correct. >> And if C++ FE will not mark “this” as READONLY, only mark >> DECL_VALUE_EXPR(this) as READONLY, the proposed fix is correct too. >> Let me know your opinion on this. > > The problem with this test is not whether the 'this' proxy is marked > READONLY, the problem is that you're trying to initialize lambda capture > proxies at all; the lambda capture objects were already initialized when > forming the closure object. So this test currently aborts with > -ftrivial-auto-var-init=zero because you "initialize" the i capture field to > 0 after it was previously initialized to 42: > > int main() > { > int i = 42; > auto l = [=]() mutable { return i; }; > if (l() != i) >__builtin_abort (); > } > > I believe the same issue applies to the proxy variables in coroutines that > work much like lambdas. So, how should the middle end determine that a variable is “proxy variable”? Have all “proxy variables” been initialized by C++ FE already? > > You can't just assume that a VAR_DECL with DECL_VALUE_EXPR is uninitialized. So, all the VAR_DECLs with DECL_VALUE_EXPR (except the ones created by “gimplify_decl_expr”) are initialized by FE already? > > Since there's already VLA handling in gimplify_decl_expr, you could remember > whether you added DECL_VALUE_EXPR in that function, and only then do the > initialization. Yes, if we can guarantee that all the VAR_DECLs with DECL_VALUE_EXPR created from FEs have been initialized already by FE, we can fix this issue as this way. thanks. Qing > > Jason >
[PATCH, v4] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]
On Thu, Sep 30, 2021 at 03:01:49PM -0400, Jason Merrill wrote: > > After fixing the incomplete std::strong_ordering spaceship-synth8.C is now > > accepted, but I'm afraid I'm getting lost in this - clang++ rejects that > > testcase instead complaining that D has <=> operator, but has it pure > > virtual. > > Ah, I think we need to add LOOKUP_NO_VIRTUAL to the flags variable, as we do > in do_build_copy_assign. I suppose it wouldn't hurt to add LOOKUP_DEFAULTED > as well. I've tried that (see patch below), but neither in build_comparison_op, nor in genericize_spaceship those changes made any difference for spaceship-synth8.C, it is still accepted instead of rejected. > > + if (special_function_p (fn) == sfk_comparison) > > + { > > + tree lhs = DECL_ARGUMENTS (fn); > > + if (is_this_parameter (lhs)) > > + lhs = cp_build_fold_indirect_ref (lhs); > > + else > > + lhs = convert_from_reference (lhs); > > + tree ctype = TYPE_MAIN_VARIANT (TREE_TYPE (lhs)); > > + /* If the comparison type is still incomplete, don't synthesize the > > +method, just see if it is not implicitly deleted. */ > > + if (!COMPLETE_TYPE_P (ctype)) > > + { > > + push_deferring_access_checks (dk_no_deferred); > > + build_comparison_op (fn, false, tf_none); > > + pop_deferring_access_checks (); > > + return !DECL_MAYBE_DELETED (fn); > > + } > > + } > > + > > ++function_depth; > > synthesize_method (fn); > > --function_depth; > > Let's factor this (from the added code to here) into a > maybe_synthesize_method in method.c. That way build_comparison_op can stay > static. Ok, done. In addition, I've added the testcases from PR98712 to the patch. I'm still worried about maybe_synthesize_method, if done e.g. from maybe_instantiate_noexcept before the class is COMPLETE_TYPE_P, could end up deducing a wrong return type, one that e.g. doesn't take into account base classes. Tried that with spaceship-synth13.C testcase, but there maybe_instantiate_noexcept isn't called early, and I'm out of ideas how to do that. Also, do maybe_instantiate_noexcept callers care just about the return type of the method or also about whether something in the body could throw? 2021-10-01 Jakub Jelinek PR c++/98712 PR c++/102490 * cp-tree.h (maybe_synthesize_method): Declare. * method.c (genericize_spaceship): Use LOOKUP_NORMAL | LOOKUP_NONVIRTUAL | LOOKUP_DEFAULTED instead of LOOKUP_NORMAL for flags. (comp_info): Remove defining member. (comp_info::comp_info): Remove complain argument, don't initialize defining. (build_comparison_op): Add defining argument. Adjust comp_info construction. Use defining instead of info.defining. Assert that if defining, ctype is a complete type. Use LOOKUP_NORMAL | LOOKUP_NONVIRTUAL | LOOKUP_DEFAULTED instead of LOOKUP_NORMAL for flags. (synthesize_method, maybe_explain_implicit_delete, explain_implicit_non_constexpr): Adjust build_comparison_op callers. (maybe_synthesize_method): New function. * class.c (check_bases_and_members): Don't call defaulted_late_check for sfk_comparison. (finish_struct_1): Call it here instead after class has been completed. * pt.c (maybe_instantiate_noexcept): Call maybe_synthesize_method instead of synthesize_method. * g++.dg/cpp2a/spaceship-synth8.C (std::strong_ordering): Provide more complete definition. (std::strong_ordering::less, std::strong_ordering::equal, std::strong_ordering::greater): Define. * g++.dg/cpp2a/spaceship-synth12.C: New test. * g++.dg/cpp2a/spaceship-synth13.C: New test. * g++.dg/cpp2a/spaceship-eq11.C: New test. * g++.dg/cpp2a/spaceship-eq12.C: New test. * g++.dg/cpp2a/spaceship-eq13.C: New test. --- gcc/cp/cp-tree.h.jj 2021-10-01 10:24:37.500266902 +0200 +++ gcc/cp/cp-tree.h2021-10-01 16:40:08.880795343 +0200 @@ -7013,6 +7013,7 @@ extern void explain_implicit_non_constex extern bool deduce_inheriting_ctor (tree); extern bool decl_remember_implicit_trigger_p (tree); extern void synthesize_method (tree); +extern void maybe_synthesize_method(tree); extern tree lazily_declare_fn (special_function_kind, tree); extern tree skip_artificial_parms_for (const_tree, tree); --- gcc/cp/method.c.jj 2021-10-01 10:24:58.312971997 +0200 +++ gcc/cp/method.c 2021-10-01 16:46:18.018588835 +0200 @@ -1098,7 +1098,7 @@ genericize_spaceship (location_t loc, tr tree gt = lookup_comparison_result (tag, type, 1); - int flags = LOOKUP_NORMAL; + int flags = LOOKUP_NORMAL | LOOKUP_NONVIRTUAL | LOOKUP_DEFAULTED; tsubst_flags_t complain = tf_none; tree comp; @@ -1288,21 +1288,16 @@ struct co
Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]
> gcc-verify still fails with this version: > > > ERR: line should start with a tab: "PR c++/101783" > > ERR: line should start with a tab: "* tree.c > > (cp_build_qualified_type_real): Excluding typedef from error" > > ERR: line should start with a tab: "PR c++/101783" > > ERR: line should start with a tab: "* g++.dg/parse/pr101783.C: New > > test." > It might work better to attach the output of git format-patch. Sorry for my clumsy copy/paste from git commit message. I now attach git format-patch output file as attachment. Also maybe for a little convenience of your work, I also attach the original commit message file when I do git commit -F. > Also, your commit subject line is too long, at 83 characters: It must be > under 75 characters, and preferably closer to 50. I might shorten it to Please go ahead. I will pay attention to this next time. Thank you! > A change description in the ChangeLog should use present tense > ("Exclude"), have a period at the end, and line wrap at 75 characters > like the rest of the commit message. So, > > * tree.c (cp_build_qualified_type_real): Exclude typedef from > error. > Modified as suggested. > > + ([dcl.type.decltype]),in which case the cv-qualifiers are ignored. > > + */ > > We usually don't put */ on its own line. Adjusted. Once again I thank you for your patience and really appreciate it. On Fri, Oct 1, 2021 at 9:29 AM Jason Merrill via Gcc-patches wrote: > > On 9/30/21 14:24, nick huang wrote: > >>> You may need to run contrib/gcc-git-customization.sh to get the git > >>> gcc-verify command. > > I re-setup and can use git gcc-verify. Now I can see it rejects because I > > forgot to add a > > description of modified file. Now that it passes gcc-verify and I attach > > the changelog > > as attachment. > > > > Thank you again for your patient explanation and help! > > You're welcome, thanks for your patience as well! Unfortunately, git > gcc-verify still fails with this version: > > > ERR: line should start with a tab: "PR c++/101783" > > ERR: line should start with a tab: "* tree.c > > (cp_build_qualified_type_real): Excluding typedef from error" > > ERR: line should start with a tab: "PR c++/101783" > > ERR: line should start with a tab: "* g++.dg/parse/pr101783.C: New > > test." > > It might work better to attach the output of git format-patch. > > Also, your commit subject line is too long, at 83 characters: It must be > under 75 characters, and preferably closer to 50. I might shorten it to > > c++: cv-qualified ref introduced by typedef [PR101783] > > > * tree.c (cp_build_qualified_type_real): Excluding typedef from error > > A change description in the ChangeLog should use present tense > ("Exclude"), have a period at the end, and line wrap at 75 characters > like the rest of the commit message. So, > > * tree.c (cp_build_qualified_type_real): Exclude typedef from > error. > > > + ([dcl.type.decltype]),in which case the cv-qualifiers are ignored. > > + */ > > We usually don't put */ on its own line. > > Jason > -- nick huang/qingzhe huang http://www.staroceans.com http://www.staroceans.com/english.htm The root cause of this bug is that it considers reference with cv-qualifiers as an error by generating value for variable "bad_quals". However, this is not correct for case of typedef. Here I quote spec [dcl.ref]/1 : "Cv-qualified references are ill-formed except when the cv-qualifiers are introduced through the use of a typedef-name ([dcl.typedef], [temp.param]) or decltype-specifier ([dcl.type.decltype]), in which case the cv-qualifiers are ignored." 2021-09-30 qingzhe huang gcc/cp/ChangeLog: PR c++/101783 * tree.c (cp_build_qualified_type_real): Exclude typedef from error. gcc/testsuite/ChangeLog: PR c++/101783 * g++.dg/parse/pr101783.C: New test. From e592a475030d99647de736d294cb3c6a7588af49 Mon Sep 17 00:00:00 2001 From: qingzhe huang Date: Fri, 1 Oct 2021 10:46:35 -0400 Subject: [PATCH] The root cause of this bug is that it considers reference with cv-qualifiers as an error by generating value for variable "bad_quals". However, this is not correct for case of typedef. Here I quote spec [dcl.ref]/1 : "Cv-qualified references are ill-formed except when the cv-qualifiers are introduced through the use of a typedef-name ([dcl.typedef], [temp.param]) or decltype-specifier ([dcl.type.decltype]), in which case the cv-qualifiers are ignored." 2021-09-30 qingzhe huang gcc/cp/ChangeLog: PR c++/101783 * tree.c (cp_build_qualified_type_real): Exclude typedef from error. gcc/testsuite/ChangeLog: PR c++/101783 * g++.dg/parse/pr101783.C: New test. --- gcc/cp/tree.c | 9 - gcc/testsuite/g++.dg/parse/pr101783.C | 5 + 2 files changed, 13 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/parse/pr101783.C diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c inde
Re: [PATCH] c++: Implement C++20 -Wdeprecated-array-compare [PR97573]
On 9/30/21 8:50 AM, Marek Polacek via Gcc-patches wrote: This patch addresses one of my leftovers from GCC 11. C++20 introduced [depr.array.comp]: "Equality and relational comparisons between two operands of array type are deprecated." so this patch adds -Wdeprecated-array-compare (enabled by default in C++20). A warning like this would be useful in C as well even though there array equality is not deprecated (though relational expressions involving distinct objects are undefined). Recently, while working on my -Waddress enhancement to "warn for more impossible null pointer tests, I noticed Clang warns for some of these equality tests in both languages (it issues -Wtautological-compare). Rather that referring to deprecation, if one is necessary, I would suggest to choose a name for the option that reflects the problem the warning (and presumably the deprecation in C++) tries to prevent. That said, since GCC already has both -Waddress and -Wtautological- compare for these problems, the warning could be issued under either of these. Martin Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/97573 gcc/c-family/ChangeLog: * c-opts.c (c_common_post_options): In C++20, turn on -Wdeprecated-array-compare. * c.opt (Wdeprecated-array-compare): New option. gcc/cp/ChangeLog: * typeck.c (do_warn_deprecated_array_compare): New. (cp_build_binary_op): Call it for equality and relational comparisons. gcc/ChangeLog: * doc/invoke.texi: Document -Wdeprecated-array-compare. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr15791-1.C: Add dg-warning. * g++.dg/cpp2a/array-comp1.C: New test. * g++.dg/cpp2a/array-comp2.C: New test. * g++.dg/cpp2a/array-comp3.C: New test. --- gcc/c-family/c-opts.c | 5 gcc/c-family/c.opt| 4 +++ gcc/cp/typeck.c | 28 +++ gcc/doc/invoke.texi | 19 - gcc/testsuite/g++.dg/cpp2a/array-comp1.C | 34 +++ gcc/testsuite/g++.dg/cpp2a/array-comp2.C | 31 + gcc/testsuite/g++.dg/cpp2a/array-comp3.C | 29 +++ gcc/testsuite/g++.dg/tree-ssa/pr15791-1.C | 2 +- 8 files changed, 150 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp1.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp2.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp3.C diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 3eaab5e1530..00b52cc5e12 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -962,6 +962,11 @@ c_common_post_options (const char **pfilename) warn_deprecated_enum_float_conv, cxx_dialect >= cxx20 && warn_deprecated); + /* -Wdeprecated-array-compare is enabled by default in C++20. */ + SET_OPTION_IF_UNSET (&global_options, &global_options_set, + warn_deprecated_array_compare, + cxx_dialect >= cxx20 && warn_deprecated); + /* Declone C++ 'structors if -Os. */ if (flag_declone_ctor_dtor == -1) flag_declone_ctor_dtor = optimize_size; diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 9c151d19870..a4f0ea68594 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -540,6 +540,10 @@ Wdeprecated C C++ ObjC ObjC++ CPP(cpp_warn_deprecated) CppReason(CPP_W_DEPRECATED) ; Documented in common.opt +Wdeprecated-array-compare +C++ ObjC++ Var(warn_deprecated_array_compare) Warning +Warn about deprecated comparisons between two operands of array type. + Wdeprecated-copy C++ ObjC++ Var(warn_deprecated_copy) Warning LangEnabledBy(C++ ObjC++, Wextra) Mark implicitly-declared copy operations as deprecated if the class has a diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index a2398dbe660..1e3a41104d6 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "asan.h" #include "gimplify.h" +#include "tree-pretty-print.h" static tree cp_build_addr_expr_strict (tree, tsubst_flags_t); static tree cp_build_function_call (tree, tree, tsubst_flags_t); @@ -4725,6 +4726,21 @@ do_warn_enum_conversions (location_t loc, enum tree_code code, tree type0, } } +/* Warn about C++20 [depr.array.comp] array comparisons: "Equality + and relational comparisons between two operands of array type are + deprecated." */ + +static inline void +do_warn_deprecated_array_compare (location_t location, tree_code code, + tree op0, tree op1) +{ + if (warning_at (location, OPT_Wdeprecated_array_compare, + "comparison between two arrays is deprecated")) +inform (location, "use unary %<+%> which decays operands to pointers " + "or %<&%D[0] %s &%D[0]%> to compare the addresses
PING^4 [PATCH] x86: Update memcpy/memset inline strategies for -mtune=generic
On Mon, Sep 20, 2021 at 10:06 AM H.J. Lu wrote: > > On Mon, Sep 13, 2021 at 6:38 AM H.J. Lu wrote: > > > > On Tue, Sep 7, 2021 at 8:01 PM H.J. Lu wrote: > > > > > > On Sun, Aug 22, 2021 at 8:28 AM H.J. Lu wrote: > > > > > > > > On Tue, Mar 23, 2021 at 09:19:38AM +0100, Richard Biener wrote: > > > > > On Tue, Mar 23, 2021 at 3:41 AM Hongyu Wang > > > > > wrote: > > > > > > > > > > > > > Hongyue, please collect code size differences on SPEC CPU 2017 and > > > > > > > eembc. > > > > > > > > > > > > Here is code size difference for this patch > > > > > > > > > > Thanks, nothing too bad although slightly larger impacts than > > > > > envisioned. > > > > > > > > > > > > > PING. > > > > > > > > OK for master branch? > > > > > > > > Thanks. > > > > > > > > H.J. > > > > --- > > > > Simplify memcpy and memset inline strategies to avoid branches for > > > > -mtune=generic: > > > > > > > > 1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector > > > >load and store for up to 16 * 16 (256) bytes when the data size is > > > >fixed and known. > > > > 2. Inline only if data size is known to be <= 256. > > > >a. Use "rep movsb/stosb" with simple code sequence if the data size > > > > is a constant. > > > >b. Use loop if data size is not a constant. > > > > 3. Use memcpy/memset libray function if data size is unknown or > 256. > > > > > > > > > > PING: > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577889.html > > > > > > > PING. This should fix: > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294 > > > > PING. > Any comments or objections to this patch? -- H.J.
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
On 10/1/21 10:54, Qing Zhao wrote: On Sep 30, 2021, at 2:31 PM, Jason Merrill wrote: On 9/30/21 11:42, Qing Zhao wrote: On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: On Thu, 30 Sep 2021, Jason Merrill wrote: On 9/29/21 17:30, Qing Zhao wrote: Hi, PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 Is due to -ftrivial-auto-var-init adding initialization for READONLY variable “this” in the following routine: (t.cpp.005t.original) === ;; Function A::foo():: (null) ;; enabled by -tree-original { const struct A * const this [value-expr: &__closure->__this]; const struct A * const this [value-expr: &__closure->__this]; return = (double) ((const struct A *) this)->a; } === However, in the above routine, “this” is NOT marked as READONLY, but its value-expr "&__closure->__this” is marked as READONLY. There are two major issues: 1. In the routine “is_var_need_auto_init”, we should exclude “decl” that is marked as READONLY; 2. In the C++ FE, “this” should be marked as READONLY. The idea solution will be: 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? In the case it’s not a quick fix in C++FE, I proposed the following fix in middle end: Let me know your comments or suggestions on this. Thanks a lot for the help. I'd think is_var_need_auto_init should be false for any variable with DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of naming objects that are initialized elsewhere. IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to auto-init VLAs, otherwise I tend to agree - would we handle those when we see a DECL_EXPR then? The current implementation is: gimplify_decl_expr: For each DECL_EXPR “decl” If (VAR_P (decl) && !DECL_EXTERNAL (decl)) { if (is_vla (decl)) gimplify_vla_decl (decl, …); /* existing handling: create a VALUE_EXPR for this vla decl*/ … if (has_explicit_init (decl)) { …; /* existing handling. */ } else if (is_var_need_auto_init (decl)) /*. New code. */ { gimple_add_init_for_auto_var (….); /* new code. */ ... } } Since the “DECL_VALUE_EXPR (decl)” is NOT a DECL_EXPR, it will not be scanned and added initialization. if we do not add initialization for a decl that has DECL_VALUE_EXPR, then the “DECL_VALUE_EXPR (decl)” will not be added an initialization either. We will miss adding initializations for these decls. So, I think that the current implementation is correct. And if C++ FE will not mark “this” as READONLY, only mark DECL_VALUE_EXPR(this) as READONLY, the proposed fix is correct too. Let me know your opinion on this. The problem with this test is not whether the 'this' proxy is marked READONLY, the problem is that you're trying to initialize lambda capture proxies at all; the lambda capture objects were already initialized when forming the closure object. So this test currently aborts with -ftrivial-auto-var-init=zero because you "initialize" the i capture field to 0 after it was previously initialized to 42: int main() { int i = 42; auto l = [=]() mutable { return i; }; if (l() != i) __builtin_abort (); } I believe the same issue applies to the proxy variables in coroutines that work much like lambdas. So, how should the middle end determine that a variable is “proxy variable”? In the front end, is_capture_proxy will identify a lambda capture proxy variable. But that won't be true for the similar proxies used by coroutines. Have all “proxy variables” been initialized by C++ FE already? Yes. You can't just assume that a VAR_DECL with DECL_VALUE_EXPR is uninitialized. So, all the VAR_DECLs with DECL_VALUE_EXPR (except the ones created by “gimplify_decl_expr”) are initialized by FE already? In general I'd expect them to refer to previously created objects which may or may not have been initialized, but if they haven't been, the place to deal with that is at their previous creation. Since there's already VLA handling in gimplify_decl_expr, you could remember whether you added DECL_VALUE_EXPR in that function, and only then do the initialization. Yes, if we can guarantee that all the VAR_DECLs with DECL_VALUE_EXPR created from FEs have been initialized already by FE, we can fix this issue as this way. Or more generally, check whether the argument to gimplify_decl_expr has DECL_VALUE_EXPR when we enter the function, and don't do the initialization in that case. Jason
[PATCH 4/4] Update c-c++-common/tsan/atomic_stack.c
Print out from __tsan_atomic32_fetch_add was removed by commit da7a5c09c86c3f639c63ce8843d6f21c915ae1c6 Author: Dmitry Vyukov Date: Wed Jul 28 16:57:39 2021 +0200 tsan: don't print __tsan_atomic* functions in report stacks Currently __tsan_atomic* functions do FuncEntry/Exit using caller PC and then use current PC (pointing to __tsan_atomic* itself) during memory access handling. As the result the top function in reports involving atomics is __tsan_atomic* and the next frame points to user code. Remove FuncEntry/Exit in atomic functions and use caller PC during memory access handling. This removes __tsan_atomic* from the top of report stacks, so that they point right to user code. The motivation for this is performance. Some atomic operations are very hot (mostly loads), so removing FuncEntry/Exit is beneficial. This also reduces thread trace consumption (1 event instead of 3). __tsan_atomic* at the top of the stack is not necessary and does not add any new information. We already say "atomic write of size 4", "__tsan_atomic32_store" does not add anything new. It also makes reports consistent between atomic and non-atomic accesses. For normal accesses we say "previous write" and point to user code; for atomics we say "previous atomic write" and now also point to user code. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D106966 * c-c++-common/tsan/atomic_stack.c: Don't expect print out from __tsan_atomic32_fetch_add. --- gcc/testsuite/c-c++-common/tsan/atomic_stack.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/testsuite/c-c++-common/tsan/atomic_stack.c b/gcc/testsuite/c-c++-common/tsan/atomic_stack.c index 746afa7b466..09ac7c72245 100644 --- a/gcc/testsuite/c-c++-common/tsan/atomic_stack.c +++ b/gcc/testsuite/c-c++-common/tsan/atomic_stack.c @@ -31,5 +31,4 @@ int main() { /* { dg-output "WARNING: ThreadSanitizer: data race.*(\n|\r\n|\r)" } */ /* { dg-output " Atomic write of size 4.*" } */ -/* { dg-output "#0 __tsan_atomic32_fetch_add.*" } */ -/* { dg-output "#1 Thread1.*" } */ +/* { dg-output "#0 Thread1.*" } */ -- 2.31.1
[PATCH 3/4] libsanitizer: Bump asan/tsan versions
Bump asan/tsan versions for upstream commits: commit f1bb30a4956f83e46406d6082e5d376ce65391e0 Author: Vitaly Buka Date: Thu Aug 26 10:25:09 2021 -0700 [sanitizer] No THREADLOCAL in qsort and bsearch qsort can reuse qsort_r if available. bsearch always passes key as the first comparator argument, so we can use it to wrap the original comparator. Differential Revision: https://reviews.llvm.org/D108751 commit d77b476c1953bcb0a608b2d6a4f2dd9fe0b43967 Author: Dmitry Vyukov Date: Mon Aug 2 16:52:53 2021 +0200 tsan: avoid extra call indirection in unaligned access functions Currently unaligned access functions are defined in tsan_interface.cpp and do a real call to MemoryAccess. This means we have a real call and no read/write constant propagation. Unaligned memory access can be quite hot for some programs (observed on some compression algorithms with ~90% of unaligned accesses). Move them to tsan_interface_inl.h to avoid the additional call and enable constant propagation. Also reorder the actual store and memory access handling for __sanitizer_unaligned_store callbacks to enable tail calling in MemoryAccess. Depends on D107282. Reviewed By: vitalybuka, melver commit 97795be22f634667ce7a022398c59ccc9f7440eb Author: Dmitry Vyukov Date: Fri Jul 30 08:35:11 2021 +0200 tsan: optimize test-only barrier The updated lots_of_threads.c test with 300 threads started running for too long on machines with low hardware parallelism (e.g. taskset -c 0-1). On lots of CPUs it finishes in ~2 secs. But with taskset -c 0-1 it runs for hundreds of seconds effectively spinning in the barrier in the sleep loop. We now have the handy futex API in sanitizer_common. Use it instead of the passive spin loop. It makes the test run only faster with taskset -c 0-1, it runs for ~1.5 secs, while with full parallelism it still runs for ~2 secs (but consumes less CPU time). Depends on D107131. Reviewed By: vitalybuka --- libsanitizer/asan/libtool-version | 2 +- libsanitizer/tsan/libtool-version | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/libsanitizer/asan/libtool-version b/libsanitizer/asan/libtool-version index 2cd4546d1b9..7a2a23f2b56 100644 --- a/libsanitizer/asan/libtool-version +++ b/libsanitizer/asan/libtool-version @@ -3,4 +3,4 @@ # a separate file so that version updates don't involve re-running # automake. # CURRENT:REVISION:AGE -7:0:0 +8:0:0 diff --git a/libsanitizer/tsan/libtool-version b/libsanitizer/tsan/libtool-version index 79dfeeea15f..6fa8162dd20 100644 --- a/libsanitizer/tsan/libtool-version +++ b/libsanitizer/tsan/libtool-version @@ -3,4 +3,4 @@ # a separate file so that version updates don't involve re-running # automake. # CURRENT:REVISION:AGE -1:0:0 +2:0:0 -- 2.31.1
[PATCH 1/4] libsanitizer: Merge with upstream
Merged revision: 1c2e5fd66ea27d0c51360ba4e22099124a915562 --- libsanitizer/MERGE|2 +- libsanitizer/asan/asan_fuchsia.cpp| 35 +- libsanitizer/asan/asan_globals.cpp| 33 +- libsanitizer/asan/asan_interceptors.cpp | 18 +- libsanitizer/asan/asan_interceptors.h | 52 +- libsanitizer/asan/asan_mapping.h |2 +- libsanitizer/asan/asan_report.cpp | 10 +- libsanitizer/asan/asan_rtl.cpp| 18 +- libsanitizer/asan/asan_stats.cpp | 10 +- libsanitizer/asan/asan_thread.cpp |4 +- libsanitizer/hwasan/Makefile.am |3 +- libsanitizer/hwasan/Makefile.in | 12 +- libsanitizer/hwasan/hwasan.cpp|3 +- libsanitizer/hwasan/hwasan.h | 25 +- .../hwasan/hwasan_allocation_functions.cpp| 24 + libsanitizer/hwasan/hwasan_allocator.cpp | 58 +- libsanitizer/hwasan/hwasan_dynamic_shadow.cpp |9 + libsanitizer/hwasan/hwasan_fuchsia.cpp| 23 + libsanitizer/hwasan/hwasan_interceptors.cpp | 70 +- .../hwasan/hwasan_interface_internal.h| 48 - libsanitizer/hwasan/hwasan_linux.cpp | 147 +- libsanitizer/hwasan/hwasan_report.cpp | 82 +- ...wasan_setjmp.S => hwasan_setjmp_aarch64.S} | 21 +- libsanitizer/hwasan/hwasan_setjmp_x86_64.S| 80 + libsanitizer/hwasan/hwasan_thread.cpp |2 +- libsanitizer/hwasan/hwasan_type_test.cpp |2 +- .../include/sanitizer/asan_interface.h|2 +- .../include/sanitizer/common_interface_defs.h |2 +- .../include/sanitizer/dfsan_interface.h |3 +- .../include/sanitizer/linux_syscall_hooks.h | 2120 + .../include/sanitizer/tsan_interface.h|3 + .../interception/interception_win.cpp | 48 +- libsanitizer/lsan/lsan_allocator.h|2 +- libsanitizer/lsan/lsan_common.cpp | 12 +- .../sanitizer_common/sancov_flags.inc |2 +- .../sanitizer_common/sanitizer_addrhashmap.h |2 +- .../sanitizer_allocator_primary64.h | 14 +- .../sanitizer_allocator_size_class_map.h |8 +- libsanitizer/sanitizer_common/sanitizer_asm.h |4 +- .../sanitizer_atomic_clang_mips.h |2 +- .../sanitizer_common/sanitizer_common.h | 20 +- .../sanitizer_common_interceptors.inc | 652 ++--- .../sanitizer_common_interceptors_format.inc | 10 +- ...izer_common_interceptors_netbsd_compat.inc |4 +- .../sanitizer_common_nolibc.cpp |1 + .../sanitizer_common_syscalls.inc | 1559 +++- .../sanitizer_coverage_fuchsia.cpp|8 +- .../sanitizer_coverage_libcdep_new.cpp| 65 +- .../sanitizer_common/sanitizer_file.cpp | 15 + .../sanitizer_common/sanitizer_file.h |2 + .../sanitizer_common/sanitizer_flag_parser.h |2 +- .../sanitizer_common/sanitizer_flags.inc |4 + .../sanitizer_common/sanitizer_fuchsia.cpp| 41 - .../sanitizer_interceptors_ioctl_netbsd.inc |2 +- .../sanitizer_interface_internal.h|7 +- .../sanitizer_internal_defs.h | 46 +- .../sanitizer_common/sanitizer_libc.cpp | 12 + .../sanitizer_common/sanitizer_libc.h |5 +- .../sanitizer_common/sanitizer_libignore.cpp | 33 +- .../sanitizer_common/sanitizer_libignore.h| 37 +- .../sanitizer_common/sanitizer_linux.cpp | 83 +- .../sanitizer_linux_libcdep.cpp |4 - .../sanitizer_local_address_space_view.h |2 +- .../sanitizer_common/sanitizer_mac.cpp| 41 +- libsanitizer/sanitizer_common/sanitizer_mac.h | 20 - .../sanitizer_common/sanitizer_mutex.cpp | 186 ++ .../sanitizer_common/sanitizer_mutex.h| 325 +-- .../sanitizer_common/sanitizer_platform.h | 25 +- .../sanitizer_platform_interceptors.h | 27 +- .../sanitizer_platform_limits_freebsd.cpp |4 + .../sanitizer_platform_limits_freebsd.h | 164 +- .../sanitizer_platform_limits_linux.cpp | 61 +- .../sanitizer_platform_limits_netbsd.cpp |1 + .../sanitizer_platform_limits_netbsd.h|1 + .../sanitizer_platform_limits_posix.cpp | 25 +- .../sanitizer_platform_limits_posix.h | 32 +- .../sanitizer_platform_limits_solaris.cpp |1 + .../sanitizer_platform_limits_solaris.h |1 + .../sanitizer_common/sanitizer_posix.h|7 +- .../sanitizer_posix_libcdep.cpp |2 + .../sanitizer_common/sanitizer_printf.cpp | 37 +- .../sanitizer_signal_interceptors.inc | 12 +- .../sanitizer_common/sanitizer_solaris.cpp| 22 - .../sanitizer_common/sanitizer_stacktrace.cpp | 22 +- .../sanitizer_stacktrace_libcdep.cpp |2 +- .../sanitizer_stacktrace_printer.cpp | 11 +- .../sanitizer_stacktrace_s
[PATCH 0/4] libsanitizer: Merge with upstream commit 1c2e5fd66ea
Merge with upstream commit: commit 1c2e5fd66ea27d0c51360ba4e22099124a915562 Author: peter klausler Date: Wed Sep 15 08:28:48 2021 -0700 [flang] Enforce constraint: defined ass't in WHERE must be elemental A defined assignment subroutine invoked in the context of a WHERE statement or construct must necessarily be elemental (C1032). Differential Revision: https://reviews.llvm.org/D109932 H.J. Lu (4): libsanitizer: Merge with upstream libsanitizer: Apply local patches libsanitizer: Bump asan/tsan versions Update c-c++-common/tsan/atomic_stack.c .../c-c++-common/tsan/atomic_stack.c |3 +- libsanitizer/MERGE|2 +- libsanitizer/asan/asan_fuchsia.cpp| 35 +- libsanitizer/asan/asan_globals.cpp| 14 +- libsanitizer/asan/asan_interceptors.cpp | 18 +- libsanitizer/asan/asan_interceptors.h | 45 +- libsanitizer/asan/asan_report.cpp | 10 +- libsanitizer/asan/asan_rtl.cpp| 18 +- libsanitizer/asan/asan_stats.cpp | 10 +- libsanitizer/asan/asan_thread.cpp |4 +- libsanitizer/asan/libtool-version |2 +- libsanitizer/hwasan/Makefile.am |3 +- libsanitizer/hwasan/Makefile.in | 12 +- libsanitizer/hwasan/hwasan.cpp|3 +- libsanitizer/hwasan/hwasan.h | 25 +- .../hwasan/hwasan_allocation_functions.cpp| 24 + libsanitizer/hwasan/hwasan_allocator.cpp | 58 +- libsanitizer/hwasan/hwasan_dynamic_shadow.cpp |9 + libsanitizer/hwasan/hwasan_fuchsia.cpp| 23 + libsanitizer/hwasan/hwasan_interceptors.cpp | 70 +- .../hwasan/hwasan_interface_internal.h| 48 - libsanitizer/hwasan/hwasan_linux.cpp | 147 +- libsanitizer/hwasan/hwasan_report.cpp | 82 +- ...wasan_setjmp.S => hwasan_setjmp_aarch64.S} | 21 +- libsanitizer/hwasan/hwasan_setjmp_x86_64.S| 80 + libsanitizer/hwasan/hwasan_thread.cpp |2 +- libsanitizer/hwasan/hwasan_type_test.cpp |2 +- .../include/sanitizer/asan_interface.h|2 +- .../include/sanitizer/common_interface_defs.h |2 +- .../include/sanitizer/dfsan_interface.h |3 +- .../include/sanitizer/linux_syscall_hooks.h | 2120 + .../include/sanitizer/tsan_interface.h|3 + .../interception/interception_win.cpp | 48 +- libsanitizer/lsan/lsan_allocator.h|2 +- libsanitizer/lsan/lsan_common.cpp | 12 +- .../sanitizer_common/sancov_flags.inc |2 +- .../sanitizer_common/sanitizer_addrhashmap.h |2 +- .../sanitizer_allocator_primary64.h | 14 +- .../sanitizer_allocator_size_class_map.h |8 +- libsanitizer/sanitizer_common/sanitizer_asm.h |4 +- .../sanitizer_atomic_clang_mips.h |2 +- .../sanitizer_common/sanitizer_common.h | 20 +- .../sanitizer_common_interceptors.inc | 652 ++--- .../sanitizer_common_interceptors_format.inc | 10 +- ...izer_common_interceptors_netbsd_compat.inc |4 +- .../sanitizer_common_nolibc.cpp |1 + .../sanitizer_common_syscalls.inc | 1559 +++- .../sanitizer_coverage_fuchsia.cpp|8 +- .../sanitizer_coverage_libcdep_new.cpp| 65 +- .../sanitizer_common/sanitizer_file.cpp | 15 + .../sanitizer_common/sanitizer_file.h |2 + .../sanitizer_common/sanitizer_flag_parser.h |2 +- .../sanitizer_common/sanitizer_flags.inc |4 + .../sanitizer_common/sanitizer_fuchsia.cpp| 41 - .../sanitizer_interceptors_ioctl_netbsd.inc |2 +- .../sanitizer_interface_internal.h|7 +- .../sanitizer_internal_defs.h | 46 +- .../sanitizer_common/sanitizer_libc.cpp | 12 + .../sanitizer_common/sanitizer_libc.h |5 +- .../sanitizer_common/sanitizer_libignore.cpp | 33 +- .../sanitizer_common/sanitizer_libignore.h| 37 +- .../sanitizer_common/sanitizer_linux.cpp | 83 +- .../sanitizer_local_address_space_view.h |2 +- .../sanitizer_common/sanitizer_mac.cpp| 29 +- .../sanitizer_common/sanitizer_mutex.cpp | 186 ++ .../sanitizer_common/sanitizer_mutex.h| 325 +-- .../sanitizer_common/sanitizer_platform.h | 25 +- .../sanitizer_platform_interceptors.h | 27 +- .../sanitizer_platform_limits_freebsd.cpp |4 + .../sanitizer_platform_limits_freebsd.h | 164 +- .../sanitizer_platform_limits_linux.cpp | 56 +- .../sanitizer_platform_limits_netbsd.cpp |1 + .../sanitizer_platform_limits_netbsd.h|1 + .../sanitizer_platform_limits_posix.cpp | 25 +- .../sanitizer_platform_limits_posix.h | 30 +- .../sanitizer_platform_limits_solaris.cpp |1 + .../sanitizer_platform_limits_solaris.h |1 + .../sanitizer_co
[PATCH 2/4] libsanitizer: Apply local patches
--- libsanitizer/asan/asan_globals.cpp| 19 -- libsanitizer/asan/asan_interceptors.h | 7 ++- libsanitizer/asan/asan_mapping.h | 2 +- .../sanitizer_linux_libcdep.cpp | 4 .../sanitizer_common/sanitizer_mac.cpp| 12 +-- libsanitizer/sanitizer_common/sanitizer_mac.h | 20 +++ .../sanitizer_platform_limits_linux.cpp | 5 - .../sanitizer_platform_limits_posix.h | 2 +- .../sanitizer_common/sanitizer_stacktrace.cpp | 17 +++- libsanitizer/tsan/tsan_rtl_ppc64.S| 1 + libsanitizer/ubsan/ubsan_flags.cpp| 1 + libsanitizer/ubsan/ubsan_handlers.cpp | 15 ++ libsanitizer/ubsan/ubsan_handlers.h | 8 libsanitizer/ubsan/ubsan_platform.h | 2 ++ 14 files changed, 85 insertions(+), 30 deletions(-) diff --git a/libsanitizer/asan/asan_globals.cpp b/libsanitizer/asan/asan_globals.cpp index 9bf378f6207..763d3c6d2c0 100644 --- a/libsanitizer/asan/asan_globals.cpp +++ b/libsanitizer/asan/asan_globals.cpp @@ -154,23 +154,6 @@ static void CheckODRViolationViaIndicator(const Global *g) { } } -// Check ODR violation for given global G by checking if it's already poisoned. -// We use this method in case compiler doesn't use private aliases for global -// variables. -static void CheckODRViolationViaPoisoning(const Global *g) { - if (__asan_region_is_poisoned(g->beg, g->size_with_redzone)) { -// This check may not be enough: if the first global is much larger -// the entire redzone of the second global may be within the first global. -for (ListOfGlobals *l = list_of_all_globals; l; l = l->next) { - if (g->beg == l->g->beg && - (flags()->detect_odr_violation >= 2 || g->size != l->g->size) && - !IsODRViolationSuppressed(g->name)) -ReportODRViolation(g, FindRegistrationSite(g), - l->g, FindRegistrationSite(l->g)); -} - } -} - // Clang provides two different ways for global variables protection: // it can poison the global itself or its private alias. In former // case we may poison same symbol multiple times, that can help us to @@ -216,8 +199,6 @@ static void RegisterGlobal(const Global *g) { // where two globals with the same name are defined in different modules. if (UseODRIndicator(g)) CheckODRViolationViaIndicator(g); -else - CheckODRViolationViaPoisoning(g); } if (CanPoisonMemory()) PoisonRedZones(*g); diff --git a/libsanitizer/asan/asan_interceptors.h b/libsanitizer/asan/asan_interceptors.h index 047b044c8bf..105c672cc24 100644 --- a/libsanitizer/asan/asan_interceptors.h +++ b/libsanitizer/asan/asan_interceptors.h @@ -81,7 +81,12 @@ void InitializePlatformInterceptors(); #if ASAN_HAS_EXCEPTIONS && !SANITIZER_WINDOWS && !SANITIZER_SOLARIS && \ !SANITIZER_NETBSD # define ASAN_INTERCEPT___CXA_THROW 1 -# define ASAN_INTERCEPT___CXA_RETHROW_PRIMARY_EXCEPTION 1 +# if ! defined(ASAN_HAS_CXA_RETHROW_PRIMARY_EXCEPTION) \ + || ASAN_HAS_CXA_RETHROW_PRIMARY_EXCEPTION +# define ASAN_INTERCEPT___CXA_RETHROW_PRIMARY_EXCEPTION 1 +# else +# define ASAN_INTERCEPT___CXA_RETHROW_PRIMARY_EXCEPTION 0 +# endif # if defined(_GLIBCXX_SJLJ_EXCEPTIONS) || (SANITIZER_IOS && defined(__arm__)) # define ASAN_INTERCEPT__UNWIND_SJLJ_RAISEEXCEPTION 1 # else diff --git a/libsanitizer/asan/asan_mapping.h b/libsanitizer/asan/asan_mapping.h index e5a7f2007ae..4b0037fced3 100644 --- a/libsanitizer/asan/asan_mapping.h +++ b/libsanitizer/asan/asan_mapping.h @@ -165,7 +165,7 @@ static const u64 kAArch64_ShadowOffset64 = 1ULL << 36; static const u64 kRiscv64_ShadowOffset64 = 0xd; static const u64 kMIPS32_ShadowOffset32 = 0x0aaa; static const u64 kMIPS64_ShadowOffset64 = 1ULL << 37; -static const u64 kPPC64_ShadowOffset64 = 1ULL << 44; +static const u64 kPPC64_ShadowOffset64 = 1ULL << 41; static const u64 kSystemZ_ShadowOffset64 = 1ULL << 52; static const u64 kSPARC64_ShadowOffset64 = 1ULL << 43; // 0x800 static const u64 kFreeBSD_ShadowOffset32 = 1ULL << 30; // 0x4000 diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp index 7ce9e25da34..fc5619e4b37 100644 --- a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp +++ b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp @@ -759,9 +759,13 @@ u32 GetNumberOfCPUs() { #elif SANITIZER_SOLARIS return sysconf(_SC_NPROCESSORS_ONLN); #else +#if defined(CPU_COUNT) cpu_set_t CPUs; CHECK_EQ(sched_getaffinity(0, sizeof(cpu_set_t), &CPUs), 0); return CPU_COUNT(&CPUs); +#else + return 1; +#endif #endif } diff --git a/libsanitizer/sanitizer_common/sanitizer_mac.cpp b/libsanitizer/sanitizer_common/sanitizer_mac.cpp index b8839f197d2..fa077a129c2 100644 --- a/libsanitizer/sanitizer_common/sanitizer_mac.cpp +++ b/libsanitizer/sanitizer_common/sa
Re: [PATCH] c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]
On 10/1/21 10:26, Patrick Palka wrote: On Fri, 1 Oct 2021, Jason Merrill wrote: On 10/1/21 09:46, Patrick Palka wrote: Here during partial ordering of the two partial specializations we end up in unify with parm=arg=NONTYPE_ARGUMENT_PACK, and crash shortly thereafter because uses_template_parms calls potential_constant_expression which doesn't handle NONTYPE_ARGUMENT_PACK. This patch fixes this by checking dependent_template_arg_p instead of uses_template_parms when parm==arg, which does handle NONTYPE_ARGUMENT_PACK. We could also perhaps fix uses_template_parms / inst_dep_expr_p to better handle NONTYPE_ARGUMENT_PACK, Please. Sounds good, like the following then? Passes light testing, bootstrap and regtest on progress. -- >8 -- PR c++/102547 gcc/cp/ChangeLog: * pt.c (instantiation_dependent_expression_p): Sidestep checking potential_constant_expression on NONTYPE_ARGUMENT_PACK. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-partial2.C: New test. * g++.dg/cpp0x/variadic-partial2a.C: New test. --- gcc/cp/pt.c | 4 +++- .../g++.dg/cpp0x/variadic-partial2.C | 16 ++ .../g++.dg/cpp0x/variadic-partial2a.C | 22 +++ 3 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 1dcdffe322a..643204103c5 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -27705,7 +27705,9 @@ instantiation_dependent_expression_p (tree expression) { return (instantiation_dependent_uneval_expression_p (expression) || (processing_template_decl - && potential_constant_expression (expression) + && expression != NULL_TREE + && (TREE_CODE (expression) == NONTYPE_ARGUMENT_PACK + || potential_constant_expression (expression)) I'd prefer to loop over the elements of the pack, either here or (probably better) in potential_constant_expression. && value_dependent_expression_p (expression))); } diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C new file mode 100644 index 000..df61f26a3c1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C @@ -0,0 +1,16 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } + +template +struct vals { }; + +template +struct vals_client { }; + +template +struct vals_client, T> { }; + +template +struct vals_client, void> { }; + +template struct vals_client, void>; //- "sorry, unimplemented..., ICE" diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C new file mode 100644 index 000..cc0ea488ad3 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C @@ -0,0 +1,22 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } +// A version of variadic-partial2.C where the partial ordering is performed +// on function templates instead of class templates. + +template +struct vals { }; + +template +void f(V, T) { }; + +template +void f(vals, T) { }; + +template +void f(vals, char) { }; + +template void f(vals<1, 2>, char); //- "sorry, unimplemented..., ICE" + +int main() { + f(vals<1, 3>{}, 'a'); //- "sorry, unimplemented..., ICE" +}
Re: [PATCH] c++: Implement C++20 -Wdeprecated-array-compare [PR97573]
On Fri, Oct 01, 2021 at 09:16:53AM -0600, Martin Sebor wrote: > On 9/30/21 8:50 AM, Marek Polacek via Gcc-patches wrote: > > This patch addresses one of my leftovers from GCC 11. C++20 introduced > > [depr.array.comp]: > > "Equality and relational comparisons between two operands of array type are > > deprecated." > > so this patch adds -Wdeprecated-array-compare (enabled by default in C++20). > > A warning like this would be useful in C as well even though there > array equality is not deprecated (though relational expressions > involving distinct objects are undefined). Recently, while working > on my -Waddress enhancement to "warn for more impossible null > pointer tests, I noticed Clang warns for some of these equality > tests in both languages (it issues -Wtautological-compare). I'll look into adding this warning to the C FE; it should be trivial. > Rather that referring to deprecation, if one is necessary, I would > suggest to choose a name for the option that reflects the problem > the warning (and presumably the deprecation in C++) tries to prevent. > That said, since GCC already has both -Waddress and -Wtautological- > compare for these problems, the warning could be issued under either > of these. In my previous email I suggested -Warray-compare -- I wanted to avoid the "deprecated" part outside C++20. I also noticed the -Wtautological warning clang emits but I don't have time to look into it. It probably won't warn for arrays declared with __attribute__((weak)) so -Warray-compare still makes sense. > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > PR c++/97573 > > > > gcc/c-family/ChangeLog: > > > > * c-opts.c (c_common_post_options): In C++20, turn on > > -Wdeprecated-array-compare. > > * c.opt (Wdeprecated-array-compare): New option. > > > > gcc/cp/ChangeLog: > > > > * typeck.c (do_warn_deprecated_array_compare): New. > > (cp_build_binary_op): Call it for equality and relational comparisons. > > > > gcc/ChangeLog: > > > > * doc/invoke.texi: Document -Wdeprecated-array-compare. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/tree-ssa/pr15791-1.C: Add dg-warning. > > * g++.dg/cpp2a/array-comp1.C: New test. > > * g++.dg/cpp2a/array-comp2.C: New test. > > * g++.dg/cpp2a/array-comp3.C: New test. > > --- > > gcc/c-family/c-opts.c | 5 > > gcc/c-family/c.opt| 4 +++ > > gcc/cp/typeck.c | 28 +++ > > gcc/doc/invoke.texi | 19 - > > gcc/testsuite/g++.dg/cpp2a/array-comp1.C | 34 +++ > > gcc/testsuite/g++.dg/cpp2a/array-comp2.C | 31 + > > gcc/testsuite/g++.dg/cpp2a/array-comp3.C | 29 +++ > > gcc/testsuite/g++.dg/tree-ssa/pr15791-1.C | 2 +- > > 8 files changed, 150 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp1.C > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp2.C > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/array-comp3.C > > > > diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c > > index 3eaab5e1530..00b52cc5e12 100644 > > --- a/gcc/c-family/c-opts.c > > +++ b/gcc/c-family/c-opts.c > > @@ -962,6 +962,11 @@ c_common_post_options (const char **pfilename) > >warn_deprecated_enum_float_conv, > >cxx_dialect >= cxx20 && warn_deprecated); > > + /* -Wdeprecated-array-compare is enabled by default in C++20. */ > > + SET_OPTION_IF_UNSET (&global_options, &global_options_set, > > + warn_deprecated_array_compare, > > + cxx_dialect >= cxx20 && warn_deprecated); > > + > > /* Declone C++ 'structors if -Os. */ > > if (flag_declone_ctor_dtor == -1) > > flag_declone_ctor_dtor = optimize_size; > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > > index 9c151d19870..a4f0ea68594 100644 > > --- a/gcc/c-family/c.opt > > +++ b/gcc/c-family/c.opt > > @@ -540,6 +540,10 @@ Wdeprecated > > C C++ ObjC ObjC++ CPP(cpp_warn_deprecated) CppReason(CPP_W_DEPRECATED) > > ; Documented in common.opt > > +Wdeprecated-array-compare > > +C++ ObjC++ Var(warn_deprecated_array_compare) Warning > > +Warn about deprecated comparisons between two operands of array type. > > + > > Wdeprecated-copy > > C++ ObjC++ Var(warn_deprecated_copy) Warning LangEnabledBy(C++ ObjC++, > > Wextra) > > Mark implicitly-declared copy operations as deprecated if the class has a > > diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c > > index a2398dbe660..1e3a41104d6 100644 > > --- a/gcc/cp/typeck.c > > +++ b/gcc/cp/typeck.c > > @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see > > #include "attribs.h" > > #include "asan.h" > > #include "gimplify.h" > > +#include "tree-pretty-print.h" > > static tree cp_build_addr_expr_strict (tree, tsubst_flags_t); > > static tree cp_build_fun
Re: [PATCH] libiberty: testsuite: add missing format on d-demangle-expected
On Wed, Sep 29, 2021 at 5:51 PM Luís Ferreira wrote: > > This patch adds a missing format parameter that prevents d-demangle-expected > test collection from running successfully. > > Signed-off-by: Luís Ferreira > --- > libiberty/testsuite/d-demangle-expected | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/libiberty/testsuite/d-demangle-expected > b/libiberty/testsuite/d-demangle-expected > index 799f4724b72..44a3649c429 100644 > --- a/libiberty/testsuite/d-demangle-expected > +++ b/libiberty/testsuite/d-demangle-expected > @@ -991,6 +991,7 @@ _D88 > _D5__T1aZv > _D5__T1aZv > # > +--format=dlang > _D00 > _D00 > # > -- > 2.33.0 > This counts as an obvious fix. Please check it in. Thanks. -- H.J.
Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]
On 10/1/21 11:10, Nick Huang wrote: gcc-verify still fails with this version: ERR: line should start with a tab: "PR c++/101783" ERR: line should start with a tab: "* tree.c (cp_build_qualified_type_real): Excluding typedef from error" ERR: line should start with a tab: "PR c++/101783" ERR: line should start with a tab: "* g++.dg/parse/pr101783.C: New test." It might work better to attach the output of git format-patch. Sorry for my clumsy copy/paste from git commit message. I now attach git format-patch output file as attachment. Also maybe for a little convenience of your work, I also attach the original commit message file when I do git commit -F. Thanks, but that isn't necessary; it should be the same in the format-patch output, except... From e592a475030d99647de736d294cb3c6a7588af49 Mon Sep 17 00:00:00 2001 From: qingzhe huang Date: Fri, 1 Oct 2021 10:46:35 -0400 Subject: [PATCH] The root cause of this bug is that it considers reference with cv-qualifiers as an error by generating value for variable "bad_quals". However, this is not correct for case of typedef. Here I quote spec [dcl.ref]/1 : "Cv-qualified references are ill-formed except when the cv-qualifiers are introduced through the use of a typedef-name ([dcl.typedef], [temp.param]) or decltype-specifier ([dcl.type.decltype]), in which case the cv-qualifiers are ignored." ...the subject line for the commit should be the first line of the commit message, followed by a blank line, followed by the description of the patch; without the subject line, git format-patch thought your whole description was the subject of the patch. I've corrected this and pushed the patch, thanks! Jason
Re: [RFC][Patch][middle-end/PR102359]Not add initialization for READONLY variables with -ftrivial-auto-var-init
> On Oct 1, 2021, at 10:33 AM, Jason Merrill wrote: > > On 10/1/21 10:54, Qing Zhao wrote: >>> On Sep 30, 2021, at 2:31 PM, Jason Merrill wrote: >>> >>> On 9/30/21 11:42, Qing Zhao wrote: > On Sep 30, 2021, at 1:54 AM, Richard Biener wrote: > > On Thu, 30 Sep 2021, Jason Merrill wrote: > >> On 9/29/21 17:30, Qing Zhao wrote: >>> Hi, >>> >>> PR102359 (ICE gimplification failed since r12-3433-ga25e0b5e6ac8a77a) >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102359 >>> >>> Is due to -ftrivial-auto-var-init adding initialization for READONLY >>> variable “this” in the following routine: (t.cpp.005t.original) >>> >>> === >>> >>> ;; Function A::foo():: (null) >>> ;; enabled by -tree-original >>> >>> { >>> const struct A * const this [value-expr: &__closure->__this]; >>>const struct A * const this [value-expr: &__closure->__this]; >>> return = (double) ((const struct A *) this)->a; >>> } >>> === >>> >>> However, in the above routine, “this” is NOT marked as READONLY, but its >>> value-expr "&__closure->__this” is marked as READONLY. >>> >>> There are two major issues: >>> >>> 1. In the routine “is_var_need_auto_init”, we should exclude “decl” >>> that is >>> marked as READONLY; >>> 2. In the C++ FE, “this” should be marked as READONLY. >>> >>> The idea solution will be: >>> >>> 1. Fix “is_var_need_auto_init” to exclude TREE_READONLY (decl); >>> 2. Fix C++ FE to mark “this” as TREE_READONLY (decl)==true; >>> >>> Not sure whether it’s hard for C++ FE to fix the 2nd issue or not? >>> >>> In the case it’s not a quick fix in C++FE, I proposed the following fix >>> in >>> middle end: >>> >>> Let me know your comments or suggestions on this. >>> >>> Thanks a lot for the help. >> >> I'd think is_var_need_auto_init should be false for any variable with >> DECL_HAS_VALUE_EXPR_P, as they aren't really variables, just ways of >> naming >> objects that are initialized elsewhere. > > IIRC handing variables with DECL_HAS_VALUE_EXPR_P is necessary to > auto-init VLAs, otherwise I tend to agree - would we handle those > when we see a DECL_EXPR then? The current implementation is: gimplify_decl_expr: For each DECL_EXPR “decl” If (VAR_P (decl) && !DECL_EXTERNAL (decl)) { if (is_vla (decl)) gimplify_vla_decl (decl, …); /* existing handling: create a VALUE_EXPR for this vla decl*/ … if (has_explicit_init (decl)) { …; /* existing handling. */ } else if (is_var_need_auto_init (decl)) /*. New code. */ { gimple_add_init_for_auto_var (….); /* new code. */ ... } } Since the “DECL_VALUE_EXPR (decl)” is NOT a DECL_EXPR, it will not be scanned and added initialization. if we do not add initialization for a decl that has DECL_VALUE_EXPR, then the “DECL_VALUE_EXPR (decl)” will not be added an initialization either. We will miss adding initializations for these decls. So, I think that the current implementation is correct. And if C++ FE will not mark “this” as READONLY, only mark DECL_VALUE_EXPR(this) as READONLY, the proposed fix is correct too. Let me know your opinion on this. >>> >>> The problem with this test is not whether the 'this' proxy is marked >>> READONLY, the problem is that you're trying to initialize lambda capture >>> proxies at all; the lambda capture objects were already initialized when >>> forming the closure object. So this test currently aborts with >>> -ftrivial-auto-var-init=zero because you "initialize" the i capture field >>> to 0 after it was previously initialized to 42: >>> >>> int main() >>> { >>> int i = 42; >>> auto l = [=]() mutable { return i; }; >>> if (l() != i) >>>__builtin_abort (); >>> } >>> >>> I believe the same issue applies to the proxy variables in coroutines that >>> work much like lambdas. > >> So, how should the middle end determine that a variable is “proxy variable”? > > In the front end, is_capture_proxy will identify a lambda capture proxy > variable. But that won't be true for the similar proxies used by coroutines. Does this mean that in middle end, especially in gimplification phase, there is Not a simple way to determine whether a variable is a proxy variable? > >> Have all “proxy variables” been initialized by C++ FE already? > > Yes. > >>> You can't just assume that a VAR_DECL with DECL_VALUE_EXPR is uninitialized. >> So, all the VAR_DECLs with DECL_VALUE_EXPR (except the ones created by >> “gimplify_decl_expr”) are initialized by FE already? > > In general I'd expect them to refer to previously created objects which may > or may not have been initiali
[PATCH 00/11] OpenMP: Deep struct dereferences
This is a series of patches to support deep struct dereferences for OpenMP 5.0 (i.e. with multiple arrow operators, "a->b[foo]->c[lo:hi]"). Apart from a couple of general bug fixes, the main parts of this comprise: 1. Topological sorting of OMP clauses by base pointer dependencies. 2. Hoisting of struct sibling list handling out of gimplify_scan_omp_clauses. These patches replace and continue from the last part of the previously-posted series: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577219.html and (still) depend on the parts of 1 through 7 of that series. The patches have been bootstrapped & regression tested individually with offloading to NVPTX (though some prior to the most recent rebase). OK? Thanks, Julian Julian Brown (11): libgomp: Release device lock on cbuf error path Remove base_ind/base_ref handling from extract_base_bit_offset OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer) Remove omp_target_reorder_clauses OpenMP/OpenACC: Hoist struct sibling list handling in gimplification OpenMP: Allow array ref components for C & C++ OpenMP: Fix non-zero attach/detach bias for struct dereferences Not for committing: noisy topological sorting output Not for committing: noisy sibling-list handling output Not for committing: noisy mapping-group taxonomy OpenMP/OpenACC: [WIP] Add gcc_unreachable to apparently-dead path in build_struct_comp_nodes gcc/c-family/c-common.h |1 + gcc/c-family/c-omp.c | 42 + gcc/c/c-typeck.c | 15 +- gcc/cp/semantics.c| 17 +- gcc/gimplify.c| 2479 - gcc/omp-low.c |7 +- gcc/testsuite/g++.dg/goacc/member-array-acc.C |2 +- gcc/testsuite/g++.dg/gomp/target-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C |6 +- gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +- gcc/testsuite/g++.dg/gomp/target-this-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C |4 +- libgomp/target.c |5 +- libgomp/testsuite/libgomp.c++/baseptrs-3.C| 182 ++ .../libgomp.c-c++-common/baseptrs-1.c | 50 + .../libgomp.c-c++-common/baseptrs-2.c | 70 + 16 files changed, 2151 insertions(+), 739 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c++/baseptrs-3.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/baseptrs-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/baseptrs-2.c -- 2.29.2
[PATCH 01/11] libgomp: Release device lock on cbuf error path
This patch releases the device lock on a sanity-checking error path in transfer combining (cbuf) handling in libgomp:target.c. This shouldn't happen when handling well-formed mapping clauses, but erroneous clauses can currently cause a hang if the condition triggers. Tested with offloading to NVPTX. OK? 2021-09-29 Julian Brown libgomp/ * target.c (gomp_copy_host2dev): Release device lock on cbuf error path. --- libgomp/target.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/libgomp/target.c b/libgomp/target.c index 65bb40100e5..84c6fdf2c47 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -385,7 +385,10 @@ gomp_copy_host2dev (struct gomp_device_descr *devicep, else if (cbuf->chunks[middle].start <= doff) { if (doff + sz > cbuf->chunks[middle].end) - gomp_fatal ("internal libgomp cbuf error"); + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("internal libgomp cbuf error"); + } memcpy ((char *) cbuf->buf + (doff - cbuf->chunks[0].start), h, sz); return; -- 2.29.2
[PATCH 02/11] Remove base_ind/base_ref handling from extract_base_bit_offset
In preparation for follow-up patches extending struct dereference handling for OpenMP, this patch removes base_ind/base_ref handling from gimplify.c:extract_base_bit_offset. This arguably simplifies some of the code around the callers of the function also, though subsequent patches modify those parts further. OK for mainline? Thanks, Julian 2021-09-29 Julian Brown gcc/ * gimplify.c (extract_base_bit_offset): Remove BASE_IND, BASE_REF and OPENMP parameters. (strip_indirections): New function. (build_struct_group): Update calls to extract_base_bit_offset. Rearrange indirect/reference handling accordingly. Use extracted base instead of passed-in decl when grouping component accesses together. --- gcc/gimplify.c | 109 ++--- 1 file changed, 57 insertions(+), 52 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 92f8a7b4073..ece22b7a4ae 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -8641,9 +8641,8 @@ build_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end, has array type, else return NULL. */ static tree -extract_base_bit_offset (tree base, tree *base_ind, tree *base_ref, -poly_int64 *bitposp, poly_offset_int *poffsetp, -tree *offsetp, bool openmp) +extract_base_bit_offset (tree base, poly_int64 *bitposp, +poly_offset_int *poffsetp, tree *offsetp) { tree offset; poly_int64 bitsize, bitpos; @@ -8651,38 +8650,12 @@ extract_base_bit_offset (tree base, tree *base_ind, tree *base_ref, int unsignedp, reversep, volatilep = 0; poly_offset_int poffset; - if (base_ind) -*base_ind = NULL_TREE; - - if (base_ref) -*base_ref = NULL_TREE; + STRIP_NOPS (base); base = get_inner_reference (base, &bitsize, &bitpos, &offset, &mode, &unsignedp, &reversep, &volatilep); - if (!openmp - && (TREE_CODE (base) == INDIRECT_REF - || (TREE_CODE (base) == MEM_REF - && integer_zerop (TREE_OPERAND (base, 1 - && TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0))) == POINTER_TYPE) -{ - if (base_ind) - *base_ind = base; - base = TREE_OPERAND (base, 0); -} - if ((TREE_CODE (base) == INDIRECT_REF - || (TREE_CODE (base) == MEM_REF - && integer_zerop (TREE_OPERAND (base, 1 - && DECL_P (TREE_OPERAND (base, 0)) - && TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0))) == REFERENCE_TYPE) -{ - if (base_ref) - *base_ref = base; - base = TREE_OPERAND (base, 0); -} - - if (!openmp) -STRIP_NOPS (base); + STRIP_NOPS (base); if (offset && poly_int_tree_p (offset)) { @@ -8739,6 +8712,17 @@ strip_components_and_deref (tree expr) return expr; } +static tree +strip_indirections (tree expr) +{ + while (TREE_CODE (expr) == INDIRECT_REF +|| (TREE_CODE (expr) == MEM_REF +&& integer_zerop (TREE_OPERAND (expr, 1 +expr = TREE_OPERAND (expr, 0); + + return expr; +} + /* Return TRUE if EXPR is something we will use as the base of an aggregate access, either: @@ -9232,7 +9216,7 @@ build_struct_group (struct gimplify_omp_ctx *ctx, { poly_offset_int coffset; poly_int64 cbitpos; - tree base_ind, base_ref, tree_coffset; + tree tree_coffset; tree ocd = OMP_CLAUSE_DECL (c); bool openmp = !(region_type & ORT_ACC); @@ -9242,10 +9226,25 @@ build_struct_group (struct gimplify_omp_ctx *ctx, if (TREE_CODE (ocd) == INDIRECT_REF) ocd = TREE_OPERAND (ocd, 0); - tree base = extract_base_bit_offset (ocd, &base_ind, &base_ref, &cbitpos, - &coffset, &tree_coffset, openmp); + tree base = extract_base_bit_offset (ocd, &cbitpos, &coffset, &tree_coffset); + tree sbase; - bool do_map_struct = (base == decl && !tree_coffset); + if (openmp) +{ + if (TREE_CODE (base) == INDIRECT_REF + && TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0))) == REFERENCE_TYPE) + sbase = strip_indirections (base); + else + sbase = base; +} + else +{ + sbase = strip_indirections (base); + + STRIP_NOPS (sbase); +} + + bool do_map_struct = (sbase == decl && !tree_coffset); /* Here, DECL is usually a DECL_P, unless we have chained indirect member accesses, e.g. mystruct->a->b. In that case it'll be the "mystruct->a" @@ -9305,19 +9304,12 @@ build_struct_group (struct gimplify_omp_ctx *ctx, OMP_CLAUSE_SET_MAP_KIND (l, k); - if (!openmp && base_ind) - OMP_CLAUSE_DECL (l) = unshare_expr (base_ind); - else if (base_ref) - OMP_CLAUSE_DECL (l) = unshare_expr (base_ref); - else - { - OMP_CLAUSE_DECL (l) = unshare_expr (decl); - if (openmp - && !DECL_P (OMP_CLAUSE_DECL (l)) - && (gimplify_expr (&OMP_CLAUSE_DECL (l), pre_p, NULL, -is_gimple_lvalue,
[PATCH 03/11] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)
This patch reimplements the omp_target_reorder_clauses function in anticipation of supporting "deeper" struct mappings (that is, with several structure dereference operators, or similar). The idea is that in place of the (possibly quadratic) algorithm in omp_target_reorder_clauses that greedily moves clauses containing addresses that are subexpressions of other addresses before those other addresses, we employ a topological sort algorithm to calculate a proper order for map clauses. This should run in linear time, and hopefully handles degenerate cases where multiple "levels" of indirect accesses are present on a given directive. The new method also takes care to keep clause groups together, addressing the concerns raised in: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html To figure out if some given clause depends on a base pointer in another clause, we strip off the outer layers of the address expression, and check (via a tree_operand_hash hash table we have built) if the result is a "base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There are some subtleties involved, however: - We must treat MEM_REF with zero offset the same as INDIRECT_REF. This should probably be fixed in the front ends instead so we always use a canonical form (probably INDIRECT_REF). The following patch shows one instance of the problem, but there may be others: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html - Mapping a whole struct implies mapping each of that struct's elements, which may be base pointers. Because those base pointers aren't necessarily explicitly referenced in the directive in question, we treat the whole-struct mapping as a dependency instead. This version of the patch is significantly improved over the version posted previously in order to support the subsequent patches in this series. OK for mainline? Thanks, Julian 2021-09-29 Julian Brown gcc/ * gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete functions. (omp_tsort_mark): Add enum. (omp_mapping_group): Add struct. (omp_get_base_pointer, omp_get_attachment, omp_group_last, omp_gather_mapping_groups, omp_group_base, omp_index_mapping_groups, omp_containing_struct, omp_tsort_mapping_groups_1, omp_tsort_mapping_groups, omp_segregate_mapping_groups, omp_reorder_mapping_groups): New functions. (gimplify_scan_omp_clauses): Call above functions instead of omp_target_reorder_clauses, unless we've seen an error. * omp-low.c (scan_sharing_clauses): Avoid strict test if we haven't sorted mapping groups. gcc/testsuite/ * g++.dg/gomp/target-lambda-1.C: Adjust expected output. * g++.dg/gomp/target-this-3.C: Likewise. * g++.dg/gomp/target-this-4.C: Likewise. --- gcc/gimplify.c | 804 +++- gcc/omp-low.c | 7 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C | 6 +- gcc/testsuite/g++.dg/gomp/target-this-3.C | 4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C | 4 +- 5 files changed, 788 insertions(+), 37 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index ece22b7a4ae..d3346fc8d35 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -8675,29 +8675,6 @@ extract_base_bit_offset (tree base, poly_int64 *bitposp, return base; } -/* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ - -static bool -is_or_contains_p (tree expr, tree base_ptr) -{ - if ((TREE_CODE (expr) == INDIRECT_REF && TREE_CODE (base_ptr) == MEM_REF) - || (TREE_CODE (expr) == MEM_REF && TREE_CODE (base_ptr) == INDIRECT_REF)) -return operand_equal_p (TREE_OPERAND (expr, 0), - TREE_OPERAND (base_ptr, 0)); - while (!operand_equal_p (expr, base_ptr)) -{ - if (TREE_CODE (base_ptr) == COMPOUND_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 1); - if (TREE_CODE (base_ptr) == COMPONENT_REF - || TREE_CODE (base_ptr) == POINTER_PLUS_EXPR - || TREE_CODE (base_ptr) == SAVE_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 0); - else - break; -} - return operand_equal_p (expr, base_ptr); -} - /* Remove COMPONENT_REFS and indirections from EXPR. */ static tree @@ -8751,6 +8728,7 @@ aggregate_base_p (tree expr) return false; } +#if 0 /* Implement OpenMP 5.x map ordering rules for target directives. There are several rules, and with some level of ambiguity, hopefully we can at least collect the complexity here in one place. */ @@ -8930,6 +8908,758 @@ omp_target_reorder_clauses (tree *list_p) } } } +#endif + + +enum omp_tsort_mark { + UNVISITED, + TEMPORARY, + PERMANENT +}; + +struct omp_mapping_group { + tree *grp_start; + tree grp_end; + omp_tsort_mark mark; + struct omp_mapping_group *sibling; + struct omp_mapping_group *next; +}; + +__attribute
[PATCH 04/11] Remove omp_target_reorder_clauses
This patch has been split out from the previous one to avoid a confusingly-interleaved diff. The two patches should probably be committed squashed together. 2021-10-01 Julian Brown gcc/ * gimplify.c (omp_target_reorder_clauses): Delete. --- gcc/gimplify.c | 183 - 1 file changed, 183 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index d3346fc8d35..c10a3e8842a 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -8728,189 +8728,6 @@ aggregate_base_p (tree expr) return false; } -#if 0 -/* Implement OpenMP 5.x map ordering rules for target directives. There are - several rules, and with some level of ambiguity, hopefully we can at least - collect the complexity here in one place. */ - -static void -omp_target_reorder_clauses (tree *list_p) -{ - /* Collect refs to alloc/release/delete maps. */ - auto_vec ard; - tree *cp = list_p; - while (*cp != NULL_TREE) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALLOC - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_RELEASE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_DELETE)) - { - /* Unlink cp and push to ard. */ - tree c = *cp; - tree nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - - /* Any associated pointer type maps should also move along. */ - while (*cp != NULL_TREE - && OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_REFERENCE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ATTACH_DETACH - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALWAYS_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_TO_PSET)) - { - c = *cp; - nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - } - } -else - cp = &OMP_CLAUSE_CHAIN (*cp); - - /* Link alloc/release/delete maps to the end of list. */ - for (unsigned int i = 0; i < ard.length (); i++) -{ - *cp = ard[i]; - cp = &OMP_CLAUSE_CHAIN (ard[i]); -} - *cp = NULL_TREE; - - /* OpenMP 5.0 requires that pointer variables are mapped before - its use as a base-pointer. */ - auto_vec atf; - for (tree *cp = list_p; *cp; cp = &OMP_CLAUSE_CHAIN (*cp)) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP) - { - /* Collect alloc, to, from, to/from clause tree pointers. */ - gomp_map_kind k = OMP_CLAUSE_MAP_KIND (*cp); - if (k == GOMP_MAP_ALLOC - || k == GOMP_MAP_TO - || k == GOMP_MAP_FROM - || k == GOMP_MAP_TOFROM - || k == GOMP_MAP_ALWAYS_TO - || k == GOMP_MAP_ALWAYS_FROM - || k == GOMP_MAP_ALWAYS_TOFROM) - atf.safe_push (cp); - } - - for (unsigned int i = 0; i < atf.length (); i++) -if (atf[i]) - { - tree *cp = atf[i]; - tree decl = OMP_CLAUSE_DECL (*cp); - if (TREE_CODE (decl) == INDIRECT_REF || TREE_CODE (decl) == MEM_REF) - { - tree base_ptr = TREE_OPERAND (decl, 0); - STRIP_TYPE_NOPS (base_ptr); - for (unsigned int j = i + 1; j < atf.length (); j++) - if (atf[j]) - { - tree *cp2 = atf[j]; - tree decl2 = OMP_CLAUSE_DECL (*cp2); - - decl2 = OMP_CLAUSE_DECL (*cp2); - if (is_or_contains_p (decl2, base_ptr)) - { - /* Move *cp2 to before *cp. */ - tree c = *cp2; - *cp2 = OMP_CLAUSE_CHAIN (c); - OMP_CLAUSE_CHAIN (c) = *cp; - *cp = c; - - if (*cp2 != NULL_TREE - && OMP_CLAUSE_CODE (*cp2) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (*cp2) == GOMP_MAP_ALWAYS_POINTER) - { - tree c2 = *cp2; - *cp2 = OMP_CLAUSE_CHAIN (c2); - OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c); - OMP_CLAUSE_CHAIN (c) = c2; - } - - atf[j] = NULL; - } - } - } - } - - /* For attach_detach map clauses, if there is another map that maps the - attached/detached pointer, make sure that map is ordered before the - attach_detach. */ - atf.truncate (0); - for (tree *cp = list_p; *cp; cp = &OMP_CLAUSE_CHAIN (*cp)) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP) - { - /* Collect alloc, to, from, to/from clauses, and - always_pointer/attach_detach clauses. */ - gomp_map_kind k = OMP_CLAUSE_MAP_KIND (*cp); - if (k == GOMP_MAP_ALLOC -
[PATCH 05/11] OpenMP/OpenACC: Hoist struct sibling list handling in gimplification
This patch lifts struct sibling-list handling out of the main loop in gimplify_scan_omp_clauses. The reasons for this are several: first, it means that we can subject created sibling list groups to topological sorting (see previous patch) so base-pointer data dependencies are handled correctly. Secondly, it means that in the first pass gathering up sibling lists from parsed OpenMP/OpenACC clauses, we don't need to worry about gimplifying: that means we can see struct bases & components we need to sort sibling lists properly, even when we're using a non-DECL_P struct base. Gimplification proper still happens Thirdly, because we use more than one pass through the clause list and gather appropriate data, we can tell if we're mapping a whole struct in a different node, and avoid building struct sibling lists for that struct appropriately. Fourthly, we can re-use the node grouping functions from the previous patch, and thus mostly avoid the "prev_list_p" handling in gimplify_scan_omp_clauses that tracks the first node in such groups at present. Some redundant code has been removed and code paths for OpenACC/OpenMP are now shared where appropriate, though OpenACC doesn't do the topological sorting of nodes (yet?). OK for mainline? Thanks, Julian 2021-09-29 Julian Brown gcc/ * gimplify.c (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS. (extract_base_bit_offset): Remove OFFSETP parameter. (strip_components_and_deref): Extend with POINTER_PLUS_EXPR and COMPOUND_EXPR handling. (aggregate_base_p): Remove. (omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling. (build_struct_group): Remove CTX, DECL, PD, COMPONENT_REF_P, FLAGS, STRUCT_SEEN_CLAUSE, PRE_P, CONT parameters. Replace PREV_LIST_P and C parameters with GRP_START_P and GRP_END. Add INNER. Update calls to extract_base_bit_offset. Remove gimplification of clauses for OpenMP. Rework inner struct handling for OpenACC. Don't use context's variables splay tree. (omp_build_struct_sibling_lists): New function, extracted from gimplify_scan_omp_clauses and refactored. (gimplify_scan_omp_clauses): Call above function to handle struct sibling lists. Remove STRUCT_MAP_TO_CLAUSE, STRUCT_SEEN_CLAUSE, STRUCT_DEREF_SET. Rework flag handling, adding decl for struct variables. (gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS handling, unused now. gcc/testsuite/ * g++.dg/goacc/member-array-acc.C: Update expected output. * g++.dg/gomp/target-3.C: Likewise. * g++.dg/gomp/target-lambda-1.C: Likewise. * g++.dg/gomp/target-this-2.C: Likewise. * g++.dg/gomp/target-this-4.C: Likewise. --- gcc/gimplify.c| 943 -- gcc/testsuite/g++.dg/goacc/member-array-acc.C | 2 +- gcc/testsuite/g++.dg/gomp/target-3.C | 4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C | 2 +- gcc/testsuite/g++.dg/gomp/target-this-2.C | 2 +- gcc/testsuite/g++.dg/gomp/target-this-4.C | 4 +- 6 files changed, 410 insertions(+), 547 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index c10a3e8842a..31e2e4d9fe7 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -125,10 +125,6 @@ enum gimplify_omp_var_data /* Flag for GOVD_REDUCTION: inscan seen in {in,ex}clusive clause. */ GOVD_REDUCTION_INSCAN = 0x200, - /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for - fields. */ - GOVD_MAP_HAS_ATTACHMENTS = 0x400, - /* Flag for GOVD_FIRSTPRIVATE: OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT. */ GOVD_FIRSTPRIVATE_IMPLICIT = 0x800, @@ -8642,7 +8638,7 @@ build_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end, static tree extract_base_bit_offset (tree base, poly_int64 *bitposp, -poly_offset_int *poffsetp, tree *offsetp) +poly_offset_int *poffsetp) { tree offset; poly_int64 bitsize, bitpos; @@ -8670,7 +8666,6 @@ extract_base_bit_offset (tree base, poly_int64 *bitposp, *bitposp = bitpos; *poffsetp = poffset; - *offsetp = offset; return base; } @@ -8683,8 +8678,15 @@ strip_components_and_deref (tree expr) while (TREE_CODE (expr) == COMPONENT_REF || TREE_CODE (expr) == INDIRECT_REF || (TREE_CODE (expr) == MEM_REF -&& integer_zerop (TREE_OPERAND (expr, 1 -expr = TREE_OPERAND (expr, 0); +&& integer_zerop (TREE_OPERAND (expr, 1))) +|| TREE_CODE (expr) == POINTER_PLUS_EXPR +|| TREE_CODE (expr) == COMPOUND_EXPR) + if (TREE_CODE (expr) == COMPOUND_EXPR) + expr = TREE_OPERAND (expr, 1); + else + expr = TREE_OPERAND (expr, 0); + + STRIP_NOPS (expr); return expr; } @@ -8700,34 +8702,6 @@ strip_indirections (tree expr) return expr; } -/* Return TRUE if EXPR is something we
[PATCH 06/11] OpenMP: Allow array ref components for C & C++
This patch fixes parsing for struct components that are array references in OMP clauses in both the C and C++ front ends. OK for mainline? Thanks, Julian 2021-09-29 Julian Brown gcc/c/ * c-typeck.c (c_finish_omp_clauses): Allow ARRAY_REF components. gcc/cp/ * semantics.c (finish_omp_clauses): Allow ARRAY_REF components. --- gcc/c/c-typeck.c | 3 ++- gcc/cp/semantics.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index e10e6aa8439..d0494cadf05 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -14815,7 +14815,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) { t = TREE_OPERAND (t, 0); if (TREE_CODE (t) == MEM_REF - || TREE_CODE (t) == INDIRECT_REF) + || TREE_CODE (t) == INDIRECT_REF + || TREE_CODE (t) == ARRAY_REF) { t = TREE_OPERAND (t, 0); STRIP_NOPS (t); diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 6e954ca06a6..53bd8d236bb 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -7849,7 +7849,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) if (REFERENCE_REF_P (t)) t = TREE_OPERAND (t, 0); if (TREE_CODE (t) == MEM_REF - || TREE_CODE (t) == INDIRECT_REF) + || TREE_CODE (t) == INDIRECT_REF + || TREE_CODE (t) == ARRAY_REF) { t = TREE_OPERAND (t, 0); STRIP_NOPS (t); -- 2.29.2
[PATCH 08/11] Not for committing: noisy topological sorting output
As a possible aid to review, this is my "printf-style" debugging cruft for the topological sorting implementation. We might want to rework this into something that emits scannable output into the gimple dump in order to write tests to make sure base pointer dependencies are being found properly, but that hasn't been done yet. This is not for committing. --- gcc/gimplify.c | 169 ++--- 1 file changed, 161 insertions(+), 8 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 31e2e4d9fe7..2ec83bf273b 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -70,6 +70,8 @@ along with GCC; see the file COPYING3. If not see #include "context.h" #include "tree-nested.h" +//#define NOISY_TOPOSORT + /* Hash set of poisoned variables in a bind expr. */ static hash_set *asan_poisoned_variables = NULL; @@ -8957,6 +8959,10 @@ omp_gather_mapping_groups (tree *list_p) { vec *groups = new vec (); +#ifdef NOISY_TOPOSORT + fprintf (stderr, "GATHER MAPPING GROUPS\n"); +#endif + for (tree *cp = list_p; *cp; cp = &OMP_CLAUSE_CHAIN (*cp)) { if (OMP_CLAUSE_CODE (*cp) != OMP_CLAUSE_MAP) @@ -8965,6 +8971,25 @@ omp_gather_mapping_groups (tree *list_p) tree *grp_last_p = omp_group_last (cp); omp_mapping_group grp; +#ifdef NOISY_TOPOSORT + if (cp == grp_last_p) + { + tree tmp = OMP_CLAUSE_CHAIN (*cp); + OMP_CLAUSE_CHAIN (*cp) = NULL_TREE; + fprintf (stderr, "found singleton clause:\n"); + debug_generic_expr (*cp); + OMP_CLAUSE_CHAIN (*cp) = tmp; + } + else + { + tree tmp = OMP_CLAUSE_CHAIN (*grp_last_p); + OMP_CLAUSE_CHAIN (*grp_last_p) = NULL_TREE; + fprintf (stderr, "found group:\n"); + debug_generic_expr (*cp); + OMP_CLAUSE_CHAIN (*grp_last_p) = tmp; + } +#endif + grp.grp_start = cp; grp.grp_end = *grp_last_p; grp.mark = UNVISITED; @@ -9129,14 +9154,44 @@ omp_index_mapping_groups (vec *groups) omp_mapping_group *grp; unsigned int i; +#ifdef NOISY_TOPOSORT + fprintf (stderr, "INDEX MAPPING GROUPS\n"); +#endif + FOR_EACH_VEC_ELT (*groups, i, grp) { +#ifdef NOISY_TOPOSORT + debug_mapping_group (grp); +#endif + tree fpp; unsigned int chained; tree node = omp_group_base (grp, &chained, &fpp); if (node == error_mark_node || (!node && !fpp)) - continue; + { +#ifdef NOISY_TOPOSORT + fprintf (stderr, " -- NULL base, not indexing.\n"); +#endif + continue; + } + +#ifdef NOISY_TOPOSORT + if (node) + { + fprintf (stderr, "base%s: ", chained > 1 ? " list" : ""); + + tree walk = node; + for (unsigned j = 0; j < chained; walk = OMP_CLAUSE_CHAIN (walk), j++) + debug_generic_expr (OMP_CLAUSE_DECL (walk)); + } + + if (fpp) + { + fprintf (stderr, "firstprivate pointer/reference: "); + debug_generic_expr (fpp); + } +#endif for (unsigned j = 0; node && j < chained; @@ -9156,7 +9211,11 @@ omp_index_mapping_groups (vec *groups) omp_mapping_group **prev = grpmap->get (decl); if (prev && *prev == grp) - /* Empty. */; + { +#ifdef NOISY_TOPOSORT + fprintf (stderr, " -- same node\n"); +#endif + } else if (prev) { /* Mapping the same thing twice is normally diagnosed as an error, @@ -9171,9 +9230,17 @@ omp_index_mapping_groups (vec *groups) grp->sibling = (*prev)->sibling; (*prev)->sibling = grp; +#ifdef NOISY_TOPOSORT + fprintf (stderr, " -- index as sibling\n"); +#endif } else - grpmap->put (decl, grp); + { +#ifdef NOISY_TOPOSORT + fprintf (stderr, " -- index as new decl\n"); +#endif + grpmap->put (decl, grp); + } } if (!fpp) @@ -9184,9 +9251,17 @@ omp_index_mapping_groups (vec *groups) { grp->sibling = (*prev)->sibling; (*prev)->sibling = grp; +#ifdef NOISY_TOPOSORT + fprintf (stderr, " -- index fpp as sibling\n"); +#endif } else - grpmap->put (fpp, grp); + { +#ifdef NOISY_TOPOSORT + fprintf (stderr, " -- index fpp as new decl\n"); +#endif + grpmap->put (fpp, grp); + } } return grpmap; } @@ -9233,6 +9308,11 @@ omp_tsort_mapping_groups_1 (omp_mapping_group ***outlist, *grpmap, omp_mapping_group *grp) { +#ifdef NOISY_TOPOSORT + fprintf (stderr, "processing node/group:\n"); + debug_mapping_group (grp); +#endif + if (grp->mark == PERMANENT) return true; if (grp->mark == TEMPORARY) @@ -9253,11 +9333,26 @@ omp_tsort_mapping_groups_1 (omp_mapping_group ***outlist, if (basep) { gcc_assert (*basep != grp); +#ifdef NOISY_TOPOSOR
[PATCH 07/11] OpenMP: Fix non-zero attach/detach bias for struct dereferences
This patch fixes attach/detach operations for OpenMP that have a non-zero bias: these can occur if we have a mapping such as: #pragma omp target map(mystruct->a.b[idx].c[:arrsz]) i.e. where there is an offset between the attachment point ("mystruct" here) and the pointed-to data. (The "b" and "c" members would be array types here, not pointers themselves). In this example the difference (thus bias encoded in the attach/detach node) will be something like: (uintptr_t) &mystruct->a.b[idx].c[0] - (uintptr_t) &mystruct->a OK for mainline? Thanks, Julian 2021-09-29 Julian Brown gcc/c-family/ * c-common.h (c_omp_decompose_attachable_address): Add prototype. * c-omp.c (c_omp_decompose_attachable_address): New function. gcc/c/ * c-typeck.c (handle_omp_array_sections): Handle attach/detach for struct dereferences with non-zero bias. gcc/cp/ * semantics.c (handle_omp_array_section): Handle attach/detach for struct dereferences with non-zero bias. libgomp/ * testsuite/libgomp.c++/baseptrs-3.C: Add test (XFAILed for now). * testsuite/libgomp.c-c++-common/baseptrs-1.c: Add test. * testsuite/libgomp.c-c++-common/baseptrs-2.c: Add test. --- gcc/c-family/c-common.h | 1 + gcc/c-family/c-omp.c | 42 gcc/c/c-typeck.c | 12 +- gcc/cp/semantics.c| 14 +- libgomp/testsuite/libgomp.c++/baseptrs-3.C| 182 ++ .../libgomp.c-c++-common/baseptrs-1.c | 50 + .../libgomp.c-c++-common/baseptrs-2.c | 70 +++ 7 files changed, 364 insertions(+), 7 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c++/baseptrs-3.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/baseptrs-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/baseptrs-2.c diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 849cefab882..dab2dd33573 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1249,6 +1249,7 @@ extern tree c_omp_check_context_selector (location_t, tree); extern void c_omp_mark_declare_variant (location_t, tree, tree); extern const char *c_omp_map_clause_name (tree, bool); extern void c_omp_adjust_map_clauses (tree, bool); +extern tree c_omp_decompose_attachable_address (tree t, tree *virtbase); enum c_omp_directive_kind { C_OMP_DIR_STANDALONE, diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c index 1f07a0a454b..fc50f57e768 100644 --- a/gcc/c-family/c-omp.c +++ b/gcc/c-family/c-omp.c @@ -3119,6 +3119,48 @@ c_omp_adjust_map_clauses (tree clauses, bool is_target) } } +tree +c_omp_decompose_attachable_address (tree t, tree *virtbase) +{ + *virtbase = t; + + /* It's already a pointer. Just use that. */ + if (POINTER_TYPE_P (TREE_TYPE (t))) +return NULL_TREE; + + /* Otherwise, look for a base pointer deeper within the expression. */ + + while (TREE_CODE (t) == COMPONENT_REF +&& (TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF +|| TREE_CODE (TREE_OPERAND (t, 0)) == ARRAY_REF)) +{ + t = TREE_OPERAND (t, 0); + while (TREE_CODE (t) == ARRAY_REF) + t = TREE_OPERAND (t, 0); +} + + + *virtbase = t; + + if (TREE_CODE (t) != COMPONENT_REF) +return NULL_TREE; + + t = TREE_OPERAND (t, 0); + + tree attach_pt = NULL_TREE; + + if ((TREE_CODE (t) == INDIRECT_REF + || TREE_CODE (t) == MEM_REF) + && TREE_CODE (TREE_TYPE (TREE_OPERAND (t, 0))) == POINTER_TYPE) +{ + attach_pt = TREE_OPERAND (t, 0); + if (TREE_CODE (attach_pt) == POINTER_PLUS_EXPR) + attach_pt = TREE_OPERAND (attach_pt, 0); +} + + return attach_pt; +} + static const struct c_omp_directive omp_directives[] = { /* Keep this alphabetically sorted by the first word. Non-null second/third if any should precede null ones. */ diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index d0494cadf05..d1fd8be8e57 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -13696,9 +13696,15 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort) if (size) size = c_fully_fold (size, false, NULL); OMP_CLAUSE_SIZE (c) = size; + tree virtbase = t; + tree attach_pt + = ((ort != C_ORT_ACC) + ? c_omp_decompose_attachable_address (t, &virtbase) + : NULL_TREE); if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP || (TREE_CODE (t) == COMPONENT_REF - && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)) + && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE + && !attach_pt)) return false; gcc_assert (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FORCE_DEVICEPTR); switch (OMP_CLAUSE_MAP_KIND (c)) @@ -13731,10 +13737,10 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort) if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER && !c_mark_addressable (t))
[PATCH 09/11] Not for committing: noisy sibling-list handling output
As a possible aid to review, this is my "printf-style" debugging cruft for the sibling list handling hoist/rework. It's not meant for committing. --- gcc/gimplify.c | 131 + 1 file changed, 131 insertions(+) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 2ec83bf273b..ffb6eda5490 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -71,6 +71,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-nested.h" //#define NOISY_TOPOSORT +//#define NOISY_SIBLING_LISTS /* Hash set of poisoned variables in a bind expr. */ static hash_set *asan_poisoned_variables = NULL; @@ -9895,6 +9896,11 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, bool openmp = !(region_type & ORT_ACC); tree *continue_at = NULL; +#ifdef NOISY_SIBLING_LISTS + fprintf (stderr, "DECL starts out as:\n"); + debug_generic_expr (ocd); +#endif + while (TREE_CODE (ocd) == ARRAY_REF) ocd = TREE_OPERAND (ocd, 0); @@ -9903,6 +9909,11 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, tree base = extract_base_bit_offset (ocd, &cbitpos, &coffset); +#ifdef NOISY_SIBLING_LISTS + fprintf (stderr, "BASE after extraction is (%p):\n", (void *) base); + debug_generic_expr (base); +#endif + bool ptr = (OMP_CLAUSE_MAP_KIND (grp_end) == GOMP_MAP_ALWAYS_POINTER); bool attach_detach = ((OMP_CLAUSE_MAP_KIND (grp_end) == GOMP_MAP_ATTACH_DETACH) @@ -9917,6 +9928,25 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, if (openmp && attach_detach) return NULL; +#ifdef NOISY_SIBLING_LISTS + if (struct_map_to_clause) +{ + fprintf (stderr, "s_m_t_c->get (base) = "); + debug_generic_expr (base); + tree *r = struct_map_to_clause->get (base); + fprintf (stderr, "returns: "); + if (r) + { + tree tmp = OMP_CLAUSE_CHAIN (*r); + OMP_CLAUSE_CHAIN (*r) = NULL_TREE; + debug_generic_expr (*r); + OMP_CLAUSE_CHAIN (*r) = tmp; + } + else + fprintf (stderr, "(nothing)\n"); +} +#endif + if (!struct_map_to_clause || struct_map_to_clause->get (base) == NULL) { tree l = build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), OMP_CLAUSE_MAP); @@ -10026,6 +10056,11 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, { tree *osc = struct_map_to_clause->get (base); tree *sc = NULL, *scp = NULL; +#ifdef NOISY_SIBLING_LISTS + fprintf (stderr, "looked up osc %p for decl (%p)\n", (void *) osc, + (void *) base); + debug_generic_expr (base); +#endif sc = &OMP_CLAUSE_CHAIN (*osc); /* The struct mapping might be immediately followed by a FIRSTPRIVATE_POINTER and/or FIRSTPRIVATE_REFERENCE -- if it's an @@ -10098,6 +10133,17 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, return NULL; } } +#ifdef NOISY_SIBLING_LISTS + if (known_eq (coffset, offset) && known_eq (cbitpos, bitpos)) + { + fprintf (stderr, "duplicate offset!\n"); + tree o1 = OMP_CLAUSE_DECL (*sc); + tree o2 = OMP_CLAUSE_DECL (grp_end); + debug_generic_expr (o1); + debug_generic_expr (o2); + } + else +#endif if (maybe_lt (coffset, offset) || (known_eq (coffset, offset) && maybe_lt (cbitpos, bitpos))) @@ -10174,6 +10220,13 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, = cl ? move_concat_nodes_after (cl, tail_chain, grp_start_p, grp_end, sc) : move_nodes_after (grp_start_p, grp_end, sc); +#ifdef NOISY_SIBLING_LISTS + if (continue_at) + { + fprintf (stderr, "continue at (1):\n"); + debug_generic_expr (*continue_at); + } +#endif } else if (*sc != grp_end) { @@ -10187,6 +10240,10 @@ build_struct_group (enum omp_region_type region_type, enum tree_code code, the correct position in the struct component list, which in this case is just SC. */ move_node_after (grp_end, grp_start_p, sc); +#ifdef NOISY_SIBLING_LISTS + fprintf (stderr, "continue at (2):\n"); + debug_generic_expr (*continue_at); +#endif } } return continue_at; @@ -10218,6 +10275,16 @@ omp_build_struct_sibling_lists (enum tree_code code, new_next = NULL; +#ifdef NOISY_SIBLING_LISTS + { + tree *tmp = grp->grp_start; + grp->grp_start = grp_start_p; + fprintf (stderr, "processing group %u:\n", i); + debug_mapping_group (grp); + grp->grp_start = tmp; + } +#endif + if (DECL_P (decl)) continue; @@ -10252,6 +10319,11 @@ omp_build_stru
[PATCH 10/11] Not for committing: noisy mapping-group taxonomy
As a possible aid to review, this is code that can be used to enumerate all the mapping group forms currently in use across the GCC/libgomp testsuites for OpenMP/OpenACC. These groups have been added somewhat organically, so there might be a couple of surprises: see e.g. the patch following this one. It's not meant for committing. --- gcc/gimplify.c | 327 + 1 file changed, 327 insertions(+) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index ffb6eda5490..d9fda21413d 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -72,6 +72,7 @@ along with GCC; see the file COPYING3. If not see //#define NOISY_TOPOSORT //#define NOISY_SIBLING_LISTS +//#define NOISY_TAXONOMY /* Hash set of poisoned variables in a bind expr. */ static hash_set *asan_poisoned_variables = NULL; @@ -9010,6 +9011,326 @@ omp_gather_mapping_groups (tree *list_p) } } +#ifdef NOISY_TAXONOMY + +static void +omp_mapping_group_taxonomy (vec *groups) +{ + int num = 0; + + for (auto &it : *groups) +{ + tree node, grp_start = *it.grp_start, grp_end = it.grp_end; + gomp_map_kind kind0 = OMP_CLAUSE_MAP_KIND (grp_start), kind1, kind2, + kind3; + int count = 1; + node = grp_start; + if (node != grp_end) + { + node = OMP_CLAUSE_CHAIN (node); + kind1 = OMP_CLAUSE_MAP_KIND (node); + count++; + if (node != grp_end) + { + node = OMP_CLAUSE_CHAIN (node); + kind2 = OMP_CLAUSE_MAP_KIND (node); + count++; + if (node != grp_end) + { + node = OMP_CLAUSE_CHAIN (node); + kind3 = OMP_CLAUSE_MAP_KIND (node); + count++; + gcc_assert (node == grp_end); + } + } + } + + fprintf (stderr, "group %d: ", num); + + switch (count) + { + case 1: + if (kind0 == GOMP_MAP_TO + || kind0 == GOMP_MAP_FROM + || kind0 == GOMP_MAP_TOFROM) + fprintf (stderr, "scalar to/from\n"); + else if (kind0 == GOMP_MAP_ALLOC) + fprintf (stderr, "alloc\n"); + else if (kind0 == GOMP_MAP_POINTER) + fprintf (stderr, "pointer (by itself)\n"); + else if (kind0 == GOMP_MAP_TO_PSET) + fprintf (stderr, "map-to-pset (by itself)\n"); + else if (kind0 == GOMP_MAP_FORCE_PRESENT) + fprintf (stderr, "force present\n"); + else if (kind0 == GOMP_MAP_DELETE) + fprintf (stderr, "delete\n"); + else if (kind0 == GOMP_MAP_FORCE_DEVICEPTR) + fprintf (stderr, "force deviceptr\n"); + else if (kind0 == GOMP_MAP_DEVICE_RESIDENT) + fprintf (stderr, "device resident\n"); + else if (kind0 == GOMP_MAP_LINK) + fprintf (stderr, "link\n"); + else if (kind0 == GOMP_MAP_IF_PRESENT) + fprintf (stderr, "if present\n"); + else if (kind0 == GOMP_MAP_FIRSTPRIVATE) + fprintf (stderr, "firstprivate (by itself)\n"); + else if (kind0 == GOMP_MAP_FIRSTPRIVATE_INT) + fprintf (stderr, "firstprivate_int (by itself)\n"); + else if (kind0 == GOMP_MAP_USE_DEVICE_PTR) + fprintf (stderr, "use device ptr\n"); + else if (kind0 == GOMP_MAP_ZERO_LEN_ARRAY_SECTION) + fprintf (stderr, "zero-length array section (by itself)\n"); + else if (kind0 == GOMP_MAP_FORCE_ALLOC) + fprintf (stderr, "force alloc\n"); + else if (kind0 == GOMP_MAP_FORCE_TO + || kind0 == GOMP_MAP_FORCE_FROM + || kind0 == GOMP_MAP_FORCE_TOFROM) + fprintf (stderr, "force to/from (scalar)\n"); + else if (kind0 == GOMP_MAP_USE_DEVICE_PTR_IF_PRESENT) + fprintf (stderr, "use device ptr if present\n"); + else if (kind0 == GOMP_MAP_ALWAYS_TO + || kind0 == GOMP_MAP_ALWAYS_FROM + || kind0 == GOMP_MAP_ALWAYS_TOFROM) + fprintf (stderr, "always to/from (scalar)\n"); + else if (kind0 == GOMP_MAP_STRUCT) + fprintf (stderr, "struct\n"); + else if (kind0 == GOMP_MAP_ALWAYS_POINTER) + fprintf (stderr, "always pointer (by itself)\n"); + else if (kind0 == GOMP_MAP_POINTER_TO_ZERO_LENGTH_ARRAY_SECTION) + fprintf (stderr, "ptr to 0-length array section (by itself)\n"); + else if (kind0 == GOMP_MAP_RELEASE) + fprintf (stderr, "release\n"); + else if (kind0 == GOMP_MAP_ATTACH) + fprintf (stderr, "attach\n"); + else if (kind0 == GOMP_MAP_DETACH) + fprintf (stderr, "detach\n"); + else if (kind0 == GOMP_MAP_FORCE_DETACH) + fprintf (stderr, "force detach\n"); + else if (kind0 == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION) + fprintf (stderr, "attach 0-length array section\n"); + else if (kind0 == GOMP_MAP_FIRSTPRIVATE_POINTER)
[PATCH 11/11] OpenMP/OpenACC: [WIP] Add gcc_unreachable to apparently-dead path in build_struct_comp_nodes
The previous "not for committing" taxonomy patch shows that the path handling "extra nodes" in build_struct_comp_nodes is probably now dead, at least across the current testsuite. This patch adds gcc_unreachable on that path: this passes testing, which suggests that the extra node handling can probably be removed completely. (Otherwise we need test coverage for that path, ideally!) This is mostly posted as an FYI: a real patch would probably just remove the unused code path, if it really isn't needed any more. Thanks, Julian 2021-09-29 Julian Brown gcc/ * gimplify.c (build_struct_comp_nodes): Add gcc_unreachable on code path that appears to now be unused. --- gcc/gimplify.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index d9fda21413d..3d444d1836f 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -8625,6 +8625,9 @@ build_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end, OMP_CLAUSE_SIZE (c3) = TYPE_SIZE_UNIT (ptr_type_node); OMP_CLAUSE_CHAIN (c3) = NULL_TREE; + /* Apparently? */ + gcc_unreachable (); + *extra_node = c3; } else -- 2.29.2
Re: [PATCH 05/11] OpenMP/OpenACC: Hoist struct sibling list handling in gimplification
Oops, editing: On Fri, 1 Oct 2021 10:09:03 -0700 Julian Brown wrote: > Secondly, it means that in the first pass gathering up sibling lists > from parsed OpenMP/OpenACC clauses, we don't need to worry about > gimplifying: that means we can see struct bases & components we need > to sort sibling lists properly, even when we're using a non-DECL_P > struct base. Gimplification proper still happens ...in the main loop in gimplify_scan_omp_clauses.
Re: [PATCH] libiberty: testsuite: add missing format on d-demangle-expected
On 9/29/2021 6:50 PM, Luís Ferreira wrote: This patch adds a missing format parameter that prevents d-demangle-expected test collection from running successfully. Signed-off-by: Luís Ferreira THanks. Pushed to the trunk. Can you please start including ChangeLog entries so that I don't have to write them for you. Thanks, Jeff
Re: [PATCH] c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]
On Fri, 1 Oct 2021, Jason Merrill wrote: > On 10/1/21 10:26, Patrick Palka wrote: > > On Fri, 1 Oct 2021, Jason Merrill wrote: > > > > > On 10/1/21 09:46, Patrick Palka wrote: > > > > Here during partial ordering of the two partial specializations we end > > > > up in unify with parm=arg=NONTYPE_ARGUMENT_PACK, and crash > > > > shortly > > > > thereafter because uses_template_parms calls > > > > potential_constant_expression > > > > which doesn't handle NONTYPE_ARGUMENT_PACK. > > > > > > > > This patch fixes this by checking dependent_template_arg_p instead of > > > > uses_template_parms when parm==arg, which does handle > > > > NONTYPE_ARGUMENT_PACK. > > > > We could also perhaps fix uses_template_parms / inst_dep_expr_p to > > > > better > > > > handle NONTYPE_ARGUMENT_PACK, > > > > > > Please. > > > > Sounds good, like the following then? Passes light testing, bootstrap > > and regtest on progress. > > > > -- >8 -- > > > > PR c++/102547 > > > > gcc/cp/ChangeLog: > > > > * pt.c (instantiation_dependent_expression_p): Sidestep checking > > potential_constant_expression on NONTYPE_ARGUMENT_PACK. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/cpp0x/variadic-partial2.C: New test. > > * g++.dg/cpp0x/variadic-partial2a.C: New test. > > --- > > gcc/cp/pt.c | 4 +++- > > .../g++.dg/cpp0x/variadic-partial2.C | 16 ++ > > .../g++.dg/cpp0x/variadic-partial2a.C | 22 +++ > > 3 files changed, 41 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C > > > > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c > > index 1dcdffe322a..643204103c5 100644 > > --- a/gcc/cp/pt.c > > +++ b/gcc/cp/pt.c > > @@ -27705,7 +27705,9 @@ instantiation_dependent_expression_p (tree > > expression) > > { > > return (instantiation_dependent_uneval_expression_p (expression) > > || (processing_template_decl > > - && potential_constant_expression (expression) > > + && expression != NULL_TREE > > + && (TREE_CODE (expression) == NONTYPE_ARGUMENT_PACK > > + || potential_constant_expression (expression)) > > I'd prefer to loop over the elements of the pack, either here or (probably > better) in potential_constant_expression. Ah, makes sense. Like so? Bootstrapped and regtested on x86_64-pc-linux-gnu. -- >8 -- Subject: [PATCH] c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547] Here during partial ordering of the two partial specializations we end up in unify with parm=arg=NONTYPE_ARGUMENT_PACK, and crash shortly thereafter because uses_template_parms(parms) calls potential_const_expr which doesn't handle NONTYPE_ARGUMENT_PACK. This patch fixes this by extending potential_constant_expression to handle NONTYPE_ARGUMENT_PACK appropriately. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/11? PR c++/102547 gcc/cp/ChangeLog: * constexpr.c (potential_constant_expression_1): Handle NONTYPE_ARGUMENT_PACK. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-partial2.C: New test. * g++.dg/cpp0x/variadic-partial2a.C: New test. --- gcc/cp/constexpr.c| 10 + .../g++.dg/cpp0x/variadic-partial2.C | 16 ++ .../g++.dg/cpp0x/variadic-partial2a.C | 22 +++ 3 files changed, 48 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 18d9d117a48..e95ff00774f 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -9043,6 +9043,16 @@ potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool now, case CO_RETURN_EXPR: return false; +case NONTYPE_ARGUMENT_PACK: + { + tree args = ARGUMENT_PACK_ARGS (t); + int len = TREE_VEC_LENGTH (args); + for (int i = 0; i < len; ++i) + if (!RECUR (TREE_VEC_ELT (args, i), any)) + return false; + return true; + } + default: if (objc_non_constant_expr_p (t)) return false; diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C new file mode 100644 index 000..df61f26a3c1 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2.C @@ -0,0 +1,16 @@ +// PR c++/102547 +// { dg-do compile { target c++11 } } + +template +struct vals { }; + +template +struct vals_client { }; + +template +struct vals_client, T> { }; + +template +struct vals_client, void> { }; + +template struct vals_client, void>; //- "sorry, unimplemented..., ICE" diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C b/gcc/testsuite/g++.dg/cpp0x/variadic-partial2a.C new file mode 100644 index 000..e98bdb
Re: [PATCH] libiberty: testsuite: add missing format on d-demangle-expected
Noted. On Fri, 2021-10-01 at 11:28 -0600, Jeff Law wrote: > > > On 9/29/2021 6:50 PM, Luís Ferreira wrote: > > > This patch adds a missing format parameter that prevents d- > > demangle-expected > > test collection from running successfully. > > > > Signed-off-by: Luís Ferreira > THanks. Pushed to the trunk. > > Can you please start including ChangeLog entries so that I don't > have > to write them for you. > > Thanks, > Jeff > > -- Sincerely, Luís Ferreira @ lsferreira.net signature.asc Description: This is a digitally signed message part
[committed] hppa: Default to dwarf version 4 on hppa64-hpux
DWARF5 is not supported by gdb on hpux, so we need to to limit version to 4. Tested on hppa64-hp-hpux11.11. Committed to trunk and gcc-11. Dave --- Default to dwarf version 4 on hppa64-hpux 2021-10-01 John David Anglin gcc/ChangeLog: PR debug/102373 * config/pa/pa.c (pa_option_override): Default to dwarf version 4 on hppa64-hpux. diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c index 06143023b46..5b3ffd48f4e 100644 --- a/gcc/config/pa/pa.c +++ b/gcc/config/pa/pa.c @@ -541,6 +541,16 @@ pa_option_override (void) write_symbols = NO_DEBUG; } + if (TARGET_64BIT && TARGET_HPUX) +{ + /* DWARF5 is not supported by gdb. Don't emit DWARF5 unless +specifically selected. */ + if (!global_options_set.x_dwarf_strict) + dwarf_strict = 1; + if (!global_options_set.x_dwarf_version) + dwarf_version = 4; +} + /* We only support the "big PIC" model now. And we always generate PIC code when in 64bit mode. */ if (flag_pic == 1 || TARGET_64BIT)
Re: PING #2 [PATCH] warn for more impossible null pointer tests [PR102103]
On 9/30/21 1:35 PM, Joseph Myers wrote: On Thu, 30 Sep 2021, Martin Sebor via Gcc-patches wrote: Jason, since you approved the C++ changes, would you mind looking over the C bits and if they look good to you giving me the green light to commit the patch? https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579693.html The C changes are OK, with two instances of "for the address %qE will never be NULL" fixed to refer to the address *of* %qE as elsewhere (those are for IMAGPART_EXPR and REALPART_EXPR; C++ also has one "the address %qE will never be NULL"), and the "pr??" in the tests filled in with an actual PR number for the XFAILed cases. Thanks for the careful review and the approval! I remember having a reason for dropping the "of" in the two instances in the C FE but after double-checking the output I see you're right that it should be there. Good catch! I believe the C++ instance is correct. It's issued for the address of a member, as in the former of the two below: struct S { int i; }; bool f () { return &S::i == 0; // the address ‘&S::i’ } bool g (S *p) { return &p->i == 0; // the address of ‘S::i’ } z.c: In function ‘bool f()’: z.c:5:16: warning: the address ‘&S::i’ will never be NULL [-Waddress] 5 | return &S::i == 0; | ~~^~~~ z.c: In function ‘bool g(S*)’: z.c:10:16: warning: the address of ‘S::i’ will never be NULL [-Waddress] 10 | return &p->i == 0; | ~~^~~~ z.c:1:16: note: ‘S::i’ declared here 1 | struct S { int i; }; |^ I've beefed up the tests to verify the expected wording. Thanks also for prompting me to open bugs for the xfailed tests, something I tend to forget to do when it depends on the patch I'm developing. I've raised pr102555 for the C FE folding defeating the warning. The C++ bug that tracks the xfails in the C++ tests is pr102378. I've pushed the updated patch in r12-4059 after retesting the whole thing on x86_64-linux. Martin
Re: [PING^2][PATCH] libgcc, emutls: Allow building weak definitions of the emutls functions.
Hi, So let’s ignore the questions for now - OK for the non-Darwin parts of the patch ? > On 24 Sep 2021, at 17:57, Iain Sandoe wrote: > > as noted below the non-Darwin parts of this are trivial (and a no-OP). > I’d like to apply this to start work towards solving Darwin’s libgcc issues, >> On 20 Sep 2021, at 09:25, Iain Sandoe wrote: >> >> The non-Darwin part of this patch is trivial but raises a couple of questions >> >> A/ >> We define builtins to support emulated TLS. >> These are defined with void * pointers >> The implementation (in libgcc) uses the correct type (struct __emutls_object >> *) >> in both a forward declaration of the functions and in thier eventual >> implementation. >> >> This leads to a (long-standing, nothing new) complaint at build-time about >> the mismatch in the builtin/implementation decls. >> >> AFAICT, there’s no way to fix that unless we introduce struct >> __emutls_object * >> as a built-in type? >> >> B/ >> It seems that a consequence of the mismatch in decls means that if I apply >> attributes to the decl (in the implementation file), they are ignored and I >> have >> to apply them to the definition in order for this to work. >> >> This (B) is what the patch below does. >> >> tested on powerpc,i686,x86_64-darwin, x86_64-linux >> OK for master? >> thanks, >> Iain >> >> If the current situation is that A or B indicates “there’s a bug”, please >> could that >> be considered as distinct from the current patch (which doesn’t alter this >> in any >> way) so that we can make progress on fixing Darwin libgcc issues. >> >> = commit log >> >> In order to better support use of the emulated TLS between objects with >> DSO dependencies and static-linked libgcc, allow a target to make weak >> definitions. >> >> Signed-off-by: Iain Sandoe >> >> libgcc/ChangeLog: >> >> * config.host: Add weak-defined emutls crt. >> * config/t-darwin: Build weak-defined emutls objects. >> * emutls.c (__emutls_get_address): Add optional attributes. >> (__emutls_register_common): Likewise. >> (EMUTLS_ATTR): New. >> --- >> libgcc/config.host | 2 +- >> libgcc/config/t-darwin | 13 + >> libgcc/emutls.c| 17 +++-- >> 3 files changed, 29 insertions(+), 3 deletions(-) >> >> diff --git a/libgcc/config.host b/libgcc/config.host >> index 6c34b13d611..a447ac7ae30 100644 >> --- a/libgcc/config.host >> +++ b/libgcc/config.host >> @@ -215,7 +215,7 @@ case ${host} in >> *-*-darwin*) >> asm_hidden_op=.private_extern >> tmake_file="$tmake_file t-darwin ${cpu_type}/t-darwin t-libgcc-pic >> t-slibgcc-darwin" >> - extra_parts="crt3.o libd10-uwfef.a crttms.o crttme.o" >> + extra_parts="crt3.o libd10-uwfef.a crttms.o crttme.o libemutls_w.a" >> ;; >> *-*-dragonfly*) >> tmake_file="$tmake_file t-crtstuff-pic t-libgcc-pic t-eh-dw2-dip" >> diff --git a/libgcc/config/t-darwin b/libgcc/config/t-darwin >> index 14ae6b35a4e..d6f688d66d5 100644 >> --- a/libgcc/config/t-darwin >> +++ b/libgcc/config/t-darwin >> @@ -15,6 +15,19 @@ crttme.o: $(srcdir)/config/darwin-crt-tm.c >> LIB2ADDEH = $(srcdir)/unwind-dw2.c $(srcdir)/config/unwind-dw2-fde-darwin.c \ >> $(srcdir)/unwind-sjlj.c $(srcdir)/unwind-c.c >> >> +# Make emutls weak so that we can deal with -static-libgcc, override the >> +# hidden visibility when this is present in libgcc_eh. >> +emutls.o: HOST_LIBGCC2_CFLAGS += \ >> + -DEMUTLS_ATTR='__attribute__((__weak__,__visibility__("default")))' >> +emutls_s.o: HOST_LIBGCC2_CFLAGS += \ >> + -DEMUTLS_ATTR='__attribute__((__weak__,__visibility__("default")))' >> + >> +# Make the emutls crt as a convenience lib so that it can be linked >> +# optionally, use the shared version so that we can link with DSO. >> +libemutls_w.a: emutls_s.o >> +$(AR_CREATE_FOR_TARGET) $@ $< >> +$(RANLIB_FOR_TARGET) $@ >> + >> # Patch to __Unwind_Find_Enclosing_Function for Darwin10. >> d10-uwfef.o: $(srcdir)/config/darwin10-unwind-find-enc-func.c >> $(crt_compile) -mmacosx-version-min=10.6 -c $< >> diff --git a/libgcc/emutls.c b/libgcc/emutls.c >> index ed2658170f5..d553a74728f 100644 >> --- a/libgcc/emutls.c >> +++ b/libgcc/emutls.c >> @@ -50,7 +50,16 @@ struct __emutls_array >> void **data[]; >> }; >> >> +/* EMUTLS_ATTR is provided to allow targets to build the emulated tls >> + routines as weak definitions, for example. >> + If there is no definition, fall back to the default. */ >> +#ifndef EMUTLS_ATTR >> +# define EMUTLS_ATTR >> +#endif >> + >> +EMUTLS_ATTR >> void *__emutls_get_address (struct __emutls_object *); >> +EMUTLS_ATTR >> void __emutls_register_common (struct __emutls_object *, word, word, void *); >> >> #ifdef __GTHREADS >> @@ -123,7 +132,11 @@ emutls_alloc (struct __emutls_object *obj) >> return ret; >> } >> >> -void * >> +/* Despite applying the attribute to the declaration, in this case the mis- >> + match between the builtin's declaration [void * (*)(void *)] and the >> + implementation here, causes the decl. attri
Re: [Patch] Fortran: Avoid var initialization in interfaces [PR54753]
[Resending as I did not see it show up in the MLs] Hi Tobias, Am 29.09.21 um 10:53 schrieb Tobias Burnus: Found when looking at F2018:C839 / PR54753. For INTENT(OUT) the dummy variable (might) also be default initialized or deallocated. However, with assumed rank, that causes issues, which C839 prevents. In the current GCC implementation, missing C839 constraint diagnostic, but also rejects-valid/ice-on-valid appears. There are three issues, this patch solves the first: * reject-valid issue due to adding the initializer also to a dummy argument which is in an INTERFACE block. Having initializers in INTERFACE blocks is pointless and causes for the attached testcase the bogus error: "Assumed-rank variable y at (1) may only be used as actual argument" ACK. (Except for wasting resources and this error, they should be ignored in trans*.c and usually do not cause any further harm.) I think Sandra has a nearly ready patch to do the C839 constraint diagnostic, which needs the attached patch to do the checks. The third issue is that GCC currently gives either an ICE or the above error message when declaring a procedure with a valid assumed-rank intent(out) dummy. This has still to be solved as well. But first I wanted to unblock Sandra's C839 work with this patch :-) Regarding the patch, '!= IFSRC_IFBODY' has to be used; "== IFSRC_DECL" won't work as the the generatedy ENTRY master function has IFSRC_UNKNOWN. I have to admit that the code touched is hard to understand for me. The conditions involved are unfortunately already dense, long lists. (There are PRs related to *missing* default initializations (e.g. PR100440), which I looked at before in that area until I got lost... Of course this needs to be addressed elsewhere.) OK for mainline? No objections here; the patch seems to work. The commit message has a non-ASCII character (hyphen). Not sure whether that was intended. You may nevertheless want a second opinion. Thanks for the patch! Harald Tobias PS: Some patch reviews are that fast that it is impossible to send the OK; at least, I did not manage to do for Harald's last two - for the last one I was at least 4min too late. ;-) - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[committed] libstdc++: Reduce header dependencies for C++20 std::erase [PR92546]
This reduces the preprocessed size of , and by not including for std::remove and std::remove_if. Also unwrap iterators using __niter_base, to avoid redundant debug mode checks. PR libstdc++/92546 * include/bits/erase_if.h (__erase_nodes_if): Use __niter_base to unwrap debug iterators. * include/bits/refwrap.h: Do not error if included in C++03. * include/bits/stl_algo.h (__remove_if): Move to ... * include/bits/stl_algobase.h (__remove_if): ... here. * include/std/deque (erase, erase_if): Use __remove_if instead of remove and remove_if. * include/std/string (erase, erase_if): Likewise. * include/std/vector (erase, erase_if): Likewise. Tested powerpc64le-linux. Committed to trunk. ff7793bea46 34e9407b3b4 b7e8fb5e482 6ccffeb56b9 e79bde6ada4 e5c093e515c 20751fad19e 9b790acc220 e3869a48fc2 44967af830a dc1b29508d7 59ffa3e3dba d71476c9df9 a09bb4a852f cfb582f6279 c46ecb0112e fb4d55ef61c 10b6d89badd ce709ad3dc0 d335d73889d 681707ec28d 741c7350c08 commit acf3a21cbc26b39b73c0006300f35ff017ddd6cb Author: Jonathan Wakely Date: Fri Oct 1 20:37:02 2021 libstdc++: Reduce header dependencies for C++20 std::erase [PR92546] This reduces the preprocessed size of , and by not including for std::remove and std::remove_if. Also unwrap iterators using __niter_base, to avoid redundant debug mode checks. PR libstdc++/92546 * include/bits/erase_if.h (__erase_nodes_if): Use __niter_base to unwrap debug iterators. * include/bits/refwrap.h: Do not error if included in C++03. * include/bits/stl_algo.h (__remove_if): Move to ... * include/bits/stl_algobase.h (__remove_if): ... here. * include/std/deque (erase, erase_if): Use __remove_if instead of remove and remove_if. * include/std/string (erase, erase_if): Likewise. * include/std/vector (erase, erase_if): Likewise. diff --git a/libstdc++-v3/include/bits/erase_if.h b/libstdc++-v3/include/bits/erase_if.h index 8d1d23168fa..7716e1a953c 100644 --- a/libstdc++-v3/include/bits/erase_if.h +++ b/libstdc++-v3/include/bits/erase_if.h @@ -51,7 +51,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __erase_nodes_if(_Container& __cont, _Predicate __pred) { typename _Container::size_type __num = 0; - for (auto __iter = __cont.begin(), __last = __cont.end(); + for (auto __iter = std::__niter_base(__cont.begin()), +__last = std::__niter_base(__cont.end()); __iter != __last;) { if (__pred(*__iter)) diff --git a/libstdc++-v3/include/bits/refwrap.h b/libstdc++-v3/include/bits/refwrap.h index adfbe214693..a549efbce9a 100644 --- a/libstdc++-v3/include/bits/refwrap.h +++ b/libstdc++-v3/include/bits/refwrap.h @@ -32,9 +32,7 @@ #pragma GCC system_header -#if __cplusplus < 201103L -# include -#else +#if __cplusplus >= 201103L #include #include diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 90f3162ff90..bc611a95ef4 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -810,26 +810,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } #endif // C++11 - template -_GLIBCXX20_CONSTEXPR -_ForwardIterator -__remove_if(_ForwardIterator __first, _ForwardIterator __last, - _Predicate __pred) -{ - __first = std::__find_if(__first, __last, __pred); - if (__first == __last) - return __first; - _ForwardIterator __result = __first; - ++__first; - for (; __first != __last; ++__first) - if (!__pred(__first)) - { - *__result = _GLIBCXX_MOVE(*__first); - ++__result; - } - return __result; -} - /** * @brief Remove elements from a sequence. * @ingroup mutating_algorithms diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index 8627d59b589..0e0586836a6 100644 --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -2125,6 +2125,26 @@ _GLIBCXX_END_NAMESPACE_ALGO return __n; } + template +_GLIBCXX20_CONSTEXPR +_ForwardIterator +__remove_if(_ForwardIterator __first, _ForwardIterator __last, + _Predicate __pred) +{ + __first = std::__find_if(__first, __last, __pred); + if (__first == __last) + return __first; + _ForwardIterator __result = __first; + ++__first; + for (; __first != __last; ++__first) + if (!__pred(__first)) + { + *__result = _GLIBCXX_MOVE(*__first); + ++__result; + } + return __result; +} + #if __cplusplus >= 201103L template diff --git a/libstdc++-v3/include/std/deque b/libstdc++-v3/include/std/deque index c9a82110ad7..b2a7cee483a 100644 --- a/libstdc++-v3/include/std/deq
[committed] libstdc++: Implement std::clamp with std::min and std::max [PR 96733]
The compiler doesn't know about the precondition of std::clamp that (hi < lo) is false, and so can't optimize as well as we'd like. By using std::min and std::max we help the compiler. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/96733 * include/bits/stl_algo.h (clamp): Use std::min and std::max. Tested powerpc64le-linux. Committed to trunk. commit 741c7350c08b0884689466867b6c9e711c7b109e Author: Jonathan Wakely Date: Sat Apr 17 22:34:09 2021 libstdc++: Implement std::clamp with std::min and std::max [PR 96733] The compiler doesn't know about the precondition of std::clamp that (hi < lo) is false, and so can't optimize as well as we'd like. By using std::min and std::max we help the compiler. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/96733 * include/bits/stl_algo.h (clamp): Use std::min and std::max. diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 5d12972ce2c..90f3162ff90 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -3621,7 +3621,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __gnu_cxx::__ops::__iter_comp_iter(__pred)); } -#if __cplusplus > 201402L +#if __cplusplus >= 201703L #define __cpp_lib_clamp 201603 @@ -3631,14 +3631,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __val A value of arbitrary type. * @param __lo A lower limit of arbitrary type. * @param __hi An upper limit of arbitrary type. - * @return max(__val, __lo) if __val < __hi or min(__val, __hi) otherwise. + * @retval `__lo` if `__val < __lo` + * @retval `__hi` if `__hi < __val` + * @retval `__val` otherwise. + * @pre `_Tp` is LessThanComparable and `(__hi < __lo)` is false. */ template constexpr const _Tp& clamp(const _Tp& __val, const _Tp& __lo, const _Tp& __hi) { __glibcxx_assert(!(__hi < __lo)); - return (__val < __lo) ? __lo : (__hi < __val) ? __hi : __val; + return std::min(std::max(__val, __lo), __hi); } /** @@ -3648,15 +3651,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @param __loA lower limit of arbitrary type. * @param __hiAn upper limit of arbitrary type. * @param __comp A comparison functor. - * @return max(__val, __lo, __comp) if __comp(__val, __hi) - * or min(__val, __hi, __comp) otherwise. + * @retval `__lo` if `__comp(__val, __lo)` + * @retval `__hi` if `__comp(__hi, __val)` + * @retval `__val` otherwise. + * @pre `__comp(__hi, __lo)` is false. */ template constexpr const _Tp& clamp(const _Tp& __val, const _Tp& __lo, const _Tp& __hi, _Compare __comp) { __glibcxx_assert(!__comp(__hi, __lo)); - return __comp(__val, __lo) ? __lo : __comp(__hi, __val) ? __hi : __val; + return std::min(std::max(__val, __lo, __comp), __hi, __comp); } #endif // C++17 #endif // C++14
[committed] libstdc++: Do not allocate a zero-size vector [PR 100153]
The vector::shrink_to_fit() implementation will allocate new storage even if the vector is empty. That then leads to the end-of-storage pointer being non-null and equal to the _M_start._M_p pointer, which means that _M_end_addr() has undefined behaviour. The fix is to stop doing a useless zero-sized allocation in shrink_to_fit(), so that _M_start._M_p and _M_end_of_storage are both null after an empty vector shrinks. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/100153 * include/bits/vector.tcc (vector::_M_shrink_to_fit()): When size() is zero just deallocate and reset. Tested powerpc64le-linux. Committed to trunk. commit 681707ec28d56494fa61a80c62500724d55f8586 Author: Jonathan Wakely Date: Tue Apr 20 16:16:13 2021 libstdc++: Do not allocate a zero-size vector [PR 100153] The vector::shrink_to_fit() implementation will allocate new storage even if the vector is empty. That then leads to the end-of-storage pointer being non-null and equal to the _M_start._M_p pointer, which means that _M_end_addr() has undefined behaviour. The fix is to stop doing a useless zero-sized allocation in shrink_to_fit(), so that _M_start._M_p and _M_end_of_storage are both null after an empty vector shrinks. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/100153 * include/bits/vector.tcc (vector::_M_shrink_to_fit()): When size() is zero just deallocate and reset. diff --git a/libstdc++-v3/include/bits/vector.tcc b/libstdc++-v3/include/bits/vector.tcc index caee5cbfc2f..16366e03c86 100644 --- a/libstdc++-v3/include/bits/vector.tcc +++ b/libstdc++-v3/include/bits/vector.tcc @@ -944,7 +944,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER return false; __try { - _M_reallocate(size()); + if (size_type __n = size()) + _M_reallocate(__n); + else + { + this->_M_deallocate(); + this->_M_impl._M_reset(); + } return true; } __catch(...)
[committed] libstdc++: Use conditional noexcept in std::reverse_iterator [PR 94418]
This adds a noexcept-specifier to each constructor and assignment operator of std::reverse_iterator so that they are noexcept when the corresponding operation on the underlying iterator is noexcept. The std::reverse_iterator class template already requires that the operations on the underlying type are valid, so we don't need to use the std::is_nothrow_xxx traits to protect against errors when the expression isn't even valid. We can just use a noexcept operator to test if the expression can throw, without the overhead of redundantly checking if the initialization/assignment would be valid. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/94418 * include/bits/stl_iterator.h (reverse_iterator): Use conditional noexcept on constructors and assignment operators. * testsuite/24_iterators/reverse_iterator/noexcept.cc: New test. Tested powerpc64le-linux. Committed to trunk. commit d335d73889d897d073b987b4323db05317fccad3 Author: Jonathan Wakely Date: Wed Apr 28 11:40:47 2021 libstdc++: Use conditional noexcept in std::reverse_iterator [PR 94418] This adds a noexcept-specifier to each constructor and assignment operator of std::reverse_iterator so that they are noexcept when the corresponding operation on the underlying iterator is noexcept. The std::reverse_iterator class template already requires that the operations on the underlying type are valid, so we don't need to use the std::is_nothrow_xxx traits to protect against errors when the expression isn't even valid. We can just use a noexcept operator to test if the expression can throw, without the overhead of redundantly checking if the initialization/assignment would be valid. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/94418 * include/bits/stl_iterator.h (reverse_iterator): Use conditional noexcept on constructors and assignment operators. * testsuite/24_iterators/reverse_iterator/noexcept.cc: New test. diff --git a/libstdc++-v3/include/bits/stl_iterator.h b/libstdc++-v3/include/bits/stl_iterator.h index 004d767224d..4973f792b56 100644 --- a/libstdc++-v3/include/bits/stl_iterator.h +++ b/libstdc++-v3/include/bits/stl_iterator.h @@ -174,20 +174,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // 235 No specification of default ctor for reverse_iterator // 1012. reverse_iterator default ctor should value initialize _GLIBCXX17_CONSTEXPR - reverse_iterator() : current() { } + reverse_iterator() + _GLIBCXX_NOEXCEPT_IF(noexcept(_Iterator())) + : current() + { } /** * This %iterator will move in the opposite direction that @p x does. */ explicit _GLIBCXX17_CONSTEXPR - reverse_iterator(iterator_type __x) : current(__x) { } + reverse_iterator(iterator_type __x) + _GLIBCXX_NOEXCEPT_IF(noexcept(_Iterator(__x))) + : current(__x) + { } /** * The copy constructor is normal. */ _GLIBCXX17_CONSTEXPR reverse_iterator(const reverse_iterator& __x) - : current(__x.current) { } + _GLIBCXX_NOEXCEPT_IF(noexcept(_Iterator(__x.current))) + : current(__x.current) + { } #if __cplusplus >= 201103L reverse_iterator& operator=(const reverse_iterator&) = default; @@ -203,7 +211,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #endif _GLIBCXX17_CONSTEXPR reverse_iterator(const reverse_iterator<_Iter>& __x) - : current(__x.current) { } + _GLIBCXX_NOEXCEPT_IF(noexcept(_Iterator(__x.current))) + : current(__x.current) + { } #if __cplusplus >= 201103L template @@ -214,6 +224,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX17_CONSTEXPR reverse_iterator& operator=(const reverse_iterator<_Iter>& __x) + _GLIBCXX_NOEXCEPT_IF(noexcept(current = __x.current)) { current = __x.current; return *this; @@ -226,6 +237,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_NODISCARD _GLIBCXX17_CONSTEXPR iterator_type base() const + _GLIBCXX_NOEXCEPT_IF(noexcept(_Iterator(current))) { return current; } /** diff --git a/libstdc++-v3/testsuite/24_iterators/reverse_iterator/noexcept.cc b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/noexcept.cc new file mode 100644 index 000..df4b1b0763d --- /dev/null +++ b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/noexcept.cc @@ -0,0 +1,92 @@ +// { dg-do compile { target c++11 } } + +#include + +template +struct bidi +{ + using value_type = T; + using pointer = T*; + using reference = T&; + using difference_type = std::ptrdiff_t; + using iterator_category = std::bidirectional_iterator_tag; + + T* ptr; + + bidi(T* ptr = nullptr) noexcept(Nothrow) : ptr(ptr) { } + + bidi(const bidi& iter) noexcept(Nothrow) : ptr(iter.ptr) { } + + template +bidi(
[committed] libstdc++: Add noexcept to common_iterator proxy operators
Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (common_iterator::__arrow_proxy) (common_iterator::__postfix_proxy): Add noexcept. Tested powerpc64le-linux. Committed to trunk. commit ce709ad3dc0ed5d7ea48a116311d4441225446f0 Author: Jonathan Wakely Date: Fri Apr 30 14:43:54 2021 libstdc++: Add noexcept to common_iterator proxy operators Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (common_iterator::__arrow_proxy) (common_iterator::__postfix_proxy): Add noexcept. diff --git a/libstdc++-v3/include/bits/stl_iterator.h b/libstdc++-v3/include/bits/stl_iterator.h index 4973f792b56..8517652a173 100644 --- a/libstdc++-v3/include/bits/stl_iterator.h +++ b/libstdc++-v3/include/bits/stl_iterator.h @@ -1813,7 +1813,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION public: const iter_value_t<_It>* - operator->() const + operator->() const noexcept { return std::__addressof(_M_keep); } }; @@ -1828,7 +1828,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION public: const iter_value_t<_It>& - operator*() const + operator*() const noexcept { return _M_keep; } };
[committed] libstdc++: Make move ctor noexcept for fully-dynamic string
The move constructor for the "fully-dynamic" COW string is not noexcept, because it allocates a new empty string rep for the moved-from string. However, there is no need to do that, because the moved-from string does not have to be left empty. Instead, implement move construction for the fully-dynamic case as a reference count increment, so the string is shared. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/cow_string.h [_GLIBCXX_FULLY_DYNAMIC_STRING] (basic_string(basic_string&&)): Add noexcept and avoid allocation, by sharing rep with the rvalue string. Tested powerpc64le-linux. Committed to trunk. commit 10b6d89baddd86139480ba902f491903fcb464a6 Author: Jonathan Wakely Date: Fri Apr 30 15:04:34 2021 libstdc++: Make move ctor noexcept for fully-dynamic string The move constructor for the "fully-dynamic" COW string is not noexcept, because it allocates a new empty string rep for the moved-from string. However, there is no need to do that, because the moved-from string does not have to be left empty. Instead, implement move construction for the fully-dynamic case as a reference count increment, so the string is shared. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/cow_string.h [_GLIBCXX_FULLY_DYNAMIC_STRING] (basic_string(basic_string&&)): Add noexcept and avoid allocation, by sharing rep with the rvalue string. diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h index 61edaa85484..ba4a8cc2e98 100644 --- a/libstdc++-v3/include/bits/cow_string.h +++ b/libstdc++-v3/include/bits/cow_string.h @@ -620,18 +620,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * The newly-created string contains the exact contents of @a __str. * @a __str is a valid, but unspecified string. */ - basic_string(basic_string&& __str) + basic_string(basic_string&& __str) noexcept #if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 - noexcept // FIXME C++11: should always be noexcept. -#endif : _M_dataplus(std::move(__str._M_dataplus)) { -#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 __str._M_data(_S_empty_rep()._M_refdata()); -#else - __str._M_data(_S_construct(size_type(), _CharT(), get_allocator())); -#endif } +#else + : _M_dataplus(__str._M_rep()) + { + // Rather than allocate an empty string for the rvalue string, + // just share ownership with it by incrementing the reference count. + // If the rvalue string was "leaked" then it was the unique owner, + // so need an extra increment to indicate shared ownership. + if (_M_rep()->_M_is_leaked()) + __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 2); + else + __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1); + } +#endif /** * @brief Construct string from an initializer %list.
[committed] libstdc++: Simplify __normal_iterator converting constructor
This uses C++11 features to simplify the definition of the __normal_iterator constructor that allows converting from iterator to const_iterator. The previous definition relied on _Container::pointer which is present in std::vector and std::basic_string, but is not actually part of the container requirements. Removing the use of _Container::pointer and defining it in terms of is_convertible allows __normal_iterator to be used with new container types which do not define a pointer member. Specifically, this will allow it to be used in std::basic_stacktrace. In theory this will enable some conversions which were not previously permitted, for example __normal_iterator> can now be converted to __normal_iterator>. In practice this doesn't matter because the library never uses such types. In any case, allowing those conversions is consistent with the corresponding constructors of std::reverse_iterator and std::move_iterator. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (__normal_iterator): Simplify converting constructor and do not require _Container::pointer. Tested powerpc64le-linux. Committed to trunk. commit fb4d55ef61ca3191ec946d4d41e0e715f4cc4197 Author: Jonathan Wakely Date: Thu May 6 13:44:36 2021 libstdc++: Simplify __normal_iterator converting constructor This uses C++11 features to simplify the definition of the __normal_iterator constructor that allows converting from iterator to const_iterator. The previous definition relied on _Container::pointer which is present in std::vector and std::basic_string, but is not actually part of the container requirements. Removing the use of _Container::pointer and defining it in terms of is_convertible allows __normal_iterator to be used with new container types which do not define a pointer member. Specifically, this will allow it to be used in std::basic_stacktrace. In theory this will enable some conversions which were not previously permitted, for example __normal_iterator> can now be converted to __normal_iterator>. In practice this doesn't matter because the library never uses such types. In any case, allowing those conversions is consistent with the corresponding constructors of std::reverse_iterator and std::move_iterator. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (__normal_iterator): Simplify converting constructor and do not require _Container::pointer. diff --git a/libstdc++-v3/include/bits/stl_iterator.h b/libstdc++-v3/include/bits/stl_iterator.h index 8517652a173..df774eeb63f 100644 --- a/libstdc++-v3/include/bits/stl_iterator.h +++ b/libstdc++-v3/include/bits/stl_iterator.h @@ -1022,6 +1022,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typedef std::iterator_traits<_Iterator> __traits_type; +#if __cplusplus >= 201103L + template + using __convertible_from + = std::__enable_if_t::value>; +#endif + public: typedef _Iteratoriterator_type; typedef typename __traits_type::iterator_category iterator_category; @@ -1042,12 +1048,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _M_current(__i) { } // Allow iterator to const_iterator conversion +#if __cplusplus >= 201103L + template> + _GLIBCXX20_CONSTEXPR + __normal_iterator(const __normal_iterator<_Iter, _Container>& __i) + noexcept +#else + // N.B. _Container::pointer is not actually in container requirements, + // but is present in std::vector and std::basic_string. template -_GLIBCXX20_CONSTEXPR __normal_iterator(const __normal_iterator<_Iter, typename __enable_if< - (std::__are_same<_Iter, typename _Container::pointer>::__value), - _Container>::__type>& __i) _GLIBCXX_NOEXCEPT + (std::__are_same<_Iter, typename _Container::pointer>::__value), + _Container>::__type>& __i) +#endif : _M_current(__i.base()) { } // Forward iterator requirements
[committed] libstdc++: Allow visiting inherited variants [PR 90943]
Implement the changes from P2162R2 (as a DR for C++17). Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/90943 * include/std/variant (__cpp_lib_variant): Update value. (__detail::__variant::__as): New helpers implementing the as-variant exposition-only function templates. (visit, visit): Use __as to upcast the variant parameters. * include/std/version (__cpp_lib_variant): Update value. * testsuite/20_util/variant/visit_inherited.cc: New test. Tested powerpc64le-linux. Committed to trunk. commit c46ecb0112e91c80ee111439e79a58a953e4479d Author: Jonathan Wakely Date: Mon Apr 19 14:49:12 2021 libstdc++: Allow visiting inherited variants [PR 90943] Implement the changes from P2162R2 (as a DR for C++17). Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/90943 * include/std/variant (__cpp_lib_variant): Update value. (__detail::__variant::__as): New helpers implementing the as-variant exposition-only function templates. (visit, visit): Use __as to upcast the variant parameters. * include/std/version (__cpp_lib_variant): Update value. * testsuite/20_util/variant/visit_inherited.cc: New test. diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant index 6383cf4e502..c651326ead9 100644 --- a/libstdc++-v3/include/std/variant +++ b/libstdc++-v3/include/std/variant @@ -71,7 +71,7 @@ namespace __variant } // namespace __variant } // namespace __detail -#define __cpp_lib_variant 201606L +#define __cpp_lib_variant 202102L template class tuple; template class variant; @@ -202,6 +202,28 @@ namespace __variant std::forward<_Variants>(__variants)...); } + // The __as function templates implement the exposition-only "as-variant" + + template +constexpr std::variant<_Types...>& +__as(std::variant<_Types...>& __v) +{ return __v; } + + template +constexpr const std::variant<_Types...>& +__as(const std::variant<_Types...>& __v) noexcept +{ return __v; } + + template +constexpr std::variant<_Types...>&& +__as(std::variant<_Types...>&& __v) noexcept +{ return std::move(__v); } + + template +constexpr const std::variant<_Types...>&& +__as(const std::variant<_Types...>&& __v) noexcept +{ return std::move(__v); } + // _Uninitialized is guaranteed to be a trivially destructible type, // even if T is not. template> @@ -1063,8 +1085,12 @@ namespace __variant std::index_sequence<__indices...>> : _Base_dedup<__indices, __poison_hash>>... { }; - template -using __get_t = decltype(std::get<_Np>(std::declval<_Variant>())); + // Equivalent to decltype(get<_Np>(as-variant(declval<_Variant>( + template())), + typename _Tp = variant_alternative_t<_Np, remove_reference_t<_AsV>>> +using __get_t + = conditional_t, _Tp&, _Tp&&>; // Return type of std::visit. template @@ -1741,7 +1767,9 @@ namespace __variant constexpr __detail::__variant::__visit_result_t<_Visitor, _Variants...> visit(_Visitor&& __visitor, _Variants&&... __variants) { - if ((__variants.valueless_by_exception() || ...)) + namespace __variant = std::__detail::__variant; + + if ((__variant::__as(__variants).valueless_by_exception() || ...)) __throw_bad_variant_access("std::visit: variant is valueless"); using _Result_type @@ -1751,10 +1779,11 @@ namespace __variant if constexpr (sizeof...(_Variants) == 1) { + using _Vp = decltype(__variant::__as(std::declval<_Variants>()...)); + constexpr bool __visit_rettypes_match = __detail::__variant:: - __check_visitor_results<_Visitor, _Variants...>( - std::make_index_sequence< - std::variant_size...>::value>()); + __check_visitor_results<_Visitor, _Vp>( + make_index_sequence>>()); if constexpr (!__visit_rettypes_match) { static_assert(__visit_rettypes_match, @@ -1765,12 +1794,12 @@ namespace __variant else return std::__do_visit<_Tag>( std::forward<_Visitor>(__visitor), - std::forward<_Variants>(__variants)...); + static_cast<_Vp>(__variants)...); } else return std::__do_visit<_Tag>( std::forward<_Visitor>(__visitor), - std::forward<_Variants>(__variants)...); + __variant::__as(std::forward<_Variants>(__variants))...); } #if __cplusplus > 201703L @@ -1778,11 +1807,13 @@ namespace __variant constexpr _Res visit(_Visitor&& __visitor, _Variants&&... __variants) { - if ((__variants.valueless_by_exception() || ...)) + namespace __variant = std::__detail::__variant; + + if ((__variant::__as(__variants).valueless_by_exception() || ...)
[committed] libstdc++: Optimize std::visit for the common case [PR 78113]
GCC does not do a good job of optimizing the table of function pointers used for variant visitation. This avoids using the table for the common case of visiting a single variant with a small number of alternative types. Instead we use: switch(v.index()) { case 0: return visitor(get<0>(v)); case 1: return visitor(get<1>(v)); ... } It's not quite that simple, because get<1>(v) is ill-formed if the variant only has one alternative, and similarly for each get. We need to ensure each case only applies the visitor if the index is in range for the actual type we're dealing with, and tell the compiler that the case is unreachable otherwise. We also need to invoke the visitor via the __gen_vtable_impl::__visit_invoke function, to handle the raw visitation cases used to implement std::variant assignments and comparisons. Because that gets quite verbose and repetitive, a macro is used to stamp out the cases. We also need to handle the valueless_by_exception case, but only for raw visitation, because std::visit already checks for it before calling __do_visit. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/78113 * include/std/variant (__do_visit): Use a switch when we have a single variant with a small number of alternatives. Tested powerpc64le-linux. Committed to trunk. commit cfb582f62791dfadc243d97d37f0b83ef77cf480 Author: Jonathan Wakely Date: Tue May 4 23:31:48 2021 libstdc++: Optimize std::visit for the common case [PR 78113] GCC does not do a good job of optimizing the table of function pointers used for variant visitation. This avoids using the table for the common case of visiting a single variant with a small number of alternative types. Instead we use: switch(v.index()) { case 0: return visitor(get<0>(v)); case 1: return visitor(get<1>(v)); ... } It's not quite that simple, because get<1>(v) is ill-formed if the variant only has one alternative, and similarly for each get. We need to ensure each case only applies the visitor if the index is in range for the actual type we're dealing with, and tell the compiler that the case is unreachable otherwise. We also need to invoke the visitor via the __gen_vtable_impl::__visit_invoke function, to handle the raw visitation cases used to implement std::variant assignments and comparisons. Because that gets quite verbose and repetitive, a macro is used to stamp out the cases. We also need to handle the valueless_by_exception case, but only for raw visitation, because std::visit already checks for it before calling __do_visit. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/78113 * include/std/variant (__do_visit): Use a switch when we have a single variant with a small number of alternatives. diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant index c651326ead9..19b2158690a 100644 --- a/libstdc++-v3/include/std/variant +++ b/libstdc++-v3/include/std/variant @@ -485,6 +485,12 @@ namespace __variant { if constexpr (__variant::__never_valueless<_Types...>()) return true; + // It would be nice if we could just return true for -fno-exceptions. + // It's possible (but inadvisable) that a std::variant could become + // valueless in a translation unit compiled with -fexceptions and then + // be passed to functions compiled with -fno-exceptions. We would need + // some #ifdef _GLIBCXX_NO_EXCEPTIONS_GLOBALLY property to elide all + // checks for valueless_by_exception(). return this->_M_index != static_cast<__index_type>(variant_npos); } @@ -1754,12 +1760,89 @@ namespace __variant constexpr decltype(auto) __do_visit(_Visitor&& __visitor, _Variants&&... __variants) { - constexpr auto& __vtable = __detail::__variant::__gen_vtable< - _Result_type, _Visitor&&, _Variants&&...>::_S_vtable; + // Get the silly case of visiting no variants out of the way first. + if constexpr (sizeof...(_Variants) == 0) + return std::forward<_Visitor>(__visitor)(); + else + { + constexpr size_t __max = 11; // "These go to eleven." - auto __func_ptr = __vtable._M_access(__variants.index()...); - return (*__func_ptr)(std::forward<_Visitor>(__visitor), - std::forward<_Variants>(__variants)...); + // The type of the first variant in the pack. + using _V0 + = typename __detail::__variant::_Nth_type<0, _Variants...>::type; + // The number of alternatives in that first variant. + constexpr auto __n = variant_size_v>; + + if constexpr (sizeof...(_Variants) > 1 || __n > __max) + { + // Use a jump table for the general case. + constexpr auto& __vtabl
[committed] libstdc++: Add utility for creating std::error_code from OS errors
This adds a helper function to encapsulate obtaining an error code for errors from OS calls. For Windows we want to use GetLastError() and the system error category, but otherwise just use errno and the generic error category. This should not be used to replace existing uses of ec.assign(errno, generic_category()) because in those cases we really do want to get the value of errno, not a system-specific error. Only the cases that currently use GetLastError() are replace by this new function. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * src/filesystem/ops-common.h (last_error): New helper function. (filesystem::do_space): Use last_error(). * src/c++17/fs_ops.cc (fs::absolute, fs::create_hard_link) (fs::equivalent, fs::remove, fs::temp_directory_path): Use last_error(). * src/filesystem/ops.cc (fs::create_hard_link) (fs::remove, fs::temp_directory_path): Likewise. Tested powerpc64le-linux. Committed to trunk. commit d71476c9df931f3ca674941f1942b03eabea010d Author: Jonathan Wakely Date: Wed Feb 10 18:00:00 2021 libstdc++: Add utility for creating std::error_code from OS errors This adds a helper function to encapsulate obtaining an error code for errors from OS calls. For Windows we want to use GetLastError() and the system error category, but otherwise just use errno and the generic error category. This should not be used to replace existing uses of ec.assign(errno, generic_category()) because in those cases we really do want to get the value of errno, not a system-specific error. Only the cases that currently use GetLastError() are replace by this new function. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: * src/filesystem/ops-common.h (last_error): New helper function. (filesystem::do_space): Use last_error(). * src/c++17/fs_ops.cc (fs::absolute, fs::create_hard_link) (fs::equivalent, fs::remove, fs::temp_directory_path): Use last_error(). * src/filesystem/ops.cc (fs::create_hard_link) (fs::remove, fs::temp_directory_path): Likewise. diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc index 2eac9977785..4f3715bbbec 100644 --- a/libstdc++-v3/src/c++17/fs_ops.cc +++ b/libstdc++-v3/src/c++17/fs_ops.cc @@ -113,7 +113,7 @@ fs::absolute(const path& p, error_code& ec) while (len > buf.size()); if (len == 0) -ec.assign((int)GetLastError(), std::system_category()); +ec = __last_system_error(); else { buf.resize(len); @@ -682,7 +682,7 @@ fs::create_hard_link(const path& to, const path& new_hard_link, if (CreateHardLinkW(new_hard_link.c_str(), to.c_str(), NULL)) ec.clear(); else -ec.assign((int)GetLastError(), system_category()); +ec = __last_system_error(); #else ec = std::make_error_code(std::errc::not_supported); #endif @@ -874,12 +874,12 @@ fs::equivalent(const path& p1, const path& p2, error_code& ec) noexcept if (!h1 || !h2) { if (!h1 && !h2) - ec.assign((int)GetLastError(), system_category()); + ec = __last_system_error(); return false; } if (!h1.get_info() || !h2.get_info()) { - ec.assign((int)GetLastError(), system_category()); + ec = __last_system_error(); return false; } return h1.info.dwVolumeSerialNumber == h2.info.dwVolumeSerialNumber @@ -1255,7 +1255,7 @@ fs::remove(const path& p, error_code& ec) noexcept return true; } else if (!ec) - ec.assign((int)GetLastError(), system_category()); + ec = __last_system_error(); } else if (status_known(st)) ec.clear(); diff --git a/libstdc++-v3/src/filesystem/ops-common.h b/libstdc++-v3/src/filesystem/ops-common.h index bf26c06b7b5..e999e11b422 100644 --- a/libstdc++-v3/src/filesystem/ops-common.h +++ b/libstdc++-v3/src/filesystem/ops-common.h @@ -57,6 +57,18 @@ namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION + + // Get the last OS error (for POSIX this is just errno). + inline error_code + __last_system_error() noexcept + { +#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS +return {::GetLastError(), std::system_category()}; +#else +return {errno, std::generic_category()}; +#endif + } + namespace filesystem { namespace __gnu_posix @@ -558,7 +570,7 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM ec.clear(); } else - ec.assign((int)GetLastError(), std::system_category()); + ec = std::last_system_error(); #else ec = std::make_error_code(std::errc::not_supported); #endif @@ -583,7 +595,7 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM } while (len > buf.size()); if (len == 0) - ec.assign((int)GetLastError(), std::system_category()); + ec = __last_system_error(); else ec.clear(); diff