Distinguish global and unkonwn memory accesses in ipa-modref
Hi, As discussed in PR103585, fatigue2 is now only benchmark from my usual testing set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees important regression when inlining functions called once is limited. This prevents us from solving runtime issues in roms benchmarks and elsewhere. The problem is that there is perdida function that takes many arguments and some of them are array descriptors. We constant propagate most of their fields but still keep their initialization. Because perdida is quite fast, the call overhead dominates, since we need over 100 memory stores consuing about 35% of the overall benchmark runtime. The memory stores would be eliminated if perdida did not call fortran I/O which makes modref to thin that the array descriptors could be accessed. We are quite close discovering that they can't becuase they are non-escaping from function. This patch makes modref to distingush between global memory access (only things that escapes) and unkonwn accesss (that may access also nonescaping things reaching the function). This makes disambiguation for functions containing error handling better. Unfortunately the patch hits two semi-latent issues in Fortran frontned. First is wrong code in gfortran.dg/unlimited_polymorphic_3.f03. This can be turned into wrong code testcase on both mainline and gcc11 if the runtime call is removed, so I filled PR 103662 for it. There is TBAA mismatch for structure produced in FE. Second is issue with GOMP where Fortran marks certain parameters as non-escaping and then makes them escape via GOMP_parallel. For this I disabled the use of escape info in verify_arg which also disables the useful transform on perdida (since we now punt on any call that takes non-constant parameters) but still does useful work for e.g. GCC error handling where we call my_friendly_abort. I will work on this incrementally. Bootstrapped/regtested x86_64-linux, lto-bootstrapped and also tested with clang build. I plan to commit this tomorrow if there are no complains (the patch is not completely short but conceptualy simple and handles a lot of common cases). gcc/ChangeLog: 2021-12-12 Jan Hubicka PR ipa/103585 * ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle MODREF_GLOBAL_MEMORY_PARM. (modref_access_node::dump): Likewise. (modref_access_node::get_call_arg): Likewise. * ipa-modref-tree.h (enum modref_special_parms): Add MODREF_GLOBAL_MEMORY_PARM. (modref_access_node::useful_for_kill): Handle MODREF_GLOBAL_MEMORY_PARM. (modref:tree::merge): Add promote_unknown_to_global. * ipa-modref.c (verify_arg):New function. (may_access_nonescaping_parm_p): New function. (modref_access_analysis::record_global_memory_load): New member function. (modref_access_analysis::record_global_memory_store): Likewise. (modref_access_analysis::process_fnspec): Distingush global and local memory. (modref_access_analysis::analyze_call): Likewise. * tree-ssa-alias.c (ref_may_access_global_memory_p): New function. (modref_may_conflict): Use it. gcc/testsuite/ChangeLog: 2021-12-12 Jan Hubicka * gcc.dg/analyzer/data-model-1.c: Disable ipa-modref. * gcc.dg/uninit-38.c: Likewise. * gcc.dg/uninit-pr98578.c: Liewise. diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c index a868e3e3983..64ef0772343 100644 --- a/gcc/ipa-modref-tree.c +++ b/gcc/ipa-modref-tree.c @@ -36,7 +36,8 @@ modref_access_node::operator == (modref_access_node &a) const { if (parm_index != a.parm_index) return false; - if (parm_index != MODREF_UNKNOWN_PARM) + if (parm_index != MODREF_UNKNOWN_PARM + && parm_index != MODREF_GLOBAL_MEMORY_PARM) { if (parm_offset_known != a.parm_offset_known) return false; @@ -613,7 +614,9 @@ modref_access_node::insert (vec *&accesses, bool modref_access_node::range_info_useful_p () const { - return parm_index != MODREF_UNKNOWN_PARM && parm_offset_known + return parm_index != MODREF_UNKNOWN_PARM +&& parm_index != MODREF_GLOBAL_MEMORY_PARM +&& parm_offset_known && (known_size_p (size) || known_size_p (max_size) || known_ge (offset, 0)); @@ -625,7 +628,9 @@ modref_access_node::dump (FILE *out) { if (parm_index != MODREF_UNKNOWN_PARM) { - if (parm_index >= 0) + if (parm_index == MODREF_GLOBAL_MEMORY_PARM) + fprintf (out, " Base in global memory"); + else if (parm_index >= 0) fprintf (out, " Parm %i", parm_index); else if (parm_index == MODREF_STATIC_CHAIN_PARM) fprintf (out, " Static chain"); @@ -655,7 +660,8 @@ modref_access_node::dump (FILE *out) tree modref_access_node::get_call_arg (const gcall *stmt) const { - if (parm_index == MODREF_UNKNOWN_PARM) + if (parm_index == MODREF_UNKNOWN_PARM + || parm_index == MODREF_GLOBAL_
[PATCH] gcc/diagnostic.c: make -Werror message more helpful
Hello. I propose to make that message more verbose. It sure would have helped me once. You don't always have a Web search available :) Andrea Monaco diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 4ded1760705..8b67662390e 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -156,7 +156,7 @@ default_diagnostic_final_cb (diagnostic_context *context) /* -Werror was given. */ if (context->warning_as_error_requested) pp_verbatim (context->printer, -_("%s: all warnings being treated as errors"), +_("%s: all warnings being treated as errors (-Werror; disable with -Wno-error)"), progname); /* At least one -Werror= was given. */ else
Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
Hi, As discussed in the PR, we miss some optimization becuase gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds __builtin_trap after them. This is seen as a side-effect by IPA analysis and additionally the (fully unreachable) builtin_trap is believed to load all global memory. I think we should think of less intrusive gimple representation of this, but it is also easy enough to special case that in IPA analysers as done in this patch. This is a win even if we improve the representation since gimple-ssa-isolate-paths is run late and this way we improve optimization early. This affects 1623 functions during cc1plus link. Current cc1plus disambiguation stats are: Alias oracle query stats: refs_may_alias_p: 76712244 disambiguations, 90400285 queries ref_maybe_used_by_call_p: 581770 disambiguations, 77716066 queries call_may_clobber_ref_p: 322533 disambiguations, 325280 queries stmt_kills_ref_p: 106434 kills, 5728738 queries nonoverlapping_component_refs_p: 0 disambiguations, 8849 queries nonoverlapping_refs_since_match_p: 29947 disambiguations, 66277 must overlaps, 97248 queries aliasing_component_refs_p: 55737 disambiguations, 3055719 queries TBAA oracle: 27336487 disambiguations 67140201 queries 16214932 are in alias set 0 9713534 queries asked about the same object 92 queries asked about the same alias set 0 access volatile 12049850 are dependent in the DAG 1825306 are aritificially in conflict with void * Modref stats: modref kill: 70 kills, 8279 queries modref use: 29557 disambiguations, 701616 queries modref clobber: 1612655 disambiguations, 21688020 queries 4925889 tbaa queries (0.227125 per modref query) 864389 base compares (0.039856 per modref query) PTA query stats: pt_solution_includes: 13512509 disambiguations, 27571176 queries pt_solutions_intersect: 1594296 disambiguations, 15943975 queries Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-12-12 Jan Hubicka PR ipa/103665 * ipa-modref.c (modref_access_analysis::analyze): Terminate BB analysis on NULL memory access. * ipa-pure-const.c (analyze_function): Likewise. diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index 55fa873e1f0..2c89c63baf6 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -1750,6 +1750,19 @@ modref_access_analysis::analyze () for (si = gsi_start_nondebug_after_labels_bb (bb); !gsi_end_p (si); gsi_next_nondebug (&si)) { + /* NULL memory accesses terminates BB. These accesses are known +to trip undefined behaviour. gimple-ssa-isolate-paths turns them +to volatile accesses and adds builtin_trap call which would +confuse us otherwise. */ + if (infer_nonnull_range_by_dereference (gsi_stmt (si), + null_pointer_node)) + { + if (dump_file) + fprintf (dump_file, " - NULL memory access; terminating BB\n"); + if (flag_non_call_exceptions) + set_side_effects (); + break; + } analyze_stmt (gsi_stmt (si), always_executed); /* Avoid doing useles work. */ diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c index fea8b08c4eb..25503f360e6 100644 --- a/gcc/ipa-pure-const.c +++ b/gcc/ipa-pure-const.c @@ -1097,6 +1097,28 @@ analyze_function (struct cgraph_node *fn, bool ipa) !gsi_end_p (gsi); gsi_next (&gsi)) { + /* NULL memory accesses terminates BB. These accesses are known +to trip undefined behaviour. gimple-ssa-isolate-paths turns them +to volatile accesses and adds builtin_trap call which would +confuse us otherwise. */ + if (infer_nonnull_range_by_dereference (gsi_stmt (gsi), + null_pointer_node)) + { + if (dump_file) + fprintf (dump_file, " NULL memory access; terminating BB%s\n", +fla
[pach, power-ieee128, committed] Generate config.h macros for ieee128 functions
Pushed to the branch. Generate config.h macros for IEEE128 math functions. libgfortran/ChangeLog: * acinclude.m4 (LIBGFOR_CHECK_MATH_IEEE128): New macro. * configure.ac: Use it. * config.h.in: Regenerate. * configure: Regenerate. diff --git a/libgfortran/Makefile.in b/libgfortran/Makefile.in index 184eb7fd926..a306fcbebd3 100644 --- a/libgfortran/Makefile.in +++ b/libgfortran/Makefile.in @@ -719,6 +719,7 @@ pdfdir = @pdfdir@ prefix = @prefix@ program_transform_name = @program_transform_name@ psdir = @psdir@ +runstatedir = @runstatedir@ sbindir = @sbindir@ sharedstatedir = @sharedstatedir@ srcdir = @srcdir@ diff --git a/libgfortran/acinclude.m4 b/libgfortran/acinclude.m4 index 5b0c094e716..bf4cc782ec2 100644 --- a/libgfortran/acinclude.m4 +++ b/libgfortran/acinclude.m4 @@ -510,3 +510,26 @@ AC_DEFUN([LIBGFOR_CHECK_AVX128], [ [AM_CONDITIONAL([HAVE_AVX128],false)]) CFLAGS="$ac_save_CFLAGS" ]) + +AC_DEFUN([LIBGFOR_CHECK_MATH_IEEE128], +[ + AC_REQUIRE([GCC_CHECK_LIBM]) + AC_REQUIRE([GCC_CHECK_MATH_HEADERS]) + AC_CACHE_CHECK([for $1], [gcc_cv_math_func_$1], + [AC_LINK_IFELSE([AC_LANG_SOURCE([ +__float128 $1 (__float128); +__float128 (*ptr)(__float128) = $1; + +int +main () +{ + return 0; +} +])], +[gcc_cv_math_func_$1=yes], +[gcc_cv_math_func_$1=no])]) + if test $gcc_cv_math_func_$1 = yes; then +AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_$1),[1], + [Define to 1 if you have the `$1' function.]) + fi +]) diff --git a/libgfortran/config.h.in b/libgfortran/config.h.in index 2d58188e50c..02dcf49c49a 100644 --- a/libgfortran/config.h.in +++ b/libgfortran/config.h.in @@ -888,9 +888,75 @@ /* Define to 1 if you have the `ynl' function. */ #undef HAVE_YNL +/* Define to 1 if you have the `__acoshieee128' function. */ +#undef HAVE___ACOSHIEEE128 + +/* Define to 1 if you have the `__acosieee128' function. */ +#undef HAVE___ACOSIEEE128 + +/* Define to 1 if you have the `__asinhieee128' function. */ +#undef HAVE___ASINHIEEE128 + +/* Define to 1 if you have the `__asinieee128' function. */ +#undef HAVE___ASINIEEE128 + +/* Define to 1 if you have the `__atan2ieee128' function. */ +#undef HAVE___ATAN2IEEE128 + +/* Define to 1 if you have the `__atanhieee128' function. */ +#undef HAVE___ATANHIEEE128 + +/* Define to 1 if you have the `__atanieee128' function. */ +#undef HAVE___ATANIEEE128 + +/* Define to 1 if you have the `__coshieee128' function. */ +#undef HAVE___COSHIEEE128 + +/* Define to 1 if you have the `__cosieee128' function. */ +#undef HAVE___COSIEEE128 + +/* Define to 1 if you have the `__erfieee128' function. */ +#undef HAVE___ERFIEEE128 + +/* Define to 1 if you have the `__expieee128' function. */ +#undef HAVE___EXPIEEE128 + +/* Define to 1 if you have the `__fabsieee128' function. */ +#undef HAVE___FABSIEEE128 + +/* Define to 1 if you have the `__jnieee128' function. */ +#undef HAVE___JNIEEE128 + +/* Define to 1 if you have the `__log10ieee128' function. */ +#undef HAVE___LOG10IEEE128 + +/* Define to 1 if you have the `__logieee128' function. */ +#undef HAVE___LOGIEEE128 + +/* Define to 1 if you have the `__powieee128' function. */ +#undef HAVE___POWIEEE128 + /* Define to 1 if you have the `__secure_getenv' function. */ #undef HAVE___SECURE_GETENV +/* Define to 1 if you have the `__sinhieee128' function. */ +#undef HAVE___SINHIEEE128 + +/* Define to 1 if you have the `__sinieee128' function. */ +#undef HAVE___SINIEEE128 + +/* Define to 1 if you have the `__sqrtieee128' function. */ +#undef HAVE___SQRTIEEE128 + +/* Define to 1 if you have the `__tanhieee128' function. */ +#undef HAVE___TANHIEEE128 + +/* Define to 1 if you have the `__tanieee128' function. */ +#undef HAVE___TANIEEE128 + +/* Define to 1 if you have the `__ynieee128' function. */ +#undef HAVE___YNIEEE128 + /* Define to the sub-directory in which libtool stores uninstalled libraries. */ #undef LT_OBJDIR diff --git a/libgfortran/configure b/libgfortran/configure index 090c8dd3d76..845d4fec5a1 100755 --- a/libgfortran/configure +++ b/libgfortran/configure @@ -6028,7 +6028,6 @@ if ac_fn_c_try_cpp "$LINENO"; then : AM_FCFLAGS="$AM_FCFLAGS -mabi=ibmlongdouble"; AM_CFLAGS="$AM_CFLAGS -mabi=ibmlongdouble"; CFLAGS="$CFLAGS -mabi=ibmlongdouble"; - echo "Juhu!"; have_real_17=yes fi rm -f conftest.err conftest.i conftest.$ac_ext @@ -6045,6 +6044,7 @@ else HAVE_REAL_17_FALSE= fi + # Add CET specific flags if CET is enabled # Check whether --enable-cet was given. if test "${enable_cet+set}" = set; then : @@ -25846,6 +25846,935 @@ _ACEOF fi +# For POWER, check the ieee128 math functions + +if test "x$have_real_17" = "xyes"; then + + + + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __acoshieee128" >&5 +$as_echo_n "checking for __acoshieee128... " >&6; } +if ${gcc_cv_math_func___acoshieee128+:} false; then : + $as_echo_n "(cached) " >&6 +else + if test x$gcc_no_link = xyes; then + as_fn_error $? "Link tests are not allowed after GCC_NO_EXECUTA
Add -fipa-strict-aliasing
Hi, ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. This sometimes breaks programs with TBAA violations including clang with LTO. To workaround that one can use -fno-strict-aliasing or -fno-ipa-modref which are both quite big hammers. So I added -fipa-strict-aliasing patch that controls only the TBAA based analysis in ipa-modref while keeping all other optimizations. Bootstrapped/regtested x86_64-linux, will commit it shortly. gcc/ChangeLog: 2021-12-12 Jan Hubicka * common.opt: Add -fipa-strict-aliasing. * doc/invoke.texi: Document -fipa-strict-aliasing. * ipa-modref.c (modref_access_analysis::record_access): Honor -fipa-strict-aliasing. (modref_access_analysis::record_access_lto): Likewise. diff --git a/gcc/common.opt b/gcc/common.opt index 445a53a265c..673813f34ab 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1945,6 +1945,10 @@ fira-algorithm= Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization -fira-algorithm=[CB|priority] Set the used IRA algorithm. +fipa-strict-aliasing +Common Var(flag_ipa_strict_aliasing) Init(1) Optimization +Assume strict aliasing rules apply across function boundaries. + Enum Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 9b4371b9213..afd85afe476 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -570,7 +570,7 @@ Objective-C and Objective-C++ Dialects}. -fsingle-precision-constant -fsplit-ivs-in-unroller -fsplit-loops@gol -fsplit-paths @gol -fsplit-wide-types -fsplit-wide-types-early -fssa-backprop -fssa-phiopt @gol --fstdarg-opt -fstore-merging -fstrict-aliasing @gol +-fstdarg-opt -fstore-merging -fstrict-aliasing -fipa-strict-aliasing @gol -fthread-jumps -ftracer -ftree-bit-ccp @gol -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol @@ -12423,6 +12423,15 @@ int f() @{ The @option{-fstrict-aliasing} option is enabled at levels @option{-O2}, @option{-O3}, @option{-Os}. +@item -fipa-strict-aliasing +@opindex fipa-strict-aliasing +Constrols whether rules of @option{-fstrict-aliasing} can be applied across +function boundaries. This option is effective only in combination with +@option{-fstrict-aliasing} and does not prevent optimization across function +boundry in case function has been inlined. + +The @option{-fipa-strict-aliasing} option is enabled by default. + @item -falign-functions @itemx -falign-functions=@var{n} @itemx -falign-functions=@var{n}:@var{m} diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index 2c89c63baf6..d6bd9d33278 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -999,9 +999,11 @@ modref_access_analysis::record_access (modref_records *tt, ao_ref *ref, modref_access_node &a) { - alias_set_type base_set = !flag_strict_aliasing ? 0 + alias_set_type base_set = !flag_strict_aliasing + || !flag_ipa_strict_aliasing ? 0 : ao_ref_base_alias_set (ref); - alias_set_type ref_set = !flag_strict_aliasing ? 0 + alias_set_type ref_set = !flag_strict_aliasing + || !flag_ipa_strict_aliasing ? 0 : (ao_ref_alias_set (ref)); if (dump_file) { @@ -1021,7 +1023,7 @@ modref_access_analysis::record_access_lto (modref_records_lto *tt, ao_ref *ref, /* get_alias_set sometimes use different type to compute the alias set than TREE_TYPE (base). Do same adjustments. */ tree base_type = NULL_TREE, ref_type = NULL_TREE; - if (flag_strict_aliasing) + if (flag_strict_aliasing && flag_ipa_strict_aliasing) { tree base;
Re: [PATCH] nvptx: Add (experimental) support for HFmode with -misa=sm_53
On 9/17/21 10:24 AM, Tobias Burnus wrote: > Hi Roger, > > some more generic remarks not specific to using new ISA features. > > On 17.09.21 00:53, Roger Sayle wrote: > >> Whilst there I also added -misa=sm_75 and -misa=sm_80 which are points >> where other useful instructions were added to the ISA. > > First, my impression was that already sm_70 added lots of useful stuff, > but granted sm_75 adds some more. In any case, the question is whether > it makes sense to add those flags if no use is made of those flags. > > In particular, sm_80 is according to the following webpage only > supported with PTX ISA 7.0 of CUDA 11.0. But GCC currently only supports > -mptx=3.6 (default) and -mptx=6.3 (= CUDA 10). > https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes > I've dropped the sm_75 and sm_80 parts of the patch. > > Note that you missed to update gcc/config/nvptx/t-omp-device for the new > sm_* and likewise the "-misa=@var{ISA-string}" section in > gcc/gcc/doc/invoke.texi. > The support is currently not in a state that it can be advertised or used at runtime, so I haven't included these changes. Currently gcc doesn't build if the default is switched to sm_53 ( and mptx=6.3 as well), it runs into some internal errors in libgcc. I've also tried to just run a float16 tests with target board nvptx-none-run/-misa=sm_53/-mptx=6.3, but the dg-require-effective-target float16 test doesn't succeed, again gcc runs into internal errors. But these are things that can be addressed later. Committed. Thanks, - Tom > Additionally, I wonder whether the preprocessor macros __nvptx__, > __nvptx_softstack__, __nvptx_unisimt__ and __PTX_SM__ should be > documented somewhere as well. As all but one are related to command-line > options, I wonder whether the respective section in invoke.texi would be > a good place for them. > > Tobias > > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; > Registergericht München, HRB 106955
Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
On December 12, 2021 11:49:12 AM GMT+01:00, Jan Hubicka via Gcc-patches wrote: >Hi, >As discussed in the PR, we miss some optimization becuase >gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds >__builtin_trap after them. This is seen as a side-effect by IPA analysis >and additionally the (fully unreachable) builtin_trap is believed to load >all global memory. > >I think we should think of less intrusive gimple representation of this, but >it is also easy enough to special case that in IPA analysers as done in >this patch. This is a win even if we improve the representation since >gimple-ssa-isolate-paths is run late and this way we improve optimization >early. > >This affects 1623 functions during cc1plus link. >Current cc1plus disambiguation stats are: > >Alias oracle query stats: > > refs_may_alias_p: 76712244 disambiguations, 90400285 queries > > ref_maybe_used_by_call_p: 581770 disambiguations, 77716066 queries > > call_may_clobber_ref_p: 322533 disambiguations, 325280 queries > > stmt_kills_ref_p: 106434 kills, 5728738 queries > > nonoverlapping_component_refs_p: 0 disambiguations, 8849 queries > > nonoverlapping_refs_since_match_p: 29947 disambiguations, 66277 must > overlaps, 97248 queries > aliasing_component_refs_p: 55737 disambiguations, 3055719 queries > > TBAA oracle: 27336487 disambiguations 67140201 queries > > 16214932 are in alias set 0 > > 9713534 queries asked about the same object > > 92 queries asked about the same alias set > > 0 access volatile > > 12049850 are dependent in the DAG > > 1825306 are aritificially in conflict with void * > > > >Modref stats: > > modref kill: 70 kills, 8279 queries > > modref use: 29557 disambiguations, 701616 queries > > modref clobber: 1612655 disambiguations, 21688020 queries > > 4925889 tbaa queries (0.227125 per modref query) > > 864389 base compares (0.039856 per modref query) > > > >PTA query stats: > > pt_solution_includes: 13512509 disambiguations, 27571176 queries > > pt_solutions_intersect: 1594296 disambiguations, 15943975 queries > > > >Bootstrapped/regtested x86_64-linux, comitted. > >gcc/ChangeLog: > >2021-12-12 Jan Hubicka > > PR ipa/103665 > * ipa-modref.c (modref_access_analysis::analyze): Terminate BB > analysis on NULL memory access. > * ipa-pure-const.c (analyze_function): Likewise. > >diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c >index 55fa873e1f0..2c89c63baf6 100644 >--- a/gcc/ipa-modref.c >+++ b/gcc/ipa-modref.c >@@ -1750,6 +1750,19 @@ modref_access_analysis::analyze () > for (si = gsi_start_nondebug_after_labels_bb (bb); > !gsi_end_p (si); gsi_next_nondebug (&si)) > { >+/* NULL memory accesses terminates BB. These accesses are known >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them >+ to volatile accesses and adds builtin_trap call which would >+ confuse us otherwise. */ >+if (infer_nonnull_range_by_dereference (gsi_stmt (si), >+null_pointer_node)) Does that correctly handle flag_delete_null_pointer_checks and address spaces with null? >+ { >+if (dump_file) >+ fprintf (dump_file, " - NULL memory access; terminating BB\n"); >+if (flag_non_call_exceptions) >+ set_side_effects (); >+break; >+ } > analyze_stmt (gsi_stmt (si), always_executed); > > /* Avoid doing useles work. */ >diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c >index fea8b08c4eb..25503f360e6 100644 >--- a/gcc/ipa-pure-const.c >+++ b/gcc/ipa-pure-const.c >@@ -1097,6 +1097,28 @@ analyze_function (struct cgraph_node *fn, bool ipa) > !gsi_end_p (gsi); > gsi_next (&gsi)) > { >+/* NULL memory accesses terminates BB. These accesses are known >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them >+ to volatile accesses and adds builtin_trap call which would >+ co
Re: Add -fipa-strict-aliasing
On December 12, 2021 1:22:09 PM GMT+01:00, Jan Hubicka via Gcc-patches wrote: >Hi, >ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. >This sometimes breaks programs with TBAA violations including clang with LTO. >To workaround that one can use -fno-strict-aliasing or -fno-ipa-modref which >are both quite big hammers. So I added -fipa-strict-aliasing patch that >controls only the TBAA based analysis in ipa-modref while keeping all other >optimizations. > >Bootstrapped/regtested x86_64-linux, will commit it shortly. I think at the documentation level we need to clarify that TBAA exposed by inlining is not considered IPA. The current wording is confusing in this regard. Or we need to somehow preserve a TBAA barrier at inline instances (like drop all inlined refs to alias set zero) >gcc/ChangeLog: > >2021-12-12 Jan Hubicka > > * common.opt: Add -fipa-strict-aliasing. > * doc/invoke.texi: Document -fipa-strict-aliasing. > * ipa-modref.c (modref_access_analysis::record_access): Honor > -fipa-strict-aliasing. > (modref_access_analysis::record_access_lto): Likewise. > >diff --git a/gcc/common.opt b/gcc/common.opt >index 445a53a265c..673813f34ab 100644 >--- a/gcc/common.opt >+++ b/gcc/common.opt >@@ -1945,6 +1945,10 @@ fira-algorithm= > Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) > Init(IRA_ALGORITHM_CB) Optimization > -fira-algorithm=[CB|priority] Set the used IRA algorithm. > >+fipa-strict-aliasing >+Common Var(flag_ipa_strict_aliasing) Init(1) Optimization >+Assume strict aliasing rules apply across function boundaries. >+ > Enum > Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA > algorithm %qs) > >diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >index 9b4371b9213..afd85afe476 100644 >--- a/gcc/doc/invoke.texi >+++ b/gcc/doc/invoke.texi >@@ -570,7 +570,7 @@ Objective-C and Objective-C++ Dialects}. > -fsingle-precision-constant -fsplit-ivs-in-unroller -fsplit-loops@gol > -fsplit-paths @gol > -fsplit-wide-types -fsplit-wide-types-early -fssa-backprop -fssa-phiopt > @gol >--fstdarg-opt -fstore-merging -fstrict-aliasing @gol >+-fstdarg-opt -fstore-merging -fstrict-aliasing -fipa-strict-aliasing @gol > -fthread-jumps -ftracer -ftree-bit-ccp @gol > -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol > -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol >@@ -12423,6 +12423,15 @@ int f() @{ > The @option{-fstrict-aliasing} option is enabled at levels > @option{-O2}, @option{-O3}, @option{-Os}. > >+@item -fipa-strict-aliasing >+@opindex fipa-strict-aliasing >+Constrols whether rules of @option{-fstrict-aliasing} can be applied across >+function boundaries. This option is effective only in combination with >+@option{-fstrict-aliasing} and does not prevent optimization across function >+boundry in case function has been inlined. >+ >+The @option{-fipa-strict-aliasing} option is enabled by default. >+ > @item -falign-functions > @itemx -falign-functions=@var{n} > @itemx -falign-functions=@var{n}:@var{m} >diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c >index 2c89c63baf6..d6bd9d33278 100644 >--- a/gcc/ipa-modref.c >+++ b/gcc/ipa-modref.c >@@ -999,9 +999,11 @@ modref_access_analysis::record_access (modref_records *tt, > ao_ref *ref, > modref_access_node &a) > { >- alias_set_type base_set = !flag_strict_aliasing ? 0 >+ alias_set_type base_set = !flag_strict_aliasing >+ || !flag_ipa_strict_aliasing ? 0 > : ao_ref_base_alias_set (ref); >- alias_set_type ref_set = !flag_strict_aliasing ? 0 >+ alias_set_type ref_set = !flag_strict_aliasing >+ || !flag_ipa_strict_aliasing ? 0 > : (ao_ref_alias_set (ref)); > if (dump_file) > { >@@ -1021,7 +1023,7 @@ modref_access_analysis::record_access_lto >(modref_records_lto *tt, ao_ref *ref, > /* get_alias_set sometimes use different type to compute the alias set > than TREE_TYPE (base). Do same adjustments. */ > tree base_type = NULL_TREE, ref_type = NULL_TREE; >- if (flag_strict_aliasing) >+ if (flag_strict_aliasing && flag_ipa_strict_aliasing) > { > tree base; >
Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
On Sun, Dec 12, 2021 at 4:34 AM Richard Biener via Gcc-patches wrote: > > On December 12, 2021 11:49:12 AM GMT+01:00, Jan Hubicka via Gcc-patches > wrote: > >Hi, > >As discussed in the PR, we miss some optimization becuase > >gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > >__builtin_trap after them. This is seen as a side-effect by IPA analysis > >and additionally the (fully unreachable) builtin_trap is believed to load > >all global memory. > > > >I think we should think of less intrusive gimple representation of this, but > >it is also easy enough to special case that in IPA analysers as done in > >this patch. This is a win even if we improve the representation since > >gimple-ssa-isolate-paths is run late and this way we improve optimization > >early. > > > >This affects 1623 functions during cc1plus link. > >Current cc1plus disambiguation stats are: > > > >Alias oracle query stats: > > refs_may_alias_p: 76712244 disambiguations, 90400285 queries > > ref_maybe_used_by_call_p: 581770 disambiguations, 77716066 queries > > call_may_clobber_ref_p: 322533 disambiguations, 325280 queries > > stmt_kills_ref_p: 106434 kills, 5728738 queries > > nonoverlapping_component_refs_p: 0 disambiguations, 8849 queries > > nonoverlapping_refs_since_match_p: 29947 disambiguations, 66277 must > > overlaps, 97248 queries > > aliasing_component_refs_p: 55737 disambiguations, 3055719 queries > > TBAA oracle: 27336487 disambiguations 67140201 queries > > 16214932 are in alias set 0 > > 9713534 queries asked about the same object > > 92 queries asked about the same alias set > > 0 access volatile > > 12049850 are dependent in the DAG > > 1825306 are aritificially in conflict with void * > > > >Modref stats: > > modref kill: 70 kills, 8279 queries > > modref use: 29557 disambiguations, 701616 queries > > modref clobber: 1612655 disambiguations, 21688020 queries > > 4925889 tbaa queries (0.227125 per modref query) > > 864389 base compares (0.039856 per modref query) > > > >PTA query stats: > > pt_solution_includes: 13512509 disambiguations, 27571176 queries > > pt_solutions_intersect: 1594296 disambiguations, 15943975 queries > > > > > >Bootstrapped/regtested x86_64-linux, comitted. > > > >gcc/ChangeLog: > > > >2021-12-12 Jan Hubicka > > > > PR ipa/103665 > > * ipa-modref.c (modref_access_analysis::analyze): Terminate BB > > analysis on NULL memory access. > > * ipa-pure-const.c (analyze_function): Likewise. > > > >diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > >index 55fa873e1f0..2c89c63baf6 100644 > >--- a/gcc/ipa-modref.c > >+++ b/gcc/ipa-modref.c > >@@ -1750,6 +1750,19 @@ modref_access_analysis::analyze () > > for (si = gsi_start_nondebug_after_labels_bb (bb); > > !gsi_end_p (si); gsi_next_nondebug (&si)) > > { > >+/* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+if (infer_nonnull_range_by_dereference (gsi_stmt (si), > >+null_pointer_node)) > > Does that correctly handle flag_delete_null_pointer_checks and address spaces > with null? Yes. infer_nonnull_range_by_dereference has the following check: /* We can only assume that a pointer dereference will yield non-NULL if -fdelete-null-pointer-checks is enabled. */ if (!flag_delete_null_pointer_checks || !POINTER_TYPE_P (TREE_TYPE (op)) || gimple_code (stmt) == GIMPLE_ASM || gimple_clobber_p (stmt)) return false; And then calls walk_stmt_load_store_ops with check_loadstore as the callback function which does: /* Some address spaces may legitimately dereference zero. */ addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (op)); if (targetm.addr_space.zero_address_valid (as)) return false; return operand_equal_p (TREE_OPERAND (op, 0), (tree)data, 0); Thanks, Andrew Pinski > > >+ { > >+if (dump_file) > >+ fprintf (dump_file, " - NULL memory access; terminating > >BB\n"); > >+if (flag_non_call_exceptions) > >+ set_side_effects (); > >+break; > >+ } > > analyze_stmt (gsi_stmt (si), always_executed); > > > > /* Avoid doing useles work. */ > >diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c > >index fea8b08c4eb..25503f360e6 100644 > >--- a/gcc/ipa-pure-const.c > >+++ b/gcc/ipa-pure-const.c > >@@ -1097,6 +1097,28 @@ analyze_function (struct cgraph_node *fn, bool ipa) > > !gsi_end_p (gsi); > > gsi_next (&gsi)) > > { > >+/* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimpl
Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
> >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if (infer_nonnull_range_by_dereference (gsi_stmt (si), > >+ null_pointer_node)) > > Does that correctly handle flag_delete_null_pointer_checks and address spaces > with null? Yes it does both - it is also same check that gimple-ssa-isolate-paths eventually uses to add the trap so we would probably see PRs if it did not. Honza
Re: Add -fipa-strict-aliasing
> On December 12, 2021 1:22:09 PM GMT+01:00, Jan Hubicka via Gcc-patches > wrote: > >Hi, > >ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. > >This sometimes breaks programs with TBAA violations including clang with LTO. > >To workaround that one can use -fno-strict-aliasing or -fno-ipa-modref which > >are both quite big hammers. So I added -fipa-strict-aliasing patch that > >controls only the TBAA based analysis in ipa-modref while keeping all other > >optimizations. > > > >Bootstrapped/regtested x86_64-linux, will commit it shortly. > > I think at the documentation level we need to clarify that TBAA exposed by > inlining is not considered IPA. The current wording is confusing in this > regard. Or we need to somehow preserve a TBAA barrier at inline instances > (like drop all inlined refs to alias set zero) It is what I meant by: "... and does not prevent optimization across function boundary in case function has been inlined." Perhaps I should not have packed it into single sentence "This option does not affect strict aliasing rules when optimizing memory locations originating from different functions that has been inlined together" I do not think it is practical to extend representation to handle function boundaries, but I still find this flag useful (I have it in a tree for a while) Honza
[Patch] invoke.texi: Add sm_35 to Nvidia PTX's -misa=
Hi Tom, this documents the new sm_35 value to -misa=, which Roger added (and uses for _Float16). OK? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 invoke.texi: Add sm_35 to Nvidia PTX's -misa= gcc/ChangeLog: * doc/invoke.texi (Nvidia PTX): Add sm_35 to -misa. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 9b4371b9213..b68905436e5 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -27035,8 +27035,8 @@ supported. @item -misa=@var{ISA-string} @opindex march Generate code for given the specified PTX ISA (e.g.@: @samp{sm_35}). ISA -strings must be lower-case. Valid ISA strings include @samp{sm_30} and -@samp{sm_35}. The default ISA is sm_35. +strings must be lower-case. Valid ISA strings include @samp{sm_30}, +@samp{sm_35} and @samp{sm_53}. The default ISA is sm_35. @item -mptx=@var{version-string} @opindex mptx
Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: Hi, As discussed in the PR, we miss some optimization becuase gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds __builtin_trap after them. This is seen as a side-effect by IPA analysis and additionally the (fully unreachable) builtin_trap is believed to load all global memory. I think we should think of less intrusive gimple representation of this, but it is also easy enough to special case that in IPA analysers as done in this patch. This is a win even if we improve the representation since gimple-ssa-isolate-paths is run late and this way we improve optimization early. So what's important about the IR representation is that we keep some kind of memory access around (so that it will fault), that the memory access reference a minimal amount of other stuff (we don't want the address computation to keep anything live for example) and that we have a subsequent trap so that the CFG routines know it traps. Otherwise we're free to do whatever we want. jeff
Re: [Patch] invoke.texi: Add sm_35 to Nvidia PTX's -misa=
I want to WITHDRAW that patch. I should read _emails_ before acting on _commit_ logs ... Reason is given by Tom at: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586649.html However, we still shouldn't forget to update *.texi and t-omp-device eventually. Tobias On 12.12.21 17:01, Tobias Burnus wrote: Hi Tom, this documents the new sm_35 value to -misa=, which Roger added (and uses for _Float16). OK? Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[wwwdocs][committed] gcc-12/changes.html: Fix OpenMP typo 'allocator'
The fix obvious: while the clause is 'allocate', the 'allocate-modifiers' are 'allocator' (not 'allocate') and 'align'. Cf. https://www.openmp.org/spec-html/5.1/openmpsu63.html#x90-102.13.4 Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 commit 54f793640e5a94dca4ab51249d603b909e43f2ed Author: Tobias Burnus Date: Sun Dec 12 17:10:03 2021 +0100 gcc-12/changes.html: Fix OpenMP typo 'allocator' diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index fd49e208..b1c88670 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -116,7 +116,7 @@ a work-in-progress. proc_bind clause and OMP_PROC_BIND environment variable, the reproducible and unconstrained modifiers to the order clause, and, for C/C++ only, the - align and allocate modifiers to the + align and allocator modifiers to the allocate clause and the atomic extensions are now available. The OMP_PLACE environment variable supports the OpenMP 5.1 features. In addition the OMP_NUM_TEAMS and
Re: [Patch] invoke.texi: Add sm_35 to Nvidia PTX's -misa=
On 12/12/21 5:08 PM, Tobias Burnus wrote: > I want to WITHDRAW that patch. > > I should read _emails_ before acting on _commit_ logs ... > heh :) > Reason is given by Tom at: > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586649.html > > However, we still shouldn't forget to update *.texi and t-omp-device > eventually. > Agreed. Thanks, - Tom > Tobias > > On 12.12.21 17:01, Tobias Burnus wrote: >> Hi Tom, >> >> this documents the new sm_35 value to -misa=, which Roger added >> (and uses for _Float16). >> >> OK? >> >> Tobias > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; > Registergericht München, HRB 106955
Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
> > > On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: > > Hi, > > As discussed in the PR, we miss some optimization becuase > > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > > __builtin_trap after them. This is seen as a side-effect by IPA analysis > > and additionally the (fully unreachable) builtin_trap is believed to load > > all global memory. > > > > I think we should think of less intrusive gimple representation of this, but > > it is also easy enough to special case that in IPA analysers as done in > > this patch. This is a win even if we improve the representation since > > gimple-ssa-isolate-paths is run late and this way we improve optimization > > early. > So what's important about the IR representation is that we keep some kind of > memory access around (so that it will fault), that the memory access > reference a minimal amount of other stuff (we don't want the address > computation to keep anything live for example) and that we have a subsequent > trap so that the CFG routines know it traps. > > Otherwise we're free to do whatever we want. I was thinking about __builtin_trap_load (ptr) and __builtin_trap_store (ptr) which we could annotate correctly for PTA and avoid need for two statements, but I am not sure if this is good idea. coioncidentally I just ran across another artifact of this. One of hottest functions in clang in getTerminator. Clang builds it as: │0122f4b0 : │llvm::BasicBlock::getTerminator() const: 2.16 │ mov0x28(%rdi),%rax 28.85 │ add$0x28,%rdi 0.20 │ cmp%rax,%rdi 3.55 │↓ je 29 │ lea-0x18(%rax),%rcx 0.79 │ test %rax,%rax │ cmove %rax,%rcx 4.15 │ movzbl 0x10(%rcx),%edx 48.46 │ add$0xffe5,%edx 3.95 │ xor%eax,%eax 0.20 │ cmp$0xb,%edx 3.35 │ cmovb %rcx,%rax 4.33 │← ret │29: xor%eax,%eax │← ret While we do: │ │01471840 : │llvm::BasicBlock::getTerminator() const: 3.24 │ mov0x28(%rdi),%rax 31.31 │ add$0x28,%rdi 0.59 │ cmp%rdi,%rax 5.31 │↓ je 30 │ test %rax,%rax 2.36 │→ je 9366f4 │ movzbl -0x8(%rax),%edx 45.73 │ sub$0x18,%rax │ sub$0x1b,%edx 4.70 │ cmp$0xb,%edx 3.53 │ mov$0x0,%edx │ cmovae %rdx,%rax 3.23 │← ret │ xchg %ax,%ax │30: xor%eax,%eax │← ret │ │ 009366f4 : │ llvm::BasicBlock::getTerminator() const [clone .cold]: │ movzbl 0x10,%eax │ ud2 The clang generated code obviously conditionally loads NULL pointer, but replacing this cmov by conditional jump to cold section seems overkill. Loading 10 into eax seems useless... I think this is common pattern in C++ code originating from cast with multiple inheritance. I would vote towards optimizing out the conditial move in this case and I think it is correct. Honza
Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
> > I think this is common pattern in C++ code originating from cast with > multiple inheritance. I would vote towards optimizing out the conditial > move in this case and I think it is correct. I crafted a testcse and filled in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103674 Honza > > Honza
Do not ICE when computing value range of ternary expression
Hi, In evaluate_conditions_for_known_args we use range_fold_unary_expr and range_fold_binary_expr to produce value ranges of the expression. However the expression also may contain ternary COND_EXPR on which we ICE. I did not find interface to do similar folding easily on ternary exprs and since it is so rare case, i guess we can just punt and give up on producing it. Bootstrapped/regtsted x86_64-linux, OK? gcc/ChangeLog: 2021-12-12 Jan Hubicka * ipa-fnsummary.c (evaluate_conditions_for_known_args): Do not ICE on ternary expression. gcc/testsuite/ChangeLog: 2021-12-12 Jan Hubicka * gcc.c-torture/compile/pr103513.c: New test. diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 6c1cdf17e47..cb3c198ec0c 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -513,7 +513,7 @@ evaluate_conditions_for_known_args (struct cgraph_node *node, op->index ? &vr : &op0); } else - gcc_unreachable (); + res.set_varying (op->type); type = op->type; vr = res; } diff --git a/gcc/testsuite/gcc.c-torture/compile/pr103513.c b/gcc/testsuite/gcc.c-torture/compile/pr103513.c new file mode 100644 index 000..ca876a9816c --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr103513.c @@ -0,0 +1,8 @@ +int a; +void b(int c) { + int d = 3; + d ^= c < 2; + if (d < 3 && a) +while (1) + b(!a); +}
Re: [PATCH] c++: processing_template_decl vs template depth [PR103408]
On 12/10/21 14:12, Patrick Palka wrote: We use processing_template_decl in two slightly different ways: as a flag to signal that we're dealing with templated trees, and its magnitude is also used as a proxy for the current syntactic template depth. This overloaded meaning of p_t_d is conceptually confusing and leads to bugs that we end up working around in an ad-hoc fashion. This patch replaces all uses of processing_template_decl that care about its magnitude to instead look at the depth of current_template_parms via a new current_template_depth macro. This allows us to eliminate 3 workarounds in the concepts code: two about non-templated requires-expressions (in constraint.cc) and one about lambdas inside constraints (in cp_parser_requires_clause_expression etc). The replacement was mostly mechanical. There were two gotchas: * In synthesize_implicit_template_parm, when introducing a new template parameter list for an abbreviated function template, we need to add a new level of current_template_parms before calling process_template_parm since this function now looks at current_template_depth to determine the level of the new parameter. * In instantiate_class_template_1 when instantiating a template friend, we need to set current_template_parms instead of processing_template_decl so that the friend_depth computation in make_friend_class remains correct. Hmm, it looks like both before and after your patch we end up with a wrong value of friend_depth in make_friend_class for member templates of class templates, but it doesn't actually seem to matter. Instead of building a fake tparm level, we might add a flag parameter to make_friend_class to force treatment as a friend template? Bootstrpped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? Also tested on cmcstl2 and rangev3. PR c++/103408 gcc/cp/ChangeLog: * constraint.cc (type_deducible_p): Remove workaround for non-templated requires-expressions. (normalize_placeholder_type_constraints): Likewise. * cp-tree.h (current_template_depth): Define. * decl.c (start_decl): Inspect current_template_depth instead of the magnitude of processing_template_decl. (grokfndecl): Likewise. (grokvardecl): Likewise. (grokdeclarator): Likewise. * friend.c (make_friend_class): Likewise. (do_friend): Likewise. * parser.c (cp_parser_requires_clause_expression): Remove workaround for lambdas inside constraints. (cp_parser_constraint_expression): Likewise. (cp_parser_requires_expression): Likewise. (synthesize_implicit_template_parm): Add to current_template_parms before calling process_template_parm. * pt.c (inline_needs_template_parms): Inspect current_template_depth instead of the magnitude of processing_template_decl. (push_inline_template_parms_recursive): Likewise. (maybe_begin_member_template_processing): Likewise. (begin_template_parm_list): Likewise. (process_template_parm): Likewise. (end_template_parm_list): Likewise. (add_inherited_template_parms): Likewise. (instantiate_class_template_1): Rename adjust_processing_template_decl to adjust_template_depth. Set current_template_parms instead of processing_template_decl when adjust_template_depth. (make_auto_1): Inspect current_template_depth instead of the magnitude of processing_template_decl. (splice_late_return_type): Likewise. * semantics.c (fixup_template_type): Likewise. gcc/testsuite/ChangeLog: * g++.dg/concepts/diagnostic18.C: Expect a "constraints on a non-templated function" error. * g++.dg/cpp23/auto-fncast10.C: New test. --- gcc/cp/constraint.cc | 16 -- gcc/cp/cp-tree.h | 2 + gcc/cp/decl.c| 10 ++-- gcc/cp/friend.c | 4 +- gcc/cp/parser.c | 28 -- gcc/cp/pt.c | 54 +++- gcc/cp/semantics.c | 2 +- gcc/testsuite/g++.dg/concepts/diagnostic18.C | 2 +- gcc/testsuite/g++.dg/cpp23/auto-fncast10.C | 19 +++ 9 files changed, 70 insertions(+), 67 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast10.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index 2896efdd7f2..566f4e38fac 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -2016,14 +2016,6 @@ type_deducible_p (tree expr, tree type, tree placeholder, tree args, references are preserved in the result. */ expr = force_paren_expr_uneval (expr); - /* When args is NULL, we're evaluating a non-templated requires expression, - but even those are parsed under processing_template_d
Re: std::basic_string<_Tp> constructor point of instantiation woes?
On Fri, 3 Dec 2021 at 22:38, Jonathan Wakely wrote: > > On Mon, 22 Nov 2021 at 16:31, Stephan Bergmann via Libstdc++ > wrote: > > > > When using recent libstc++ trunk with Clang in C++20 mode, > > std::u16string literals as in > > > > > #include > > > int main() { > > > using namespace std::literals; > > > u""s; > > > } > > > > started to cause linker failures due to undefined > > > > > _ZNSt7__cxx1112basic_stringIDsSt11char_traitsIDsESaIDsEE12_M_constructIPKDsEEvT_S8_St20forward_iterator_tag > > > > After some head scratching, I found the more insightful > > > > > $ cat test.cc > > > #include > > > constexpr std::string s("", 0); > > > > > $ clang++ -std=c++20 -fsyntax-only test.cc > > > test.cc:2:23: error: constexpr variable 's' must be initialized by a > > > constant expression > > > constexpr std::string s("", 0); > > > ^~~~ > > > ~/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/12.0.0/../../../../include/c++/12.0.0/bits/basic_string.h:620:2: > > > note: undefined function '_M_construct' cannot be used in > > > a constant expression > > > _M_construct(__s, __s + __n, std::forward_iterator_tag()); > > > ^ > > > test.cc:2:23: note: in call to 'basic_string(&""[0], 0, > > > std::allocator())' > > > constexpr std::string s("", 0); > > > ^ > > > ~/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/12.0.0/../../../../include/c++/12.0.0/bits/basic_string.h:331:9: > > > note: declared here > > > _M_construct(_FwdIterator __beg, _FwdIterator __end, > > > ^ > > > 1 error generated. > > > > and after some more head scratching found Clang to complain about the > > reduced > > > > > template struct S { > > > constexpr void f(); > > > constexpr S() { f(); }; > > > }; > > > S s1; > > > template constexpr void S::f() {} > > > constexpr S s2; > > > > (about which GCC does not complain). Not entirely sure who is right, > > but what would help Clang is to move the definitions of the literal > > operators in basic_string.h (which implicitly instantiate the > > corresponding std::basic_string<_Tp> constructor) past the definition of > > _M_construct (which is called from the constructor) in basic_string.tcc; > > something like > > The .tcc files are something of an anachronism, as I think they were > supposed to have the non-inline function definitions which might be > subject to 'export' for separate compilation. Except that feature was > removed from C++11, and so now it's just a fairly pointless separation > between inline and non-inline functions ... except where we're muddied > the waters by changing some to 'inline' without moving them to the > other file (because why bother). > > That said, all the one- or two-line inline functions like the literal > operators and to_string are all in basic_string.h and having to move > some arbitrary subset of them into the other file, after the > non-inline definitions, is a bit annoying. > > I think this is https://bugs.llvm.org/show_bug.cgi?id=24128 Which is now https://github.com/llvm/llvm-project/issues/24502
Re: SV: [PATCH v2] jit: Add support for global rvalue initialization and ctors
Yes, this patch works for rustc_codegen_gcc perfectly. It even fixes one issue that was in my patch, so that's nice! Le samedi 11 décembre 2021 à 15:35 +, Petter Tomner a écrit : > Hi! > > > s/an union/a union/ > > s/a rvalue/an rvalue/ > > Heh no way ... and I though I knew English grammar :) > > Had to look that up to see what the deal was but it makes sense. > > yunion, arevalue. > > > s/wrong-field-name/wrong-field-obj/ > > > > to match the struct example (given that the issue being tested for > > is > > that it's the wrong object, rather than the wrong name). > > Initially, before submitting to the list, I made the code such that > the field > objects did not have to be the ones that were used when creating the > struct or union, and forgot changing the test names. > > I figured it required too much string compares for the field names > and > pointer compares for the field object were more appropriate. To > create > dummy field objects were also kinda heavy. > > I'll address the points. > > Regards, Petter > > > Från: David Malcolm > Skickat: den 9 december 2021 20:39 > Till: Petter Tomner; gcc-patches@gcc.gnu.org; j...@gcc.gnu.org; Antoni > Boucher > Ämne: Re: [PATCH v2] jit: Add support for global rvalue > initialization and ctors > > On Mon, 2021-12-06 at 10:47 +, Petter Tomner via Gcc-patches > wrote: > > Hi! > > > > Attached is a patch with changes in line with the review of the > > prior > > patch. > > The patch adds support for initialization of global variables with > > rvalues as well > > as rvalue constructors for structs, arrays and unions. > > Thanks for the updated patch. > > Antoni: does this patch work for you for your rustc plugin? > > > > > Review: https://gcc.gnu.org/pipermail/jit/2021q4/001400.html > > > > The points have been addressed, except: > > > > > Can the type be made const? > > > > I started to make types_kinda_same_internal () taking const args, > > but I > > felt the > > patch was ballooning because some spread out methods needed a const > > signature too. I could submit that in a separate patch. > > Fair enough; fixing that isn't a blocker; it's already a big patch. > > > > > I also addressed a problem Antoni found: > > https://gcc.gnu.org/pipermail/jit/2021q4/001399.html > > > > , where you could not initialize global pointer variables to point > > to > > uninitialized variables. I did that by > > removing a redundant check with validate_var_has_init (), since > > that > > walking function would > > have to be quite complex to allow pointers to uninitialized > > variables. > > > > Any: > > const int foo; > > int bar = foo; > > > > will instead be reported as "not compile time constant" instead of > > a > > nice error message with names. > > > > make check-jit runs fine on gnu-linux-x64 Debian. > > Various review comments inline below, which are mostly just nits: > > > From a4fef1308eaa72ce4ec51dbe5efcfbbf032e9870 Mon Sep 17 00:00:00 > > 2001 > > From: Petter Tomner > > Date: Mon, 29 Nov 2021 20:44:07 +0100 > > Subject: [PATCH] Add support for global rvalue initialization and > > constructors > > > > This patch adds support for initialization of global variables > > with rvalues and creating constructors for array, struct and > > union types which can be used as rvalues. > > > > Signed-off-by: > > 2021-12-06 Petter Tomner > > [...snip...] > > > diff --git a/gcc/jit/docs/topics/expressions.rst > > b/gcc/jit/docs/topics/expressions.rst > > index 396259ef07e..5f64ca68595 100644 > > --- a/gcc/jit/docs/topics/expressions.rst > > +++ b/gcc/jit/docs/topics/expressions.rst > > @@ -126,6 +126,147 @@ Simple expressions > > underlying string, so it is valid to pass in a pointer to an > > on-stack > > buffer. > > > > +Constructor expressions > > +*** > > + > > + The following functions make constructors for array, struct and > > union > > + types. > > + > > + The constructor rvalue can be used for assignment to locals. > > + It can be used to initialize global variables with > > + :func:`gcc_jit_global_set_initializer_rvalue`. It can also be > > used as a > > + temporary value for function calls and return values, but its > > address > > + can't be taken. > > + > > + Note that arrays in libgccjit does not collapse to pointers > > like in > > s/does not/do not/ > > > + C. I.e. if an array constructor is used as e.g. a return value, > > the whole > > + array would be returned by value - array constructors can be > > assigned to > > + array variables. > > + > > + The constructor can contain nested constructors. > > + > > + Note that a string literal rvalue can't be used to construct a > > char array. > > + It need one rvalue for each char. > > s/char array. It need one rvalue/char array; the latter needs one > rvalue/ > > > + > > + These entrypoints were added in :ref:`LIBGCCJIT_ABI_16`; you > > can test for its > > s/16/17/ I believe. > > s/its/their/ >
Re: SV: [PATCH v2] jit: Add support for global rvalue initialization and ctors
One thing I'm not sure if it is a code style issue, but worth mentionning: > @@ -1405,8 +1436,10 @@ private: > > private: >enum gcc_jit_global_kind m_kind; > + enum global_var_flags flags = GLOBAL_VAR_FLAGS_NONE; ^ Should it be named m_flags instead of flags? >string *m_name; >void *m_initializer; > + rvalue *m_rvalue_init = nullptr; /* Only needed for write_dump. */ >size_t m_initializer_num_bytes; > }; Le samedi 11 décembre 2021 à 15:35 +, Petter Tomner a écrit : > Hi! > > > s/an union/a union/ > > s/a rvalue/an rvalue/ > > Heh no way ... and I though I knew English grammar :) > > Had to look that up to see what the deal was but it makes sense. > > yunion, arevalue. > > > s/wrong-field-name/wrong-field-obj/ > > > > to match the struct example (given that the issue being tested for > > is > > that it's the wrong object, rather than the wrong name). > > Initially, before submitting to the list, I made the code such that > the field > objects did not have to be the ones that were used when creating the > struct or union, and forgot changing the test names. > > I figured it required too much string compares for the field names > and > pointer compares for the field object were more appropriate. To > create > dummy field objects were also kinda heavy. > > I'll address the points. > > Regards, Petter > > > Från: David Malcolm > Skickat: den 9 december 2021 20:39 > Till: Petter Tomner; gcc-patches@gcc.gnu.org; j...@gcc.gnu.org; Antoni > Boucher > Ämne: Re: [PATCH v2] jit: Add support for global rvalue > initialization and ctors > > On Mon, 2021-12-06 at 10:47 +, Petter Tomner via Gcc-patches > wrote: > > Hi! > > > > Attached is a patch with changes in line with the review of the > > prior > > patch. > > The patch adds support for initialization of global variables with > > rvalues as well > > as rvalue constructors for structs, arrays and unions. > > Thanks for the updated patch. > > Antoni: does this patch work for you for your rustc plugin? > > > > > Review: https://gcc.gnu.org/pipermail/jit/2021q4/001400.html > > > > The points have been addressed, except: > > > > > Can the type be made const? > > > > I started to make types_kinda_same_internal () taking const args, > > but I > > felt the > > patch was ballooning because some spread out methods needed a const > > signature too. I could submit that in a separate patch. > > Fair enough; fixing that isn't a blocker; it's already a big patch. > > > > > I also addressed a problem Antoni found: > > https://gcc.gnu.org/pipermail/jit/2021q4/001399.html > > > > , where you could not initialize global pointer variables to point > > to > > uninitialized variables. I did that by > > removing a redundant check with validate_var_has_init (), since > > that > > walking function would > > have to be quite complex to allow pointers to uninitialized > > variables. > > > > Any: > > const int foo; > > int bar = foo; > > > > will instead be reported as "not compile time constant" instead of > > a > > nice error message with names. > > > > make check-jit runs fine on gnu-linux-x64 Debian. > > Various review comments inline below, which are mostly just nits: > > > From a4fef1308eaa72ce4ec51dbe5efcfbbf032e9870 Mon Sep 17 00:00:00 > > 2001 > > From: Petter Tomner > > Date: Mon, 29 Nov 2021 20:44:07 +0100 > > Subject: [PATCH] Add support for global rvalue initialization and > > constructors > > > > This patch adds support for initialization of global variables > > with rvalues and creating constructors for array, struct and > > union types which can be used as rvalues. > > > > Signed-off-by: > > 2021-12-06 Petter Tomner > > [...snip...] > > > diff --git a/gcc/jit/docs/topics/expressions.rst > > b/gcc/jit/docs/topics/expressions.rst > > index 396259ef07e..5f64ca68595 100644 > > --- a/gcc/jit/docs/topics/expressions.rst > > +++ b/gcc/jit/docs/topics/expressions.rst > > @@ -126,6 +126,147 @@ Simple expressions > > underlying string, so it is valid to pass in a pointer to an > > on-stack > > buffer. > > > > +Constructor expressions > > +*** > > + > > + The following functions make constructors for array, struct and > > union > > + types. > > + > > + The constructor rvalue can be used for assignment to locals. > > + It can be used to initialize global variables with > > + :func:`gcc_jit_global_set_initializer_rvalue`. It can also be > > used as a > > + temporary value for function calls and return values, but its > > address > > + can't be taken. > > + > > + Note that arrays in libgccjit does not collapse to pointers > > like in > > s/does not/do not/ > > > + C. I.e. if an array constructor is used as e.g. a return value, > > the whole > > + array would be returned by value - array constructors can be > > assigned to > > + array variables. > > + > > + The constructor can contain nested constructors. > > + > > +
Re: SV: [PATCH v2] jit: Add support for global rvalue initialization and ctors
Also, the LIBGCCJIT_ABI_17 will need to be updated as I merged some of my patches. Le samedi 11 décembre 2021 à 15:35 +, Petter Tomner a écrit : > Hi! > > > s/an union/a union/ > > s/a rvalue/an rvalue/ > > Heh no way ... and I though I knew English grammar :) > > Had to look that up to see what the deal was but it makes sense. > > yunion, arevalue. > > > s/wrong-field-name/wrong-field-obj/ > > > > to match the struct example (given that the issue being tested for > > is > > that it's the wrong object, rather than the wrong name). > > Initially, before submitting to the list, I made the code such that > the field > objects did not have to be the ones that were used when creating the > struct or union, and forgot changing the test names. > > I figured it required too much string compares for the field names > and > pointer compares for the field object were more appropriate. To > create > dummy field objects were also kinda heavy. > > I'll address the points. > > Regards, Petter > > > Från: David Malcolm > Skickat: den 9 december 2021 20:39 > Till: Petter Tomner; gcc-patches@gcc.gnu.org; j...@gcc.gnu.org; Antoni > Boucher > Ämne: Re: [PATCH v2] jit: Add support for global rvalue > initialization and ctors > > On Mon, 2021-12-06 at 10:47 +, Petter Tomner via Gcc-patches > wrote: > > Hi! > > > > Attached is a patch with changes in line with the review of the > > prior > > patch. > > The patch adds support for initialization of global variables with > > rvalues as well > > as rvalue constructors for structs, arrays and unions. > > Thanks for the updated patch. > > Antoni: does this patch work for you for your rustc plugin? > > > > > Review: https://gcc.gnu.org/pipermail/jit/2021q4/001400.html > > > > The points have been addressed, except: > > > > > Can the type be made const? > > > > I started to make types_kinda_same_internal () taking const args, > > but I > > felt the > > patch was ballooning because some spread out methods needed a const > > signature too. I could submit that in a separate patch. > > Fair enough; fixing that isn't a blocker; it's already a big patch. > > > > > I also addressed a problem Antoni found: > > https://gcc.gnu.org/pipermail/jit/2021q4/001399.html > > > > , where you could not initialize global pointer variables to point > > to > > uninitialized variables. I did that by > > removing a redundant check with validate_var_has_init (), since > > that > > walking function would > > have to be quite complex to allow pointers to uninitialized > > variables. > > > > Any: > > const int foo; > > int bar = foo; > > > > will instead be reported as "not compile time constant" instead of > > a > > nice error message with names. > > > > make check-jit runs fine on gnu-linux-x64 Debian. > > Various review comments inline below, which are mostly just nits: > > > From a4fef1308eaa72ce4ec51dbe5efcfbbf032e9870 Mon Sep 17 00:00:00 > > 2001 > > From: Petter Tomner > > Date: Mon, 29 Nov 2021 20:44:07 +0100 > > Subject: [PATCH] Add support for global rvalue initialization and > > constructors > > > > This patch adds support for initialization of global variables > > with rvalues and creating constructors for array, struct and > > union types which can be used as rvalues. > > > > Signed-off-by: > > 2021-12-06 Petter Tomner > > [...snip...] > > > diff --git a/gcc/jit/docs/topics/expressions.rst > > b/gcc/jit/docs/topics/expressions.rst > > index 396259ef07e..5f64ca68595 100644 > > --- a/gcc/jit/docs/topics/expressions.rst > > +++ b/gcc/jit/docs/topics/expressions.rst > > @@ -126,6 +126,147 @@ Simple expressions > > underlying string, so it is valid to pass in a pointer to an > > on-stack > > buffer. > > > > +Constructor expressions > > +*** > > + > > + The following functions make constructors for array, struct and > > union > > + types. > > + > > + The constructor rvalue can be used for assignment to locals. > > + It can be used to initialize global variables with > > + :func:`gcc_jit_global_set_initializer_rvalue`. It can also be > > used as a > > + temporary value for function calls and return values, but its > > address > > + can't be taken. > > + > > + Note that arrays in libgccjit does not collapse to pointers > > like in > > s/does not/do not/ > > > + C. I.e. if an array constructor is used as e.g. a return value, > > the whole > > + array would be returned by value - array constructors can be > > assigned to > > + array variables. > > + > > + The constructor can contain nested constructors. > > + > > + Note that a string literal rvalue can't be used to construct a > > char array. > > + It need one rvalue for each char. > > s/char array. It need one rvalue/char array; the latter needs one > rvalue/ > > > + > > + These entrypoints were added in :ref:`LIBGCCJIT_ABI_16`; you > > can test for its > > s/16/17/ I believe. > > s/its/their/ > > > > + presense using: > > s
[PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
Hi, This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of a TI to a V2DI, then move the V2DI to V1TI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out on P9. The new test case illustrates it. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-12-13 Haochen Gui gcc/ * config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined. gcc/testsuite/ * gcc.target/powerpc/pr103124.c: New testcase. patch.diff diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index bf033e31c1c..7bca7780735 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -6589,3 +6589,19 @@ (define_insn "xxeval" [(set_attr "type" "vecperm") (set_attr "prefixed" "yes")]) +;; split TI to V1TI move +(define_split + [(set (match_operand:V1TI 0 "vsx_register_operand") + (subreg:V1TI + (match_operand:TI 1 "int_reg_operand") 0 ))] + "TARGET_P9_VECTOR && !reload_completed" + [(const_int 0)] +{ + rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0); + rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8); + rtx tmp3 = gen_reg_rtx (V2DImode); + emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2)); + rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0); + emit_move_insn (operands[0], tmp4); + DONE; +}) diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c new file mode 100644 index 000..724492dbcd2 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ +/* { dg-final { scan-assembler-not "\mmr\M" } } */ + +vector __int128 add (long long a) +{ + vector __int128 b; + b = (vector __int128) {a}; + return b; +}
[PATCH] rs6000: powerpc suboptimal boolean test of contiguous bits [PR102239]
Add specialized version to combine two instructions from 9: {r123:CC=cmp(r124:DI&0x6,0);clobber scratch;} REG_DEAD r124:DI 10: pc={(r123:CC==0)?L15:pc} REG_DEAD r123:CC to: 10: {pc={(r123:DI&0x6==0)?L15:pc};clobber scratch;clobber %0:CC;} then split2 will split it to one rotate dot instruction (to save one rotate back instruction) as shifted result doesn't matter when comparing to 0 in CCEQmode. Bootstrapped and regression tested pass on Power 8/9/10, OK for master? gcc/ChangeLog: PR target/102239 * config/rs6000/rs6000.md (*anddi3_insn_dot): New. gcc/testsuite/ChangeLog: PR target/102239 * gcc.target/powerpc/pr102239.c: New test. --- gcc/config/rs6000/rs6000-protos.h | 1 + gcc/config/rs6000/rs6000.c | 7 gcc/config/rs6000/rs6000.md | 38 + gcc/testsuite/gcc.target/powerpc/pr102239.c | 13 +++ 4 files changed, 59 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102239.c diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 14f6b313105..3644c524376 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -73,6 +73,7 @@ extern int expand_block_move (rtx[], bool); extern bool expand_block_compare (rtx[]); extern bool expand_strn_compare (rtx[], int); extern bool rs6000_is_valid_mask (rtx, int *, int *, machine_mode); +extern bool rs6000_is_valid_rotate_dot_mask (rtx mask, machine_mode mode); extern bool rs6000_is_valid_and_mask (rtx, machine_mode); extern bool rs6000_is_valid_shift_mask (rtx, rtx, machine_mode); extern bool rs6000_is_valid_insert_mask (rtx, rtx, machine_mode); diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 5e129986516..57a38cf954a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -11606,6 +11606,13 @@ rs6000_is_valid_mask (rtx mask, int *b, int *e, machine_mode mode) return true; } +bool +rs6000_is_valid_rotate_dot_mask (rtx mask, machine_mode mode) +{ + int nb, ne; + return rs6000_is_valid_mask (mask, &nb, &ne, mode) && nb >= ne && ne > 0; +} + /* Return whether MASK (a CONST_INT) is a valid mask for any rlwinm, rldicl, or rldicr instruction, to implement an AND with it in mode MODE. */ diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 6bec2bddbde..014dc9612ea 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -3762,6 +3762,44 @@ (define_insn_and_split "*and3_2insn_dot2" (set_attr "dot" "yes") (set_attr "length" "8,12")]) +(define_insn_and_split "*anddi3_insn_dot" + [(set (pc) +(if_then_else (eq (and:DI (match_operand:DI 1 "gpc_reg_operand" "%r,r") + (match_operand:DI 2 "const_int_operand" "n,n")) + (const_int 0)) + (label_ref (match_operand 3 "")) + (pc))) + (clobber (match_scratch:DI 0 "=r,r")) + (clobber (reg:CC CR0_REGNO))] + "rs6000_is_valid_rotate_dot_mask (operands[2], DImode) + && TARGET_POWERPC64" + "#" + "&& reload_completed" + [(pc)] +{ + int nb, ne; + if (rs6000_is_valid_mask (operands[2], &nb, &ne, DImode) + && nb >= ne + && ne > 0) + { + unsigned HOST_WIDE_INT val = INTVAL (operands[2]); + int shift = 63 - nb; + rtx tmp = gen_rtx_ASHIFT (DImode, operands[1], GEN_INT (shift)); + tmp = gen_rtx_AND (DImode, tmp, GEN_INT (val << shift)); + rtx cr0 = gen_rtx_REG (CCmode, CR0_REGNO); + rs6000_emit_dot_insn (operands[0], tmp, 1, cr0); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]); + rtx cond = gen_rtx_EQ (CCEQmode, cr0, const0_rtx); + rtx ite = gen_rtx_IF_THEN_ELSE (VOIDmode, cond, loc_ref, pc_rtx); + emit_jump_insn (gen_rtx_SET (pc_rtx, ite)); + DONE; + } + else + FAIL; +} + [(set_attr "type" "shift") + (set_attr "dot" "yes") + (set_attr "length" "8,12")]) (define_expand "3" [(set (match_operand:SDI 0 "gpc_reg_operand") diff --git a/gcc/testsuite/gcc.target/powerpc/pr102239.c b/gcc/testsuite/gcc.target/powerpc/pr102239.c new file mode 100644 index 000..1bafc9fe18e --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr102239.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-O2" } */ + +void foo(long arg) +{ + if (arg & ((1UL << 33) | (1UL << 34))) +asm volatile("# if"); + else +asm volatile("# else"); +} + +/* { dg-final { scan-assembler-times "rldicr." 1 } } */ -- 2.25.1
Re: [PATCH] pragma: Update target option node when optimization changes [PR103515]
on 2021/12/9 上午9:43, Jeff Law wrote: > > > On 12/6/2021 7:15 PM, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> For a function with optimize pragma, it's possible that the target >> options change as optimization options change. Now we create one >> optimization option node when parsing pragma optimize, but don't >> create target option node for possible target option changes. It >> makes later processing not detect the target options have actually >> changed and doesn't update the target options accordingly. >> >> This patch is to check whether target options have changed when >> creating one optimization option node for pragma optimize, and >> make one target option node if needed. The associated test case >> shows the difference. Without this patch, the function foo1 will >> perform unrolling which is unexpected. The reason is that flag >> unroll_only_small_loops isn't correctly set for it. The value >> is updated after parsing function foo2, but doesn't get restored >> later since both decls don't have DECL_FUNCTION_SPECIFIC_TARGET >> set and the hook think we don't need to switch. With this patch, >> there is no unrolling for foo1, which is also consistent with the >> behavior by replacing pragma by attribute whether w/ and w/o this >> patch. >> >> Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu >> and powerpc64{,le}-linux-gnu. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> --- >> gcc/ChangeLog: >> >> PR target/103515 >> * attribs.c (decl_attributes): Check if target options change and >> create one node if so. >> >> gcc/testsuite/ChangeLog: >> >> PR target/103515 >> * gcc.target/powerpc/pr103515.c: New test. > OK > jeff > Thanks for the review, Jeff! Committed as r12-5920. BR, Kewen
Re: Do not ICE when computing value range of ternary expression
On December 12, 2021 9:35:00 PM GMT+01:00, Jan Hubicka via Gcc-patches wrote: >Hi, >In evaluate_conditions_for_known_args we use range_fold_unary_expr and >range_fold_binary_expr to produce value ranges of the expression. >However the expression also may contain ternary COND_EXPR on which we >ICE. I did not find interface to do similar folding easily on ternary >exprs and since it is so rare case, i guess we can just punt and give up >on producing it. > >Bootstrapped/regtsted x86_64-linux, OK? Ok. Richard. >gcc/ChangeLog: > >2021-12-12 Jan Hubicka > > * ipa-fnsummary.c (evaluate_conditions_for_known_args): Do not ICE > on ternary expression. > >gcc/testsuite/ChangeLog: > >2021-12-12 Jan Hubicka > > * gcc.c-torture/compile/pr103513.c: New test. > >diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c >index 6c1cdf17e47..cb3c198ec0c 100644 >--- a/gcc/ipa-fnsummary.c >+++ b/gcc/ipa-fnsummary.c >@@ -513,7 +513,7 @@ evaluate_conditions_for_known_args (struct cgraph_node >*node, > op->index ? &vr : &op0); > } > else >- gcc_unreachable (); >+ res.set_varying (op->type); > type = op->type; > vr = res; > } >diff --git a/gcc/testsuite/gcc.c-torture/compile/pr103513.c >b/gcc/testsuite/gcc.c-torture/compile/pr103513.c >new file mode 100644 >index 000..ca876a9816c >--- /dev/null >+++ b/gcc/testsuite/gcc.c-torture/compile/pr103513.c >@@ -0,0 +1,8 @@ >+int a; >+void b(int c) { >+ int d = 3; >+ d ^= c < 2; >+ if (d < 3 && a) >+while (1) >+ b(!a); >+}
Re: [PATCH] pragma: Update target option node when optimization changes [PR103515]
on 2021/12/9 下午11:21, Martin Liška wrote: > On 12/7/21 03:15, Kewen.Lin wrote: >> Hi, >> >> For a function with optimize pragma, it's possible that the target >> options change as optimization options change. Now we create one >> optimization option node when parsing pragma optimize, but don't >> create target option node for possible target option changes. It >> makes later processing not detect the target options have actually >> changed and doesn't update the target options accordingly. >> >> This patch is to check whether target options have changed when >> creating one optimization option node for pragma optimize, and >> make one target option node if needed. The associated test case >> shows the difference. Without this patch, the function foo1 will >> perform unrolling which is unexpected. The reason is that flag >> unroll_only_small_loops isn't correctly set for it. The value >> is updated after parsing function foo2, but doesn't get restored >> later since both decls don't have DECL_FUNCTION_SPECIFIC_TARGET >> set and the hook think we don't need to switch. With this patch, >> there is no unrolling for foo1, which is also consistent with the >> behavior by replacing pragma by attribute whether w/ and w/o this >> patch. >> >> Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu >> and powerpc64{,le}-linux-gnu. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> --- >> gcc/ChangeLog: >> >> PR target/103515 >> * attribs.c (decl_attributes): Check if target options change and >> create one node if so. >> >> gcc/testsuite/ChangeLog: >> >> PR target/103515 >> * gcc.target/powerpc/pr103515.c: New test. >> >> - >> > > Hello. > > I do support the patch as it does pretty similar thing to what I did in > g:ebd5e86c0f41dc1d692f9b2b68a510b1f6835a3e. > The revision was about pragmas. > > I can confirm the patch can bootstrap on x86_64-linux-gnu and survives > regression tests. > > Martin Hi Martin, Thanks for the feedback and the pointer to your previous commit for the similar issue, I've added it in the commit log as one reference. BR, Kewen