Re: Has GCC completed C++ 20 module support?
Please don't cross-post to the gcc and gcc-help lists. Either you are asking about GCC development of asking about using it, not both. Pick one list. On Tue, 2 Nov 2021, 04:22 sotrdg sotrdg via Gcc-help, wrote: > It looks like It is still in early phase. the fmodule-ts still emits dead > code for example. > > any progress here? > I wouldn't say early, but it is incomplete, as documented in the GCC 11 release notes and at https://gcc.gnu.org/projects/cxx-status.html#cxx20
Question on cgraph_node::force_output
Hi, I am looking at tree-ssa-structalias.c looking at what makes a function nonlocal during IPA-PTA. I am having some problems understanding force_output and when it is set or unset. 1. What is the meaning of force_output? cgraph.h gives an example that force output means that the symbol might be used in an invisible way. I believe this means some sort of "unanalyzable" way. However, for a few tests I've made, all functions have the field force_output set to true. 2. Does this value depend on some other pass? At the moment, I am looking at this field within my own passes (IPA_PASS and SIMPLE_IPA_PASS), but I would like to inspect the dump_file(s) which show information about force_output to make sure that it doesn't depend on pass order or even my own flags. 3. What flags should I use to inspect force_output? Thanks!
Re: [PATCH] Add -fopt-builtin optimization option
On Sun, Oct 31, 2021 at 11:13 AM Keith Packard via Gcc-patches wrote: > > This option (enabled by default) controls optimizations which convert > a sequence of operations into an equivalent sequence that includes > calls to builtin functions. Typical cases here are code which matches > memcpy, calloc, sincos. > > The -ftree-loop-distribute-patterns flag only covers converting loops > into builtin calls, not numerous other places where knowledge of > builtin function semantics changes the generated code. > > The goal is to allow built-in functions to be declared by the compiler > and used directly by the application, but to disable optimizations > which create new calls to them, and to allow this optimization > behavior to be changed for individual functions by decorating the > function definition like this: > > void > attribute((optimize("no-opt-builtin"))) > sincos(double x, double *s, double *c) > { > *s = sin(x); > *c = cos(x); > } > > This also avoids converting loops into library calls like this: > > void * > attribute((optimize("no-opt-builtin"))) > memcpy(void *__restrict__ dst, const void *__restrict__ src, size_t n) > { > char *d = dst; > const char *s = src; > > while (n--) > *d++ = *s++; > return dst; > } > > As well as disabling analysis of memory lifetimes around free as in > this example: > > void * > attribute((optimize("no-opt-builtin"))) > erase_and_free(void *ptr) > { > memset(ptr, '\0', malloc_usable_size(ptr)); > free(ptr); > } > > Clang has a more sophisticated version of this mechanism which > can disable all builtins, or disable a specific builtin: > > double > attribute((no_builtin("exp2"))) > exp2(double x) > { > return pow (2.0, x); > } I don't think it reliably works the way you implement it. It's also having more side-effects than what you document, in particular pow (2.0, x); will now clobber and use global memory (besides errno). I think you may want to instead change builtin_decl_implicit to avoid code-generating a specific builtin. Generally we'd also want sth like the clang attribute and _not_ use optimize("") for this or a global flag_*, so the behavior can be more readily encoded in the IL. In fact a flag on the call statement could be added to denote the desired effect on it. I also don't see the advantage compared to -fno-builtin[-foo]. Declaring the function should be something that's already done. Richard. > Signed-off-by: Keith Packard > --- > gcc/builtins.c | 6 ++ > gcc/common.opt | 4 > gcc/gimple.c | 3 +++ > gcc/tree-loop-distribution.c | 2 ++ > 4 files changed, 15 insertions(+) > > diff --git a/gcc/builtins.c b/gcc/builtins.c > index 7d0f61fc98b..7aae57deab5 100644 > --- a/gcc/builtins.c > +++ b/gcc/builtins.c > @@ -1922,6 +1922,9 @@ mathfn_built_in_2 (tree type, combined_fn fn) >built_in_function fcodef64x = END_BUILTINS; >built_in_function fcodef128x = END_BUILTINS; > > + if (flag_no_opt_builtin) > +return END_BUILTINS; > + >switch (fn) > { > #define SEQ_OF_CASE_MATHFN \ > @@ -2125,6 +2128,9 @@ mathfn_built_in_type (combined_fn fn) >case CFN_BUILT_IN_##MATHFN##L_R: \ > return long_double_type_node; > > + if (flag_no_opt_builtin) > +return NULL_TREE; > + >switch (fn) > { > SEQ_OF_CASE_MATHFN > diff --git a/gcc/common.opt b/gcc/common.opt > index eeba1a727f2..d6111cc776a 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2142,6 +2142,10 @@ fomit-frame-pointer > Common Var(flag_omit_frame_pointer) Optimization > When possible do not generate stack frames. > > +fopt-builtin > +Common Var(flag_no_opt_builtin, 0) Optimization > +Match code sequences equivalent to builtin functions > + > fopt-info > Common Var(flag_opt_info) Optimization > Enable all optimization info dumps on stderr. > diff --git a/gcc/gimple.c b/gcc/gimple.c > index 22dd6417d19..5b82b9409c0 100644 > --- a/gcc/gimple.c > +++ b/gcc/gimple.c > @@ -2790,6 +2790,9 @@ gimple_builtin_call_types_compatible_p (const gimple > *stmt, tree fndecl) > { >gcc_checking_assert (DECL_BUILT_IN_CLASS (fndecl) != NOT_BUILT_IN); > > + if (flag_no_opt_builtin) > +return false; > + >tree ret = gimple_call_lhs (stmt); >if (ret >&& !useless_type_conversion_p (TREE_TYPE (ret), > diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c > index 583c01a42d8..43f22a3c7ce 100644 > --- a/gcc/tree-loop-distribution.c > +++ b/gcc/tree-loop-distribution.c > @@ -1859,6 +1859,7 @@ loop_distribution::classify_partition (loop_p loop, > >/* Perform general partition disqualification for builtins. */ >if
Re: -Wuninitialized false positives and threading knobs
On Mon, Nov 1, 2021 at 4:18 PM Jeff Law via Gcc wrote: > > > > On 10/31/2021 6:12 AM, Aldy Hernandez wrote: > > After Jeff's explanation of the symbiosis between jump threading and > > the uninit pass, I'm beginning to see that (almost) every > > Wuninitialized warning is cause for reflection. It usually hides a > > missing jump thread. I investigated one such false positive > > (uninit-pred-7_a.c) and indeed, there's a missing thread. The > > question is what to do about it. > > > > This seemingly simple test is now regressing as can be seen by the > > xfail I added. > This looks amazingly familiar. You might want to look at this old thread: > > https://gcc.gnu.org/pipermail/gcc-patches/2017-May/474229.html > > > What happened was that threading did a better job, but in the process > the shape of the CFG changed in ways that made it harder for the > predicate analysis pass to prune paths. Richi & I never reached any > kind of conclusion on that patch, so it's never been applied. Now there's also rangers relation oracle (not sure if that's even moderately powerful enough to cobble up predicates of two points in the CFG and relate them) and Martin(?) has split out the predicate analysis bits from uninit analysis. My stance is still that the machinery needs generalization. > Remember, that the whole point behind the predicate analysis pass is to > deal with infeasible paths that may be the in the CFG, including cases > where the threaders may have found a jump thread, but not optimized it > due to code size considerations. > > So one of the first things I'd do is look at the dumps prior to your > changes and see if the uninitialized use was still in the IL in > the.uninit dump, but was analyzed as properly guarded by predicate analysis. > > > > > What happens is that we now thread far more than before, causing the > > distance from definition to use to expand. The threading candidate > > that would make the Wuninitialized go away is there, and the backward > > threader can see it, but it refuses to thread it because the number of > > statements would be too large. > Right. > > > > > This is interesting because it means threading is causing larger IL > > that in turn keeps us from threading some unreachable paths later on > > because the paths are too large. > Yes. This is not unexpected. Jump threading reduces dynamic > conditional jumps and statements executed, but often at the expense of > increasing code size, much like PRE. Jump threading also can create > scenarios that can't be handled by the predicate analysis pass. > > The other thing to review is whether or not you're accounting for > statements that are going to be removed as a result of jump threading. > I had Alex implement that a few years back for the forward threader. > Essentially statements which exist merely to compute the conditional we > thread are going to be removed and we need not worry about the cost of > copying them which allowed us to thread many cases we had missed before > without increasing codesize. Yeah, and code size is important so simply upping the limit isn't the way to go since there's usually zero chance of a reverse transform later. > Anyway, those are the research areas to look at first, then we'll figure > out what the next steps are. > > JEff >
Re: libgfortran.so SONAME and powerpc64le-linux ABI changes (work in progress patches)
On Mon, Nov 01, 2021 at 10:56:33AM -0500, Bill Schmidt wrote: > Would starting from Advance Toolchain 15 with the most recent glibc make > things easier for Thomas to test? The problem is gcc135 runs Centos 7.x which is not compatible with AT 13-15. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com
Re: [PATCH] Add -fopt-builtin optimization option
Richard Biener writes: > I don't think it reliably works the way you implement it. It's also having > more side-effects than what you document, in particular Yeah, I made a 'minimal' patch that had the effect I needed, but it's clearly in the wrong place as it disables the matching of builtins against the incoming source code instead of the generation of new builtin references from the tree. > I think you may want to instead change builtin_decl_implicit > to avoid code-generating a specific builtin. Yup, I looked at that and there are numerous places which assume that will work, so it will be a more complicated patch. > Generally we'd also want sth like the clang attribute and _not_ > use optimize("") for this or a global flag_*, so the behavior can > be more readily encoded in the IL. In fact a flag on the call > statement could be added to denote the desired effect on it. Agreed, using the existing optimize attribute was a short-cut to leverage the existing code handling that case. If we think providing something that matches the clang attribute would be useful, it makes sense to provide it using the same syntax. > I also don't see the advantage compared to -fno-builtin[-foo]. > Declaring the function should be something that's already done. The semantic of the clang option is not to completely disable access to the given builtin function, but rather to stop the optimizer from creating new builtin function references (either to a specific builtin, or to all builtins). If I could use "no-builtin" in a function attribute, I probably wouldn't have bothered looking to implement the clang semantics, but -fno-builtin isn't supported in this way. But, now that I think I understand the behavior of attribute((no_builtin)) in clang, I think it has value beyond what -fno-builtin performs as you can still gain access to builtin functions when they are directly named. I'll go implement changes in builtin_decl_implicit and all of the affected call sites and see what that looks like. Thanks much for your review! -- -keith signature.asc Description: PGP signature
[PATCH] Add 'no_builtin' function attribute
This attribute controls optimizations which make assumptions about the semantics of builtin functions. Typical cases here are code which match memcpy, calloc, sincos, or which call builtins like free. This extends on things like the -ftree-loop-distribute-patterns flag. That flag only covers converting loops into builtin calls, not numerous other places where knowledge of builtin function semantics changes the generated code. The goal is to allow built-in functions to be declared by the compiler and used directly by the application, but to disable optimizations which take advantage of compiler knowledge about their semantics, and to allow this optimization behavior to be changed for individual functions. One place where this behavior is especially useful is when compiling the builtin functions that gcc knows about, as in the C library. Currently, C library source code and build systems have various kludges to work around the compilers operations in these areas, using a combination of -fno-tree-loop-distribute-patterns, -fno-builtins and even symbol aliases to keep GCC from generating infinite recursions. This can be applied globally to a file using the -fno-optimize-builtin flag. This disables optimizations which translate a sequence of builtin calls into an equivalent sequence: void attribute((no_builtin)) sincos(double x, double *s, double *c) { *s = sin(x); *c = cos(x); } This also avoids converting loops into builtin calls like this: void * attribute((no_builtin)) memcpy(void *__restrict__ dst, const void *__restrict__ src, size_t n) { char *d = dst; const char *s = src; while (n--) *d++ = *s++; return dst; } As well as disabling analysis of memory lifetimes around free as in this example: void * attribute((no_builtin)) erase_and_free(void *ptr) { memset(ptr, '\0', malloc_usable_size(ptr)); free(ptr); } It also prevents converting builtin calls into inline code: void attribute((no_builtin)) copy_fixed(char *dest) { strcpy(dest, "hello world"); } Clang has a more sophisticated version of this mechanism which can disable specific builtins: double attribute((no_builtin("exp2"))) exp2(double x) { return pow (2.0, x); } The general approach in this change is to introduce checks in some places where builtin functions are used to see if the specific function is 'allowed' to be used for optimization, skipping the optimization when the desired function has been disabled. Three new functions, builtin_decl_implicit_opt_p, builtin_decl_explicit_opt and builtin_decl_implicit_opt are introduced which add checks for whether the compiler can assume standard semantics for the specified function for purposes of optimization. These are used throughout the compiler wherever appropriate. Code which must use builtins for correct operation (e.g. struct assignment) are not affected. The machinery proposed here could be extended to support the additional clang feature by extending the attribute parsing function and creating a list of disabled builtins checked by the builtin_decl functions described above. Signed-off-by: Keith Packard --- gcc/builtins.c | 12 +++--- gcc/c-family/c-attribs.c | 68 ++ gcc/common.opt | 4 ++ gcc/gimple-fold.c| 72 ++-- gcc/gimple-match-head.c | 2 +- gcc/tree-loop-distribution.c | 7 gcc/tree-ssa-alias.c | 3 +- gcc/tree-ssa-strlen.c| 48 ++-- gcc/tree-ssa-structalias.c | 3 +- gcc/tree.h | 39 +++ 10 files changed, 194 insertions(+), 64 deletions(-) diff --git a/gcc/builtins.c b/gcc/builtins.c index 7d0f61fc98b..d665ee716e8 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -2061,7 +2061,7 @@ mathfn_built_in_1 (tree type, combined_fn fn, bool implicit_p) if (fcode2 == END_BUILTINS) return NULL_TREE; - if (implicit_p && !builtin_decl_implicit_p (fcode2)) + if (implicit_p && !builtin_decl_implicit_opt_p (fcode2)) return NULL_TREE; return builtin_decl_explicit (fcode2); @@ -3481,9 +3481,9 @@ expand_builtin_stpcpy_1 (tree exp, rtx target, machine_mode mode) src = CALL_EXPR_ARG (exp, 1); /* If return value is ignored, transform stpcpy into strcpy. */ - if (target == const0_rtx && builtin_decl_implicit (BUILT_IN_STRCPY)) + if (target == const0_rtx && builtin_decl_implicit_opt (BUILT_IN_STRCPY)) { - tree fn = builtin_decl_implicit (BUILT_IN_STRCPY); + tree fn = builtin_decl_implicit_opt (BUILT_IN_STRCPY); tree result = build_call_nofold_loc (loc, fn, 2, dst