Re: Detect EAF flags in ipa-modref
On Tue, 10 Nov 2020, Jan Hubicka wrote: > > > + tree callee = gimple_call_fndecl (stmt); > > > + if (callee) > > > +{ > > > + cgraph_node *node = cgraph_node::get (callee); > > > + modref_summary *summary = node ? get_modref_function_summary (node) > > > + : NULL; > > > + > > > + if (summary && summary->arg_flags.length () > arg) > > > > So could we make modref "transform" push this as fnspec attribute or > > would that not really be an optimization? > > It was my original plan to synthetize fnspecs, but I think it is not > very good idea: we have the summary readily available and we can > represent information that fnspecs can't > (do not have artificial limits on number of parameters or counts) > > I would preffer fnspecs to be used only for in-compiler declarations. Fine, I was just curious... > > > + > > > +/* Analyze EAF flags for SSA name NAME. > > > + KNOWN_FLAGS is a cache for flags we already determined. > > > + DEPTH is a recursion depth used to make debug output prettier. */ > > > + > > > +static int > > > +analyze_ssa_name_flags (tree name, vec *known_flags, int depth) > > > > C++ has references which makes the access to known_flags nicer ;) > > Yay, will chang that :) > > > > > +{ > > > + imm_use_iterator ui; > > > + gimple *use_stmt; > > > + int flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED; > > > + > > > + /* See if value is already computed. */ > > > + if ((*known_flags)[SSA_NAME_VERSION (name)]) > > > +{ > > > + /* Punt on cycles for now, so we do not need dataflow. */ > > > + if ((*known_flags)[SSA_NAME_VERSION (name)] == 1) > > > + { > > > + if (dump_file) > > > + fprintf (dump_file, > > > + "%*sGiving up on a cycle in SSA graph\n", depth * 4, ""); > > > + return 0; > > > + } > > > + return (*known_flags)[SSA_NAME_VERSION (name)] - 2; > > > +} > > > + /* Recursion guard. */ > > > + (*known_flags)[SSA_NAME_VERSION (name)] = 1; > > > > This also guards against multiple evaluations of the same stmts > > but only in some cases? Consider > > > > _1 = ..; > > _2 = _1 + _3; > > _4 = _1 + _5; > > _6 = _2 + _4; > > > > where we visit _2 = and _4 = from _1 but from both are going > > to visit _6. > > Here we first push _6, then we go for _2 then for _1 evaluate _1, > evalueate _2, go for _4 and evaluate _4, and evaluate _6. > It is DFS and you need backward edge in DFS (comming from a PHI). Hmm, but then we eventually evaluate _6 twice? > > Cycles seems to somewhat matter for GCC: we do have a lot of functions > that walk linked lists that we could track otherwise. > > > > Maybe I'm blind but you're not limiting depth? Guess that asks > > for problems, esp. as you are recursing rather than using a > > worklist or so? > > > > I see you try to "optimize" the walk by only visiting def->use > > links from parameters but then a RPO walk over all stmts would > > be simpler iteration-wise ... > We usually evaluate just small part of bigger functions (since we lose > track quite easily after hitting first memory store). My plan was to > change this to actual dataflow once we have it well defined > (this means after discussing EAF flags with you and adding the logic to > track callsites for true IPA pass that midly complicated things - for > every ssa name I track callsite/arg pair where it is passed to > either directly or indirectly. Then this is translaed into call summary > and used by IPA pass to compute final flags) > > I guess I can add --param ipa-modref-walk-depth for now and handle > dataflow incremntally? Works for me. > In particular I am not sure if I should just write iterated RPO myself > or use tree-ssa-propagate.h (the second may be overkill). tree-ssa-propagate.h is not to be used, it should DIE ;) I guess you do want to iterate SSA cycles rather than BB cycles so I suggest to re-surrect the SSA SCC discovery from the SCC value-numbering (see tree-ssa-sccvn.c:DFS () on the gcc-8-branch) which is non-recursive and micro-optimized. Could put it somewhere useful (tree-ssa.c?). > > > > > + if (dump_file) > > > +{ > > > + fprintf (dump_file, > > > +"%*sAnalyzing flags of ssa name: ", depth * 4, ""); > > > + print_generic_expr (dump_file, name); > > > + fprintf (dump_file, "\n"); > > > +} > > > + > > > + FOR_EACH_IMM_USE_STMT (use_stmt, ui, name) > > > +{ > > > + if (flags == 0) > > > + { > > > + BREAK_FROM_IMM_USE_STMT (ui); > > > + } > > > + if (is_gimple_debug (use_stmt)) > > > + continue; > > > + if (dump_file) > > > + { > > > + fprintf (dump_file, "%*s Analyzing stmt:", depth * 4, ""); > > > + print_gimple_stmt (dump_file, use_stmt, 0); > > > + } > > > + > > > + /* Gimple return may load the return value. */ > > > + if (gimple_code (use_stmt) == GIMPLE_RETURN) > > > > if (greturn *ret = dyn_cast (use_stmt)) > > > > makes the as_a below not needed, similar for
Re: Dead Field Elimination and Field Reordering
Hello, I will be posting today some changes on the patches mailing list. I believe that due to some changes in the way clones are materialized, the transformation now supports IPA-SRA by default. I still need to test this more thoroughly, but initial tests show that this is no longer an issue. After this, I will be removing std data structures and using specific ones. If anyone have any comments about the transformation, please let me know. I am happy to answer questions. -Erick On 06.11.20 05:51, Erick Ochoa wrote: Hi Richard, just some top-level comments before I write about anything specific: * I have removed all non-essential flags I introduced * I have placed the standard headers before config * I have squashed some changes that I sent to the patches mailing list and make sure that the transformation bootstraps on every commit * I will be sending another set of patches to the patch mailing list soon that have the changes mentioned above. Things I will be working on the next couple of days: * Removing std datastructures and using GCC specific ones. * Adding support for ipa-sra. * Flattening the visitors. On 05/11/2020 14:10, Richard Biener wrote: On Tue, Nov 3, 2020 at 5:21 PM Erick Ochoa wrote: Thanks for the review Richard I'll address what I can. I also provide maybe some hindsight into some of the design decisions here. I'm not trying to be defensive just hoping to illuminate why some decisions were made and how some criticisms may fail to really address the reason why these designs were made. On 03/11/2020 15:58, Richard Biener wrote: On Fri, Oct 30, 2020 at 6:44 PM Erick Ochoa wrote: Hello again, I've been working on several implementations of data layout optimizations for GCC, and I am again kindly requesting for a review of the type escape based dead field elimination and field reorg. Thanks to everyone that has helped me. The main differences between the previous commits have been fixing the style, adding comments explaining classes and families of functions, exit gracefully if we handle unknown gimple syntax, and added a heuristic to handle void* casts. This patchset is organized in the following way: * Adds a link-time warning if dead fields are detected * Allows for the dead-field elimination transformation to be applied * Reorganizes fields in structures. * Adds some documentation * Gracefully does not apply transformation if unknown syntax is detected. * Adds a heuristic to handle void* casts I have tested this transformations as extensively as I can. The way to trigger these transformations are: -fipa-field-reorder and -fipa-type-escape-analysis Having said that, I welcome all criticisms and will try to address those criticisms which I can. Please let me know if you have any questions or comments, I will try to answer in a timely manner. The code is in: refs/vendors/ARM/heads/arm-struct-reorg-wip Future work includes extending the current heuristic with ipa-modref extending the analysis to use IPA-PTA as discussed previously. Few notes: * Currently it is not safe to use -fipa-sra. * I added some tests which are now failing by default. This is because there is no way to safely determine within the test case that a layout has been transformed. I used to determine that a field was eliminated doing pointer arithmetic on the fields. And since that is not safe, the analysis decides not to apply the transformation. There is a way to deal with this (add a flag to allow the address of a field to be taken) but I wanted to hear other possibilities to see if there is a better option. * At this point we’d like to thank the again GCC community for their patient help so far on the mailing list and in other channels. And we ask for your support in terms of feedback, comments and testing. I've only had a brief look at the branch - if you want to even have a remote chance of making this stage1 you should break the branch up into a proper patch series and post it with appropriate ChangeLogs and descriptions. First, standard includes may _not_ be included after including system.h, in fact, they _need_ to be included from system.h - that includes things like or . There are "convenient" defines you can use like #define INCLUDE_SET #include "system.h" and system.h will do what you want. Not to say that you should use GCCs containers and not the standard library ones. Thanks I didn't know this! You expose way too many user-visible command-line options. Yes, I can certainly remove some debugging flags. All the stmt / tree walking "meta" code should be avoided - it would need to be touched each time we change GIMPLE or GENERIC. Instead use available walkers if you really need it in such DFS-ish way. There are two points being made here: 1. Use the available walkers 2. Changing gimple would imply changes to your code First: I did tried using the available walkers in the past, and the walk_tree function does not contain a post-o
Re: typeof and operands in named address spaces
On Tue, Nov 10, 2020 at 09:11:08PM +0100, Peter Zijlstra wrote: > On Tue, Nov 10, 2020 at 10:42:58AM -0800, Nick Desaulniers wrote: > > When I think of qualifiers, I think of const and volatile. I'm not > > sure why the first post I'm cc'ed on talks about "segment" qualifiers. > > Maybe it's in reference to a variable attribute that the kernel > > defines? Looking at Clang's Qualifier class, I see const, volatile, > > restrict (ah, right), some Objective-C stuff, and address space > > (TR18037 is referenced, I haven't looked up what that is) though maybe > > "segment" pseudo qualifiers the kernel defines expand to address space > > variable attributes? > > Right, x86 Named Address Space: > > > https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Named-Address-Spaces.html#Named-Address-Spaces > > Also, Google found me this: > > https://reviews.llvm.org/D64676 > > The basic problem seems to be they act exactly like qualifiers in that > typeof() preserves them, so if you have: GCC has the four standard type qualifiers (const, volatile, restrict, and _Atomic), but also the address space things yes. > > Maybe stripping all qualifiers is fine since you can add them back in > > if necessary? > > So far that seems sufficient. Although the Devil's advocate in me is > trying to construct a case where we need to preserve const but strip > volatile and that's then means we need to detect if the original has > const or not, because unconditionally adding it will be wrong. If you want to drop all qualifiers, you only need a way to convert something to an rvalue (which always has an unqualified type). So maybe make syntax for just *that*? __builtin_unqualified() perhaps? Which could be useful in more places than just doing an unqualified_typeof. Segher
Re: typeof and operands in named address spaces
On Tue, Nov 10, 2020 at 08:57:42AM +0100, Peter Zijlstra wrote: > On Mon, Nov 09, 2020 at 11:50:15AM -0800, Nick Desaulniers wrote: > > On Mon, Nov 9, 2020 at 11:46 AM Segher Boessenkool > > wrote: > > > On Mon, Nov 09, 2020 at 01:47:13PM +0100, Peter Zijlstra wrote: > > > > C in general does not provide means to strip qualifiers. > > > > > > Most ways you can try to use the result are undefined behaviour, even. > > > > Yes, removing `const` from a `const` declared variable (via cast) then > > expecting to use the result is a great way to have clang omit the use > > from the final program. This has bitten us in the past getting MIPS > > support up and running, and one of the MTK gfx drivers. > > Stripping const to delcare another variable is useful though. Sure C has > sharp edges, esp. if you cast stuff, but since when did that stop anyone > ;-) My point is that removing most qualifiers usually is a problem, so before doing this, we should think if it is such a good plan, whether there is a safer / saner solution, etc. Segher
Using IFUNC with template functions.
Hi All, I want to have a template function as an ifunc. Find below my test program. // Test Program #include int glob_t = 2; typedef void (*VOID_FUNC_P)(void); template T add_isa1(T x, T y) { std::cout << "ISA1 implementation" << std::endl; return x + y; } template T add_isa2(T x, T y) { std::cout << "ISA2 implementation" << std::endl; return x + y; } template static VOID_FUNC_P add_resolver(void) { T (*T_p)(T,T) = NULL; switch (glob_t) { case 1: T_p = add_isa1; break; case 2: T_p = add_isa2; break; } return reinterpret_cast(T_p); } template VOID_FUNC_P add_resolver(void); template VOID_FUNC_P add_resolver(void); // explicit/manual instantiation of possibilities int add(int, int) __attribute((ifunc("_Z12add_resolverIiEPFvvEv"))); double add(double, double) __attribute((ifunc("_Z12add_resolverIdEPFvvEv"))); int main() { //int res = add(1,68); double res = add(1.68,68.01); std::cout << "Res = " << res << std::endl; return 0; } I have to explicitly declare the ifunc resolver function for all possible combinations of template arguments. In reality, I think it would be impractical for programmers to do this kind of manual/explicit instantiation. I am interested to know if there any other better way to use ifuncs with template functions. If there is none, is it worth suggesting to the C++ standards? Thanks and Regards, Amrita H S