On Mon, 1 Oct 2018, Richard Biener wrote: > On Mon, 1 Oct 2018, Jan Hubicka wrote: > > > > > > > The ODR savings really look good but as you say the implementation is > > > somewhat "tricky". > > > > > > I'd like to see the type-variant done separately (of course) and > > > also differently. This is because when enabling > > > free-lang-data by default we should be able to get benefits for > > > non-LTO compilations as well. Specifically I'd like us to > > > (in free-lang-data) "canonicalize" existing variants applying the > > > rules you derived for the lazier compare. At the point you then > > > drop non-fld-walked variants we could weed out the resulting > > > duplicates, keeping only a prevailing one in the list and (ab-)using > > > GC to fixup references. Like with ggc_register_fixup (void *from, void > > > *to) which would when following edges at mark time, adjust them > > > according to this map (I'm not sure if we'd want to do it manually > > > and record a vector of possibly affected TREE_TYPE uses during the fld > > > walk). We could also (easily?) optimize the streaming itself by > > > streaming only differences to the main variant of a type (but then > > > we could change our tree data structure to make that trivial). > > > > I had patch for streaming the differences only some time ago. My > > recollection > > is that we got it into tree and then reverted as there was some issues with > > cycles. I can look into this again. > > > > I would really like to avoid using ggc for IL rewriting. One reason is > > that eventually we would like to see GGC to go and thus we should not > > wire it more into the essential parts of the compiler and other is that > > GGC is often not run. > > Heh. > > > Saving type variants seems to have relatively minor effect on the IL size, > > so > > we need to have solution that is not too expensive to be justified. I > > suppose > > free lang data is sort of visiting all the relevant datastructures for > > middle-end so we could do rewriting there. It would also make sense to > > canonicalize types more based on knowledge whether they are used for memory > > access (i.e. for non-accesses go to main variant and perhaps invent > > something > > like main variant WRT useless conversions). > > I guess for that it would be better to put things like memory access > alignment into the memory-references rather than using TYPE_ALIGN. > Likewise for other things affecting semantics (TYPE_REF_CAN_ALIAS_ALL). > > Being able to simply drop to TYPE_MAIN_VARIANT would be very appealing... > (and simplify things). The hard thing is to figure out where we look > into those variant differences during late compilation... > > Type variants was always my first "easy" middle-end type related cleanup. > And for non-LTO ODR merging probably gets us nothing (the FE should do > the merging). > > > We could ggc_free the old variant and then ggc will ICE on dangling > > pointers. > > I can give it a try with my patch to prune the variant tree. > > Yes, definitely do that. I also _really_ like us to do FLD > unconditionally - maybe we can start by guarding individual pieces with > a for_lto flag we pass down. But esp. disabling langhooks and stuff like > the variant type purging would be good to get enabled for non-LTO > to not have that modes diverge more and more... > > > > > > > I wonder how much "ODR" merging we'd get for free when we canonicalize > > > types some more - that is, how do ODR equal but tree unequal types > > > usually differ? I guess it might be most of the time fields with > > > pointer to incomplete vs. complete type? > > > > Most of the time it is complete vs incomplete pointer type. > > I had statistics of type duplicates before, but they did not look as bad > > as reality. The reason is that smaller types tends to be merged while > > bigger types tends to be duplicated many times. In GCC this is typically > > RTL, gimple and derived types which are really many. > > I see. So one possible canonicalization is to make _all_ > pointer-typed FIELD_DECLs point to incomplete variants since the memory > accesses should already have the "proper" access types. Can you > get statistics on that? Not sure how to get an "incomplete" type > though (iff we can simply copy the type and NULL TYPE_FIELDs and > TYPE_SIZE and friends) - again I'd do that at FLD time.
So sth like tp = build_distinct_type_copy (t); TYPE_FIELDS (tp) = NULL_TREE; TYPE_SIZE (tp) = NULL_TREE; TYPE_SIZE_UNIT (tp) = NULL_TREE; tp = type_hash_canon (tp); of course we "leak" the original type in used COMPONENT_REFs (may also cause some verifier ICEs here if the types mismatch that of the FIELD_DECLs) and in aggregate copies, etc. But I wonder how much "unused" unnecessary types we have. That is, I'd paper over the ICEs this causes and not fixup the IL stream at first for example. Richard. > Richard. > > > Honza > > > > > > Thanks, > > > Richard. > > > > > > > > > > Honza > > > > > > > > Index: ipa-devirt.c > > > > =================================================================== > > > > --- ipa-devirt.c (revision 264689) > > > > +++ ipa-devirt.c (working copy) > > > > @@ -2111,6 +2111,29 @@ get_odr_type (tree type, bool insert) > > > > return val; > > > > } > > > > > > > > +/* Return the main variant of the odr type. This is used for straming > > > > out > > > > + to reduce number of type duplicates hitting the WPA->LTRANS streams. > > > > + Do not do so when ODR violation was detected since the type may be > > > > + structurally different then. */ > > > > + > > > > +tree > > > > +prevailing_odr_type (tree t) > > > > +{ > > > > + t = TYPE_MAIN_VARIANT (t); > > > > + /* In need_assembler_name_p we also mangle assembler names of > > > > INTEGER_TYPE. > > > > + We can not merge these because this does not honnor precision and > > > > + signedness. */ > > > > + if (!type_with_linkage_p (t) > > > > + || type_in_anonymous_namespace_p (t) > > > > + || TREE_CODE (t) == INTEGER_TYPE > > > > + || !COMPLETE_TYPE_P (t)) > > > > + return t; > > > > + odr_type ot = get_odr_type (t, true); > > > > + if (!ot || !ot->odr_violated) > > > > + return ot->type; > > > > + return t; > > > > +} > > > > + > > > > /* Add TYPE od ODR type hash. */ > > > > > > > > void > > > > Index: ipa-utils.h > > > > =================================================================== > > > > --- ipa-utils.h (revision 264689) > > > > +++ ipa-utils.h (working copy) > > > > @@ -90,6 +90,7 @@ void warn_types_mismatch (tree t1, tree > > > > location_t loc2 = UNKNOWN_LOCATION); > > > > bool odr_or_derived_type_p (const_tree t); > > > > bool odr_types_equivalent_p (tree type1, tree type2); > > > > +tree prevailing_odr_type (tree t); > > > > > > > > /* Return vector containing possible targets of polymorphic call E. > > > > If COMPLETEP is non-NULL, store true if the list is complete. > > > > Index: lto/lto.c > > > > =================================================================== > > > > --- lto/lto.c (revision 264689) > > > > +++ lto/lto.c (working copy) > > > > @@ -485,6 +485,8 @@ gimple_register_canonical_type_1 (tree t > > > > static void > > > > gimple_register_canonical_type (tree t) > > > > { > > > > + if (flag_checking) > > > > + verify_type (t); > > > > if (TYPE_CANONICAL (t) || !type_with_alias_set_p (t) > > > > || !canonical_type_used_p (t)) > > > > return; > > > > Index: lto-streamer-out.c > > > > =================================================================== > > > > --- lto-streamer-out.c (revision 264689) > > > > +++ lto-streamer-out.c (working copy) > > > > @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3. > > > > #include "debug.h" > > > > #include "omp-offload.h" > > > > #include "print-tree.h" > > > > +#include "attribs.h" > > > > +#include "ipa-utils.h" > > > > > > > > > > > > static void lto_write_tree (struct output_block*, tree, bool); > > > > @@ -109,14 +111,188 @@ destroy_output_block (struct output_bloc > > > > free (ob); > > > > } > > > > > > > > +/* Do same comparsion as check_qualified_type skipping lang part of > > > > type > > > > + and be more permissive about type names: we only care that names are > > > > + same (for diagnostics) and that ODR names are the same. */ > > > > > > > > -/* Look up NODE in the type table and write the index for it to OB. */ > > > > +static bool > > > > +types_equal_p (tree t, tree v) > > > > +{ > > > > + if (TYPE_QUALS (t) != TYPE_QUALS (v)) > > > > + return false; > > > > + > > > > + if (TYPE_NAME (t) != TYPE_NAME (v) > > > > + && (!TYPE_NAME (t) || !TYPE_NAME (v) > > > > + || TREE_CODE (TYPE_NAME (t)) != TYPE_DECL > > > > + || TREE_CODE (TYPE_NAME (v)) != TYPE_DECL > > > > + || DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (t)) > > > > + != DECL_ASSEMBLER_NAME_RAW (TYPE_NAME (v)) > > > > + || DECL_NAME (TYPE_NAME (t)) != DECL_NAME (TYPE_NAME (v)))) > > > > + return false; > > > > + > > > > + if (TYPE_ALIGN (t) != TYPE_ALIGN (v)) > > > > + return false; > > > > + > > > > + if (!attribute_list_equal (TYPE_ATTRIBUTES (t), > > > > + TYPE_ATTRIBUTES (v))) > > > > + return false; > > > > + > > > > + /* Do not replace complete type by incomplete. */ > > > > + if (COMPLETE_TYPE_P (t) != COMPLETE_TYPE_P (v)) > > > > + return false; > > > > + > > > > + if (TYPE_ADDR_SPACE (t) != TYPE_ADDR_SPACE (v)) > > > > + return false; > > > > + > > > > + gcc_assert (TREE_CODE (t) == TREE_CODE (v)); > > > > + > > > > + /* For pointer types and array types we also care about the type they > > > > + reffer to. */ > > > > + if (TREE_TYPE (t)) > > > > + return types_equal_p (TREE_TYPE (t), TREE_TYPE (v)); > > > > + > > > > + return true; > > > > +} > > > > + > > > > +/* Search list of type variants and look up first one that looks same. > > > > + At compile time, this removes duplicates of types created by > > > > front-ends. > > > > + At WPA time we also merge duplicated ODR types. */ > > > > + > > > > +static tree > > > > +first_compatible_type_variant (tree t) > > > > +{ > > > > + tree first = NULL; > > > > + if (POINTER_TYPE_P (t)) > > > > + { > > > > + tree t2 = first_compatible_type_variant (TREE_TYPE (t)); > > > > + if (t2 != TREE_TYPE (t)) > > > > + { > > > > + if (TREE_CODE (t) == POINTER_TYPE) > > > > + first = build_pointer_type_for_mode (t2, TYPE_MODE (t), > > > > + TYPE_REF_CAN_ALIAS_ALL > > > > (t)); > > > > + else > > > > + first = build_reference_type_for_mode (t2, TYPE_MODE (t), > > > > + TYPE_REF_CAN_ALIAS_ALL > > > > (t)); > > > > + } > > > > + else > > > > + first = TYPE_MAIN_VARIANT (t); > > > > + gcc_assert (TYPE_MAIN_VARIANT (first) == first); > > > > + } > > > > + else > > > > + first = prevailing_odr_type (t); > > > > + > > > > + gcc_checking_assert (gimple_canonical_types_compatible_p (t, first, > > > > false)); > > > > + > > > > + for (tree v = first; v; v = TYPE_NEXT_VARIANT (v)) > > > > + if (types_equal_p (t, v)) > > > > + { > > > > + gcc_checking_assert (gimple_canonical_types_compatible_p (t, v, > > > > false)); > > > > + gcc_checking_assert (!in_lto_p || TYPE_CANONICAL (t) == > > > > TYPE_CANONICAL (v)); > > > > + gcc_checking_assert (get_alias_set (t) == get_alias_set (v)); > > > > + return v; > > > > + } > > > > + > > > > + /* Most of the time we will find v itself in the list. With ODR > > > > type merging > > > > + we however merge across TYPE_MAIN_VARIANTs and in that case we > > > > may not > > > > + have corresponding type in the list. */ > > > > + tree v = build_variant_type_copy (first); > > > > + TYPE_READONLY (v) = TYPE_READONLY (t); > > > > + TYPE_VOLATILE (v) = TYPE_VOLATILE (t); > > > > + TYPE_ATOMIC (v) = TYPE_ATOMIC (t); > > > > + TYPE_RESTRICT (v) = TYPE_RESTRICT (t); > > > > + TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t); > > > > + SET_TYPE_ALIGN (v, TYPE_ALIGN (t)); > > > > + TYPE_NAME (v) = TYPE_NAME (t); > > > > + TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t); > > > > + TREE_TYPE (v) = TREE_TYPE (t); > > > > + gcc_checking_assert (types_equal_p (t, v)); > > > > + gcc_checking_assert (gimple_canonical_types_compatible_p (t, v, > > > > false)); > > > > + gcc_checking_assert (!in_lto_p || TYPE_CANONICAL (t) == > > > > TYPE_CANONICAL (v)); > > > > + gcc_checking_assert (get_alias_set (t) == get_alias_set (v)); > > > > + return v; > > > > +} > > > > + > > > > + > > > > +/* Look up NODE in the type table and write the index for it to OB. > > > > + Eliminate unnecesary type variants. */ > > > > > > > > static void > > > > output_type_ref (struct output_block *ob, tree node) > > > > { > > > > + bool existed_p; > > > > + unsigned int &index > > > > + = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].tree_hash_table-> > > > > + get_or_insert (node, &existed_p); > > > > streamer_write_record_start (ob, LTO_type_ref); > > > > - lto_output_type_ref_index (ob->decl_state, ob->main_stream, node); > > > > + > > > > + if (existed_p) > > > > + { > > > > + streamer_write_uhwi_stream (ob->main_stream, index); > > > > + return; > > > > + } > > > > + > > > > + tree node2 = first_compatible_type_variant (node); > > > > + > > > > + /* If NODE is first variant, just add it into encoder. */ > > > > + if (node2 == node) > > > > + { > > > > + index = > > > > ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.length (); > > > > + if (streamer_dump_file) > > > > + { > > > > + print_node_brief (streamer_dump_file, > > > > + " Encoding indexable ", node, 4); > > > > + fprintf (streamer_dump_file, " as %i\n", index); > > > > + } > > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.safe_push > > > > (node); > > > > + streamer_write_uhwi_stream (ob->main_stream, index); > > > > + return; > > > > + } > > > > + > > > > + index = 0xdead; > > > > + > > > > + if (streamer_dump_file) > > > > + { > > > > + print_node_brief (streamer_dump_file, " Type ", node, 4); > > > > + fprintf (streamer_dump_file, " prevailed by "); > > > > + print_node_brief (streamer_dump_file, "", node2, 4); > > > > + fprintf (streamer_dump_file, "\n"); > > > > + } > > > > + > > > > + /* See if we already assgined one to the first variant. If that is > > > > the case > > > > + only duplicate the entry in the hashtable so we do not need to > > > > call > > > > + first_compatible_type_variant redundantly. */ > > > > + unsigned int &index2 > > > > + = ob->decl_state->streams[LTO_DECL_STREAM_TYPE].tree_hash_table-> > > > > + get_or_insert (node2, &existed_p); > > > > + /* The reference may be changed by hash table expansion. */ > > > > + unsigned int i = index2; > > > > + > > > > + if (existed_p) > > > > + { > > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE]. > > > > + tree_hash_table->get_or_insert (node, &existed_p) = i; > > > > + streamer_write_uhwi_stream (ob->main_stream, i); > > > > + return; > > > > + } > > > > + else > > > > + { > > > > + index2 = > > > > ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.length (); > > > > + i = index2; > > > > + if (streamer_dump_file) > > > > + { > > > > + print_node_brief (streamer_dump_file, > > > > + " Encoding indexable ", node2, 4); > > > > + fprintf (streamer_dump_file, " as %i \n", i); > > > > + } > > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE].trees.safe_push > > > > (node2); > > > > + /* We need to search again to watch hash table resize. */ > > > > + ob->decl_state->streams[LTO_DECL_STREAM_TYPE]. > > > > + tree_hash_table->get_or_insert (node, &existed_p) = i; > > > > + gcc_assert (existed_p); > > > > + streamer_write_uhwi_stream (ob->main_stream, i); > > > > + } > > > > + return; > > > > } > > > > > > > > > > > > @@ -978,12 +1171,15 @@ hash_tree (struct streamer_tree_cache_d > > > > #define visit(SIBLING) \ > > > > do { \ > > > > unsigned ix; \ > > > > - if (!SIBLING) \ > > > > + tree t2 = SIBLING; \ > > > > + if (t2 && TYPE_P (t2)) \ > > > > + t2 = first_compatible_type_variant (t2); \ > > > > + if (!t2) \ > > > > hstate.add_int (0); \ > > > > - else if (streamer_tree_cache_lookup (cache, SIBLING, &ix)) \ > > > > + else if (streamer_tree_cache_lookup (cache, t2, &ix)) \ > > > > hstate.add_int (streamer_tree_cache_get_hash (cache, ix)); \ > > > > else if (map) \ > > > > - hstate.add_int (*map->get (SIBLING)); \ > > > > + hstate.add_int (*map->get (t2)); \ > > > > else \ > > > > hstate.add_int (1); \ > > > > } while (0) > > > > @@ -1554,6 +1750,10 @@ DFS::DFS_write_tree (struct output_block > > > > if (this_ref_p && tree_is_indexable (expr)) > > > > return; > > > > > > > > + /* Replace type by its prevailing variant. */ > > > > + if (TYPE_P (expr)) > > > > + expr = first_compatible_type_variant (expr); > > > > + > > > > /* Check if we already streamed EXPR. */ > > > > if (streamer_tree_cache_lookup (ob->writer_cache, expr, NULL)) > > > > return; > > > > @@ -1591,6 +1791,11 @@ lto_output_tree (struct output_block *ob > > > > return; > > > > } > > > > > > > > + if (TYPE_P (expr)) > > > > + expr = first_compatible_type_variant (expr); > > > > + > > > > + gcc_checking_assert (!this_ref_p || !tree_is_indexable (expr)); > > > > + > > > > existed_p = streamer_tree_cache_lookup (ob->writer_cache, expr, &ix); > > > > if (existed_p) > > > > { > > > > @@ -2334,7 +2539,18 @@ copy_function_or_variable (struct symtab > > > > gcc_assert (lto_tree_ref_encoder_size (encoder) == 0); > > > > encoder->trees.reserve_exact (n); > > > > for (j = 0; j < n; j++) > > > > - encoder->trees.safe_push ((*trees)[j]); > > > > + { > > > > + tree t = (*trees)[j]; > > > > + if (TYPE_P (t)) > > > > + t = first_compatible_type_variant (t); > > > > + if (streamer_dump_file) > > > > + { > > > > + print_node_brief (streamer_dump_file, > > > > + " Adding reference to ", t, 4); > > > > + fprintf (streamer_dump_file, " as %i \n", (int)j); > > > > + } > > > > + encoder->trees.safe_push (t); > > > > + } > > > > } > > > > > > > > lto_free_raw_section_data (file_data, LTO_section_function_body, > > > > name, > > > > @@ -2507,6 +2723,8 @@ write_global_stream (struct output_block > > > > print_node_brief (streamer_dump_file, "", t, 4); > > > > fprintf (streamer_dump_file, "\n"); > > > > } > > > > + gcc_checking_assert (!TYPE_P (t) > > > > + || t == first_compatible_type_variant (t)); > > > > if (!streamer_tree_cache_lookup (ob->writer_cache, t, NULL)) > > > > stream_write_tree (ob, t, false); > > > > } > > > > @@ -2537,6 +2755,8 @@ write_global_references (struct output_b > > > > t = lto_tree_ref_encoder_get_tree (encoder, index); > > > > streamer_tree_cache_lookup (ob->writer_cache, t, &slot_num); > > > > gcc_assert (slot_num != (unsigned)-1); > > > > + gcc_checking_assert (!TYPE_P (t) > > > > + || t == first_compatible_type_variant (t)); > > > > data[index + 1] = slot_num; > > > > } > > > > > > > > Index: tree-ssa-alias.c > > > > =================================================================== > > > > --- tree-ssa-alias.c (revision 264689) > > > > +++ tree-ssa-alias.c (working copy) > > > > @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. > > > > #include "tree-dfa.h" > > > > #include "ipa-reference.h" > > > > #include "varasm.h" > > > > +#include "ipa-utils.h" > > > > > > > > /* Broad overview of how alias analysis on gimple works: > > > > > > > > @@ -955,6 +956,12 @@ nonoverlapping_component_refs_of_decl_p > > > > ??? Bitfields can overlap at RTL level so punt on them. */ > > > > if (DECL_BIT_FIELD (field1) && DECL_BIT_FIELD (field2)) > > > > return false; > > > > + /* In LTO we may end up merging different variants of ODR > > > > types > > > > + but keep corresponding field decls unmerged. */ > > > > + if (in_lto_p > > > > + && type_with_linkage_p (type1) > > > > + && DECL_NAME (field1) == DECL_NAME (field2)) > > > > + return false; > > > > return true; > > > > } > > > > } > > > > @@ -1049,7 +1056,9 @@ nonoverlapping_component_refs_p (const_t > > > > { > > > > /* We're left with accessing different fields of a structure, > > > > no possible overlap. */ > > > > - if (fieldx != fieldy) > > > > + if (fieldx != fieldy > > > > + && (in_lto_p || !type_with_linkage_p (typex) > > > > + || DECL_NAME (fieldx) == DECL_NAME (fieldy))) > > > > { > > > > /* A field and its representative need to be considered > > > > the > > > > same. */ > > > > Index: tree.c > > > > =================================================================== > > > > --- tree.c (revision 264689) > > > > +++ tree.c (working copy) > > > > @@ -5845,7 +5861,12 @@ free_lang_data_in_cgraph (void) > > > > > > > > /* Traverse every type found freeing its language data. */ > > > > FOR_EACH_VEC_ELT (fld.types, i, t) > > > > - free_lang_data_in_type (t); > > > > + { > > > > + free_lang_data_in_type (t); > > > > + while (TYPE_NEXT_VARIANT (t) > > > > + && !fld.pset.contains (TYPE_NEXT_VARIANT (t))) > > > > + TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT > > > > (t)); > > > > + } > > > > if (flag_checking) > > > > { > > > > FOR_EACH_VEC_ELT (fld.types, i, t) > > > > > > > > > > > > > > -- > > > Richard Biener <rguent...@suse.de> > > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, > > > HRB 21284 (AG Nuernberg) > > > > > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)