> On Tue, 24 Nov 2015, Jan Hubicka wrote: > > > > > > > > > We do already wrap all bases into MEM_REFs at streaming time, it would > > > > be easy to adjust it to make it effectively alias-set zero. But of > > > > course the overhead and the downstream effects of having more MEM_REFs > > > > (we strip the unneeded ones at stream-in) are unknown (compared to > > > > the effect of disabling inlining). > > > > > > Hmm, I can test in on Firefox (once I get it back to working condition). > > > > One way would be to keep current MEM_REFS stripping and conditoinal in > > get_alias_set on strict aliasing, but extend inliner to introduce them at a > > point -fno-strict-aliasing is inlined to -fstrict-aliasing. That way we > > could > > drop the code in lto-streamer-out that forcingly set alias set to 0 when > > get_alias_set == 0 and hopefully get all code transitions right. > > Yeah, that could also work. We can also rewrite overflow stuff > this way to do overflow related inlining (in one direction only?). > That is, when inlining !strict-overflow into strict-overflow code > re-write arithmetic to unsigned during inlining. > > Sth for next stage1. > > Maybe you can open an enhancement PR for these cases.
I will certainly do. sadly more I understand the implementation the easier I can consturct wrong code examples (see testcases bellow). I think the whole idea of storing TYPE_ALIAS_SET at streaming out time is not working well. First of all it does not solve optimization attribute and second we can randomly lose the info (on prestreamed type or types where canonical type merging prevails with non-0 alias set type) or push random type to alias set 0 (where canonical type merging prevail the oposite direction). I do not see how to easily fix it: canonical type merging can not make difference between alias set 0 types and others unless we make it clear that the derived types can not alias (which I think they can). I suppose only way here would be to force all alias set 0 types to be variant and revisit all the code to check it before going to main vairant&canonical type. Compared to that I like the solution with flag in MEM_REF better, but that of course is an invasive change where we will need to revisit all MEM_REF construction to set the flag correctly. I wonder what would you think of the following patch. It basically makes type representation to be completely agnostic of -fstrict-aliasing (it should be because -fstrict-aliasing is function local property, while types are not) and makes -fstrict-aliasing to be purely evaulated at a time we ask TBAA oracle. I disabled the TYPE_ALIAS_SET streaming and instead I assert that LTO's implementation will be compatible (which caught some surprises where we tamper with alias set in rtti.c and free_lang_data) I modifed inliner to make the -fstrict-aliasing infectious. That is instead of forbidding the inlinng it simply drops the flags in caller. Moreover I play the game with COMDATs and the fact that whenever you inline comdat w/o explicit optimization attribute to caller w/o explicit optimization attribute you may assume that the function body is valid under caller flags. We already play this game in can_inline_p. This means that no inlines are blocked in firefox. Of course it is always dodgy to change optimization flags after ealry optimizations that may have made code previously valid wrt -fstrict-aliasing invalid. I reviewed the 26 uses of -fstrict-aliasing in the compiler and it seems that only ipa-icf and fold-const may result in such transform. So I disabled them pre-inlining (which I think is good idea for time being until we make -fstrict-aliasing part of MEM_REF: both transforms are far less important than inlining). Once we update to MEM_REF we can easilu drop this. The patch bootstraps/regtests x86_64-linux and seems to do decent job on Firefox (actually increasing effectivity of TBAA, only 26 functions are demoted to -fno-strict-aliasing because of the new code in ipa-inline-transform). I plan to do more testing tomorrow (I still can't build the firefox binary to do some benchamrks). Honza * ipa-inline-transform.c (inline_call): Merge -fno-strict-aliasing if needed. * ipa-icf-gimple.c (func_checker::compatible_types_p): Pass true to get_alias_set. * alias.c (get_alias_set): Add new strict flag. (new_alias_set): Always produce new set. (record_component_aliases): Pass true to get_alias_set. * alias.h (get_alias_set): New parameter STRICT which is false by default. * fold-const.c (operand_equal_p): Before inlining don not permit any transformations that would be invalid if code became strict-aliasing * tree-streamer-out.c (pack_ts_type_common_value_fields): Do not stream TYPE_ALIAS_SET; sanity check that no alias set 0 info is lost. * tree-streamer-in.c (unpack_ts_type_common_value_fields): Do not stream in TYPE_ALIAS_SET. * tree.c (free_lang_data): Pass true to get_alias_set. * lto-symtab.c (warn_type_compatibility_p): Pass true to get_alias_set * c-common.c (parse_optimize_options): Remove ugly hack that makes strict-aliasing changes to be silently ignored. * gcc.dg/lto/alias-1_0.c: New testcase. * gcc.dg/lto/alias-1_1.c: New testcase. * gcc.c-torture/execute/alias-1.c: New testcase. Index: ipa-inline-transform.c =================================================================== --- ipa-inline-transform.c (revision 230924) +++ ipa-inline-transform.c (working copy) @@ -322,6 +322,23 @@ inline_call (struct cgraph_edge *e, bool if (DECL_FUNCTION_PERSONALITY (callee->decl)) DECL_FUNCTION_PERSONALITY (to->decl) = DECL_FUNCTION_PERSONALITY (callee->decl); + if (!opt_for_fn (callee->decl, flag_strict_aliasing) + && opt_for_fn (to->decl, flag_strict_aliasing) + && (!callee->merged + || lookup_attribute ("optimization", DECL_ATTRIBUTES (e->caller->decl)) + || lookup_attribute ("optimization", DECL_ATTRIBUTES (callee->decl)))) + { + struct gcc_options opts = global_options; + cl_optimization_restore (&opts, + TREE_OPTIMIZATION (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl))); + opts.x_flag_strict_aliasing = false; + if (dump_file) + fprintf (dump_file, "Dropping flag_strict_aliasing on %s:%i\n", + to->name (), to->order); + build_optimization_node (&opts); + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl) + = build_optimization_node (&opts); + } /* If aliases are involved, redirect edge to the actual destination and possibly remove the aliases. */ Index: ipa-icf-gimple.c =================================================================== --- ipa-icf-gimple.c (revision 230924) +++ ipa-icf-gimple.c (working copy) @@ -241,7 +241,7 @@ func_checker::compatible_types_p (tree t For time being just avoid calling get_alias_set on types that are not having alias sets defined at all. */ if (type_with_alias_set_p (t1) && type_with_alias_set_p (t2) - && get_alias_set (t1) != get_alias_set (t2)) + && get_alias_set (t1, true) != get_alias_set (t2, true)) return return_false_with_msg ("alias sets are different"); return true; Index: alias.c =================================================================== --- alias.c (revision 230924) +++ alias.c (working copy) @@ -809,17 +809,21 @@ init_alias_set_entry (alias_set_type set } /* Return the alias set for T, which may be either a type or an - expression. Call language-specific routine for help, if needed. */ + expression. Call language-specific routine for help, if needed. + If STRICT is true, ignore value of flag_strict_aliasing. This is needed + in cases we are in -fno-strict-aliasing region but still need to compute + alias sets for some reason (this is used, for example, by rtti code to copy + alias set from type to type). */ alias_set_type -get_alias_set (tree t) +get_alias_set (tree t, bool strict) { alias_set_type set; /* If we're not doing any alias analysis, just assume everything aliases everything else. Also return 0 if this or its type is an error. */ - if (! flag_strict_aliasing || t == error_mark_node + if ((! flag_strict_aliasing && !strict)|| t == error_mark_node || (! TYPE_P (t) && (TREE_TYPE (t) == 0 || TREE_TYPE (t) == error_mark_node))) return 0; @@ -898,7 +902,7 @@ get_alias_set (tree t) /* For arrays with unknown size the conservative answer is the alias set of the element type. */ if (TREE_CODE (t) == ARRAY_TYPE) - return get_alias_set (TREE_TYPE (t)); + return get_alias_set (TREE_TYPE (t), true); /* But return zero as a conservative answer for incomplete types. */ return 0; @@ -920,7 +924,7 @@ get_alias_set (tree t) normal usage. And indeed lets vectors be treated more like an array slice. */ else if (TREE_CODE (t) == VECTOR_TYPE) - set = get_alias_set (TREE_TYPE (t)); + set = get_alias_set (TREE_TYPE (t), true); /* Unless the language specifies otherwise, treat array types the same as their components. This avoids the asymmetry we get @@ -933,7 +937,7 @@ get_alias_set (tree t) else if (TREE_CODE (t) == ARRAY_TYPE && (!TYPE_NONALIASED_COMPONENT (t) || TYPE_STRUCTURAL_EQUALITY_P (t))) - set = get_alias_set (TREE_TYPE (t)); + set = get_alias_set (TREE_TYPE (t), true); /* From the former common C and C++ langhook implementation: @@ -997,7 +1001,7 @@ get_alias_set (tree t) (see record_component_aliases) and thus it is safe it to use it for pointers to types with TYPE_STRUCTURAL_EQUALITY_P. */ if (TREE_CODE (p) == VOID_TYPE || TYPE_STRUCTURAL_EQUALITY_P (p)) - set = get_alias_set (ptr_type_node); + set = get_alias_set (ptr_type_node, true); else { /* Rebuild pointer type starting from canonical types using @@ -1085,15 +1089,10 @@ get_alias_set (tree t) alias_set_type new_alias_set (void) { - if (flag_strict_aliasing) - { - if (alias_sets == 0) - vec_safe_push (alias_sets, (alias_set_entry *) NULL); - vec_safe_push (alias_sets, (alias_set_entry *) NULL); - return alias_sets->length () - 1; - } - else - return 0; + if (alias_sets == 0) + vec_safe_push (alias_sets, (alias_set_entry *) NULL); + vec_safe_push (alias_sets, (alias_set_entry *) NULL); + return alias_sets->length () - 1; } /* Indicate that things in SUBSET can alias things in SUPERSET, but that @@ -1169,7 +1168,7 @@ record_alias_subset (alias_set_type supe void record_component_aliases (tree type) { - alias_set_type superset = get_alias_set (type); + alias_set_type superset = get_alias_set (type, true); tree field; if (superset == 0) @@ -1215,16 +1214,17 @@ record_component_aliases (tree type) if (POINTER_TYPE_P (t)) t = ptr_type_node; else if (flag_checking) - gcc_checking_assert (get_alias_set (t) - == get_alias_set (TREE_TYPE (field))); + gcc_checking_assert (get_alias_set (t, true) + == get_alias_set (TREE_TYPE (field), + true)); } - record_alias_subset (superset, get_alias_set (t)); + record_alias_subset (superset, get_alias_set (t, true)); } break; case COMPLEX_TYPE: - record_alias_subset (superset, get_alias_set (TREE_TYPE (type))); + record_alias_subset (superset, get_alias_set (TREE_TYPE (type), true)); break; /* VECTOR_TYPE and ARRAY_TYPE share the alias set with their Index: alias.h =================================================================== --- alias.h (revision 230924) +++ alias.h (working copy) @@ -21,7 +21,7 @@ along with GCC; see the file COPYING3. #define GCC_ALIAS_H extern alias_set_type new_alias_set (void); -extern alias_set_type get_alias_set (tree); +extern alias_set_type get_alias_set (tree, bool strict = false); extern alias_set_type get_deref_alias_set (tree); extern alias_set_type get_varargs_alias_set (void); extern alias_set_type get_frame_alias_set (void); Index: cp/rtti.c =================================================================== --- cp/rtti.c (revision 230924) +++ cp/rtti.c (working copy) @@ -300,10 +300,10 @@ typeid_ok_p (void) /* Make sure abi::__type_info_pseudo has the same alias set as std::type_info. */ if (! TYPE_ALIAS_SET_KNOWN_P (pseudo_type_info)) - TYPE_ALIAS_SET (pseudo_type_info) = get_alias_set (type_info_type); + TYPE_ALIAS_SET (pseudo_type_info) = get_alias_set (type_info_type, true); else gcc_assert (TYPE_ALIAS_SET (pseudo_type_info) - == get_alias_set (type_info_type)); + == get_alias_set (type_info_type, true)); return true; } Index: fold-const.c =================================================================== --- fold-const.c (revision 230924) +++ fold-const.c (working copy) @@ -2987,7 +2987,7 @@ operand_equal_p (const_tree arg0, const_ flags))) return 0; /* Verify that accesses are TBAA compatible. */ - if (flag_strict_aliasing + if ((flag_strict_aliasing || !cfun->after_inlining) && (!alias_ptr_types_compatible_p (TREE_TYPE (TREE_OPERAND (arg0, 1)), TREE_TYPE (TREE_OPERAND (arg1, 1))) Index: lto/lto-symtab.c =================================================================== --- lto/lto-symtab.c (revision 230924) +++ lto/lto-symtab.c (working copy) @@ -276,8 +276,8 @@ warn_type_compatibility_p (tree prevaili we make ptr_type_node to TBAA compatible with every other type. */ if (type_with_alias_set_p (type) && type_with_alias_set_p (prevailing_type)) { - alias_set_type set1 = get_alias_set (type); - alias_set_type set2 = get_alias_set (prevailing_type); + alias_set_type set1 = get_alias_set (type, true); + alias_set_type set2 = get_alias_set (prevailing_type, true); if (set1 && set2 && set1 != set2 && (!POINTER_TYPE_P (type) || !POINTER_TYPE_P (prevailing_type) Index: tree-streamer-out.c =================================================================== --- tree-streamer-out.c (revision 230924) +++ tree-streamer-out.c (working copy) @@ -317,13 +319,18 @@ pack_ts_type_common_value_fields (struct bp_pack_value (bp, TYPE_RESTRICT (expr), 1); bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1); bp_pack_value (bp, TYPE_READONLY (expr), 1); - /* Make sure to preserve the fact whether the frontend would assign - alias-set zero to this type. Do that only for main variants, because - type variants alias sets are never computed. - FIXME: This does not work for pre-streamed builtin types. */ - bp_pack_value (bp, (TYPE_ALIAS_SET (expr) == 0 - || (!in_lto_p && TYPE_MAIN_VARIANT (expr) == expr - && get_alias_set (expr) == 0)), 1); + /* We used to stream TYPE_ALIAS_SET == 0 information to let frontends mark + types that are opaque for TBAA. This however did not work as intended, + becuase TYPE_ALIAS_SET == 0 was regularly lost in type merging. + + Instead now double check that all aliaset set 0 types will be alias set + 0 in LTO world, too. */ + gcc_checking_assert (!type_with_alias_set_p (expr) + || !canonical_type_used_p (expr) + || TYPE_ALIAS_SET (expr) != 0 + || expr == char_type_node + || expr == signed_char_type_node + || expr == unsigned_char_type_node); if (RECORD_OR_UNION_TYPE_P (expr)) { bp_pack_value (bp, TYPE_TRANSPARENT_AGGR (expr), 1); Index: c-family/c-common.c =================================================================== --- c-family/c-common.c (revision 230924) +++ c-family/c-common.c (working copy) @@ -9987,7 +9988,6 @@ parse_optimize_options (tree args, bool bool ret = true; unsigned opt_argc; unsigned i; - int saved_flag_strict_aliasing; const char **opt_argv; struct cl_decoded_option *decoded_options; unsigned int decoded_options_count; @@ -10080,8 +10080,6 @@ parse_optimize_options (tree args, bool for (i = 1; i < opt_argc; i++) opt_argv[i] = (*optimize_args)[i]; - saved_flag_strict_aliasing = flag_strict_aliasing; - /* Now parse the options. */ decode_cmdline_options_to_array_default_mask (opt_argc, opt_argv, &decoded_options, @@ -10092,9 +10090,6 @@ parse_optimize_options (tree args, bool targetm.override_options_after_change(); - /* Don't allow changing -fstrict-aliasing. */ - flag_strict_aliasing = saved_flag_strict_aliasing; - optimize_args->truncate (0); return ret; } Index: tree-streamer-in.c =================================================================== --- tree-streamer-in.c (revision 230924) +++ tree-streamer-in.c (working copy) @@ -366,7 +366,6 @@ unpack_ts_type_common_value_fields (stru TYPE_RESTRICT (expr) = (unsigned) bp_unpack_value (bp, 1); TYPE_USER_ALIGN (expr) = (unsigned) bp_unpack_value (bp, 1); TYPE_READONLY (expr) = (unsigned) bp_unpack_value (bp, 1); - TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, 1) ? 0 : -1; if (RECORD_OR_UNION_TYPE_P (expr)) { TYPE_TRANSPARENT_AGGR (expr) = (unsigned) bp_unpack_value (bp, 1); Index: testsuite/gcc.dg/lto/alias-1_0.c =================================================================== --- testsuite/gcc.dg/lto/alias-1_0.c (revision 0) +++ testsuite/gcc.dg/lto/alias-1_0.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options { { -O2 -flto } } } */ +int val; + +int *ptr = &val; +float *ptr2 = &val; + +extern void typefun(void); + + +main() +{ + *ptr=1; + typefun (); + if (*ptr) + link_error (); + return 0; +} + Index: testsuite/gcc.dg/lto/alias-1_1.c =================================================================== --- testsuite/gcc.dg/lto/alias-1_1.c (revision 0) +++ testsuite/gcc.dg/lto/alias-1_1.c (revision 0) @@ -0,0 +1,7 @@ +/* { dg-options "-fno-strict-aliasing" } */ +extern float *ptr2; +__attribute__((optimize ("-fno-strict-aliasing"))) +typefun () +{ + *ptr2=0; +} Index: testsuite/gcc.c-torture/execute/alias-1.c =================================================================== --- testsuite/gcc.c-torture/execute/alias-1.c (revision 0) +++ testsuite/gcc.c-torture/execute/alias-1.c (revision 0) @@ -0,0 +1,19 @@ +int val; + +int *ptr = &val; +float *ptr2 = &val; + +__attribute__((optimize ("-fno-strict-aliasing"))) +typepun () +{ + *ptr2=0; +} + +main() +{ + *ptr=1; + typepun (); + if (*ptr) + __builtin_abort (); +} + Index: tree.c =================================================================== --- tree.c (revision 230924) +++ tree.c (working copy) @@ -5971,7 +5971,8 @@ free_lang_data (void) while the slots are still in the way the frontends generated them. */ for (i = 0; i < itk_none; ++i) if (integer_types[i]) - TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i]); + TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i], + true); /* Traverse the IL resetting language specific information for operands, expressions, etc. */