> Historically the pragma Inline_Always of GNAT had been implemented in the FE 
> because the RTL inliner and then the Tree inliner weren't invoked at -O0 or 
> powerful enough to inline some constructs.  But this approach had drawbacks, 
> especially wrt debug info.  These restrictions were gradually lifted and now 
> the pragma is entirely piggybacked on the Tree inliner.
> 
> This went mostly OK, except for a few cases where intrisinc operations that 
> used to be reasonably handled at -O0 now generate awful code, the typical 
> example being a modulus or division instrinsic by a power-of-2 generating a 
> fully-fledged modulus or division instruction instead of a simple shift.
> 
> Therefore the attached patch implements anonymous constant propagation in the 
> inliner to fix the code quality regression.
> 
> Tested on x86_64-suse-linux, OK for the mainline?
> 
>    if (TREE_CODE (*tp) == SSA_NAME)
>      {
> -      *tp = remap_ssa_name (*tp, id);
> +      tree t = remap_ssa_name (*tp, id);
> +      /* Perform anonymous constant propagation, this makes it possible to
> +         generate reasonable code even at -O0 for operators implemented as
> +         inline functions.  */
> +      if (TREE_CODE (t) == SSA_NAME
> +       && SSA_NAME_DEF_STMT (t)
> +       && (!SSA_NAME_VAR (t) || DECL_IGNORED_P (SSA_NAME_VAR (t)))
> +       && gimple_assign_copy_p (SSA_NAME_DEF_STMT (t))
> +       && is_gimple_min_invariant
> +          (gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t))))
> +     *tp = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t));
> +      else
> +     *tp = t;

This looks like a good idea to me (though i can't approve it).  We may want to
lift the (!SSA_NAME_VAR (t) || DECL_IGNORED_P (SSA_NAME_VAR (t))) when optimize
is set - the amount of garbage inliner produce is large and killing it early is
better than killing it later.  This has chance to help early opts where
ordering between ccp and einline is quite hard.
>        *walk_subtrees = 0;
>        return NULL;
>      }
> @@ -1965,7 +1977,7 @@ copy_bb (copy_body_data *id, basic_block
>  
>         /* Statements produced by inlining can be unfolded, especially
>            when we constant propagated some operands.  We can't fold
> -          them right now for two reasons:
> +          them right now in the general case for two reasons:
>            1) folding require SSA_NAME_DEF_STMTs to be correct
>            2) we can't change function calls to builtins.
>            So we just mark statement for later folding.  We mark
> @@ -1974,7 +1986,10 @@ copy_bb (copy_body_data *id, basic_block
>            foldable indirectly are updated.  If this turns out to be
>            expensive, copy_body can be told to watch for nontrivial
>            changes.  */
> -       if (id->statements_to_fold)
> +       if (gimple_assign_cast_p (stmt)
> +           && is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
> +         fold_stmt (&copy_gsi);
> +       else if (id->statements_to_fold)

Why this is needed? Is it just an optimization because we know that folding of
casts will not do crazy stuff like SSA graph walking (that was original reason
for delaying the folidng) or just an optimization to reudce the size of the 
list?

Honza
>           id->statements_to_fold->add (stmt);
>  
>         /* We're duplicating a CALL_EXPR.  Find any corresponding

Reply via email to