> Hi,
> 
> this patch uses the aggregate jump functions created by the previous
> patch in the series to determine benefits of inlining a particular
> call graph edge.  It has not changed much since the last time I posted
> it, except for the presence of by_ref flags and removal of checks
> required by TBAA which we now do not use.
> 
> The patch works in fairly straightforward way.  It ads two flags to
> struct condition to specify it actually refers to an aggregate passed
> by value or something passed by reference, in both cases at a
> particular offset, also newly stored in the structures.  Functions
> which build the predicates specifying under which conditions CFG edges
> will be taken or individual statements are actually executed then
> simply also look whether a value comes from an aggregate passed to us
> in a parameter (either by value or reference) and if so, create
> appropriate conditions.  Later on, predicates are evaluated as before,
> we only also look at aggregate contents of the jump functions of the
> edge we are considering to inline when evaluating the predicates, and
> also remap the offsets of the jump functions when remapping over an
> ancestor jump function.
> 
> This patch alone makes us inline the function bar in testcase of PR
> 48636 in comment #4.  It also passes bootstrap and testing on
> x86_64-linux.  I successfully LTO-built Firefox with it too.
> 
> Thanks for all comments and suggestions,
> 
> Martin
> 
> 
> 2012-07-31  Martin Jambor  <mjam...@suse.cz>
> 
>       PR fortran/48636
>       * ipa-inline.h (condition): New fields offset, agg_contents and by_ref.
>       * ipa-inline-analysis.c (agg_position_info): New type.
>       (add_condition): New parameter aggpos, also store agg_contents, by_ref
>       and offset.
>       (dump_condition): Also dump aggregate conditions.
>       (evaluate_conditions_for_known_args): Also handle aggregate
>       conditions.  New parameter known_aggs.
>       (evaluate_properties_for_edge): Gather known aggregate contents.
>       (inline_node_duplication_hook): Pass NULL known_aggs to
>       evaluate_conditions_for_known_args.
>       (unmodified_parm): Split into unmodified_parm and unmodified_parm_1.
>       (unmodified_parm_or_parm_agg_item): New function.
>       (set_cond_stmt_execution_predicate): Handle values passed in
>       aggregates.
>       (set_switch_stmt_execution_predicate): Likewise.
>       (will_be_nonconstant_predicate): Likewise.
>       (estimate_edge_devirt_benefit): Pass new parameter known_aggs to
>       ipa_get_indirect_edge_target.
>       (estimate_calls_size_and_time): New parameter known_aggs, pass it
>       recrsively to itself and to estimate_edge_devirt_benefit.
>       (estimate_node_size_and_time): New vector known_aggs, pass it o
>       functions which need it.
>       (remap_predicate): New parameter offset_map, use it to remap aggregate
>       conditions.
>       (remap_edge_summaries): New parameter offset_map, pass it recursively
>       to itself and to remap_predicate.
>       (inline_merge_summary): Also create and populate vector offset_map.
>       (do_estimate_edge_time): New vector of known aggregate contents,
>       passed to functions which need it.
>       (inline_read_section): Stream new fields of condition.
>       (inline_write_summary): Likewise.
>       * ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate
>       contents.  Let all local callers pass NULL for known_aggs.
> 
>       * testsuite/gfortran.dg/pr48636.f90: New test.

OK with the following changes.

I plan to push out my inline hints code, so it would be nice if you commited 
soon 
so I cn resolve conflicts on my side.
> Index: src/gcc/ipa-inline.h
> ===================================================================
> *** src.orig/gcc/ipa-inline.h
> --- src/gcc/ipa-inline.h
> *************** along with GCC; see the file COPYING3.
> *** 28,36 ****
> --- 28,45 ----
>   
>   typedef struct GTY(()) condition
>     {
> +     /* If agg_contents is set, this is the offset from which the used data 
> was
> +        loaded.  */
> +     HOST_WIDE_INT offset;
>       tree val;
>       int operand_num;
>       enum tree_code code;
> +     /* Set if the used data were loaded from an aggregate parameter or from
> +        data received by reference.  */
> +     unsigned agg_contents : 1;
> +     /* If agg_contents is set, this differentiates between loads from data
> +        passed by reference and by value.  */
> +     unsigned by_ref : 1;

Do you have any data on memory usage?  I was originally concerned about memory 
use of the
whole predicate thingy on WPA level.  Eventually we could add simple 
inheritance on
conditions and sort them into mutiple vectors if needed. But I assume it is OK 
or we
will work out on Mozilla builds soonish.

One obvious thing is to patch CODE and the bitfields so we fit in 3 64bit words.
> *************** dump_condition (FILE *f, conditions cond
> *** 519,524 ****
> --- 554,561 ----
>         c = VEC_index (condition, conditions,
>                    cond - predicate_first_dynamic_condition);
>         fprintf (f, "op%i", c->operand_num);
> +       if (c->agg_contents)
> +     fprintf (f, "[offset: " HOST_WIDE_INT_PRINT_DEC "]", c->offset);
Dumping of by_ref?
>         if (c->code == IS_NOT_CONSTANT)
>       {
>         fprintf (f, " not constant");
> --- 718,761 ----
>         tree val;
>         tree res;
>   
> !       /* We allow call stmt to have fewer arguments than the callee function
> !      (especially for K&R style programs).  So bound check here (we assume
> !      known_aggs vector, if non-NULL, has the same length as
> !      known_vals).  */
> !       gcc_assert (!known_aggs
Perhaps checking_assert. This tends to be hot path of WPA inliner.
> --- 833,857 ----
>                                 es->param,
>                                 i)->change_prob)
>           VEC_replace (tree, known_vals, i, error_mark_node);
> +       /* TODO: When IPA-CP starts merging aggregate jump functions, use its
> +          knowledge of the caller too, just like the scalar case above.  */
> +       VEC_replace (ipa_agg_jump_function_p, known_aggs, i, &jf->agg);

By merging of arggregate functions you meen propagating from caller to callee?
> ! /* If OP refers to a value of a function parameter or value loaded from an
> !    aggregate passed to a parameter (either by value or reference), return 
> TRUE
> !    and store the number of the parameter to *INDEX_P and information whether
> !    and how it has been loaded from an aggregate into *AGGPOS.  INFO 
> describes
> !    the function parameters, STMT is the statement in which OP is used or
> !    loaded.  */
> ! 
> ! static bool
> ! unmodified_parm_or_parm_agg_item (struct ipa_node_params *info,
> !                               gimple stmt, tree op, int *index_p,
> !                               struct agg_position_info *aggpos)

I assume this follows the same startegy as the ipa-cp code, right?
> *************** set_cond_stmt_execution_predicate (struc
> *** 1480,1485 ****
> --- 1609,1615 ----
>         || gimple_call_num_args (set_stmt) != 1)
>       return;
>     op2 = gimple_call_arg (set_stmt, 0);
> +   /* TODO: Use unmodified_parm_or_parm_agg_item also here.  */

Why it is TODO?
>     base = get_base_address (op2);
>     parm = unmodified_parm (set_stmt, base ? base : op2);
>     if (!parm)
> --- 1800,1817 ----
>       return p;
>   
>     is_load = gimple_vuse (stmt) != NULL;
>     /* Loads can be optimized when the value is known.  */
>     if (is_load)
>       {
> !       tree op;
>         gcc_assert (gimple_assign_single_p (stmt));
> !       op = gimple_assign_rhs1 (stmt);
> !       if (!unmodified_parm_or_parm_agg_item (info, stmt, op, &base_index,
> !                                          &aggpos))
>       return p;
>       }
> +   else
> +     base_index = -1;

Don't we miss the actual check that the load is load of known aggregate value 
on this path?
>   /* Translate all conditions from callee representation into caller
>      representation and symbolically evaluate predicate P into new predicate.
>   
> !    INFO is inline_summary of function we are adding predicate into, 
> CALLEE_INFO
> !    is summary of function predicate P is from. OPERAND_MAP is array giving
> !    callee formal IDs the caller formal IDs. POSSSIBLE_TRUTHS is clausule of 
> all
> !    callee conditions that may be true in caller context.  TOPLEV_PREDICATE 
> is
> !    predicate under which callee is executed.  OFFSET_MAP is an array of of
> !    offsets that need to be added to conditions, negative offset means that
> !    conditions relying on values passed by reference have to be discarded
> !    because they might not be preserved (and should be considered offset zero
> !    for other purposes).  */

Do you handle cases like arg passed by value turning into arg passed by 
reference?
If not, just add an TODO.
Where do we verify that by_reference flags match?

Thanks,
Honza

Reply via email to