On Thu, May 26, 2011 at 9:53 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
> On Fri, Apr 22, 2011 at 5:35 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> Hi,
>> this patch implements infrastructure to summarize function body size&time in 
>> a
>> way that is sensitive on function context.  At the moment context means
>>
>>  1) if function is inline or offline
>>  2) if some of parameters are known compile time constants.
>>
>> but we should handle more later.
>>
>> The analysis is implemented by introducing notion of predicates, that are
>> simple logical formulas in conjunctive-disjunctive form on conditions.
>> Conditions are simple tests like "function is not inlined" "op0 is not
>> constant" "op0 > 6". That is one can express things like "this statement will
>> execute if op1 >6 or op0 is not constant".
>>
>> The patch implements simple infrastructure to represent the predicates.  
>> There
>> are hard limits on everything - i.e. there are no more than 32 different
>> conditions and no more than 8 clauses.  This makes it possible to test 
>> clauses
>> by simple logicaloperations on integers and represent predicates at array of 
>> 8
>> integers thatis very cheap.  The implementation details are quite contained 
>> so
>> we might relax the limits, but I don't really see a need for that.
>>
>> The main point of designing this was to allow effective way of evaulating 
>> those
>> predicates for given context, since this happens many times during inlining,
>> and to not make WPA memory usage grow too much.  At the same time I wanted 
>> the
>> infrastructure to be flexible enough to allow adding more tricks in future.
>> Like we might consider extra inlining points if callee uses the argument to
>> drive number of iterations of loop or when caller pass a pointer to a static
>> variable that might be SRAed after inlining etc. etc.
>>
>> At the moment only consumer of predicates is size/time vector that is vector
>> of simple entries consiting of size, time and predicate.  Function size/time 
>> is then
>> computed as sum of all entries whose predicate might be true in given 
>> context +
>> size/time of all call edges (this is because call edges can disappear at 
>> different
>> conditions or be turned into constant).
>>
>> I plan to add more uses of predicates in near future - i.e. attaching
>> predicates to edges so we know what calls will be optimized out at WPA time.
>> Also I plan to use the analysis to drive function clonning (i.e. partial
>> specialization): when function is called from several places with the same
>> context and the context makes a difference at expected runtime, clone the
>> function.
>>
>> The analysis part deciding on predicates is currently very simple, kind of 
>> proof
>> of concept:
>>
>>  1) Every BB gets assigned predicate when it is reachable. At the moment it 
>> happens
>>    only if the all predecestors of BB are conditionals that tests function
>>    parameter.  Obviously we will need to propagate this info for sane 
>> results.
>>
>>  2) Every statement gets assigned predicate when it will become constant. 
>> Again
>>    it is very simple, only statements using only function arguments are 
>> considered.
>>    Simple propagation across SSA graph will do better.
>>
>>  3) Finally the statement is accounted at a predicate that is conjunction of 
>> both
>>    above.
>>  All call statements are accounted unconditoinally because we don't 
>> predicate edges, yet.
>>
>> While computing function sizes is fast, it is not as speedy as original
>> "time-benefit".  Small function inliner is quite insane about querying the
>> sizes/times again and again while it keeps up to date its queue. For this
>> purpose I added cache same way as we already cache function growths.  Not 
>> that
>> I would not plan to make inliner&badness more sensible here soon.
>> So far I did not want to touch the actual heuristics part of inliner and 
>> hope to do
>> it after getting the infrastructure to the point I want it to be for 4.7.
>>
>> The patch bootstraps&regtests.  I tested that compile time implication on
>> tramp3d is positive (because of caching, without it inliner grows from 1.1% 
>> to
>> 4% of compile time) I also tested SPECs and there are not great changes, that
>> is not bad result given stupidity of the analysis ;).
>>
>> I will look into Mozilla even though I plan to look into solving scability
>> problems of the inliner as followup instead of snowballing this.
>>
>> I plan to work on the patch little further during weekend, in particular make
>> dumps more readable since they got bit convoluted by random formatting. But i
>> am sending the patch for comments and I plan to get it finished till next 
>> week.
>>
>> Honza
>>
>>        * gengtype.c (open_base_files): Add ipa-inline.h include.
>>        * ipa-cp.c (ipcp_get_lattice, ipcp_lattice_from_jfunc): Move to 
>> ipa-prop.c
>>        update all uses.
>>        * ipa-prop.c: (ipa_get_lattice, ipa_lattice_from_jfunc): ... here.
>>        * ipa-inline-transform.c (inline_call): Use inline_merge_summary to 
>> merge
>>        summary of inlined function into former caller.
>>        * ipa-inline.c (max_benefit): Remove.
>>        (edge_badness): Compensate for removal of benefits.
>>        (update_caller_keys): Use 
>> reset_node_growth_cache/reset_edge_growth_cache.
>>        (update_callee_keys): Likewise.
>>        (update_all_callee_keys): Likewise.
>>        (inline_small_functions): Do not collect max_benefit; do not
>>        reset stimated_growth; call free_growth_caches and 
>> initialize_growth_caches.
>>        * ipa-inline.h (struct condition, type clause_t, struct predicate, 
>> struct
>>        size_time_entry): New structures.
>>        (INLINE_SIZE_SCALE, INLINE_TIME_SCALE, MAX_CLAUSES): New constants.
>>        (inline_summary): Remove size_inlining_benefit, time_inlining_benefit 
>> and
>>        estimated_growth.
>>        (edge_growth_cache_entry): New structure.
>>        (node_growth_cache, edge_growth_cache): New global vars.
>>        (estimate_growth): Turn into inline.
>>        (inline_merge_summary, do_estimate_edge_growth, do_estimate_edge_time,
>>        initialize_growth_caches, free_growth_caches): Declare.
>>        (estimate_edge_growth): Rewrite.
>>        (estimate_edge_time): Implement as inline cache lookup.
>>        (reset_node_growth_cache, reset_edge_growth_cache): New inline 
>> functions.
>>        (MAX_TIME): Reduce to allow multiplicatoin by INLINE_SIZE_SCALE.
>>        (NUM_CONDITIONS): New constant.
>>        (predicate_conditions): New enum.
>>        (IS_NOT_CONSTANT): New constant.
>>        (edge_removal_hook_holder): New var.
>>        (node_growth_cache, edge_growth_cache): New global vars.
>>        (true_predicate, single_cond_predicate, false_predicate, 
>> not_inlined_predicate,
>>        add_condition, add_clause, and_predicates, or_predicates, 
>> predicates_equal_p,
>>        evaulate_predicate, dump_condition, dump_clause, dump_predicate, 
>> account_size_time,
>>        evaulate_conditions_for_edge): New functions.
>>        (inline_summary_alloc): Move to heap.
>>        (inline_node_removal_hook): Clear condition and entry vectors.
>>        (inline_edge_removal_hook): New function.
>>        (initialize_growth_caches, free_growth_caches): New function.
>>        (dump_inline_summary): Update.
>>        (edge_execution_predicate): New function.
>>        (will_be_nonconstant_predicate): New function.
>>        (estimate_function_body_sizes): Compute BB and constantness 
>> predicates.
>>        (compute_inline_parameters): Do not clear estimated_growth.
>>        (estimate_edge_size_and_time): New function.
>>        (estimate_calls_size_and_time): New function.
>>        (estimate_callee_size_and_time): New function.
>>        (remap_predicate): New function.
>>        (inline_merge_summary): New function.
>>        (do_estimate_edge_time): New function based on...
>>        (estimate_edge_time): ... this one.
>>        (do_estimate_edge_growth): New function.
>>        (do_estimate_growth): New function based on....
>>        (estimate_growth): ... this one.
>>        (inline_analyze_function): Analyze after deciding on jump functions.
>>        (inline_read_section): New function.
>>        (inline_read_summary): Use it.
>>        (inline_write_summary): Write all the new data.
>>        * ipa-prop.c (ipa_get_param_decl_index): Export.
>>        (ipa_lattice_from_jfunc): Move here from ipa-cp.c
>>        * ipa-prop.h (ipa_get_param_decl_index, ipa_lattice_from_jfunc): 
>> Declare.
>>        (ipa_get_lattice): Move hre from ipa-cp.c
>>        * Makefile.in (GTFILES): Add ipa-inline.h and ipa-inline-analysis.c
>>        * params.def (PARAM_EARLY_INLINING_INSNS): Set to 11.
>>
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49179
>
>

This also caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49091


-- 
H.J.

Reply via email to