On 06/21/2017 10:26 AM, Jan Hubicka wrote: >> >> Ok, so I fixed that in the described way. There's one remaining fallout of: >> gcc/testsuite/gcc.dg/tree-ssa/ipa-split-5.c >> >> Where a fnsplit is properly done, but then it's again inlined: >> >> Considering split_me.part.0/5 with 23 size >> to be inlined into test/2 in unknown:0 >> Estimated badness is -0.000001, frequency 0.33. >> Inlined split_me.part.0 into test which now has time 50.300000 and size 44, >> net change of +17. >> >> Considering split_me.part.0/5 with 23 size >> to be inlined into test/2 in unknown:0 >> Estimated badness is -0.000001, frequency 0.33. >> Inlined split_me.part.0 into test which now has time 70.760000 and size 61, >> net change of +17. >> >> Considering split_me.part.0/5 with 23 size >> to be inlined into test/2 in unknown:0 >> Estimated badness is -0.000001, frequency 0.33. >> Inlined split_me.part.0 into test which now has time 91.220000 and size 78, >> net change of +17. >> >> Considering split_me.part.0/5 with 23 size >> to be inlined into test/2 in unknown:0 >> Estimated badness is -0.000001, frequency 0.33. >> Inlined split_me.part.0 into test which now has time 111.680000 and size >> 95, net change of +17. >> Unit growth for small function inlining: 61->129 (111%) >> >> ... >> >> Any hint how to block the IPA inlining? > > I guess you only want to make cold part of split_me bigger or > just use --param to reduce growth for auto inlining. > > How the patch reduces split_me_part considerably? > Honza
Well, I probably overlooked test results, test works fine. I'm going to install the patch. Martin >> >> Sending new version of patch. >> Martin >> >>> >>> I would just move pass_strip_predict_hints pre-IPA and not worry about >>> them chaining. >>> >>> There is problem that after inlining the prediction may expand its scope >>> and predict branch that it outside of the original function body, >>> but I do not see very easy solution for that besides not re-doing >>> prediction (we could also copy probabilities from the inlined function >>> when they exists and honnor them in the outer function. I am not sure >>> that is going to improve prediction quality though - extra context >>> is probably useful) >>> >>> Thanks, >>> Honza >>>> >>>> Thanks, >>>> Martin >>>> >>>>> >>>>> Where did you found this case? >>>>> Honza >>>>>> >>>>>> /* Create a new deep copy of the statement. */ >>>>>> copy = gimple_copy (stmt); >>>>>> -- >>>>>> 2.13.0 >>>>>> >> > >> >From 84625a782add6ae2ed29630815b61b34a052770a Mon Sep 17 00:00:00 2001 >> From: marxin <mli...@suse.cz> >> Date: Tue, 6 Jun 2017 10:55:18 +0200 >> Subject: [PATCH 1/2] Make early return predictor more precise. >> >> gcc/ChangeLog: >> >> 2017-05-26 Martin Liska <mli...@suse.cz> >> >> PR tree-optimization/79489 >> * gimplify.c (maybe_add_early_return_predict_stmt): New >> function. >> (gimplify_return_expr): Call the function. >> * predict.c (tree_estimate_probability_bb): Remove handling >> of early return. >> * predict.def: Update comment about early return predictor. >> * gimple-predict.h (is_gimple_predict): New function. >> * predict.def: Change default value of early return to 66. >> * tree-tailcall.c (find_tail_calls): Skip GIMPLE_PREDICT >> statements. >> * passes.def: Put pass_strip_predict_hints to the beginning of >> IPA passes. >> --- >> gcc/gimple-low.c | 2 ++ >> gcc/gimple-predict.h | 8 ++++++++ >> gcc/gimplify.c | 16 ++++++++++++++++ >> gcc/passes.def | 1 + >> gcc/predict.c | 41 ----------------------------------------- >> gcc/predict.def | 15 +++------------ >> gcc/tree-tailcall.c | 2 ++ >> 7 files changed, 32 insertions(+), 53 deletions(-) >> >> diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c >> index 619b9d7bfb1..4ea6c3532f3 100644 >> --- a/gcc/gimple-low.c >> +++ b/gcc/gimple-low.c >> @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. If not see >> #include "calls.h" >> #include "gimple-iterator.h" >> #include "gimple-low.h" >> +#include "predict.h" >> +#include "gimple-predict.h" >> >> /* The differences between High GIMPLE and Low GIMPLE are the >> following: >> diff --git a/gcc/gimple-predict.h b/gcc/gimple-predict.h >> index ba58e12e9e9..0e6c2e1ea01 100644 >> --- a/gcc/gimple-predict.h >> +++ b/gcc/gimple-predict.h >> @@ -80,4 +80,12 @@ gimple_build_predict (enum br_predictor predictor, enum >> prediction outcome) >> return p; >> } >> >> +/* Return true if GS is a GIMPLE_PREDICT statement. */ >> + >> +static inline bool >> +is_gimple_predict (const gimple *gs) >> +{ >> + return gimple_code (gs) == GIMPLE_PREDICT; >> +} >> + >> #endif /* GCC_GIMPLE_PREDICT_H */ >> diff --git a/gcc/gimplify.c b/gcc/gimplify.c >> index 9af95a28704..1c6e1591953 100644 >> --- a/gcc/gimplify.c >> +++ b/gcc/gimplify.c >> @@ -1428,6 +1428,20 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p) >> return GS_ALL_DONE; >> } >> >> +/* Maybe add early return predict statement to PRE_P sequence. */ >> + >> +static void >> +maybe_add_early_return_predict_stmt (gimple_seq *pre_p) >> +{ >> + /* If we are not in a conditional context, add PREDICT statement. */ >> + if (gimple_conditional_context ()) >> + { >> + gimple *predict = gimple_build_predict (PRED_TREE_EARLY_RETURN, >> + NOT_TAKEN); >> + gimplify_seq_add_stmt (pre_p, predict); >> + } >> +} >> + >> /* Gimplify a RETURN_EXPR. If the expression to be returned is not a >> GIMPLE value, it is assigned to a new temporary and the statement is >> re-written to return the temporary. >> @@ -1458,6 +1472,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p) >> || TREE_CODE (ret_expr) == RESULT_DECL >> || ret_expr == error_mark_node) >> { >> + maybe_add_early_return_predict_stmt (pre_p); >> greturn *ret = gimple_build_return (ret_expr); >> gimple_set_no_warning (ret, TREE_NO_WARNING (stmt)); >> gimplify_seq_add_stmt (pre_p, ret); >> @@ -1525,6 +1540,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p) >> >> gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p); >> >> + maybe_add_early_return_predict_stmt (pre_p); >> ret = gimple_build_return (result); >> gimple_set_no_warning (ret, TREE_NO_WARNING (stmt)); >> gimplify_seq_add_stmt (pre_p, ret); >> diff --git a/gcc/passes.def b/gcc/passes.def >> index c14f6b9f63c..316e19d12e3 100644 >> --- a/gcc/passes.def >> +++ b/gcc/passes.def >> @@ -107,6 +107,7 @@ along with GCC; see the file COPYING3. If not see >> early optimizations again. It is thus good idea to do this >> late. */ >> NEXT_PASS (pass_split_functions); >> + NEXT_PASS (pass_strip_predict_hints); >> POP_INSERT_PASSES () >> NEXT_PASS (pass_release_ssa_names); >> NEXT_PASS (pass_rebuild_cgraph_edges); >> diff --git a/gcc/predict.c b/gcc/predict.c >> index 60d1a094d10..790be9fbf16 100644 >> --- a/gcc/predict.c >> +++ b/gcc/predict.c >> @@ -2739,7 +2739,6 @@ tree_estimate_probability_bb (basic_block bb, bool >> local_only) >> { >> edge e; >> edge_iterator ei; >> - gimple *last; >> >> FOR_EACH_EDGE (e, ei, bb->succs) >> { >> @@ -2766,46 +2765,6 @@ tree_estimate_probability_bb (basic_block bb, bool >> local_only) >> } >> } >> >> - /* Predict early returns to be probable, as we've already taken >> - care for error returns and other cases are often used for >> - fast paths through function. >> - >> - Since we've already removed the return statements, we are >> - looking for CFG like: >> - >> - if (conditional) >> - { >> - .. >> - goto return_block >> - } >> - some other blocks >> - return_block: >> - return_stmt. */ >> - if (e->dest != bb->next_bb >> - && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) >> - && single_succ_p (e->dest) >> - && single_succ_edge (e->dest)->dest == EXIT_BLOCK_PTR_FOR_FN (cfun) >> - && (last = last_stmt (e->dest)) != NULL >> - && gimple_code (last) == GIMPLE_RETURN) >> - { >> - edge e1; >> - edge_iterator ei1; >> - >> - if (single_succ_p (bb)) >> - { >> - FOR_EACH_EDGE (e1, ei1, bb->preds) >> - if (!predicted_by_p (e1->src, PRED_NULL_RETURN) >> - && !predicted_by_p (e1->src, PRED_CONST_RETURN) >> - && !predicted_by_p (e1->src, PRED_NEGATIVE_RETURN)) >> - predict_edge_def (e1, PRED_TREE_EARLY_RETURN, NOT_TAKEN); >> - } >> - else >> - if (!predicted_by_p (e->src, PRED_NULL_RETURN) >> - && !predicted_by_p (e->src, PRED_CONST_RETURN) >> - && !predicted_by_p (e->src, PRED_NEGATIVE_RETURN)) >> - predict_edge_def (e, PRED_TREE_EARLY_RETURN, NOT_TAKEN); >> - } >> - >> /* Look for block we are guarding (ie we dominate it, >> but it doesn't postdominate us). */ >> if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && e->dest != bb >> diff --git a/gcc/predict.def b/gcc/predict.def >> index fcda6c48f11..f7b2bf7738c 100644 >> --- a/gcc/predict.def >> +++ b/gcc/predict.def >> @@ -128,18 +128,9 @@ DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic >> call", HITRATE (59), 0) >> indefinitely. */ >> DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0) >> >> -/* Branch causing function to terminate is probably not taken. >> - FIXME: early return currently predicts code: >> - int foo (int a) >> - { >> - if (a) >> - bar(); >> - else >> - bar2(); >> - } >> - even though there is no return statement involved. We probably want to >> track >> - this from FE or retire the predictor. */ >> -DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE >> (54), 0) >> +/* Branch causing function to terminate is probably not taken. */ >> +DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE >> (66), >> + 0) >> >> /* Branch containing goto is probably not taken. >> FIXME: Currently not used. */ >> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c >> index b7053387e91..6aa9a56462e 100644 >> --- a/gcc/tree-tailcall.c >> +++ b/gcc/tree-tailcall.c >> @@ -421,6 +421,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret) >> if (gimple_code (stmt) == GIMPLE_LABEL >> || gimple_code (stmt) == GIMPLE_RETURN >> || gimple_code (stmt) == GIMPLE_NOP >> + || gimple_code (stmt) == GIMPLE_PREDICT >> || gimple_clobber_p (stmt) >> || is_gimple_debug (stmt)) >> continue; >> @@ -555,6 +556,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret) >> >> if (gimple_code (stmt) == GIMPLE_LABEL >> || gimple_code (stmt) == GIMPLE_NOP >> + || gimple_code (stmt) == GIMPLE_PREDICT >> || gimple_clobber_p (stmt) >> || is_gimple_debug (stmt)) >> continue; >> -- >> 2.13.1 >> >