On Thu, May 11, 2017 at 11:39 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Tue, Apr 18, 2017 at 12:53 PM, Bin Cheng <bin.ch...@arm.com> wrote: >> Hi, >> Currently IVOPTs shares the same register pressure computation with RTL loop >> invariant pass, >> which doesn't work very well. This patch introduces specific interface for >> IVOPTs. >> The general idea is described in the cover message as below: >> C) Current implementation shares the same register pressure computation >> with RTL loop >> inv pass. It has difficulty in handling (especially large) loop nest, >> and quite >> often generating too many candidates (especially for outer loops). >> This change >> introduces new register pressure computation. The brief idea is to >> differentiate >> (hot) innermost loop and outer loop. for (possibly hot) inner most, >> more registers >> are allowed as long as the register pressure is within the range of >> number of target >> available registers. >> It can also help to restrict number of candidates for outer loop. >> Is it OK? > > +/* Determine if current loop is the innermost loop and maybe hot. */ > + > +static void > +determine_hot_innermost_loop (struct ivopts_data *data) > +{ > + data->hot_innermost_loop_p = true; > + if (!data->speed) > + return; > > err, so when not optimizing for speed we assume all loops (even not innermost) > are hot and innermost?! > > + HOST_WIDE_INT niter = avg_loop_niter (loop); > + if (niter < PARAM_VALUE (PARAM_AVG_LOOP_NITER) > + || loop_constraint_set_p (loop, LOOP_C_PROLOG) > + || loop_constraint_set_p (loop, LOOP_C_EPILOG) > + || loop_constraint_set_p (loop, LOOP_C_VERSION)) > + data->hot_innermost_loop_p = false; > > this needs adjustment for the constraint patch removal. Also looking at niter > of the loop in question insn't a good metric for hotness. data->speed is the > best guess you get I think (optimize_loop_for_speed_p). > > data->speed = optimize_loop_for_speed_p (loop); > + determine_hot_innermost_loop (data); > > data->hot_innermost_loop_p = determine_hot_innermost_loop (data); > > would be more consistent here. Hi, I removed the hot innermost part and here is the updated version. Is it OK?
Thanks, bin 2017-05-11 Bin Cheng <bin.ch...@arm.com> * tree-ssa-loop-ivopts.c (ivopts_estimate_reg_pressure): New reg_pressure model function. (ivopts_global_cost_for_size): Delete. (determine_set_costs, iv_ca_recount_cost): Call new model function ivopts_estimate_reg_pressure. > > Thanks, > Richard. > >> Thanks, >> bin >> 2017-04-11 Bin Cheng <bin.ch...@arm.com> >> >> * tree-ssa-loop-ivopts.c (struct ivopts_data): New field. >> (ivopts_estimate_reg_pressure): New reg_pressure model function. >> (ivopts_global_cost_for_size): Delete. >> (determine_set_costs, iv_ca_recount_cost): Call new model function >> ivopts_estimate_reg_pressure. >> (determine_hot_innermost_loop): New. >> (tree_ssa_iv_optimize_loop): Call above function.
From 3ca5cb6bafb516c68cad9d6fd3adbbe73bec4d19 Mon Sep 17 00:00:00 2001 From: Bin Cheng <binch...@e108451-lin.cambridge.arm.com> Date: Fri, 10 Mar 2017 11:03:16 +0000 Subject: [PATCH 1/9] ivopt-reg_pressure-model-20170225.txt --- gcc/tree-ssa-loop-ivopts.c | 49 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 8b228ca..7caed10 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -5531,17 +5531,46 @@ determine_iv_costs (struct ivopts_data *data) fprintf (dump_file, "\n"); } -/* Calculates cost for having N_REGS registers. This number includes - induction variables, invariant variables and invariant expressions. */ +/* Estimate register pressure for loop having N_INVS invariants and N_CANDS + induction variables. Note N_INVS includes both invariant variables and + invariant expressions. */ static unsigned -ivopts_global_cost_for_size (struct ivopts_data *data, unsigned n_regs) +ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs, + unsigned n_cands) { - unsigned cost = estimate_reg_pressure_cost (n_regs, - data->regs_used, data->speed, - data->body_includes_call); - /* Add n_regs to the cost, so that we prefer eliminating ivs if possible. */ - return n_regs + cost; + unsigned cost; + unsigned n_old = data->regs_used, n_new = n_invs + n_cands; + unsigned regs_needed = n_new + n_old, available_regs = target_avail_regs; + bool speed = data->speed; + + /* If there is a call in the loop body, the call-clobbered registers + are not available for loop invariants. */ + if (data->body_includes_call) + available_regs = available_regs - target_clobbered_regs; + + /* If we have enough registers. */ + if (regs_needed + target_res_regs < available_regs) + cost = n_new; + /* If close to running out of registers, try to preserve them. */ + else if (regs_needed <= available_regs) + cost = target_reg_cost [speed] * regs_needed; + /* If we run out of available registers but the number of candidates + does not, we penalize extra registers using target_spill_cost. */ + else if (n_cands <= available_regs) + cost = target_reg_cost [speed] * available_regs + + target_spill_cost [speed] * (regs_needed - available_regs); + /* If the number of candidates runs out available registers, we penalize + extra candidate registers using target_spill_cost * 2. Because it is + more expensive to spill induction variable than invariant. */ + else + cost = target_reg_cost [speed] * available_regs + + target_spill_cost [speed] * (n_cands - available_regs) * 2 + + target_spill_cost [speed] * (regs_needed - n_cands); + + /* Finally, add the number of candidates, so that we prefer eliminating + induction variables if possible. */ + return cost + n_cands; } /* For each size of the induction variable set determine the penalty. */ @@ -5602,7 +5631,7 @@ determine_set_costs (struct ivopts_data *data) fprintf (dump_file, " ivs\tcost\n"); for (j = 0; j <= 2 * target_avail_regs; j++) fprintf (dump_file, " %d\t%d\n", j, - ivopts_global_cost_for_size (data, j)); + ivopts_estimate_reg_pressure (data, 0, j)); fprintf (dump_file, "\n"); } } @@ -5661,7 +5690,7 @@ iv_ca_recount_cost (struct ivopts_data *data, struct iv_ca *ivs) comp_cost cost = ivs->cand_use_cost; cost += ivs->cand_cost; - cost += ivopts_global_cost_for_size (data, ivs->n_invs + ivs->n_cands); + cost += ivopts_estimate_reg_pressure (data, ivs->n_invs, ivs->n_cands); ivs->cost = cost; } -- 1.9.1