On Mon, Jun 12, 2017 at 9:19 AM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > "Bin.Cheng" <amker.ch...@gmail.com> writes: >> On Sat, Jun 10, 2017 at 10:40 AM, Richard Sandiford >> <richard.sandif...@linaro.org> wrote: >>> Sorry to return this old patch, but: >>> >>> Bin Cheng <bin.ch...@arm.com> writes: >>>> -/* Calculate the number of iterations under which scalar loop will be >>>> - preferred than vectorized loop. NITERS_PROLOG is the number of >>>> - iterations of prolog loop. If it's integer const, the integer >>>> - number is also passed by INT_NITERS_PROLOG. VF is vector factor; >>>> - TH is the threshold for vectorized loop if CHECK_PROFITABILITY is >>>> - true. This function also store upper bound of the result in BOUND. */ >>>> +/* Calculate the number of iterations above which vectorized loop will be >>>> + preferred than scalar loop. NITERS_PROLOG is the number of iterations >>>> + of prolog loop. If it's integer const, the integer number is also >>>> passed >>>> + in INT_NITERS_PROLOG. BOUND_PROLOG is the upper bound (included) of >>>> + number of iterations of prolog loop. VFM1 is vector factor minus one. >>>> + If CHECK_PROFITABILITY is true, TH is the threshold below which scalar >>>> + (rather than vectorized) loop will be executed. This function stores >>>> + upper bound (included) of the result in BOUND_SCALAR. */ >>>> >>>> static tree >>>> vect_gen_scalar_loop_niters (tree niters_prolog, int int_niters_prolog, >>>> - int bound_prolog, int vf, int th, int *bound, >>>> - bool check_profitability) >>>> + int bound_prolog, int vfm1, int th, >>>> + int *bound_scalar, bool check_profitability) >>>> { >>>> tree type = TREE_TYPE (niters_prolog); >>>> tree niters = fold_build2 (PLUS_EXPR, type, niters_prolog, >>>> - build_int_cst (type, vf)); >>>> + build_int_cst (type, vfm1)); >>>> >>>> - *bound = vf + bound_prolog; >>>> + *bound_scalar = vfm1 + bound_prolog; >>>> if (check_profitability) >>>> { >>>> - th++; >>>> + /* TH indicates the minimum niters of vectorized loop, while we >>>> + compute the maximum niters of scalar loop. */ >>>> + th--; >>> >>> Are you sure about this last change? It looks like it should be dropping >> Hi Richard, >> Thanks for spotting this. I vaguely remember I got this from the way >> how niter/th was checked in previous peeling code, but did't double >> check it now. I tend to believe there is inconsistence about th, >> especially with comment like: >> >> /* Threshold of number of iterations below which vectorzation will not be >> performed. It is calculated from MIN_PROFITABLE_ITERS and >> PARAM_MIN_VECT_LOOP_BOUND. */ >> unsigned int th; >> >> I also tend to believe the inconsistence was introduced partly because >> niters in vectorizer stands for latch_niters + 1, while latch_niters >> in rest of the compiler. >> >> and..., >> >>> the increment rather than replacing it with a decrement. >>> >>> It looks like the threshold is already the maximum niters for the scalar >>> loop. It's set by: >>> >>> min_scalar_loop_bound = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND) >>> * vectorization_factor) - 1); >>> >>> /* Use the cost model only if it is more conservative than user specified >>> threshold. */ >>> th = (unsigned) min_scalar_loop_bound; >>> if (min_profitable_iters >>> && (!min_scalar_loop_bound >>> || min_profitable_iters > min_scalar_loop_bound)) >>> th = (unsigned) min_profitable_iters; >>> >>> LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) = th; >>> >>> (Note the "- 1" in the min_scalar_loop_bound. The multiplication result >>> is the minimum niters for the vector loop.) >> To be honest, min_scalar_loop_bound is more likely for something's >> lower bound which is the niters for the vector loop. If it refers to >> the niters scalar loop, it is in actuality the "max" value we should >> use. I am not quite sure here, partly because I am not a native >> speaker. >> >>> >>> min_profitable_iters sounds like it _ought_ to be the minimum niters for >>> which the vector loop is used, but vect_estimate_min_profitable_iters >>> instead returns the largest niters for which the scalar loop should be >>> preferred: >>> >>> /* Cost model disabled. */ >>> if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo))) >>> { >>> dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n"); >>> *ret_min_profitable_niters = 0; >>> *ret_min_profitable_estimate = 0; >>> return; >>> } >>> [...] >>> /* Because the condition we create is: >>> if (niters <= min_profitable_iters) >>> then skip the vectorized loop. */ >>> min_profitable_iters--; >>> [...] >>> min_profitable_estimate --; >>> >>> Also, when deciding whether the threshold needs to be used, we have: >>> >>> th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); >>> if (th >= LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1 >>> && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) >>> { >>> if (dump_enabled_p ()) >>> dump_printf_loc (MSG_NOTE, vect_location, >>> "Profitability threshold is %d loop iterations.\n", >>> th); >>> check_profitability = true; >>> } >> Yes, this indeed indicates that th refers to the maximum niters of the >> scalar loop. Anyway, we should resolve the inconsistence and make it >> more clear in variable names etc.. > > Yeah, agree the variable names are really confusing. Given the choice > between changing the names to match the current meaning and changing > the meaning to match the current names, the latter looks like it would > give cleaner code. I'm happy to do that it that sounds like the way > to go. I am all for this direction, but Richard B may have some to say here. BTW, will you also fix the other inconsistence you found?
Thanks, bin > > FWIW, I came across this while trying make sense of the "+ 1" in: > > /* Decide whether we need to create an epilogue loop to handle > remaining scalar iterations. */ > th = ((LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo) + 1) > / LOOP_VINFO_VECT_FACTOR (loop_vinfo)) > * LOOP_VINFO_VECT_FACTOR (loop_vinfo); > > Thanks, > Richard