On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng <bin.ch...@arm.com> wrote: > Hi, > Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt > issue still exists. > > Current candidate selecting algorithm tends to select fewer candidates given > below reasons: > 1) to better handle loops with many induction uses but the best choice is > one generic basic induction variable; > 2) to keep compilation time low. > > One fundamental weakness of the strategy is the opposite situation can't be > handled properly sometimes. For these cases the best choice is each > induction variable has its own candidate. > This patch fixes the problem by shuffling candidate set after fix-point is > reached by current implementation. The reason why this strategy works is it > replaces candidate set by selecting local optimal candidate for some > induction uses, and the new candidate set (has lower cost) is exact what we > want in the mentioned case. Instrumentation data shows this can find better > candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64. > > This patch actually is extension to the first version patch posted at > https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds > another selecting pass with special seed set (more or less like the shuffled > set in this patch). Data also confirms this patch can find optimal sets for > most loops found by the first one, as well as optimal sets for many new > loops. > > Bootstrap and test on x86_64, no regression on benchmarks. Bootstrap and > test on aarch64. > Since this patch only selects candidate set with lower cost, any regressions > revealed are latent bugs of other components in GCC. > I also collected GCC bootstrap time on x86_64, no regression either. > Is this OK?
The algorithm seems to be quadratic in the number of IV candidates (at least): + for (i = 0; i < n_iv_cands (data); i++) + { ... + iv_ca_replace (data, ivs, cand, act_delta, &tmp_delta); ... and +static void +iv_ca_replace (struct ivopts_data *data, struct iv_ca *ivs, + struct iv_cand *cand, struct iv_ca_delta *act_delta, + struct iv_ca_delta **delta) +{ ... + for (i = 0; i < ivs->upto; i++) + { ... + if (data->consider_all_candidates) + { + for (j = 0; j < n_iv_cands (data); j++) + { possibly cubic if ivs->upto is of similar value. I wonder if it is possible to restrict this to the single IV with the largest delta? After all we are iterating try_improve_iv_set. Alternatively move the handling out of iteration completey, thus into the caller of try_improve_iv_set? Note that compile-time issues always arise in auto-generated code, not during GCC bootstrap. Richard. > 2014-12-03 Bin Cheng bin.ch...@arm.com > > PR tree-optimization/62178 > * tree-ssa-loop-ivopts.c (iv_ca_replace): New function. > (try_improve_iv_set): Shuffle candidates set in order to handle > case in which candidate wrto each iv use should be selected. > > gcc/testsuite/ChangeLog > 2014-12-03 Bin Cheng bin.ch...@arm.com > > PR tree-optimization/62178 > * gcc.target/aarch64/pr62178.c: New test.