On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng <bin.ch...@arm.com> wrote:
> Hi,
> Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt
> issue still exists.
>
> Current candidate selecting algorithm tends to select fewer candidates given
> below reasons:
>   1) to better handle loops with many induction uses but the best choice is
> one generic basic induction variable;
>   2) to keep compilation time low.
>
> One fundamental weakness of the strategy is the opposite situation can't be
> handled properly sometimes.  For these cases the best choice is each
> induction variable has its own candidate.
> This patch fixes the problem by shuffling candidate set after fix-point is
> reached by current implementation.  The reason why this strategy works is it
> replaces candidate set by selecting local optimal candidate for some
> induction uses, and the new candidate set (has lower cost) is exact what we
> want in the mentioned case.  Instrumentation data shows this can find better
> candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64.
>
> This patch actually is extension to the first version patch posted at
> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds
> another selecting pass with special seed set (more or less like the shuffled
> set in this patch).  Data also confirms this patch can find optimal sets for
> most loops found by the first one, as well as optimal sets for many new
> loops.
>
> Bootstrap and test on x86_64, no regression on benchmarks.  Bootstrap and
> test on aarch64.
> Since this patch only selects candidate set with lower cost, any regressions
> revealed are latent bugs of other components in GCC.
> I also collected GCC bootstrap time on x86_64, no regression either.
> Is this OK?

The algorithm seems to be quadratic in the number of IV candidates
(at least):

+         for (i = 0; i < n_iv_cands (data); i++)
+           {
...
+             iv_ca_replace (data, ivs, cand, act_delta, &tmp_delta);
...

and

+static void
+iv_ca_replace (struct ivopts_data *data, struct iv_ca *ivs,
+              struct iv_cand *cand, struct iv_ca_delta *act_delta,
+              struct iv_ca_delta **delta)
+{
...
+  for (i = 0; i < ivs->upto; i++)
+    {
...
+      if (data->consider_all_candidates)
+       {
+         for (j = 0; j < n_iv_cands (data); j++)
+           {

possibly cubic if ivs->upto is of similar value.

I wonder if it is possible to restrict this to the single IV with
the largest delta?  After all we are iterating try_improve_iv_set.
Alternatively move the handling out of iteration completey,
thus into the caller of try_improve_iv_set?

Note that compile-time issues always arise in auto-generated code,
not during GCC bootstrap.

Richard.


> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>
>   PR tree-optimization/62178
>   * tree-ssa-loop-ivopts.c (iv_ca_replace): New function.
>   (try_improve_iv_set): Shuffle candidates set in order to handle
>   case in which candidate wrto each iv use should be selected.
>
> gcc/testsuite/ChangeLog
> 2014-12-03  Bin Cheng  bin.ch...@arm.com
>
>   PR tree-optimization/62178
>   * gcc.target/aarch64/pr62178.c: New test.

Reply via email to