On Sun, Oct 30, 2011 at 8:49 AM, David Miller <da...@davemloft.net> wrote:
>
> My check-gcc runs are completely dominated by the time it takes to
> test compile.exp=20001226-1.c
>
> On sparc when compiling this function at -O1 the lowest hanging fruit
> is some sillyness in reorg.  It is trying to test a very specific
> condition within a well contained code block, but the label scanning
> is not constrained properly so things go haywire for large functions.
>
> Specifically it is looking at:
>
>        INSN SEQ (delay_insn
>                  delay_slot_1
>                  ...)
>        ...
>        NEXT
> LABEL:
>        next_active_insn (NEXT)
>
> and wants to scan backwards to find if LABEL exists, starting at
> next_active_insn (NEXT), and if so are NEXT and delay_slot_1 equal
> instructions (equal meaning rtx_equal_p()).
>
> But it just does a blind prev_label() call which will happily scan all
> the way to the beginning of the function.  Compilation of this
> testcase at -O1 spends 1/3 of it's time in prev_insn ().
>
> In this specific reorg check, there is no reason to scan further back
> than INSN.
>
> So I've added a limited scanning helper function to reorg.c for this
> purpose and got rid of prev_insn entirely since there are no longer
> any users.
>
> Unfortunately, at -O2 this testcase causes exponential cost problems
> which are beyond my skills to fix.  Specifically,
> tail_merge_optimize() applies it's clusters and
> iterate_fix_dominators() hits the worst case O(n) complexity of
> et_splay for a CFG structured like this. :-/  %38 of the compilation
> time is spent in et_splay().

Yeah, iteratively re-computing dominators can be expensive.  Tom,
can't you avoid that (thus, do you know how the dominance relation
changes?)

Richard.

> Committed to trunk.
>
> gcc/
>
>        * reorg.c (label_before_next_insn): New function.
>        (relax_delay_slots): Use it instead of prev_label.
>        * rtl.h (prev_label): Delete declaration.
>        * emit-rtl.c (prev_label): Remove.
>
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180674 
> 138bc75d-0d04-0410-961f-82ee72b054a4
> ---
>  gcc/ChangeLog  |    7 +++++++
>  gcc/emit-rtl.c |   15 ---------------
>  gcc/reorg.c    |   17 ++++++++++++++++-
>  gcc/rtl.h      |    1 -
>  4 files changed, 23 insertions(+), 17 deletions(-)
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 0eb34e5..e9bda2b 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2011-10-30  David S. Miller  <da...@davemloft.net>
> +
> +       * reorg.c (label_before_next_insn): New function.
> +       (relax_delay_slots): Use it instead of prev_label.
> +       * rtl.h (prev_label): Delete declaration.
> +       * emit-rtl.c (prev_label): Remove.
> +
>  2011-10-30  Revital Eres  <revital.e...@linaro.org>
>
>        * modulo-sched.c (generate_prolog_epilog): Mark prolog and epilog
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 8465237..c2bc56b 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -3330,21 +3330,6 @@ next_label (rtx insn)
>   return insn;
>  }
>
> -/* Return the last CODE_LABEL before the insn INSN, or 0 if there is none.  
> */
> -
> -rtx
> -prev_label (rtx insn)
> -{
> -  while (insn)
> -    {
> -      insn = PREV_INSN (insn);
> -      if (insn == 0 || LABEL_P (insn))
> -       break;
> -    }
> -
> -  return insn;
> -}
> -
>  /* Return the last label to mark the same position as LABEL.  Return LABEL
>    itself if it is null or any return rtx.  */
>
> diff --git a/gcc/reorg.c b/gcc/reorg.c
> index f77a3a0..40d73a7 100644
> --- a/gcc/reorg.c
> +++ b/gcc/reorg.c
> @@ -3349,6 +3349,21 @@ delete_jump (rtx insn)
>     delete_computation (insn);
>  }
>
> +static rtx
> +label_before_next_insn (rtx x, rtx scan_limit)
> +{
> +  rtx insn = next_active_insn (x);
> +  while (insn)
> +    {
> +      insn = PREV_INSN (insn);
> +      if (insn == scan_limit || insn == NULL_RTX)
> +       return NULL_RTX;
> +      if (LABEL_P (insn))
> +       break;
> +    }
> +  return insn;
> +}
> +
>
>  /* Once we have tried two ways to fill a delay slot, make a pass over the
>    code to try to improve the results and to do such things as more jump
> @@ -3634,7 +3649,7 @@ relax_delay_slots (rtx first)
>         identical to the one in its delay slot.  In this case, we can just
>         delete the branch and the insn in its delay slot.  */
>       if (next && NONJUMP_INSN_P (next)
> -         && prev_label (next_active_insn (next)) == target_label
> +         && label_before_next_insn (next, insn) == target_label
>          && simplejump_p (insn)
>          && XVECLEN (pat, 0) == 2
>          && rtx_equal_p (PATTERN (next), PATTERN (XVECEXP (pat, 0, 1))))
> diff --git a/gcc/rtl.h b/gcc/rtl.h
> index 81958a5..41536be 100644
> --- a/gcc/rtl.h
> +++ b/gcc/rtl.h
> @@ -1812,7 +1812,6 @@ extern rtx next_real_insn (rtx);
>  extern rtx prev_active_insn (rtx);
>  extern rtx next_active_insn (rtx);
>  extern int active_insn_p (const_rtx);
> -extern rtx prev_label (rtx);
>  extern rtx next_label (rtx);
>  extern rtx skip_consecutive_labels (rtx);
>  extern rtx next_cc0_user (rtx);
> --
> 1.7.6.401.g6a319
>
>

Reply via email to