Re: [Patch, cfg] Improve jump to return optimization for complex return

Segher Boessenkool Tue, 14 Jun 2016 13:31:40 -0700

On Tue, Jun 14, 2016 at 03:53:59PM +0100, Jiong Wang wrote:
> "bl to pop" into "pop" which is "jump to return" into "return",  so a better
> place to fix this issue is at try_optimize_cfg where we are doing these
> jump/return optimization already:
> 
>   /* Try to change a branch to a return to just that return.  */
>   rtx_insn *ret, *use;
>   ...
> 
> Currently we are using ANY_RETURN_P to check whether an rtx is return
> while ANY_RETURN_P only covered RETURN, SIMPLE_RETURN without side-effect:
> 
> #define ANY_RETURN_P(X) \
>   (GET_CODE (X) == RETURN || GET_CODE (X) == SIMPLE_RETURN)


On PowerPC we have a similar problem for jumps to trivial epilogues,
which are a parallel of a return with a (use reg:LR_REGNO).  But that
one shows up for conditional jumps.

> This patch:
>   * rename existed ANY_RETURN_P into ANY_PURE_RETURN_P.
>   * Re-implement ANY_RETURN_P to consider complex JUMP_INSN with
>     PARALLEL in the pattern.

ANY_RETURN_P is used in many other places, have you checked them all?

>   * Removed the side_effect_p check on the last insn in both bb1
>     and bb2.  This is because suppose we have bb1 and bb2 contains
>     the following single jump_insn and both fall through to EXIT_BB:

<snip>

>     cross jump optimization will try to change the jump_insn in bb1 into
>     a unconditional jump to bb2, then we will enter the next iteration
>     of try_optimize_cfg, and it will be a new "jump to return", then
>     bb1 will be optimized back into above patterns, and thus another round
>     of cross jump optimization, we will end up with infinite loop here.

Why is it save to remove the side_effects_p check?  Why was it there
in the first place?

> --- a/gcc/cfgcleanup.c
> +++ b/gcc/cfgcleanup.c
> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tm_p.h"
>  #include "insn-config.h"
>  #include "emit-rtl.h"
> +#include "output.h"

What is this include used for?  Ah, get_attr_min_length?

> @@ -2825,7 +2826,11 @@ try_optimize_cfg (int mode)
>             rtx_insn *ret, *use;
>             if (single_succ_p (b)
>                 && onlyjump_p (BB_END (b))
> -               && bb_is_just_return (single_succ (b), &ret, &use))
> +               && bb_is_just_return (single_succ (b), &ret, &use)
> +               && ((get_attr_min_length (ret)
> +                    <= get_attr_min_length (BB_END (b)))
> +                   || optimize_bb_for_speed_p (b)))

This is for breaking the cycle, right?  It seems fragile.

> --- a/gcc/jump.c
> +++ b/gcc/jump.c
> @@ -1437,7 +1437,35 @@ delete_for_peephole (rtx_insn *from, rtx_insn *to)
>       since the peephole that replaces this sequence
>       is also an unconditional jump in that case.  */
>  }
> -
> +

Unrelated change.

> +/* If X is RETURN or SIMPLE_RETURN then return itself.  If X is PARALLEL, 
> return
> +   the contained RETURN or SIMPLE_RETURN if it's not a CALL_INSN, otherwise
> +   return NULL_RTX.  */
> +
> +rtx
> +return_in_jump (rtx x)

Strange semantics, and the name does not capture the "call" semantics
at all.  Maybe split out that part?  What is that part for, anyway?


Segher

Re: [Patch, cfg] Improve jump to return optimization for complex return

Reply via email to