Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

Segher Boessenkool Thu, 03 Nov 2016 08:47:48 -0700

On Thu, Nov 03, 2016 at 09:29:07AM +0100, Richard Biener wrote:
> On Wed, Nov 2, 2016 at 4:27 PM, Segher Boessenkool
> <[email protected]> wrote:
> > On Wed, Nov 02, 2016 at 02:51:41PM +0100, Richard Biener wrote:
> >> >> I don't believe it needs a cleanup on every iteration. One cleanup at
> >> >> the end should work fine.
> >> >
> >> > But as the comment there says:
> >> >
> >> >       /* Merge the duplicated blocks into predecessors, when possible.  
> >> > */
> >> >       cleanup_cfg (0);
> >> >
> >> > (this is not a new comment), and without merging blocks this whole
> >> > patch does zilch.
> >> >
> >> > There is no need to create new basic blocks at all (can replace the
> >> > final branch in a block with a copy of the whole block it jumps to,
> >> > instead); and then it is painfully obvious that switching to layout
> >> > mode here is pointless, too.
> >> >
> >> > Do you want me to do that?
> >> >
> >> > Btw, this isn't quadratic anyway; it is a constant number (the maximal
> >> > length allowed of the block to copy) linear.  Unless there are 
> >> > unboundedly
> >> > many forwarder blocks, which there shouldn't be, but I cannot prove that.
> >>
> >> Well, you iterate calling functions (find candidates, merge, cleanup-cfg) 
> >> that
> >> walk the whole function.  So unless the number of iterations is bound
> >> with a constant I call this quadratic (in function size).
> >
> > It is bounded (with that caveat above): new candidates will be bigger than
> > the block merged into it, so there won't be more than
> > PARAM_MAX_GOTO_DUPLICATION_INSNS passes.
> >
> > But I can make it all work simpler, in non-layout mode, if you prefer.
> 
> Iteration is fine but we should restrict where we look for new
> candidates.  Likewise
> block merging opportunities should be easier to exploit than by
> running cfg-cleanup...


Yes, but that was what the original code does already, so I didn't look
deeper.  "It was just a simple bugfix".

> I'm just thinking of those gigantic machine-generated state machines where we
> ATM do quite well (at least at -O1 ...) with respect to compile-time.

Like I said, I tested it on a 2000 node VM, very artificial so almost
no code except the threading itself, and the compgotos pass takes less
than 1% time both before and after the patch.  I ony tested at -O2 though.

I'll get back to it, but first I have some things that need handling during
stage 1.


Segher

Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

Reply via email to