On 10/19/21 10:40 AM, Richard Biener wrote:
On Tue, Oct 19, 2021 at 9:33 AM Aldy Hernandez <al...@redhat.com> wrote:

On Tue, Oct 19, 2021 at 8:52 AM Richard Biener
<richard.guent...@gmail.com> wrote:

On Mon, Oct 18, 2021 at 4:03 PM Aldy Hernandez <al...@redhat.com> wrote:



On 10/18/21 3:41 PM, Aldy Hernandez wrote:

I've been experimenting with reducing the total number of threading
passes, and I'd like to see if there's consensus/stomach for altering
the pipeline.  Note, that the goal is to remove forward threader clients,
not the other way around.  So, we should prefer to remove a VRP threader
instance over a *.thread one immediately before VRP.

After some playing, it looks like if we enable fully-resolving mode in
the *.thread passes immediately preceeding VRP, we can remove the VRP
threading passes altogether, thus removing 2 threading passes (and
forward threading passes at that!).

It occurs to me that we could also remove the threading before VRP
passes, and enable a fully-resolving backward threader after VRP.  I
haven't played with this scenario, but it should be just as good.  That
being said, I don't know the intricacies of why we had both pre and post
VRP threading passes, and if one is ideally better than the other.

It was done because they were different threaders.  Since the new threader
uses built-in VRP it shouldn't really matter whether it's before or after
VRP _for the threading_, but it might be that if threading runs before VRP
then VRP itself can do a better job on cleaning up the IL.

Good point.

FWIW, earlier this season I played with replacing the VRPs with evrp
instances (which fold far more conditionals) and I found that the
threaders can actually find LESS opportunities after *vrp fold away
things.  I don't know if this is a good or a bad thing.

Probably a sign that either threading theads stuff that's pointless
(does not consider conditions on the path that always evaluate false?)

At least in the backward threader, we don't keep looking back if we can resolve the conditional at the end of an in-progress path, so it's certainly possible we thread paths that are unreachable. I'm pretty sure that's also possible in the forward threader.

For example, we if we have a candidate path that ends in x > 1234 and we know on entry to the path that x is [2000,3000], there's no need to chase further back to see if the path itself is reachable.

or that after VRP and removing redundant conditions blocks are now
bigger and we run into some --param limit (that would suggest the
limits behave odd if it would thread when we'd artificially split blocks).

Examples might be interesting to look at to understand what's going "wrong".

Perhaps we
should benchmark three alternatives:

1. Mainline
2. Fully resolving threader -> VRP -> No threading.
3. No threading -> VRP -> Full resolving threader.

...and see what the actual effect is, regardless of number of threaded paths.

As said, only 2. makes "sense" to me unless examples show why we really
have the usual pass ordering issue.  As said, I think threading exposes new
VRP (esp. constant/copy prop) opportunities but VRP shouldn't expose new
threading opportunities.

Excellent!  This saves me time, as I've mostly played with option #2 ;-).

Thanks.
Aldy

Reply via email to