On 10/18/2021 7:41 AM, Aldy Hernandez wrote:
After some playing, it looks like if we enable fully-resolving mode in
the *.thread passes immediately preceeding VRP, we can remove the VRP
threading passes altogether, thus removing 2 threading passes (and
forward threading passes at that!).
Whoo whoo!
The numbers look really good. We get 6874 more jump threading passes
over my boostrap .ii files for a total 3.74% increase. And we get that
while running marginally faster (0.19% faster, so noise).
The details are:
*** Mainline (with the loop rotation patch):
ethread:64722
dom:31246
thread:73709
vrp-thread:14357
total: 184034
*** Removing all the VRP threaders.
ethread:64722
thread-full:76493
dom:33648
thread:16045
total: 190908
Notice that not only do we get a lot more threads in thread-full
(resolving mode), but even DOM can get more jump threads.
This doesn't come without risks though. The main issue is that we would
be removing one engine (forward threader), with another one (backward
threader). But the good news is that (a) we've been using the new
backward threader for a while now (b) even the VRP threader in
mainline is using the backward threader solver. So, all that would
really be changing would be the path discovery bits and custom copier
in the forward threader, with the backward threader bit and the
generic copier.
I personally don't think this is a big risk, because we've done all
the hard work already and it's all being stressed in one way or another.
I don't see the risk as significantly different than any other big chunk
of development work. Furthermore, this is a major step on the path
we've been discussing the last couple years. There'll be some testing
fallout, but I think that's manageable and ultimately worth the pain to
work through. I can express how happy I am that we're at the point of
zapping the two VRP threading passes.
The untested patch below is all that would need to happen, albeit with
copius changes to tests.
I'd like to see where we all stand on this before I start chugging away
at testing and other time consuming tasks.
Note, that all the relevant bits will still be tested in this release,
so I'm not gonna cry one way or another. But it'd be nice to start
reducing passes, especially if we get a 3.74% increase in jump threads
for no time penalty.
Finally, even if we all agree, I think we should give us a week after the
loop rotation restrictions go in, because threading changes always cause
a party of unexpected things to happen.
Sure. And FWIW, the loop rotation changes are fine IMHO. So commit
those when you're ready, then drop this in a week later. All in time
for stage1 close :-)
Jeff