On 10/5/2021 7:33 AM, Aldy Hernandez wrote:
OK?
BTW Jeff, this fixes all the regressions you mention except:
1. pr68911.c
The path not being threaded here is 7->10->11->12. It crosses loops
multiple times, so I believe the restriction code is correct.
7, 10, 12 are in loop1.
11 is in loop 2.
So we have a path going from loop1 -> loop2 -> loop1. I can't
conceive any scenario where this is ok, but I can attach the entire IL
if I'm missing something.
Wow. I'm having trouble seeing how we were threading that, but clearly
not OK.
2. 961125-1.c
This one is a bit trickier. Here we're trying to thread the following
conditional, which I mentioned when I contributed this work, we don't
handle (and happens infrequently enough not to matter):
Sorry, I'd forgotten about this case.
+ // Loop 4
+ <bb 6> [local count: 114863531]:
+ # ptr_8 = PHI <ptr_2(4), ptr_2(5)>
+ if (ptr_8 < &MEM <char[4]> [(void *)":ab" + 3B])
+ goto <bb 7>; [50.00%]
+ else
+ goto <bb 17>; [50.00%]
The hybrid threader doesn't handle &MEM in the final conditional. As
I mentioned earlier, if this becomes an issue, we can adapt class
pointer_equiv_analyzer like we did for evrp. I have a gimple FE test
I will contribute as an XFAIL with an associated PR to keep us honest.
That being said... in this particular test, this is all irrelevant
because the path will be disallowed for two reasons:
a) The path crosses loops, and the reason we didn't realize it in the
old code was because the ASSERT_EXPR had pulled the SSA outside the
loop, so it looks like the entire path is l in the same loop. If you
look at the original IL, it's not.
b) Now the path actually fits the pattern being discussed in this
patch, where there's an early exit out of a loop, so it looks like we
should handle it. But...in this case, we would fill a presently empty
latch. Interestingly, the old code didn't catch it, because....you
guessed it...there was an ASSERT_EXPR in the latch.
So I argue that even in the absence of MEM_REF handling, we shouldn't
thread this path.
I can live with that too. I'll set new baselines for visium-elf so that
these don't show up again.
0001-Loosen-loop-crossing-restriction-in-threader.patch
From 5abe65668f602d53b8f3dc515508dc4616beb048 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez <al...@redhat.com>
Date: Tue, 5 Oct 2021 15:03:34 +0200
Subject: [PATCH] Loosen loop crossing restriction in threader.
Crossing loops is generally discouraged from the threader, but we can
make an exception when we don't cross the latch or enter another loop,
since this is just an early exit out of the loop.
In fact, the whole threaded path is logically outside the loop. This
has nice secondary effects. For example, objects on the threaded path
will no longer necessarily be live throughout the loop, so we can get
register allocation improvements. The threaded path can physically
move outside the loop resulting in better icache efficiency, etc.
Tested on x86-64 Linux, and on a visium-elf cross making sure that the
following tests do not have an abort in the final assembly:
gcc.c-torture/execute/960218-1.c
gcc.c-torture/execute/visium-pending-4.c
gcc.c-torture/execute/pr58209.c
gcc/ChangeLog:
* tree-ssa-threadupdate.c (jt_path_registry::cancel_invalid_paths):
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/ssa-thread-valid.c: New test.
OK. I think there may still be some concern if the latch is marked as a
joiner. I think we should always reject those before the loop
optimizers run as the joiner's clone would introduce a second latch. I
think that can be handled in a follow-up refinement.
Co-authored-by: Jeff Law <jeffrea...@gmail.com>
I'm not sure I deserve co-authorship :-) You're doing all the work, I'm
just reporting the regressions.
jeff