Hello, On Mon, 4 Oct 2021, Jeff Law wrote:
> And just in case it got lost. Here's the analysis on 960218-1 on visium: > > We've got this going into ethread: > > int f (int x) > { > int D.1420; > int a; > > ;; basic block 2, loop depth 0, maybe hot > ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) > ;; pred: ENTRY (FALLTHRU,EXECUTABLE) > a_4 = ~x_3(D); > goto <bb 4>; [INV] > ;; succ: 4 (FALLTHRU,EXECUTABLE) > > ;; basic block 3, loop depth 1, maybe hot > ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) > ;; pred: 4 (TRUE_VALUE,EXECUTABLE) > gl = a_1; > ;; succ: 4 (FALLTHRU,DFS_BACK,EXECUTABLE) > > ;; basic block 4, loop depth 1, maybe hot > ;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED) > ;; pred: 2 (FALLTHRU,EXECUTABLE) > ;; 3 (FALLTHRU,DFS_BACK,EXECUTABLE) > # a_1 = PHI <a_4(2), 0(3)> > if (a_1 != 0) > goto <bb 3>; [INV] > else > goto <bb 5>; [INV] > ;; succ: 3 (TRUE_VALUE,EXECUTABLE) > ;; 5 (FALSE_VALUE,EXECUTABLE) > > ;; basic block 5, loop depth 0, maybe hot > ;; prev block 4, next block 1, flags: (NEW, REACHABLE, VISITED) > ;; pred: 4 (FALSE_VALUE,EXECUTABLE) > return; > ;; succ: EXIT > > } First notice that this doesn't have an empty latch block to start with (in fact, there is no separate latch block at all), so the loop optimizers haven't been initialized for simple latches at this point. Still, I would say that even then we want to be careful with the latch edge, as putting code onto it will most probably create a problem downstream once we _do_ want to intialize the loop optimizers for simple latches. So, that we are careful here is okay, we are just too careful in this specific case. > There's a pretty obvious jump thread in there. 3->4->5. > > We used to allow this, but it's currently being rejected and I'm not > sure it should. > > In simplest terms we're threading from inside the loop, through the > latch to a point outside the loop. At first glance, that seems safe. Is at least the unrestricted jump threader (the one after loop optimizers) picking this up? Independend of that, I agree, that this specific instance of threading through the latch block, even though followed by to-be-copied instructions, is okay. We need to determine precisely when that is okay, though. In this case there are several reasons I could see why this is so: (a) while the thread is through the latch block, it's _not_ through the latch edge (which is 4->3). That would create the problem, so here we're fine. (b) even if it were through the latch edge, and would be followed by to-be-copied instructions (and hence fill the latch edge) the ultimate end of the path is outside the loop, so all the edges and blocks that would now carry instructions would also be outside the loop, and hence be no problem Now, capture this precisely ;-) I think something like so: we have a candidate path S -> L -> B1 (-> B2 ... Bn) (Start, Latch, Blocks 1 to n following the latch). (I think in our notation that means that the jump in L is redirected to Bn, and all code from B1..Bn be copied, right? Do we even support multiple blocks after the to-be-redirected jump?) Now if L is a latch block, but L->B1 is no latch/back edge : no restrictions apply. Otherwise, L->B1 is a latch/back edge (that means B1 is a loop header): if all of B2..Bn-1 don't contain jumps back into the loop, and Bn is outside the loop, then the thread is okay as well. (B2..Bn-1 can be inside the loop, as long as they don't contain jumps back into the loop, after copying by the threader, they don't create problems: their copies will be placed outside the loop and won't generate side entries back into the loop; the copied latch edge will not be a latch edge anymore, but a loop exit edge). It's quite possible that the full generality above is not necessary. It's probably enough to just not deal with the (b) case above (and continue to reject it), and simply only accept the candidate path if none of the involved edges is a latch/back edge (despite one block being the latch block). Especially so because I'm not convinced that I handled the idea of case (b) correctly above ;-) Ciao, Michael.