https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78496

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
           Assignee|unassigned at gcc dot gnu.org      |law at redhat dot com

--- Comment #2 from Jeffrey A. Law <law at redhat dot com> ---
So first you need to be aware that the backwards threader does not do
simplifications nor does it know how to analyze logical operations.  Adding
those capabilities to the backward threader is on the TODO list for gcc-8.

The forward walking threader does have some code to optimize stuff like this:

 _549 = w1_416 <= 899;
  _548 = _407 > 839;
  _541 = _548 & _549;
  if (_541 != 0)

Essentially it'll want to try and prove that _548 or _549 is zero, which in
turn would allow it to know that _541 is zero.

The problem is almost certainly the fact that we don't iterate the forward jump
threader.

In particular if we look at block 105 after forward jump threading:

;;   basic block 105, loop depth 1, count 0, freq 3060, maybe hot
;;    prev block 104, next block 106, flags: (NEW, REACHABLE, VISITED)
;;    pred:       11 (FALSE_VALUE,EXECUTABLE)
;;                99 [100.0%]  (FALLTHRU,EXECUTABLE)
  # aa1_216 = PHI <aa1_258(11), aa1_502(99)>
  # d1_206 = PHI <d1_432(11), d1_432(99)>
  # v1_196 = PHI <0(11), 0(99)>
  # oo1_194 = PHI <oo1_565(11), r1_254(99)>
  # ss1_192 = PHI <ss1_574(11), ss1_257(99)>
  # r1_190 = PHI <r1_583(11), 0(99)>
  # _188 = PHI <1(11), _500(99)>
  _184 = v1_196 * 3600;
  _182 = (unsigned int) _184;
  _180 = _182 / 60;
  w1_178 = (int) _180;
  _177 = w1_178 <= 449;
  _176 = _180 > 389;
  _169 = _176 & _177;
  goto <bb 16>; [100.00%]

We would need to propagate v1_196 = 0 into the uses, which in turn result in
_180 having the value 0.  That value in turn would feed into the PHI at the
start of bb16:

;;   basic block 16, loop depth 1, count 0, freq 8500, maybe hot
;;   Invalid sum of incoming frequencies 6375, should be 8500
;;    prev block 15, next block 17, flags: (NEW, REACHABLE, VISITED)
;;    pred:       15 [100.0%]  (FALLTHRU,EXECUTABLE)
;;                14 [21.9%]  (FALSE_VALUE,EXECUTABLE)
;;                105 [100.0%]  (FALLTHRU)
  # x1_197 = PHI <x1_261(15), x1_435(14), x1_435(105)>
  # _407 = PHI <_16(15), _16(14), _180(105)>
  # aa1_410 = PHI <aa1_185(15), aa1_185(14), aa1_216(105)>
  # d1_413 = PHI <d1_191(15), d1_191(14), d1_206(105)>
  # w1_416 = PHI <w1_260(15), w1_260(14), w1_178(105)>
  # v1_377 = PHI <v1_558(15), v1_558(14), v1_196(105)>
  # oo1_371 = PHI <oo1_567(15), oo1_567(14), oo1_194(105)>
  # ss1_376 = PHI <ss1_576(15), ss1_576(14), ss1_192(105)>
  # r1_609 = PHI <r1_585(15), r1_585(14), r1_190(105)>
  # _612 = PHI <_596(15), _596(14), _188(105)>
  _549 = w1_416 <= 899;
  _548 = _407 > 839;
  _541 = _548 & _549;
  if (_541 != 0)
    goto <bb 17>; [50.00%]
  else
    goto <bb 18>; [50.00%]


Once the value of _180 is propagated into the RHS of the assignment to _407
we'd be able to thread this block.  But there's not another forward jump
threading pass that runs prior to PRE.  

I looked at ways to introduce iteration without the full DOM pass years ago. 
It was pretty obvious that the most interesting things happened as a result of
exposing degenerate PHIs.  But I wasn't ever able to make a leap from that to a
low overhead iterating jump threader.  What did come out of it was the phi-only
cprop pass which propagates away the degenerate PHIs.

Through the years I've become increasingly convinced that forward, equivalence
table based jump threading is the wrong model.  THe right model (IMHO) is an
on-demand backwards substitution based threader.

Sebastian started that work and is what you would find in
tree-ssa-threadbackwards.c.  I continue to extend that towards the long term
vision.

Essentially we want to raise the query _541 != 0 an back substitute/fold
(_548 & _549) != 0
((_407 > 839) & _549) != 0
((_180 > 839) & _540) != 0
(((_182 / 60) > 839) & _540) != 0
((((unsigned int _184) / 60) > 839) & _540) != 0
(((((unsigned int) (v1_196 * 3600))/ 60) > 839) & _540) != 0
(((((unsigned int) (0) * 3600)) / 60) > 839) & _540) != 0

At which point it simplifies to 0 != 0

We're not to that point yet, but that's the direction its going.

Reply via email to