https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78496

--- Comment #5 from Jeffrey A. Law <law at redhat dot com> ---
More comments.  As has been noted, this looks like a case where we need
iteration to fully optimize.  However, there are things we can do to improve
VRP's jump threading which should have a direct positive impact on this test.


vrp has code which will forward propagate something like

x = (typecast) y;
if (x == 42)

Into

if (y == 42)

That's fine and good, except it really makes it hard for jump threading to
thread through such conditionals since we may have had ASSERT_EXPRs to give us
a nice range for X, but not for Y.

>From my experiments, deferring that transformation until after jump threading
would definitely help this testcase.  Doing so allows VRP1 to jump thread a
number of range tests of w1.  The total number of vrp & dom jump threads are
currently 15.  Fixing increases that to 57.

We also have cases where we've got something like this:

;;   basic block 13, loop depth 1, count 0, freq 8500, maybe hot
;;    prev block 12, next block 129, flags: (NEW, REACHABLE, VISITED)
;;    pred:       11 [50.0%]  (FALSE_VALUE,EXECUTABLE)
;;                12 [100.0%]  (FALLTHRU,EXECUTABLE)
  out_ind.71_313 = out_ind;
  OUT[out_ind.71_313] = ss1_233;
  _314 = out_ind.71_313 + 1;
  out_ind = _314;
  aa1_258 = ye1_171 + aa1_186;
  if (v1_179 > 0)
    goto <bb 14>; [64.00%]
  else
    goto <bb 129>; [36.00%]
;;    succ:       14 [64.0%]  (TRUE_VALUE,EXECUTABLE)
;;                129 [36.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 129, loop depth 1, count 0, freq 3060, maybe hot
;;    prev block 13, next block 14, flags: (NEW)
;;    pred:       13 [36.0%]  (FALSE_VALUE,EXECUTABLE)
  v1_378 = ASSERT_EXPR <v1_179, v1_179 <= 0>;
  goto <bb 16>; [100.00%]
;;    succ:       16 [100.0%]  (FALLTHRU)

Where we have this range computed for v1_378:
v1_378: [0, 0]  EQUIVALENCES: { v1_179 } (1 elements)


Presumably we don't simplify the ASSERT_EXPR because it was thought that
they're just going to be dropped.  But by simplifying the ASSERT_EXPR into an
equality test, we can then propagate a value of 0 for v1_179 into BB16 when
it's reached via BB129.  That in turn allows us to thread the jump earlier (in
VRP1 rather than in DOM2, which is usually helpful).


The one I still haven't cracked looks like this:

;;   basic block 18, loop depth 1, count 0, freq 2125, maybe hot
;;    prev block 131, next block 19, flags: (NEW, REACHABLE, VISITED)
;;    pred:       17 [50.0%]  (TRUE_VALUE,EXECUTABLE)
  w1_390 = ASSERT_EXPR <w1_389, w1_389 <= 449>;
  _17 = 450 - w1_390;
  _18 = (unsigned int) _17;
  _19 = _18 * oo1_227;
  _20 = _19 / 3600;
  _21 = _20 + _310;
  x1_261 = (int) _21;
;;    succ:       19 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 19, loop depth 1, count 0, freq 8500, maybe hot
;;    prev block 18, next block 132, flags: (NEW, REACHABLE, VISITED)
;;    pred:       130 [100.0%]  (FALLTHRU)
;;                131 [100.0%]  (FALLTHRU)
;;                18 [100.0%]  (FALLTHRU,EXECUTABLE)
  # x1_197 = PHI <x1_206(130), x1_206(131), x1_261(18)>
  if (w1_260 > 839)
    goto <bb 20>; [50.00%]
  else
    goto <bb 132>; [50.00%]
;;    succ:       20 [50.0%]  (TRUE_VALUE,EXECUTABLE)
;;                132 [50.0%]  (FALSE_VALUE,EXECUTABLE)

We've got an obviously threadable jump when BB19 is reached via BB18.

w1_260 is not used by the ASSERT_EXPR in BB18, so we don't find that
ASSERT_EXPR, which we need to be able to thread the jump.  I can see expensive
ways to get to the ASSERT_EXPR, but nothing I'd consider clean enough yet.  I
suspect this happens enough that if we were to catch it in VRP1 that we'd
probably be reasonably close to catching the jump threads early enough to
essentially resolve this BZ.

Reply via email to