On 11/20/2017 03:25 AM, Richard Biener wrote:
> On Sun, Nov 19, 2017 at 9:16 PM, Jeff Law <l...@redhat.com> wrote:
>> On my local branch gcc.dg/torture/pr56349.c fails by sending GCC into an
>> infinite loop trying to simplify a self-referring statement. ie
>> something like
>>
>> x_1 = x_1 + 10;
>>
>> That, of course, shouldn't be happening in SSA form.  After some digging
>> I've found the culprit.
>>
>> Let's say we've got a PHI.
>>
>> a_1 = PHI (a_0, a_2)
>>
>> If DOM decides that the edge associated with a_2 is not executable, then
>> DOM will consider the PHI a degenerate and enter a_1 = a_0 into its
>> equivalence table.
>>
>> That in turn will result in propagation of a_0 into uses of a_1.
>>
>> That, of course, isn't right.  There's nothing that guarantees that the
>> definition of a_0 dominates the uses of a_1.  In the testcase that bogus
>> propagation cascades and eventually results in a self-referring node
>> like I showed above.
>>
>> The solution here is to note whether or not we ignored any PHI
>> arguments.  If we do and the equivalence we want to enter is SSA_NAME =
>> SSA_NAME, then we must reject the equivalence.    Obviously if we wanted
>> to enter SSA_NAME = CONST, then we can still do so.
>>
>> Bootstrapped and regression tested on x86_64.  Installing on the trunk.
> 
> Hmm, but if the edge isn't executable then it will be removed and thus the
> definition _will_ dominate.  So I think the error happens elsewhere.  With
> the change you remove one of the advantages of tracking unexecutable
> edges, namely that we can treat those merges optimistically resulting in
> more CSE.
> 
> You didn't add a testcase so I can't have a quick look myself.
> 
> Short: I think you're papering over an issue elsehwere.
Depends on your point of view :-)

It's not something I think you can trigger on the trunk right now.  But
the testcase is pr56349.  You need the embedded vrp bits installed into
DOM and for DOM to also use those bits to detect branches that have a
static destination.

So I'll just walk you through it...

At the start of the second DOM pass we have this:

f ()
{
  int k__lsm.8;
  int * k;
  int a;
  int _2;
  int b.0_3;
  int _4;
  unsigned int ivtmp_5;
  int _6;
  short int iftmp.3_14;
  int _15;
  int b.4_25;
  short int iftmp.3_27;
  short int c.2_33;
  unsigned int ivtmp_38;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  a_9 = 1;
  ivtmp_38 = 1;
  a_31 = a_9 + 1;
  ivtmp_5 = ivtmp_38 + 4294967295;
  _2 = 1;
  b.0_3 = b;
  _4 = b.0_3 | _2;
  b = _4;
  if (_4 == 0)
    goto <bb 12>; [66.00%]
  else
    goto <bb 10>; [34.00%]
;;    succ:       12
;;                10

;;   basic block 3, loop depth 1
;;    pred:       5
;;                3
  goto <bb 3>; [100.00%]
;;    succ:       3

;;   basic block 4, loop depth 0
;;    pred:       11
;;                12
lbl1:
  c.2_33 = c;
;;    succ:       5

;;   basic block 5, loop depth 1
;;    pred:       4
;;                5
  if (c.2_33 != 0)
    goto <bb 3>; [85.00%]
  else
    goto <bb 5>; [15.00%]
;;    succ:       3
;;                5

;;   basic block 6, loop depth 0
;;    pred:       11
  b.4_25 = b;
  if (b.4_25 != 0)
    goto <bb 7>; [50.00%]
  else
    goto <bb 8>; [50.00%]
;;    succ:       7
;;                8

;;   basic block 7, loop depth 0
;;    pred:       6
  iftmp.3_27 = (short int) b.4_25;
  goto <bb 9>; [100.00%]
;;    succ:       9

;;   basic block 8, loop depth 0
;;    pred:       6
;;                12
  # k_37 = PHI <k_30(6), 0B(12)>
;;    succ:       9

;;   basic block 9, loop depth 0
;;    pred:       7
;;                8
  # iftmp.3_14 = PHI <iftmp.3_27(7), 2(8)>
  # k_44 = PHI <k_30(7), k_37(8)>
  c = iftmp.3_14;
  if (iftmp.3_14 != 0)
    goto <bb 10>; [50.00%]
  else
    goto <bb 11>; [50.00%]
;;    succ:       10
;;                11

;;   basic block 10, loop depth 0
;;    pred:       2
;;                9
  # k_10 = PHI <0B(2), k_44(9)>
lbl2:
  b = 0;
;;    succ:       11

;;   basic block 11, loop depth 0
;;    pred:       10
;;                9
  # k_11 = PHI <k_10(10), k_44(9)>
  k_30 = k_11 + 4;
  _6 = MEM[(int *)k_11 + 4B];
  if (_6 != 0)
    goto <bb 6>; [66.00%]
  else
    goto <bb 4>; [34.00%]
;;    succ:       6
;;                4

;;   basic block 12, loop depth 0
;;    pred:       2
  _15 = MEM[(int *)0B];
  if (_15 != 0)
    goto <bb 8>; [66.00%]
  else
    goto <bb 4>; [34.00%]
;;    succ:       8
;;                4

}

The first tidbit of interest is we can statically determine that bb2
will always transfer control to bb10.  The edge 2->12 is marked as not
executable.  That also means that 12->4 and 12->8 are not executable as
well.

The PHI in BB8 is critical:

  # k_37 = PHI <k_30(6), 0B(12)>

With the edge 8->12 being unexecutable the PHI is essentially k_37 =
k_30 and we'll try to propagate k_30 into the uses of k_37.

The first use of interest is BB9.


  # k_44 = PHI <k_30(7), k_37(8)>

Which turns into
  # k_44 = PHI <k_30(7), k_30(8)>

And we've lost.  The assignment to k_30 does not currently dominate k_44
and won't until after we've done cfg cleanups and updated the dominator
information.

How does this matter?  We have to continue to follow things through DOM.


We can then propagate k_30 into uses of k_44, which we do for the PHIs
in BB10 and BB11 which turn into:

<bb 10> [local count: 0]:
# k_10 = PHI <0B(2), k_30(9)>


<bb 11> [local count: 1]:
# k_11 = PHI <k_10(10), k_30(9)>
k_30 = k_11 + 4;
_6 = MEM[(int *)k_11 + 4B];
if (_6 != 0)
  goto <bb 6>; [66.00%]
else
  goto <bb 4>; [34.00%]


Now the fun part.

After we have finished optimizing bb9, we try to thread its outgoing
edges.  Of particular interest is the edge 9->11.

We do temporary propagation as part of the jump threading process.  ie,
when we traverse 9->11 we know that k_11 will have the value k_30.  So
we transform the assignment

k_30 = k_11 + 4

into

k_30 = k_30 + 4

Then we try to simplify that statement and infinitely recurse.


I considered trying to key behavior on EDGE_DFS_BACK (6->8).  It'd be
something like don't record equivalences from a degenerate PHI where the
remaining edge(s) are EDGE_DFS_BACK.  But I just couldn't convince
myself that was actually reasonable or sufficient in general.

Jeff

Reply via email to