https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68373

            Bug ID: 68373
           Summary: autopar fails on loop exit phi with argument defined
                    outside loop
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

First consider a parloops testcase test.c, with a use of the final value of the
iteration variable (return i):
...
unsigned int
foo (int *a, int n)
{
  int i;
  for (i = 0; i < n; ++i)
    a[i] = 1;

  return i;
}
...

Say we compile the testcase like this:
...
$ gcc -S -O2 test.c -ftree-parallelize-loops=2 -fdump-tree-all-details
...

We find in the parloops dump-file:
...
  SUCCESS: may be parallelized
...

The autoparallelization is possible because -ftree-scev-cprop substitutes the
final iteration variable value, which eliminates the only loop exit phi:
...
final value replacement:
  i_1 = PHI <i_10(4)>
  with
  i_1 = n_4(D);
...


Now consider a similar testcase test-2.c, but with loop counter and bound
unsigned:
...
unsigned int
foo (int *a, unsigned int n)
{
  unsigned int i;
  for (i = 0; i < n; ++i)
    a[i] = 1;

  return i;
}
...

Also here, we have -ftree-scev-cprop eliminating the only loop exit phi:
...
  i_2 = PHI <i_10(3)>
  with
  i_2 = n_4(D);
...

But, in a subsequent pass_copy_prop, we manage to propagate i_2 (that didn't
happen in test.c because of signedness differences), eliminate the empty bb,
effectively reintroducing a new loop exit phi:
...
  <bb 4>:
  # i_12 = PHI <0(3), i_10(5)>
  _5 = (long unsigned int) i_12;
  _6 = _5 * 4;
  _8 = a_7(D) + _6;
  *_8 = 1;
  i_10 = i_12 + 1;
  if (n_4(D) > i_10)
    goto <bb 5>;
  else
    goto <bb 6>;

  <bb 5>:
  goto <bb 4>;

  <bb 6>:
  # i_14 = PHI <n_4(D)(4), 0(2)>
...

And in parloops during
gather_scalar_reductions/vect_analyze_loop_form/vect_analyze_loop_form_1 we
split the loop exit edge, which results in this exit phi:
...
  <bb 4>:
  # i_12 = PHI <0(3), i_10(5)>
  _5 = (long unsigned int) i_12;
  _6 = _5 * 4;
  _8 = a_7(D) + _6;
  *_8 = 1;
  i_10 = i_12 + 1;
  if (n_4(D) > i_10)
    goto <bb 5>;
  else
    goto <bb 7>;

  <bb 5>:
  goto <bb 4>;

  <bb 7>:
  # n_2 = PHI <n_4(D)(4)>
...

And then parloops stumbles over that phi:
...
phi is n_2 = PHI <n_4(D)(4)>
arg of phi to exit:   value n_4(D) used outside loop
  checking if it a part of reduction pattern:
  FAILED: it is not a part of reduction.
...

In split_loop_exit_edge we preserve loop-closed ssa, which is why it introduces
phis. But we should be able to optimize
  n_2 = PHI <n_4(D)(4)>
into
  n_2 = n_4(D)
without breaking loop-closed ssa form.

Alternatively, we could improve parloops to allow phis like this and handle
them.

Reply via email to