On Tue, Jul 25, 2017 at 12:48 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Mon, Jul 10, 2017 at 10:24 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Tue, Jun 27, 2017 at 11:49 AM, Bin Cheng <bin.ch...@arm.com> wrote:
>>> Hi,
>>> This is a followup patch better handling below case:
>>>      for (i = 0; i < n; i++)
>>>        {
>>>          a[i] = 1;
>>>          a[i+2] = 2;
>>>        }
>>> Instead of generating root variables by loading from memory and propagating 
>>> with PHI
>>> nodes, like:
>>>      t0 = a[0];
>>>      t1 = a[1];
>>>      for (i = 0; i < n; i++)
>>>        {
>>>          a[i] = 1;
>>>          t2 = 2;
>>>          t0 = t1;
>>>          t1 = t2;
>>>        }
>>>      a[n] = t0;
>>>      a[n+1] = t1;
>>> We can simply store loop invariant values after loop body if we know loop 
>>> iterates more
>>> than chain->length times, like:
>>>      for (i = 0; i < n; i++)
>>>        {
>>>          a[i] = 1;
>>>        }
>>>      a[n] = 2;
>>>      a[n+1] = 2;
>>>
>>> Bootstrap(O2/O3) in patch series on x86_64 and AArch64.  Is it OK?
>> Update patch wrto changes in previous patch.
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> +      if (TREE_CODE (val) == INTEGER_CST || TREE_CODE (val) == REAL_CST)
> +       continue;
>
> Please use CONSTANT_CLASS_P (val) instead.  I suppose VECTOR_CST or
> FIXED_CST would be ok as well for example.
>
> Ok with that change.  Did we eventually optimize this in followup
> passes previously?
Probably not?  Given below test:

int a[10000], b[10000], c[10000];
int f(void)
{
  int i, n = 100;
  int t0 = a[0];
  int t1 = a[1];
     for (i = 0; i < n; i++)
       {
         a[i] = 1;
         int t2 = 2;
         t0 = t1;
         t1 = t2;
       }
     a[n] = t0;
     a[n+1] = t1;
  return 0;
}
The optimized dump is as:

  <bb 2> [1.00%] [count: INV]:
  t1_8 = a[1];
  ivtmp.9_17 = (unsigned long) &a;
  _16 = ivtmp.9_17 + 400;

  <bb 3> [99.00%] [count: INV]:
  # t1_20 = PHI <2(3), t1_8(2)>
  # ivtmp.9_2 = PHI <ivtmp.9_1(3), ivtmp.9_17(2)>
  _15 = (void *) ivtmp.9_2;
  MEM[base: _15, offset: 0B] = 1;
  ivtmp.9_1 = ivtmp.9_2 + 4;
  if (ivtmp.9_1 != _16)
    goto <bb 3>; [98.99%] [count: INV]
  else
    goto <bb 4>; [1.01%] [count: INV]

  <bb 4> [1.00%] [count: INV]:
  a[100] = t1_20;
  a[101] = 2;
  return 0;

We now eliminate one phi and leave another behind.  It is vrp1/dce2
when the phi is eliminated.

Thanks,
bin

Reply via email to