On Fri, Apr 16, 2010 at 7:40 PM, Lu, John <john...@verisilicon.com> wrote:
> Hi,
>
> I've encountered a performance issue in a port of GCC I'm working on,
> where the behavior of LIM is affected by the ordering of fields in a
> structure.  I've been able to re-create it with a GCC 4.3.1
> Linux X86 compiler.  With GCC 4.3.1 Linux X86, when I compile:
>
>    struct foo {
>      int *p;
>      int  t;
>    } T;
>
>    void o() {
>      unsigned int  i;
>
>      for (i = 0; i < 256; i++) {
>        T.p[i]=0;
>      }
>    }
>
> with the command:
>
>    gcc -S -O2 -fdump-tree-all good.c
>
> the file good.c.095t.lim shows T.p being moved outside the loop:
>
>    <bb 2>:
>      pretmp.10_8 = T.p;
>
>    <bb 3>:
>      # i_14 = PHI <i_7(4), 0(2)>
>      D.1556_4 = (long unsigned int) i_14;
>      D.1557_5 = D.1556_4 * 4;
>      D.1558_6 = pretmp.10_8 + D.1557_5;
>      *D.1558_6 = 0;
>      i_7 = i_14 + 1;
>      if (i_7 <= 255)
>        goto <bb 4>;
>      else
>        goto <bb 5>;
>
>    <bb 4>:
>      goto <bb 3>;
>
> If the fields of the structure are reversed:
>
>    struct foo {
>      int  t;
>      int *p;
>    } T;
>
> T.p is kept inside the loop:

This is because T.p[i] may point to T.t and gcc is confused then.
Look at the -crited-vops dump.

Richard.

>    <bb 3>:
>      # i_21 = PHI <i_7(4), 0(2)>
>      D.1555_3 = T.p;
>      D.1556_4 = (long unsigned int) i_21;
>      D.1557_5 = D.1556_4 * 4;
>      D.1558_6 = D.1555_3 + D.1557_5;
>      *D.1558_6 = 0;
>      i_7 = i_21 + 1;
>      if (i_7 <= 255)
>        goto <bb 4>;
>      else
>        goto <bb 5>;
>
>    <bb 4>:
>      goto <bb 3>;
>
> On my port, this causes a large performance degradation, and I suspect
> the
> cause is ultimately in the alias analysis pass.  I was wondering if
> there
> is a way to configure GCC to avoid this issue.
>
> Thanks,
> John Lu
>

Reply via email to