On Fri, Apr 16, 2010 at 7:40 PM, Lu, John <john...@verisilicon.com> wrote: > Hi, > > I've encountered a performance issue in a port of GCC I'm working on, > where the behavior of LIM is affected by the ordering of fields in a > structure. I've been able to re-create it with a GCC 4.3.1 > Linux X86 compiler. With GCC 4.3.1 Linux X86, when I compile: > > struct foo { > int *p; > int t; > } T; > > void o() { > unsigned int i; > > for (i = 0; i < 256; i++) { > T.p[i]=0; > } > } > > with the command: > > gcc -S -O2 -fdump-tree-all good.c > > the file good.c.095t.lim shows T.p being moved outside the loop: > > <bb 2>: > pretmp.10_8 = T.p; > > <bb 3>: > # i_14 = PHI <i_7(4), 0(2)> > D.1556_4 = (long unsigned int) i_14; > D.1557_5 = D.1556_4 * 4; > D.1558_6 = pretmp.10_8 + D.1557_5; > *D.1558_6 = 0; > i_7 = i_14 + 1; > if (i_7 <= 255) > goto <bb 4>; > else > goto <bb 5>; > > <bb 4>: > goto <bb 3>; > > If the fields of the structure are reversed: > > struct foo { > int t; > int *p; > } T; > > T.p is kept inside the loop:
This is because T.p[i] may point to T.t and gcc is confused then. Look at the -crited-vops dump. Richard. > <bb 3>: > # i_21 = PHI <i_7(4), 0(2)> > D.1555_3 = T.p; > D.1556_4 = (long unsigned int) i_21; > D.1557_5 = D.1556_4 * 4; > D.1558_6 = D.1555_3 + D.1557_5; > *D.1558_6 = 0; > i_7 = i_21 + 1; > if (i_7 <= 255) > goto <bb 4>; > else > goto <bb 5>; > > <bb 4>: > goto <bb 3>; > > On my port, this causes a large performance degradation, and I suspect > the > cause is ultimately in the alias analysis pass. I was wondering if > there > is a way to configure GCC to avoid this issue. > > Thanks, > John Lu >