http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48316
Summary: missed CSE / reassoc with array offsets Product: gcc Version: 4.7.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: rgue...@gcc.gnu.org int foo (int *p, int i) { int i1 = i + 1; int i2 = i + 2; return p[i1] + p[i2]; } int bar (int *p, unsigned long i) { unsigned long i1 = i + 1; unsigned long i2 = i + 2; return p[i1] + p[i2]; } For both testcases (the latter being the more "optimal" input due to pointer-plus-expr constraints) we miss to CSE the multiplication of i by 4 which makes the memory references not trivially independent (based on the same pointer, offsetted by different constants). Such a case causes vectorization for alias checks being inserted for gfortran.dg/reassoc_4.f with --param max-completely-peeled-insns=4000 IL on x86_64 is <bb 2>: i1_2 = i_1(D) + 1; i2_3 = i_1(D) + 2; D.2702_4 = (long unsigned int) i1_2; D.2703_5 = D.2702_4 * 4; D.2704_7 = p_6(D) + D.2703_5; D.2705_8 = MEM[(int *)D.2704_7]; D.2706_9 = (long unsigned int) i2_3; D.2707_10 = D.2706_9 * 4; D.2708_11 = p_6(D) + D.2707_10; D.2709_12 = MEM[(int *)D.2708_11]; D.2701_13 = D.2705_8 + D.2709_12; return D.2701_13; vs. <bb 2>: i1_2 = i_1(D) + 1; i2_3 = i_1(D) + 2; D.2694_4 = i1_2 * 4; D.2695_6 = p_5(D) + D.2694_4; D.2696_7 = MEM[(int *)D.2695_6]; D.2697_8 = i2_3 * 4; D.2698_9 = p_5(D) + D.2697_8; D.2699_10 = MEM[(int *)D.2698_9]; D.2693_11 = D.2696_7 + D.2699_10; return D.2693_11; For the reassoc_4.f testcase the question is whether either SCEV or data-dependence can be enhanced to handle the cases (the multiplications are in BB2, outside of any loop).