http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56764
Bug #: 56764 Summary: vect_prune_runtime_alias_test_list not smart enough Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ja...@gcc.gnu.org __attribute__((noinline, noclone)) void foo (float x[3][32], float y1, float y2, float y3, float *z1, float *z2, float *z3) { int i; for (i = 0; i < 32; i++) { z1[i] = -y1 * x[0][i]; z2[i] = -y2 * x[1][i]; z3[i] = -y3 * x[2][i]; } } float x[6][32] __attribute__((aligned (32))); int main () { int i; for (i = 0; i < 32; i++) { x[0][i] = i; x[1][i] = 7 * i; x[2][i] = -5.5 * i; } for (i = 0; i < 100000000; i++) foo (&x[0], 12.5, 0.5, -1.5, &x[3][0], &x[4][0], &x[5][0]); return 0; } isn't vectorized on x86_64-linux with -O3 -mavx, because there are too many versioning checks for alias. We vectorize it only with --param vect-max-version-for-alias-checks=12 . But I don't see why we'd need to emit that many checks for versioning, instead of the 12 checks for aliasing we emit we could emit just 6 (keep the 3 overlap checks in between z1, z2 and z3 and just merge each of the zN vs. &x[0][0], zN vs. &x[1][0] and zN vs. &x[2][0] tests into one comparing zN[0] though zN[31] range with &x[0][0] through &x[2][31]. Similarly, if we wanted to do a runtime check for alignment (not the case on x86_64 apparently), we could test only alignment of &x[0][0], because it is provably the same alignment as &x[1][0] and &x[2][0].