http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51074
Bug #: 51074 Summary: No constant folding performed for VEC_PERM_EXPR, VEC_INTERLEAVE*EXPR, VEC_EXTRACT*EXPR Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ja...@gcc.gnu.org CC: i...@gcc.gnu.org, r...@gcc.gnu.org We don't constant fold what could be constant folded, namely the above mentioned permutation trees if all the operands of them are VECTOR_CSTs: -O2 #define vector(type, count) type __attribute__((vector_size (sizeof (type) * count))) vector (short, 8) d; void foo () { vector (short, 8) a = { 0, 1, 2, 3, 4, 5, 6, 7 }; vector (short, 8) b = { 8, 9, 10, 11, 12, 13, 14, 15 }; vector (short, 8) c = { 0, 8, 1, 9, 2, 10, 3, 11 }; d = __builtin_shuffle (a, b, c); } void bar () { vector (short, 8) a = { 0, 1, 2, 3, 4, 5, 6, 7 }; vector (short, 8) b = { 8, 9, 10, 11, 12, 13, 14, 15 }; vector (short, 8) c = { 4, 12, 5, 13, 6, 14, 7, 15 }; d = __builtin_shuffle (a, b, c); } or: -O3 -fno-vect-cost-model -mavx: char *a[1024]; extern char b[]; void foo () { int i; for (i = 0; i < 1024; i += 16) { a[i] = b + 1; a[i + 15] = b + 2; a[i + 1] = b + 3; a[i + 14] = b + 4; a[i + 2] = b + 5; a[i + 13] = b + 6; a[i + 3] = b + 7; a[i + 12] = b + 8; a[i + 4] = b + 9; a[i + 11] = b + 10; a[i + 5] = b + 11; a[i + 10] = b + 12; a[i + 6] = b + 13; a[i + 9] = b + 14; a[i + 7] = b + 15; a[i + 8] = b + 16; } } I wonder if e.g. expand_vector_operations couldn't handle those (if all the arguments are either VECTOR_CSTs or SSA_NAMEs initialized to VECTOR_CSTs), there is of course a risk that if we create from very few VECTOR_CSTs in a loop many different VECTOR_CSTs then it increases register pressure, so perhaps we'd want to count how many VECTOR_CSTs we've created vs. how many we've got rid and allow the number to grow only by some small constant or something similar. Plus, there is the question if the vectorizer shouldn't be aware of that too (e.g. in the second testcase the vectorizer could take it into the account when computing costs and e.g. for interleaved constant stores couldn't just do it right away.