http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-03-20
Ever Confirmed|0 |1
--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-20
09:17:41 UTC ---
Testing the 3 patches now (AVX2 improvements, expand_vselect and #c8 with
further comments). For 3/4 insn sequences, I agree with the proposal to
attempt to handle d->op0 == d->op1 cross-lane shuffles as two operand in-lane
shuffles
after vperm2f128 swapping the lanes. Two insn expanders could be groupped into
expand_vec_perm_2 and three insn expanders into expand_vec_perm_3.
We need to write some further 2 and 3 insn in-lane expanders though, as shown
by:
typedef double V4DF __attribute__((vector_size (4 * sizeof (double))));
typedef long V4DI __attribute__((vector_size (4 * sizeof (long))));
#define A(a, b, c, d) \
__attribute__((noinline, noclone)) V4DF \
f##a##b##c##d (V4DF x, V4DF y) \
{\
V4DI m = { a, b, c, d }; \
return __builtin_shuffle (x, y, m); \
}
#define B(b, c, d) A(0, b, c, d) A(1, b, c, d) A(4, b, c, d) A(5, b, c, d)
#define C(c, d) B(0, c, d) B(1, c, d) B(4, c, d) B(5, c, d)
#define D(d) C(2, d) C(3, d) C(6, d) C(7, d)
#define E D(2) D(3) D(6) D(7)
E
int
main ()
{
V4DF x = { 0.5, 1.5, 2.5, 3.5 }, y = { 4.5, 5.5, 6.5, 7.5 }, z;
#undef A
#define A(a, b, c, d) \
z = f##a##b##c##d (x, y); \
if (z[0] != a + .5 || z[1] != b + .5 || z[2] != c + .5 || z[3] != d + .5) \
__builtin_abort ();
E
return 0;
}