On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson <r...@redhat.com> wrote: > On 10/03/2011 09:43 AM, Artem Shinkarov wrote: >> Hi, Richard >> >> There is a problem with the testcases of the patch you have committed >> for me. The code in every test-case is doubled. Could you please, >> apply the following patch, otherwise it would fail all the tests from >> the vector-shuffle-patch would fail. > > Huh. Dunno what happened there. Fixed. > >> Also, if it is possible, could you change my name from in the >> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last >> version is the way I am spelled in the passport, and the name I use in >> the ChangeLog. > > Fixed. > > > r~ >
Richard, there was a problem causing segfault in ix86_expand_vshuffle which I have fixed with the patch attached. Another thing I cannot figure out is the following case: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type)))) type vector (8, short) __attribute__ ((noinline)) f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) { return __builtin_shuffle (x, y, mask); } int main (int argc, char *argv[]) { vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; vector (8, short) v2; int i; v2 = f (v0, v1, mask0); /* v2 = __builtin_shuffle (v0, v1, mask0); */ for (i = 0; i < 8; i ++) __builtin_printf ("%i, ", v2[i]); return 0; } I am compiling with support of ssse3, in my case it is ./xgcc -B. b.c -O3 -mtune=core2 -march=core2 And I get 1, 1, 1, 3, 4, 5, 1, 7, on the output, which is wrong. But if I will call __builtin_shuffle directly, then the answer is correct. Any ideas? Thanks, Artem.
Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 179464) +++ gcc/config/i386/i386.c (working copy) @@ -19312,14 +19312,17 @@ ix86_expand_vshuffle (rtx operands[]) xops[1] = operands[1]; xops[2] = operands[2]; xops[3] = gen_rtx_EQ (mode, mask, w_vector); - xops[4] = t1; - xops[5] = t2; + xops[4] = t2; + xops[5] = t1; return ix86_expand_int_vcond (xops); } - /* mask = mask * {w, w, ...} */ - new_mask = expand_simple_binop (maskmode, MULT, new_mask, w_vector, + /* mask = mask * {16/w, 16/w, ...} */ + for (i = 0; i < w; i++) + vec[i] = GEN_INT (16/w); + vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec)); + new_mask = expand_simple_binop (maskmode, MULT, new_mask, vt, NULL_RTX, 0, OPTAB_DIRECT); /* Convert mask to vector of chars. */ @@ -19332,7 +19335,7 @@ ix86_expand_vshuffle (rtx operands[]) ... */ for (i = 0; i < w; i++) for (j = 0; j < 16/w; j++) - vec[i*w+j] = GEN_INT (i*16/w); + vec[i*(16/w)+j] = GEN_INT (i*16/w); vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec)); vt = force_reg (V16QImode, vt); @@ -19344,7 +19347,7 @@ ix86_expand_vshuffle (rtx operands[]) new_mask = new_mask + {0,1,..,16/w, 0,1,..,16/w, ...} */ for (i = 0; i < w; i++) for (j = 0; j < 16/w; j++) - vec[i*w+j] = GEN_INT (j); + vec[i*(16/w)+j] = GEN_INT (j); vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec)); new_mask = expand_simple_binop (V16QImode, PLUS, new_mask, vt, @@ -19386,8 +19389,8 @@ ix86_expand_vshuffle (rtx operands[]) xops[1] = operands[1]; xops[2] = operands[2]; xops[3] = gen_rtx_EQ (mode, mask, w_vector); - xops[4] = t1; - xops[5] = t2; + xops[4] = t2; + xops[5] = t1; return ix86_expand_int_vcond (xops); }