On 10/19/2011 12:25 PM, Jakub Jelinek wrote: > 2011-10-19 Jakub Jelinek <ja...@redhat.com> > > * config/i386/i386.c (expand_vec_perm_vpshufb2_vpermq_even_odd): Use > d->op1 instead of d->op0 for the second vpshufb. > (expand_vec_perm_even_odd_1): For V8SImode fix vpshufd immediates. > (ix86_expand_vec_perm_const): If mask indicates two operands are > needed, but both are the same and expanding them as d.op0 == d.op1 > failed, retry with d.op0 != d.op1. > (ix86_expand_vec_perm_builtin): Likewise. Handle sorry printing > also for d.nelt == 32. > > * gcc.dg/torture/vshuf-32.inc: Add interleave permutations. > * gcc.dg/torture/vshuf-16.inc: Likewise. > * gcc.dg/torture/vshuf-8.inc: Likewise. > * gcc.dg/torture/vshuf-4.inc: Likewise.
Ok. Although I think a good followup would be to fix > + if (which == 3 && d.op0 == d.op1) > + { > + rtx seq; > + bool ok; > + > + memcpy (d.perm, perm, sizeof (perm)); > + d.op1 = gen_reg_rtx (d.vmode); > + start_sequence (); > + ok = ix86_expand_vec_perm_builtin_1 (&d); > + seq = get_insns (); > + end_sequence (); > + if (ok) > + { > + emit_move_insn (d.op1, d.op0); > + emit_insn (seq); this so that we don't need a copy to another register. That could be done by adding a d.one_operand field, and using that test instead of explicit equality everywhere. r~