On 06/09/2014 03:13 AM, Evgeny Stupachenko wrote:
> +  /* First we apply one operand permutation to the part where
> +     elements stay not in their respective lanes.  */
> +  dcopy = *d;
> +  if (which == 2)
> +    dcopy.op0 = dcopy.op1 = d->op1;
> +  else
> +    dcopy.op0 = dcopy.op1 = d->op0;
> +  dcopy.one_operand_p = true;
> +
> +  for (i = 0; i < nelt; ++i)
> +    {
> +      unsigned e = d->perm[i];
> +      if (which == 2)
> +       dcopy.perm[i] = ((e >= nelt) ? (e - nelt) : e);

This is wrong for which == 1.  For both cases this simplifies to

  dcopy.perm[i] = e & (nelt - 1);

> +
> +  for (i = 0; i < nelt; ++i)
> +    {
> +      unsigned e = d->perm[i];
> +      if (which == 2)
> +       dcopy1.perm[i] = ((e >= nelt) ? (nelt + i) : e);
> +      else
> +       dcopy1.perm[i] = ((e < nelt) ? i : e);
> +    }

This is known to be a blend, so you know the value of E.
Simplifies to

  dcopy1.perm[i] = (e >= nelt ? nelt + i : i);


r~

Reply via email to