https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94509

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<ja...@gcc.gnu.org>:

https://gcc.gnu.org/g:14192f1ed48cb3982b1b3c794e0f313835d0cdcd

commit r9-8482-g14192f1ed48cb3982b1b3c794e0f313835d0cdcd
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Tue Apr 7 14:39:24 2020 +0200

    i386: Fix V{64QI,32HI}mode constant permutations [PR94509]

    The following testcases are miscompiled, because expand_vec_perm_pshufb
    incorrectly thinks it can use vpshufb instruction for the permutations
    when it can't.
    The
              if (vmode == V32QImode)
                {
                  /* vpshufb only works intra lanes, it is not
                     possible to shuffle bytes in between the lanes.  */
                  for (i = 0; i < nelt; ++i)
                    if ((d->perm[i] ^ i) & (nelt / 2))
                      return false;
                }
    intra-lane check which is correct has been copied and adjusted for 64-byte
    modes into:
              if (vmode == V64QImode)
                {
                  /* vpshufb only works intra lanes, it is not
                     possible to shuffle bytes in between the lanes.  */
                  for (i = 0; i < nelt; ++i)
                    if ((d->perm[i] ^ i) & (nelt / 4))
                      return false;
                }
    which is not correct, because 64-byte modes have 4 lanes rather than just
    two and the above is only testing that the permutation grabs even lane elts
    from even lanes and odd lane elts from odd lanes, but not that they are
    from the same 256-bit half.

    The following patch fixes it by using 3 * nelt / 4 instead of nelt / 4,
    so we actually check the most significant 2 bits rather than just one.

    2020-04-07  Jakub Jelinek  <ja...@redhat.com>

            PR target/94509
            * config/i386/i386-expand.c (expand_vec_perm_pshufb): Fix the check
            for inter-lane permutation for 64-byte modes.

            * gcc.target/i386/avx512bw-pr94509-1.c: New test.
            * gcc.target/i386/avx512bw-pr94509-2.c: New test.

Reply via email to