On Thu, Feb 13, 2014 at 11:44 AM, Kirill Yukhin <kirill.yuk...@gmail.com> wrote:

> I've noticed that _mm512_permutexvar_epi[64|32] intrinsics
> have wrong arguments order. As per [1] first argument is index.
> For vmpermps/vpermpd intrinsics are fine, but I've changed tests
> to call CALC with same arg order as intrinsic. here is the same
> problem (wrong argument order) with vrcp14s[d|s].
> Also avx512er-vrcp28ss-2.c test called wrong intrinsic.
>
> [1]  http://software.intel.com/sites/landingpage/IntrinsicsGuide/
>
> gcc/
>         * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap
>         arguments order in builtin.
>         (_mm512_permutexvar_epi64): Ditto.
>         (_mm512_mask_permutexvar_epi64): Ditto
>         (_mm512_maskz_permutexvar_epi32): Ditto
>         (_mm512_permutexvar_epi32): Ditto
>         (_mm512_mask_permutexvar_epi32): Ditto
>         * config/i386/sse.md (srcp14<mode>): Swap operands.
>
> gcc/testsuite/
>         * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic.
>         * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations.
>         * gcc.target/i386/avx512f-vpermpd-2.c: Ditto.
>         * gcc.target/i386/avx512f-vpermps-2.c: Ditto.
>         * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto.
>         * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
>         * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index a04b289..d3b2dc5 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1456,12 +1456,12 @@
>    [(set (match_operand:VF_128 0 "register_operand" "=v")
>         (vec_merge:VF_128
>           (unspec:VF_128
> -           [(match_operand:VF_128 1 "nonimmediate_operand" "vm")]
> +           [(match_operand:VF_128 2 "nonimmediate_operand" "vm")]
>             UNSPEC_RCP14)
> -         (match_operand:VF_128 2 "register_operand" "v")
> +         (match_operand:VF_128 1 "register_operand" "v")
>           (const_int 1)))]
>    "TARGET_AVX512F"
> -  "vrcp14<ssescalarmodesuffix>\t{%1, %2, %0|%0, %2, %1}"
> +  "vrcp14<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"

Please don't change srcp pattern, it should be defined similar to
vrcpss (aka sse_vmrcpv4sf). You need to switch operand order
elsewhere.

Other than that, the patch is OK.

Uros.

Reply via email to