On Fri, Jun 17, 2011 at 3:01 PM, Jakub Jelinek <ja...@redhat.com> wrote:

>
> > Not here, those are handled by  ix86_expand_args_builtin
> > instead of ix86_expand_multi_arg_builtin.  Furthermore, only
> > CODE_FOR_vcvtps2ph and CODE_FOR_vcvtps2ph256 have CONST_INT argument.
> > And I believe ix86_expand_args_builtin handles it fine, what's wrong
> > is the actual predicates those insns use.
>
> Ok, had a deeper look into this and it seems there are other issues,
> some of them even without test coverage regressed since 4.6.
> Some problems result in ICEs, other fail to assemble.  Had to revert
> the blendbits removal patch, because that removal results in out of
> range immediates not to be reported as predicate failures, but instead
> as ICEs.
>
> So here is an updated patch that adds test coverage.  Regtested
> on x86_64-linux {-m32,-m64}, ok for trunk (and backport for 4.6)?
>
> There are still a couple of things I'm unsure about (not tested
> by the testcases, compile fine):
> #include <x86intrin.h>
> __m128i i1, i2, i3, i4;
> __m128 a1, a2, a3, a4;
> __m128d d1, d2, d3, d4;
> __m256i l1, l2, l3, l4;
> __m256 b1, b2, b3, b4;
> __m256d e1, e2, e3, e4;
> __m64 m1, m2, m3, m4;
> int k1, k2, k3, k4;
> float f1, f2, f3, f4;
> void
> foo (void)
> {
>  /* 8 bit imm only?  This compiles fine, but one ends up with
>     number modulo 256 in the insn.  To make it error out
>     const_0_to_255_operand would need to be used.  */
>  e1 = _mm256_shuffle_pd (e2, e3, 256);
>  b1 = _mm256_shuffle_ps (b2, b3, 256);
>  i1 = _mm_shuffle_epi32 (i2, 256);
>  i1 = _mm_shufflehi_epi16 (i2, 256);
>  i1 = _mm_shufflelo_epi16 (i2, 256);
>  d1 = _mm_shuffle_pd (d2, d3, 256);
>  m1 = _mm_shuffle_pi16 (m2, 256);
>  a1 = _mm_shuffle_ps (a2, a3, 256);

These actually take macro function for shuffle. But I think that we
should use const_0_to_255 here, since this is the range that assembler
recognizes.

>  /* What about these?  Similarly to the above, they result
>     in imm modulo 16 resp. imm modulo 4.  */
>  e1 = _mm256_permute_pd (e2, 16);
>  d1 = _mm_permute_pd (d2, 4);
> }
>

Also const_0_to_255 here, the width of the immediate is specified as
8-bit immediate at [1].

[1] 
http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/cpp/lin/intref_cls/common/intref_avx_permute_pd.htm

>
> 2011-06-17  Jakub Jelinek  <ja...@redhat.com>
>
>        PR target/49411
>        * config/i386/i386.c (ix86_expand_multi_arg_builtins): If
>        last_arg_constant and last argument doesn't match its predicate,
>        for xop_vpermil2<mode>3 error out and for xop_rotl<mode>3
>        if it is CONST_INT, mask it, otherwise expand using rotl<mode>3.
>        (ix86_expand_sse_pcmpestr, ix86_expand_sse_pcmpistr): Fix
>        spelling of error message.
>        * config/i386/sse.md (sse4a_extrqi, sse4a_insertqi,
>        vcvtps2ph, *vcvtps2ph, *vcvtps2ph_store, vcvtps2ph256): Use
>        const_0_to_255_operand instead of const_int_operand.
>
>        Revert:
>        2011-05-09  Uros Bizjak  <ubiz...@gmail.com>
>
>        * config/i386/sse.md (blendbits): Remove mode attribute.
>        (<sse4_1>_blend<ssemodesuffix><avxsizesuffix>): Use const_int_operand
>        instead of const_0_to_<blendbits>_operand for operand 3 predicate.
>        Check integer value of operand 3 in insn constraint.
>
>        * gcc.target/i386/testimm-1.c: New test.
>        * gcc.target/i386/testimm-2.c: New test.
>        * gcc.target/i386/testimm-3.c: New test.
>        * gcc.target/i386/testimm-4.c: New test.
>        * gcc.target/i386/testimm-5.c: New test.
>        * gcc.target/i386/testimm-6.c: New test.
>        * gcc.target/i386/testimm-7.c: New test.
>        * gcc.target/i386/testimm-8.c: New test.
>        * gcc.target/i386/xop-vpermil2px-2.c: New test.
>        * gcc.target/i386/xop-rotate1-int.c: New test.
>        * gcc.target/i386/xop-rotate2-int.c: New test.
>

This is OK for 4.6 and mainline.

Thanks,
Uros.

Reply via email to