On Fri, Jun 17, 2011 at 3:01 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> > > Not here, those are handled by ix86_expand_args_builtin > > instead of ix86_expand_multi_arg_builtin. Furthermore, only > > CODE_FOR_vcvtps2ph and CODE_FOR_vcvtps2ph256 have CONST_INT argument. > > And I believe ix86_expand_args_builtin handles it fine, what's wrong > > is the actual predicates those insns use. > > Ok, had a deeper look into this and it seems there are other issues, > some of them even without test coverage regressed since 4.6. > Some problems result in ICEs, other fail to assemble. Had to revert > the blendbits removal patch, because that removal results in out of > range immediates not to be reported as predicate failures, but instead > as ICEs. > > So here is an updated patch that adds test coverage. Regtested > on x86_64-linux {-m32,-m64}, ok for trunk (and backport for 4.6)? > > There are still a couple of things I'm unsure about (not tested > by the testcases, compile fine): > #include <x86intrin.h> > __m128i i1, i2, i3, i4; > __m128 a1, a2, a3, a4; > __m128d d1, d2, d3, d4; > __m256i l1, l2, l3, l4; > __m256 b1, b2, b3, b4; > __m256d e1, e2, e3, e4; > __m64 m1, m2, m3, m4; > int k1, k2, k3, k4; > float f1, f2, f3, f4; > void > foo (void) > { > /* 8 bit imm only? This compiles fine, but one ends up with > number modulo 256 in the insn. To make it error out > const_0_to_255_operand would need to be used. */ > e1 = _mm256_shuffle_pd (e2, e3, 256); > b1 = _mm256_shuffle_ps (b2, b3, 256); > i1 = _mm_shuffle_epi32 (i2, 256); > i1 = _mm_shufflehi_epi16 (i2, 256); > i1 = _mm_shufflelo_epi16 (i2, 256); > d1 = _mm_shuffle_pd (d2, d3, 256); > m1 = _mm_shuffle_pi16 (m2, 256); > a1 = _mm_shuffle_ps (a2, a3, 256); These actually take macro function for shuffle. But I think that we should use const_0_to_255 here, since this is the range that assembler recognizes. > /* What about these? Similarly to the above, they result > in imm modulo 16 resp. imm modulo 4. */ > e1 = _mm256_permute_pd (e2, 16); > d1 = _mm_permute_pd (d2, 4); > } > Also const_0_to_255 here, the width of the immediate is specified as 8-bit immediate at [1]. [1] http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/cpp/lin/intref_cls/common/intref_avx_permute_pd.htm > > 2011-06-17 Jakub Jelinek <ja...@redhat.com> > > PR target/49411 > * config/i386/i386.c (ix86_expand_multi_arg_builtins): If > last_arg_constant and last argument doesn't match its predicate, > for xop_vpermil2<mode>3 error out and for xop_rotl<mode>3 > if it is CONST_INT, mask it, otherwise expand using rotl<mode>3. > (ix86_expand_sse_pcmpestr, ix86_expand_sse_pcmpistr): Fix > spelling of error message. > * config/i386/sse.md (sse4a_extrqi, sse4a_insertqi, > vcvtps2ph, *vcvtps2ph, *vcvtps2ph_store, vcvtps2ph256): Use > const_0_to_255_operand instead of const_int_operand. > > Revert: > 2011-05-09 Uros Bizjak <ubiz...@gmail.com> > > * config/i386/sse.md (blendbits): Remove mode attribute. > (<sse4_1>_blend<ssemodesuffix><avxsizesuffix>): Use const_int_operand > instead of const_0_to_<blendbits>_operand for operand 3 predicate. > Check integer value of operand 3 in insn constraint. > > * gcc.target/i386/testimm-1.c: New test. > * gcc.target/i386/testimm-2.c: New test. > * gcc.target/i386/testimm-3.c: New test. > * gcc.target/i386/testimm-4.c: New test. > * gcc.target/i386/testimm-5.c: New test. > * gcc.target/i386/testimm-6.c: New test. > * gcc.target/i386/testimm-7.c: New test. > * gcc.target/i386/testimm-8.c: New test. > * gcc.target/i386/xop-vpermil2px-2.c: New test. > * gcc.target/i386/xop-rotate1-int.c: New test. > * gcc.target/i386/xop-rotate2-int.c: New test. > This is OK for 4.6 and mainline. Thanks, Uros.