On Fri, Feb 7, 2014 at 11:13 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
>> This (should be) the last patch for AVX-512 support in v4.9. >> >> It improves correspondence between ICC, SDM [1], and official >> intrinsics guide [2]. >> >> What was done: >> - Fixed shifts such as VPSLLD and friends. Actual instruction >> loads 128-bit of count and uses only 64-bit (see [1]). So, >> we're changing V4S* -> V2D* in built-ins. >> >> - Rename (back) _mm512_[load|store]u_epi32 to >> _mm512_[load|store]u_si512 according to [2]. >> >> - Remove floor and ceil with zero-masking and/or rounding. >> >> - Remove _mm512_expand_p[s|d] as it is absent in [2]. >> >> - Make scatter prefetch take 1,2,5,6 immediates. 1 and 5 means >> L1, 2 and 6 means L6. >> >> - Make gather prefetch take 1 and 2 meaning corresponding cache >> levels. >> >> - Fix all tests accordingly. >> >> gcc/ >> * config/i386/avx512fintrin.h (_mm512_storeu_epi64): Removed. >> (_mm512_loadu_epi32): Renamed into... >> (_mm512_loadu_si512): This. >> (_mm512_storeu_epi32): Renamed into... >> (_mm512_storeu_si512): This. >> (_mm512_maskz_ceil_ps): Removed. >> (_mm512_maskz_ceil_pd): Ditto. >> (_mm512_maskz_floor_ps): Ditto. >> (_mm512_maskz_floor_pd): Ditto. >> (_mm512_floor_round_ps): Ditto. >> (_mm512_floor_round_pd): Ditto. >> (_mm512_ceil_round_ps): Ditto. >> (_mm512_ceil_round_pd): Ditto. >> (_mm512_mask_floor_round_ps): Ditto. >> (_mm512_mask_floor_round_pd): Ditto. >> (_mm512_mask_ceil_round_ps): Ditto. >> (_mm512_mask_ceil_round_pd): Ditto. >> (_mm512_maskz_floor_round_ps): Ditto. >> (_mm512_maskz_floor_round_pd): Ditto. >> (_mm512_maskz_ceil_round_ps): Ditto. >> (_mm512_maskz_ceil_round_pd): Ditto. >> (_mm512_expand_pd): Ditto. >> (_mm512_expand_ps): Ditto. >> (_mm512_sll_epi32): Updated parameter type. >> (_mm512_mask_sll_epi32): Ditto. >> (_mm512_maskz_sll_epi32): Ditto. >> (_mm512_srl_epi32): Ditto. >> (_mm512_mask_srl_epi32): Ditto. >> (_mm512_maskz_srl_epi32): Ditto. >> (_mm512_sra_epi32): Ditto. >> (_mm512_mask_sra_epi32): Ditto. >> (_mm512_maskz_sra_epi32): Ditto. >> * config/i386/i386-builtin-type.def >> (V16SI_FTYPE_V16SI_V4SI_V16SI_HI): >> Change into... >> (V16SI_FTYPE_V16SI_V2DI_V16SI_HI): This. >> * config/i386/i386.c (ix86_builtins): Remove >> IX86_BUILTIN_EXPANDPD512_NOMASK, IX86_BUILTIN_EXPANDPS512_NOMASK. >> (bdesc_args): Ditto. >> * config/i386/predicates.md (const1256_operand): New. >> (const_1_to_2_operand): Ditto. >> * config/i386/sse.md (avx512pf_gatherpf<mode>sf): Change hint value. >> (*avx512pf_gatherpf<mode>sf_mask): Ditto. >> (*avx512pf_gatherpf<mode>sf): Ditto. >> (avx512pf_gatherpf<mode>df): Ditto. >> (*avx512pf_gatherpf<mode>df_mask): Ditto. >> (*avx512pf_gatherpf<mode>df): Ditto. >> (avx512pf_scatterpf<mode>sf): Ditto. >> (*avx512pf_scatterpf<mode>sf_mask): Ditto. >> (*avx512pf_scatterpf<mode>sf): Ditto. >> (avx512pf_scatterpf<mode>df): Ditto. >> (*avx512pf_scatterpf<mode>df_mask): Ditto. >> (*avx512pf_scatterpf<mode>df): Ditto. >> (avx512f_expand<mode>): Removed. >> (<shift_insn><mode>3<mask_name>): Change parameter type. >> >> gcc/testsuite/ >> * gcc.target/i386/avx512f-vexpandpd-1.c: Update intrinsics. >> * gcc.target/i386/avx512f-vexpandps-1.c: Ditto. >> * gcc.target/i386/avx512f-vexpandpd-2.c: Ditto. >> * gcc.target/i386/avx512f-vexpandps-2.c: Ditto. >> * gcc.target/i386/avx512f-vmovdqu32-1: Ditto. >> * gcc.target/i386/avx512f-vmovdqu32-2: Ditto. >> * gcc.target/i386/avx512f-vmovdqu64-1: Ditto. >> * gcc.target/i386/avx512f-vmovdqu64-2: Ditto. >> * gcc.target/i386/avx512f-vpcmpd-2.c: Ditto. >> * gcc.target/i386/avx512f-vpcmpq-2.c: Ditto. >> * gcc.target/i386/avx512f-vpcmupd-2.c: Ditto. >> * gcc.target/i386/avx512f-vpcmupq-2.c: Ditto. >> * gcc.target/i386/avx512f-vrndscalepd-1.c: Ditto. >> * gcc.target/i386/avx512f-vrndscaleps-1.c: Ditto. >> * gcc.target/i386/avx512f-vrndscalepd-2.c: Ditto. >> * gcc.target/i386/avx512f-vrndscaleps-2.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf0dpd-1.c: Update parameters. >> * gcc.target/i386/avx512pf-vgatherpf0dps-1.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf0qpd-1.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf0qps-1.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf1dpd-1.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf1dps-1.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf1qpd-1.c: Ditto. >> * gcc.target/i386/avx512pf-vgatherpf1qps-1.c: Ditto. >> * gcc.target/i386/avx512f-vpsrad-2.c: Ditto. >> * gcc.target/i386/avx512f-vpslld-2.c: Ditto. >> * gcc.target/i386/avx512f-vpsrld-2.c: Ditto. >> >> Is it ok for trunk? > > OK. The changes look trivial enough even at this stage. Actually, please leave out the shift changes. AVX2 shifts are also marked as V4SI/SI, so we should rethink, if the change is really needed. > >> @@ -8220,7 +8219,7 @@ >> [(set (match_operand:VI48_512 0 "register_operand" "=v,v") >> (any_lshift:VI48_512 >> (match_operand:VI48_512 1 "register_operand" "v,m") > > Please change the above operand to nonimmediate_operand. The change above should be committed anyway. Uros.