On Fri, Feb 7, 2014 at 10:49 AM, Kirill Yukhin <kirill.yuk...@gmail.com> wrote:
> This (should be) the last patch for AVX-512 support in v4.9. > > It improves correspondence between ICC, SDM [1], and official > intrinsics guide [2]. > > What was done: > - Fixed shifts such as VPSLLD and friends. Actual instruction > loads 128-bit of count and uses only 64-bit (see [1]). So, > we're changing V4S* -> V2D* in built-ins. > > - Rename (back) _mm512_[load|store]u_epi32 to > _mm512_[load|store]u_si512 according to [2]. > > - Remove floor and ceil with zero-masking and/or rounding. > > - Remove _mm512_expand_p[s|d] as it is absent in [2]. > > - Make scatter prefetch take 1,2,5,6 immediates. 1 and 5 means > L1, 2 and 6 means L6. > > - Make gather prefetch take 1 and 2 meaning corresponding cache > levels. > > - Fix all tests accordingly. > > gcc/ > * config/i386/avx512fintrin.h (_mm512_storeu_epi64): Removed. > (_mm512_loadu_epi32): Renamed into... > (_mm512_loadu_si512): This. > (_mm512_storeu_epi32): Renamed into... > (_mm512_storeu_si512): This. > (_mm512_maskz_ceil_ps): Removed. > (_mm512_maskz_ceil_pd): Ditto. > (_mm512_maskz_floor_ps): Ditto. > (_mm512_maskz_floor_pd): Ditto. > (_mm512_floor_round_ps): Ditto. > (_mm512_floor_round_pd): Ditto. > (_mm512_ceil_round_ps): Ditto. > (_mm512_ceil_round_pd): Ditto. > (_mm512_mask_floor_round_ps): Ditto. > (_mm512_mask_floor_round_pd): Ditto. > (_mm512_mask_ceil_round_ps): Ditto. > (_mm512_mask_ceil_round_pd): Ditto. > (_mm512_maskz_floor_round_ps): Ditto. > (_mm512_maskz_floor_round_pd): Ditto. > (_mm512_maskz_ceil_round_ps): Ditto. > (_mm512_maskz_ceil_round_pd): Ditto. > (_mm512_expand_pd): Ditto. > (_mm512_expand_ps): Ditto. > (_mm512_sll_epi32): Updated parameter type. > (_mm512_mask_sll_epi32): Ditto. > (_mm512_maskz_sll_epi32): Ditto. > (_mm512_srl_epi32): Ditto. > (_mm512_mask_srl_epi32): Ditto. > (_mm512_maskz_srl_epi32): Ditto. > (_mm512_sra_epi32): Ditto. > (_mm512_mask_sra_epi32): Ditto. > (_mm512_maskz_sra_epi32): Ditto. > * config/i386/i386-builtin-type.def (V16SI_FTYPE_V16SI_V4SI_V16SI_HI): > Change into... > (V16SI_FTYPE_V16SI_V2DI_V16SI_HI): This. > * config/i386/i386.c (ix86_builtins): Remove > IX86_BUILTIN_EXPANDPD512_NOMASK, IX86_BUILTIN_EXPANDPS512_NOMASK. > (bdesc_args): Ditto. > * config/i386/predicates.md (const1256_operand): New. > (const_1_to_2_operand): Ditto. > * config/i386/sse.md (avx512pf_gatherpf<mode>sf): Change hint value. > (*avx512pf_gatherpf<mode>sf_mask): Ditto. > (*avx512pf_gatherpf<mode>sf): Ditto. > (avx512pf_gatherpf<mode>df): Ditto. > (*avx512pf_gatherpf<mode>df_mask): Ditto. > (*avx512pf_gatherpf<mode>df): Ditto. > (avx512pf_scatterpf<mode>sf): Ditto. > (*avx512pf_scatterpf<mode>sf_mask): Ditto. > (*avx512pf_scatterpf<mode>sf): Ditto. > (avx512pf_scatterpf<mode>df): Ditto. > (*avx512pf_scatterpf<mode>df_mask): Ditto. > (*avx512pf_scatterpf<mode>df): Ditto. > (avx512f_expand<mode>): Removed. > (<shift_insn><mode>3<mask_name>): Change parameter type. > > gcc/testsuite/ > * gcc.target/i386/avx512f-vexpandpd-1.c: Update intrinsics. > * gcc.target/i386/avx512f-vexpandps-1.c: Ditto. > * gcc.target/i386/avx512f-vexpandpd-2.c: Ditto. > * gcc.target/i386/avx512f-vexpandps-2.c: Ditto. > * gcc.target/i386/avx512f-vmovdqu32-1: Ditto. > * gcc.target/i386/avx512f-vmovdqu32-2: Ditto. > * gcc.target/i386/avx512f-vmovdqu64-1: Ditto. > * gcc.target/i386/avx512f-vmovdqu64-2: Ditto. > * gcc.target/i386/avx512f-vpcmpd-2.c: Ditto. > * gcc.target/i386/avx512f-vpcmpq-2.c: Ditto. > * gcc.target/i386/avx512f-vpcmupd-2.c: Ditto. > * gcc.target/i386/avx512f-vpcmupq-2.c: Ditto. > * gcc.target/i386/avx512f-vrndscalepd-1.c: Ditto. > * gcc.target/i386/avx512f-vrndscaleps-1.c: Ditto. > * gcc.target/i386/avx512f-vrndscalepd-2.c: Ditto. > * gcc.target/i386/avx512f-vrndscaleps-2.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf0dpd-1.c: Update parameters. > * gcc.target/i386/avx512pf-vgatherpf0dps-1.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf0qpd-1.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf0qps-1.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf1dpd-1.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf1dps-1.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf1qpd-1.c: Ditto. > * gcc.target/i386/avx512pf-vgatherpf1qps-1.c: Ditto. > * gcc.target/i386/avx512f-vpsrad-2.c: Ditto. > * gcc.target/i386/avx512f-vpslld-2.c: Ditto. > * gcc.target/i386/avx512f-vpsrld-2.c: Ditto. > > Is it ok for trunk? OK. The changes look trivial enough even at this stage. > @@ -8220,7 +8219,7 @@ > [(set (match_operand:VI48_512 0 "register_operand" "=v,v") > (any_lshift:VI48_512 > (match_operand:VI48_512 1 "register_operand" "v,m") Please change the above operand to nonimmediate_operand. > - (match_operand:SI 2 "nonmemory_operand" "vN,N")))] > + (match_operand:DI 2 "nonmemory_operand" "vN,N")))] > "TARGET_AVX512F && <mask_mode512bit_condition>" > "vp<vshift><ssemodesuffix>\t{%2, %1, %0<mask_operand3>|%0<mask_operand3>, > %1, %2}" > [(set_attr "isa" "avx512f") Uros.