On Mon, May 25, 2020 at 8:41 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Mon, May 25, 2020 at 2:21 PM Hongtao Liu <crazy...@gmail.com> wrote: > > > > According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit > > memory_operand instead of 128-bit one which exists in current > > implementation. Also for other vpmov instructions which have > > memory_operand narrower than 128bits. > > > > Bootstrap is ok, regression test for i386/x86-64 backend is ok. > > > + [(set (match_operand:HI 0 "memory_operand" "=m") > + (subreg:HI (any_truncate:V2QI > + (match_operand:V2DI 1 "register_operand" "v")) 0))] > > This should store in V2QImode, subregs are not allowed in insn patterns. > > You need a pre-reload splitter to split from register_operand to a > memory_operand, Jakub fixed a bunch of pmov patterns a while ago, so > perhaps he can give some additional advice here. >
Like this? --- (define_insn "*avx512vl_<code>v2div2qi2_store" [(set (match_operand:V2QI 0 "memory_operand" "=m") (any_truncate:V2QI (match_operand:V2DI 1 "register_operand" "v")))] "TARGET_AVX512VL" "vpmov<trunsuffix>qb\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "memory" "store") (set_attr "prefix" "evex") (set_attr "mode" "TI")]) (define_insn_and_split "*avx512vl_<code>v2div2qi2_store" [(set (match_operand:HI 0 "memory_operand") (subreg:HI (any_truncate:V2QI (match_operand:V2DI 1 "register_operand")) 0))] "TARGET_AVX512VL && ix86_pre_reload_split ()" "#" "&& 1" [(set (match_dup 0) (any_truncate:V2QI (match_dup 1)))] "operands[0] = adjust_address_nv (operands[0], V2QImode, 0);") --- > Uros. > > > > gcc/ChangeLog > > > > * config/i386/sse.md (*avx512vl_<code>v2div2qi2_store): Refine > > size of memory_operand according to Intel SDM. > > (avx512vl_<code>v2div2qi2_mask_store): Ditto. > > (*avx512vl_<code><mode>v4qi2_store): Ditto. > > (avx512vl_<code><mode>v4qi2_mask_store): Ditto. > > (*avx512vl_<code><mode>v8qi2_store): Ditto. > > (avx512vl_<code><mode>v8qi2_mask_store): Ditto. > > (*avx512vl_<code><mode>v4hi2_store): Ditto. > > (avx512vl_<code><mode>v4hi2_mask_store): Ditto. > > (*avx512vl_<code>v2div2hi2_store): Ditto. > > (avx512vl_<code>v2div2hi2_mask_store): Ditto. > > (*avx512vl_<code>v2div2si2_store): Ditto. > > (avx512vl_<code>v2div2si2_mask_store): Ditto. > > (*avx512f_<code>v8div16qi2_store): Ditto. > > (avx512f_<code>v8div16qi2_mask_store): Ditto. > > * config/i386/i386-builtin-types.def: Adjust builtin type. > > * config/i386/i386-expand.c: Ditto. > > * config/i386/i386-builtin.def: Adjust builtin. > > * config/i386/avx512fintrin.h: Ditto. > > * config/i386/avx512vlbwintrin.h: Ditto. > > * config/i386/avx512vlintrin.h: Ditto. > > > > I think the code i changed is already covered by existed intrinsics > > tests, so i didn't add any new tests. > > -- > > BR, > > Hongtao -- BR, Hongtao