On Tue, Apr 17, 2018 at 6:53 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> Hi!
>
> With the PR85430 fix, I used some scripting to look for insns with
> n_operands 2 and types like alu, sselog etc. (those that have 1 suffixed
> variants).
>
> The vec_extract_* patterns are one big category that have been detected by
> this; some of the patterns used sselog1 type, but others used sselog, which
> is IMHO incorrect for n_operands 2 insns.  Furthermore, the
> define_insn_and_split I've changed recently that originally had "#"
> unconditionally lacked any attrs, which are IMHO needed now that they can
> emit in some cases insn directly rather than being split first.
>
> The vec_extract_hi_* patterns usually allow memory in the output operand and
> never allow it in the input operand; for sselog1 the default memory attr
> implementation returns:
>          (and (eq_attr "type" "alu1,negnot,ishift1,rotate1,sselog1,sseshuf1")
>               (match_operand 1 "memory_operand"))
>            (const_string "both")
>          (and (match_operand 0 "memory_operand")
>               (match_operand 1 "memory_operand"))
>            (const_string "both")
>          (match_operand 0 "memory_operand")
>            (const_string "store")
>       (const_string "none")
> The 2 MEMs never happen for any vec_extract_*_, and for vec_extract_hi_*
> op1 is never mem either, so it is always none or store, exactly what we
> want.  So, we can even simplify some patterns that used separate
> alternatives for the explicit memory attribute; the implicit one can handle
> it fine.
> On the other side, vec_extract_lo_* patterns allow (=v,v), (=v,m) and (=m,v)
> alternatives in some cases, and the memory attr used to be wrong for the
> (=v,m) cases, for which IMHO we want load rather than both.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2018-04-17  Jakub Jelinek  <ja...@redhat.com>
>
>         * config/i386/sse.md (vec_extract_lo_<mode><mask_name>): Add
>         (=v, v) alternative and explicit "memory" attribute.
>         (vec_extract_lo_<mode><mask_name>): Likewise.  Also add
>         "type", "prefix", "prefix_extra", "length_immediate" and "mode"
>         attributes.
>         (vec_extract_lo_<mode><mask_name>): Add (=v, v) alternative and use
>         "sselog1" type instead of "sselog".
>         (vec_extract_hi_<mode><mask_name>): Use "sselog1" type instead of
>         "sselog".  Remove explicit "memory" attribute.
>         (vec_extract_lo_v32hi): Add (=v, v) alternative and explicit "memory",
>         "type", "prefix", "prefix_extra", "length_immediate" and "mode"
>         attributes.
>         (vec_extract_hi_v32hi): Merge all alternatives into one, use
>         "sselog1" type instead of "sselog".  Remove explicit "memory"
>         attribute.
>         (vec_extract_hi_v16hi): Merge each pair of alternatives into one,
>         use "sselog1" type instead of "sselog".  Remove explicit "memory"
>         attribute.
>         (vec_extract_lo_v64qi): Add (=v, v) alternative and explicit "memory",
>         "type", "prefix", "prefix_extra", "length_immediate" and "mode"
>         attributes.
>         (vec_extract_hi_v64qi): Merge all alternatives into one, use
>         "sselog1" type instead of "sselog".  Remove explicit "memory"
>         attribute.
>         (vec_extract_hi_v32qi): Merge each pair of alternatives into one,
>         use "sselog1" type instead of "sselog".  Remove explicit "memory"
>         attribute.

OK for mainline and backports.

Thanks,
Uros.

> --- gcc/config/i386/sse.md.jj   2018-04-17 09:07:32.974146472 +0200
> +++ gcc/config/i386/sse.md      2018-04-17 14:40:15.571698749 +0200
> @@ -7495,9 +7495,9 @@ (define_insn "vec_extract_lo_<mode>_mask
>     (set_attr "mode" "<sseinsnmode>")])
>
>  (define_insn "vec_extract_lo_<mode><mask_name>"
> -  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" 
> "=<store_mask_constraint>,v")
> +  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" 
> "=v,<store_mask_constraint>,v")
>         (vec_select:<ssehalfvecmode>
> -         (match_operand:V8FI 1 "<store_mask_predicate>" 
> "v,<store_mask_constraint>")
> +         (match_operand:V8FI 1 "<store_mask_predicate>" 
> "v,v,<store_mask_constraint>")
>           (parallel [(const_int 0) (const_int 1)
>              (const_int 2) (const_int 3)])))]
>    "TARGET_AVX512F
> @@ -7511,6 +7511,7 @@ (define_insn "vec_extract_lo_<mode><mask
>    [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> +   (set_attr "memory" "none,store,load")
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> @@ -7651,10 +7652,10 @@ (define_expand "avx_vextractf128<mode>"
>  })
>
>  (define_insn "vec_extract_lo_<mode><mask_name>"
> -  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,m")
> +  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=v,v,m")
>         (vec_select:<ssehalfvecmode>
>           (match_operand:V16FI 1 "<store_mask_predicate>"
> -                                "<store_mask_constraint>,v")
> +                                "v,<store_mask_constraint>,v")
>           (parallel [(const_int 0) (const_int 1)
>                       (const_int 2) (const_int 3)
>                       (const_int 4) (const_int 5)
> @@ -7670,7 +7671,13 @@ (define_insn "vec_extract_lo_<mode><mask
>      return "vextract<shuffletype>32x8\t{$0x0, %1, 
> %0<mask_operand2>|%0<mask_operand2>, %1, 0x0}";
>    else
>      return "#";
> -})
> +}
> +  [(set_attr "type" "sselog1")
> +   (set_attr "prefix_extra" "1")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "memory" "none,load,store")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "<sseinsnmode>")])
>
>  (define_split
>    [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
> @@ -7697,10 +7704,10 @@ (define_split
>  })
>
>  (define_insn "vec_extract_lo_<mode><mask_name>"
> -  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" "=v,m")
> +  [(set (match_operand:<ssehalfvecmode> 0 "<store_mask_predicate>" "=v,v,m")
>         (vec_select:<ssehalfvecmode>
>           (match_operand:VI8F_256 1 "<store_mask_predicate>"
> -                                   "<store_mask_constraint>,v")
> +                                   "v,<store_mask_constraint>,v")
>           (parallel [(const_int 0) (const_int 1)])))]
>    "TARGET_AVX
>     && <mask_avx512vl_condition> && <mask_avx512dq_condition>
> @@ -7711,10 +7718,10 @@ (define_insn "vec_extract_lo_<mode><mask
>    else
>      return "#";
>  }
> -   [(set_attr "type" "sselog")
> +   [(set_attr "type" "sselog1")
>      (set_attr "prefix_extra" "1")
>      (set_attr "length_immediate" "1")
> -    (set_attr "memory" "none,store")
> +    (set_attr "memory" "none,load,store")
>      (set_attr "prefix" "evex")
>      (set_attr "mode" "XI")])
>
> @@ -7745,10 +7752,9 @@ (define_insn "vec_extract_hi_<mode><mask
>    else
>      return "vextract<i128>\t{$0x1, %1, %0|%0, %1, 0x1}";
>  }
> -  [(set_attr "type" "sselog")
> +  [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "memory" "none,store")
>     (set_attr "prefix" "vex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> @@ -7854,9 +7860,9 @@ (define_insn "vec_extract_hi_<mode>"
>     (set_attr "mode" "<sseinsnmode>")])
>
>  (define_insn_and_split "vec_extract_lo_v32hi"
> -  [(set (match_operand:V16HI 0 "nonimmediate_operand" "=v,m")
> +  [(set (match_operand:V16HI 0 "nonimmediate_operand" "=v,v,m")
>         (vec_select:V16HI
> -         (match_operand:V32HI 1 "nonimmediate_operand" "vm,v")
> +         (match_operand:V32HI 1 "nonimmediate_operand" "v,m,v")
>           (parallel [(const_int 0) (const_int 1)
>                      (const_int 2) (const_int 3)
>                      (const_int 4) (const_int 5)
> @@ -7886,12 +7892,18 @@ (define_insn_and_split "vec_extract_lo_v
>      operands[0] = lowpart_subreg (V32HImode, operands[0], V16HImode);
>    else
>      operands[1] = gen_lowpart (V16HImode, operands[1]);
> -})
> +}
> +  [(set_attr "type" "sselog1")
> +   (set_attr "prefix_extra" "1")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "memory" "none,load,store")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "XI")])
>
>  (define_insn "vec_extract_hi_v32hi"
> -  [(set (match_operand:V16HI 0 "nonimmediate_operand" "=v,m")
> +  [(set (match_operand:V16HI 0 "nonimmediate_operand" "=vm")
>         (vec_select:V16HI
> -         (match_operand:V32HI 1 "register_operand" "v,v")
> +         (match_operand:V32HI 1 "register_operand" "v")
>           (parallel [(const_int 16) (const_int 17)
>                      (const_int 18) (const_int 19)
>                      (const_int 20) (const_int 21)
> @@ -7902,10 +7914,9 @@ (define_insn "vec_extract_hi_v32hi"
>                      (const_int 30) (const_int 31)])))]
>    "TARGET_AVX512F"
>    "vextracti64x4\t{$0x1, %1, %0|%0, %1, 0x1}"
> -  [(set_attr "type" "sselog")
> +  [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "memory" "none,store")
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "XI")])
>
> @@ -7924,9 +7935,9 @@ (define_insn_and_split "vec_extract_lo_v
>    "operands[1] = gen_lowpart (V8HImode, operands[1]);")
>
>  (define_insn "vec_extract_hi_v16hi"
> -  [(set (match_operand:V8HI 0 "nonimmediate_operand" "=x,m,v,m,v,m")
> +  [(set (match_operand:V8HI 0 "nonimmediate_operand" "=xm,vm,vm")
>         (vec_select:V8HI
> -         (match_operand:V16HI 1 "register_operand" "x,x,v,v,v,v")
> +         (match_operand:V16HI 1 "register_operand" "x,v,v")
>           (parallel [(const_int 8) (const_int 9)
>                      (const_int 10) (const_int 11)
>                      (const_int 12) (const_int 13)
> @@ -7934,23 +7945,19 @@ (define_insn "vec_extract_hi_v16hi"
>    "TARGET_AVX"
>    "@
>     vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
>     vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}
>     vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}"
> -  [(set_attr "type" "sselog")
> +  [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "isa" "*,*,avx512dq,avx512dq,avx512f,avx512f")
> -   (set_attr "memory" "none,store,none,store,none,store")
> -   (set_attr "prefix" "vex,vex,evex,evex,evex,evex")
> +   (set_attr "isa" "*,avx512dq,avx512f")
> +   (set_attr "prefix" "vex,evex,evex")
>     (set_attr "mode" "OI")])
>
>  (define_insn_and_split "vec_extract_lo_v64qi"
> -  [(set (match_operand:V32QI 0 "nonimmediate_operand" "=v,m")
> +  [(set (match_operand:V32QI 0 "nonimmediate_operand" "=v,v,m")
>         (vec_select:V32QI
> -         (match_operand:V64QI 1 "nonimmediate_operand" "vm,v")
> +         (match_operand:V64QI 1 "nonimmediate_operand" "v,m,v")
>           (parallel [(const_int 0) (const_int 1)
>                      (const_int 2) (const_int 3)
>                      (const_int 4) (const_int 5)
> @@ -7988,12 +7995,18 @@ (define_insn_and_split "vec_extract_lo_v
>      operands[0] = lowpart_subreg (V64QImode, operands[0], V32QImode);
>    else
>      operands[1] = gen_lowpart (V32QImode, operands[1]);
> -})
> +}
> +  [(set_attr "type" "sselog1")
> +   (set_attr "prefix_extra" "1")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "memory" "none,load,store")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "XI")])
>
>  (define_insn "vec_extract_hi_v64qi"
> -  [(set (match_operand:V32QI 0 "nonimmediate_operand" "=v,m")
> +  [(set (match_operand:V32QI 0 "nonimmediate_operand" "=vm")
>         (vec_select:V32QI
> -         (match_operand:V64QI 1 "register_operand" "v,v")
> +         (match_operand:V64QI 1 "register_operand" "v")
>           (parallel [(const_int 32) (const_int 33)
>                      (const_int 34) (const_int 35)
>                      (const_int 36) (const_int 37)
> @@ -8012,10 +8025,9 @@ (define_insn "vec_extract_hi_v64qi"
>                      (const_int 62) (const_int 63)])))]
>    "TARGET_AVX512F"
>    "vextracti64x4\t{$0x1, %1, %0|%0, %1, 0x1}"
> -  [(set_attr "type" "sselog")
> +  [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "memory" "none,store")
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "XI")])
>
> @@ -8038,9 +8050,9 @@ (define_insn_and_split "vec_extract_lo_v
>    "operands[1] = gen_lowpart (V16QImode, operands[1]);")
>
>  (define_insn "vec_extract_hi_v32qi"
> -  [(set (match_operand:V16QI 0 "nonimmediate_operand" "=x,m,v,m,v,m")
> +  [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm,vm")
>         (vec_select:V16QI
> -         (match_operand:V32QI 1 "register_operand" "x,x,v,v,v,v")
> +         (match_operand:V32QI 1 "register_operand" "x,v,v")
>           (parallel [(const_int 16) (const_int 17)
>                      (const_int 18) (const_int 19)
>                      (const_int 20) (const_int 21)
> @@ -8052,17 +8064,13 @@ (define_insn "vec_extract_hi_v32qi"
>    "TARGET_AVX"
>    "@
>     vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
>     vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}
>     vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}"
> -  [(set_attr "type" "sselog")
> +  [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "isa" "*,*,avx512dq,avx512dq,avx512f,avx512f")
> -   (set_attr "memory" "none,store,none,store,none,store")
> -   (set_attr "prefix" "vex,vex,evex,evex,evex,evex")
> +   (set_attr "isa" "*,avx512dq,avx512f")
> +   (set_attr "prefix" "vex,evex,evex")
>     (set_attr "mode" "OI")])
>
>  ;; Modes handled by vec_extract patterns.
>
>         Jakub

Reply via email to