Hi Carl,
on 2023/5/3 23:30, Carl Love via Gcc-patches wrote:
> GCC maintainers:
>
> The following patch cleans up the definition for the
> __builtin_altivec_vcmpnet. The current implementation implies that the
> builtin is only supported on Power 9 since it is defined under the
> Power 9 stanza. However the builtin has no ISA restrictions as stated
> in the Power Vector Intrinsic Programming Reference document. The
> current built-in works because the builtin gets replaced during GIMPLE
> folding by a simple not-equal operator so it doesn't get expanded and
> checked for Power 9 code generation.
>
> This patch moves the definition to the Altivec stanza in the builtin
> definition file. The builtin then generates code for Power 8 and
> earlier processors or Power 9 and later processors.
>
> The patch has been tested on Power 8 and Power 9 with no regressions.
>
> Please let me know if the patch is acceptable for mainline. Thanks.
>
> Carl
>
> -----------------------------------------
> rs6000: vec_cmpne confusing implementation
A better subject seems to be "Make __builtin_altivec_vcmpne{b,h,w,t}
supported under altivec" to match better what this patch does?
>
> __builtin_altivec_vcmpnet does not have any ISA restrictions. The current
Power Vector Intrinsic Programming Reference doesn't define functions
__builtin_altivec_vcmpne* but only vec_cmpne, which hasn't restricted any ISA
yet.
So I think what you proposed is actually to extend the current built-in
functions
__builtin_altivec_vcmpne{b,h,w,t} not longer power9 and later only.
This patch needs some test cases for the corresponding changes, but I'm curious
why people want to use these built-ins instead of just using vec_cmpne? The
latter looks more general and better.
> built-in definitions for vcmpneb, vcmpneh, vcmpnew, vcmpnet are defined
> under the Power 9 section. This implies they are only supported on Power
> 9 and above when in fact they are defined and work on Power 8.
>
> This patch moves the definitions to the Altivec stanza and maps the
> builtin dispatches to different code generation for Power 8 and earlier
> or for Power 9 and later.
>
> gcc/ChangeLog:
>
> * config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew.
> vcmpnet): Move definitions to Altivec stanza.
> * config/rs6000/vsx.md (cmpneb, vcmpneh, vcmpnew, vcmpnet): New
> define_expand.
> (cmpneb, vcmpneh, vcmpnew, vcmpnet): Rename define_insn.
>
> Patch has been tested on Power 8 and Power 9 with no regressions.
> ---
> gcc/config/rs6000/rs6000-builtins.def | 24 ++++++------
> gcc/config/rs6000/vsx.md | 54 +++++++++++++++++++++++++--
> 2 files changed, 63 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtins.def
> b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..adb4122be29 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -641,6 +641,18 @@
> const int __builtin_altivec_vcmpgtuw_p (int, vsi, vsi);
> VCMPGTUW_P vector_gtu_v4si_p {pred}
>
> + const vsc __builtin_altivec_vcmpneb (vsc, vsc);
> + VCMPNEB vcmpneb {}
> +
> + const vss __builtin_altivec_vcmpneh (vss, vss);
> + VCMPNEH vcmpneh {}
> +
> + const vsi __builtin_altivec_vcmpnew (vsi, vsi);
> + VCMPNEW vcmpnew {}
> +
The expanders for these three can just start with
separated eq and not, later if vcmpne{b,h,w} get
supported, these two can be combined further.
> + const vbq __builtin_altivec_vcmpnet (vsq, vsq);
> + VCMPNET vcmpnet {}
The current define_expand for vcmpnet guards TARGET_POWER10,
once you have test cases to cover these changes, you will found
the current expander isn't enough for what you want.
> +
> const vsi __builtin_altivec_vctsxs (vf, const int<5>);
> VCTSXS altivec_vctsxs {}
>
> @@ -2599,9 +2611,6 @@
> const signed int __builtin_altivec_vcmpaew_p (vsi, vsi);
> VCMPAEW_P vector_ae_v4si_p {}
>
> - const vsc __builtin_altivec_vcmpneb (vsc, vsc);
> - VCMPNEB vcmpneb {}
> -
> const signed int __builtin_altivec_vcmpneb_p (vsc, vsc);
> VCMPNEB_P vector_ne_v16qi_p {}
>
> @@ -2614,15 +2623,9 @@
> const signed int __builtin_altivec_vcmpnefp_p (vf, vf);
> VCMPNEFP_P vector_ne_v4sf_p {}
>
> - const vss __builtin_altivec_vcmpneh (vss, vss);
> - VCMPNEH vcmpneh {}
> -
> const signed int __builtin_altivec_vcmpneh_p (vss, vss);
> VCMPNEH_P vector_ne_v8hi_p {}
>
> - const vsi __builtin_altivec_vcmpnew (vsi, vsi);
> - VCMPNEW vcmpnew {}
> -
> const signed int __builtin_altivec_vcmpnew_p (vsi, vsi);
> VCMPNEW_P vector_ne_v4si_p {}
>
> @@ -3203,9 +3206,6 @@
> const signed int __builtin_altivec_vcmpgtut_p (signed int, vuq, vuq);
> VCMPGTUT_P vector_gtu_v1ti_p {pred}
>
> - const vbq __builtin_altivec_vcmpnet (vsq, vsq);
> - VCMPNET vcmpnet {}
> -
> const signed int __builtin_altivec_vcmpnet_p (vsq, vsq);
> VCMPNET_P vector_ne_v1ti_p {}
>
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 7d845df5c2d..3f05e3e6d00 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5652,8 +5652,56 @@
> DONE;
> })
>
> +;; Expand for builtin vcmpneb
> +(define_expand "vcmpneb"
> + [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
> + (not:V16QI
> + (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
> + (match_operand:V16QI 2 "altivec_register_operand" "v"))))]
> + ""
> + {
> + if (TARGET_P9_VECTOR)
> + emit_insn (gen_vcmpneb_p9 (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vcmpequb_p (operands[0], operands[1],
> + operands[2]));
> + DONE;
> + })
> +
It's better to have an explicit condition TARGET_ALTIVEC to match the stanza.
As mentioned above, you can just expand it with eq and not with something like:
(define_expand "vcmpneb"
[(set (match_operand:V16QI 3 "altivec_register_operand" "=v")
(eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
(match_operand:V16QI 2 "altivec_register_operand" "v")))
(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
(not:V16QI (match_dup 3))]
"TARGET_ALTIVEC"
{
operands[3] = gen_reg_rtx (V16QImode);
});
And move these expands to altivec.md since this is not vsx specific.
> +;; Expand for builtin vcmpneh
> +(define_expand "vcmpneh"
> + [(set (match_operand:V8HI 0 "altivec_register_operand" "=v")
> + (not:V8HI
> + (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v")
> + (match_operand:V8HI 2 "altivec_register_operand" "v"))))]
> + ""
> + {
> + if (TARGET_P9_VECTOR)
> + emit_insn (gen_vcmpneh_p9 (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vcmpequh_p (operands[0], operands[1],
> + operands[2]));
> + DONE;
> + })
> +
Ditto,
> +;; Expand for builtin vcmpnew
> +(define_expand "vcmpnew"
> + [(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
> + (not:V4SI
> + (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v")
> + (match_operand:V4SI 2 "altivec_register_operand" "v"))))]
> + ""
> + {
> + if (TARGET_P9_VECTOR)
> + emit_insn (gen_vcmpnew_p9 (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vcmpequw_p (operands[0], operands[1],
> + operands[2]));
> + DONE;
> + })
Ditto, they can be merged with some mode iterators further.
> +
> ;; Vector Compare Not Equal Byte (specified/not+eq:)
> -(define_insn "vcmpneb"
> +(define_insn "vcmpneb_p9"
> [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
> (not:V16QI
> (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
> @@ -5703,7 +5751,7 @@
> [(set_attr "type" "vecsimple")])
>
> ;; Vector Compare Not Equal Half Word (specified/not+eq:)
> -(define_insn "vcmpneh"
> +(define_insn "vcmpneh_p9"
> [(set (match_operand:V8HI 0 "altivec_register_operand" "=v")
> (not:V8HI
> (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v")
> @@ -5723,7 +5771,7 @@
> [(set_attr "type" "vecsimple")])
>
> ;; Vector Compare Not Equal Word (specified/not+eq:)
> -(define_insn "vcmpnew"
> +(define_insn "vcmpnew_p9"
> [(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
> (not:V4SI
> (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v")
These are not needed with the above changes.
BR,
Kewen