Re: [PATCH ver 3] rs6000, Add new overloaded vector shift builtin int128, variants

Carl Love Mon, 05 Aug 2024 13:35:55 -0700

Kewen:

On 8/4/24 11:13 PM, Kewen.Lin wrote:

Hi Carl,


on 2024/8/2 03:35, Carl Love wrote:

GCC developers:

Version 3, updated the testcase dg-do link to dg-do compile.  Moved the new 
documentation again.  Retested on Power 10 LE and BE to verify the dg arguments 
disable the test on Power10BE but enable the test for Power10LE.  Reran the 
full regression testsuite.   There were no new regressions for the testsuite.

Version 2, updated rs6000-overload.def to remove adding additonal internal 
names and to change XXSLDWI_Q to XXSLDWI_1TI per comments from Kewen.  Move new 
documentation statement for the PIVPR built-ins per comments from Kewen.  
Updated dg-do-run directive and added comment about the save-temps  in testcase 
per feedback from Segher.  Retested the patch on Power 10 with no regressions.

The following patch adds the int128 varients to the existing overloaded 
built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo, vec_srdb, vec_srl, 
vec_sro.  These varients were requested by Steve Munroe.

The patch has been tested on a Power 10 system with no regressions.

OK with the below nits tweaked and tested well on both BE and LE, thanks!


Fixed the various nits, see specifics below.

Rebased on current mainline, rebuilt and retested on LE and BE with noregression failures.


Will commit in a day or so assuming I don't receive any additional comments.

Please let me know if the patch is acceptable for mainline.

                                    Carl

------------------------------------------------------------------------------
rs6000, Add new overloaded vector shift builtin int128 variants

Add the signed __int128 and unsigned __int128 argument types for the
overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
vec_srdb, vec_srl, vec_sro.  For each of the new argument types add a
testcase and update the documentation for the built-in.

gcc/ChangeLog:
     * config/rs6000/altivec.md (vs<SLDB_lr>db_<mode>): Change
     define_insn iterator to VEC_IC.
     * config/rs6000/rs6000-builtins.def (__builtin_altivec_vsldoi_v1ti,
     __builtin_vsx_xxsldwi_v1ti, __builtin_altivec_vsldb_v1ti,
     __builtin_altivec_vsrdb_v1ti): New builtin definitions.
     * config/rs6000/rs6000-overload.def (vec_sld, vec_sldb, vec_sldw,
     vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro): New overloaded
     definitions.
     * doc/extend.texi (vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
     vec_srdb, vec_srl, vec_sro): Add documentation for new overloaded
     built-ins.

gcc/testsuite/ChangeLog:
     * gcc.target/powerpc/vec-shift-double-runnable-int128.c: New test file.
---
  gcc/config/rs6000/altivec.md                  |   6 +-
  gcc/config/rs6000/rs6000-builtins.def         |  12 +
  gcc/config/rs6000/rs6000-overload.def         |  40 ++
  gcc/doc/extend.texi                           |  43 ++
  .../vec-shift-double-runnable-int128.c        | 419 ++++++++++++++++++
  5 files changed, 517 insertions(+), 3 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 5af9bf920a2..2a18ee44526 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -878,9 +878,9 @@ (define_int_attr SLDB_lr [(UNSPEC_SLDB "l")
  (define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB])

  (define_insn "vs<SLDB_lr>db_<mode>"
- [(set (match_operand:VI2 0 "register_operand" "=v")
-  (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v")
-           (match_operand:VI2 2 "register_operand" "v")
+ [(set (match_operand:VEC_IC 0 "register_operand" "=v")
+  (unspec:VEC_IC [(match_operand:VEC_IC 1 "register_operand" "v")
+           (match_operand:VEC_IC 2 "register_operand" "v")
             (match_operand:QI 3 "const_0_to_12_operand" "n")]
            VSHIFT_DBL_LR))]
    "TARGET_POWER10"
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 77eb0f7e406..a2b2b729270 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -964,6 +964,9 @@
    const vss __builtin_altivec_vsldoi_8hi (vss, vss, const int<4>);
      VSLDOI_8HI altivec_vsldoi_v8hi {}

+  const vsq __builtin_altivec_vsldoi_v1ti (vsq, vsq, const int<4>);
+    VSLDOI_V1TI altivec_vsldoi_v1ti {}

Nit: s/VSLDOI_V1TI/VSLDOI_1TI/ to align with the other VSLDOI_* (no 'V' in *).


Changed to:

  const vsq __builtin_altivec_vsldoi_v1ti (vsq, vsq, const int<4>);
    VSLDOI_1TI altivec_vsldoi_v1ti {}

+
    const vss __builtin_altivec_vslh (vss, vus);
      VSLH vashlv8hi3 {}

@@ -1831,6 +1834,9 @@
    const vsll __builtin_vsx_xxsldwi_2di (vsll, vsll, const int<2>);
      XXSLDWI_2DI vsx_xxsldwi_v2di {}

+  const vsq __builtin_vsx_xxsldwi_v1ti (vsq, vsq, const int<2>);
+    XXSLDWI_1TI vsx_xxsldwi_v1ti {}
+
    const vf __builtin_vsx_xxsldwi_4sf (vf, vf, const int<2>);
      XXSLDWI_4SF vsx_xxsldwi_v4sf {}

@@ -3299,6 +3305,9 @@
    const vss __builtin_altivec_vsldb_v8hi (vss, vss, const int<3>);
      VSLDB_V8HI vsldb_v8hi {}

+  const vsq __builtin_altivec_vsldb_v1ti (vsq, vsq, const int<3>);
+    VSLDB_V1TI vsldb_v1ti {}
+
    const vsq __builtin_altivec_vslq (vsq, vuq);
      VSLQ vashlv1ti3 {}

@@ -3317,6 +3326,9 @@
    const vss __builtin_altivec_vsrdb_v8hi (vss, vss, const int<3>);
      VSRDB_V8HI vsrdb_v8hi {}

+  const vsq __builtin_altivec_vsrdb_v1ti (vsq, vsq, const int<3>);
+    VSRDB_V1TI vsrdb_v1ti {}
+
    const vsq __builtin_altivec_vsrq (vsq, vuq);
      VSRQ vlshrv1ti3 {}

diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index b5fd9d0e38f..b1e038488e2 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3399,6 +3399,10 @@
      VSLDOI_4SF
    vd __builtin_vec_sld (vd, vd, const int);
      VSLDOI_2DF
+  vsq __builtin_vec_sld (vsq, vsq, const int);
+    VSLDOI_V1TI  VSLDOI_VSQ
+  vuq __builtin_vec_sld (vuq, vuq, const int);
+    VSLDOI_V1TI  VSLDOI_VUQ

Nit: s/VSLDOI_V1TI/VSLDOI_1TI/ as the above change.


Changed the above two entries to:

  vsq __builtin_vec_sld (vsq, vsq, const int);
    VSLDOI_1TI  VSLDOI_VSQ
  vuq __builtin_vec_sld (vuq, vuq, const int);
    VSLDOI_1TI  VSLDOI_VUQ

  [VEC_SLDB, vec_sldb, __builtin_vec_sldb]
    vsc __builtin_vec_sldb (vsc, vsc, const int);
@@ -3417,6 +3421,10 @@
      VSLDB_V2DI  VSLDB_VSLL
    vull __builtin_vec_sldb (vull, vull, const int);
      VSLDB_V2DI  VSLDB_VULL
+  vsq __builtin_vec_sldb (vsq, vsq, const int);
+    VSLDB_V1TI  VSLDB_VSQ
+  vuq __builtin_vec_sldb (vuq, vuq, const int);
+    VSLDB_V1TI  VSLDB_VUQ

  [VEC_SLDW, vec_sldw, __builtin_vec_sldw]
    vsc __builtin_vec_sldw (vsc, vsc, const int);
@@ -3439,6 +3447,10 @@
      XXSLDWI_4SF  XXSLDWI_VF
    vd __builtin_vec_sldw (vd, vd, const int);
      XXSLDWI_2DF  XXSLDWI_VD
+  vsq __builtin_vec_sldw (vsq, vsq, const int);
+    XXSLDWI_1TI  XXSLDWI_VSQ
+  vuq __builtin_vec_sldw (vuq, vuq, const int);
+    XXSLDWI_1TI  XXSLDWI_VUQ

  [VEC_SLL, vec_sll, __builtin_vec_sll]
    vsc __builtin_vec_sll (vsc, vuc);
@@ -3459,6 +3471,10 @@
      VSL  VSL_VSLL
    vull __builtin_vec_sll (vull, vuc);
      VSL  VSL_VULL
+  vsq __builtin_vec_sll (vsq, vuc);
+    VSL  VSL_VSQ
+  vuq __builtin_vec_sll (vuq, vuc);
+    VSL  VSL_VUQ
  ; The following variants are deprecated.
    vsc __builtin_vec_sll (vsc, vus);
      VSL  VSL_VSC_VUS
@@ -3554,6 +3570,14 @@
      VSLO  VSLO_VFS
    vf __builtin_vec_slo (vf, vuc);
      VSLO  VSLO_VFU
+  vsq __builtin_vec_slo (vsq, vsc);
+    VSLO  VSLDO_VSQS
+  vsq __builtin_vec_slo (vsq, vuc);
+    VSLO  VSLDO_VSQU
+  vuq __builtin_vec_slo (vuq, vsc);
+    VSLO  VSLDO_VUQS
+  vuq __builtin_vec_slo (vuq, vuc);
+    VSLO  VSLDO_VUQU

  [VEC_SLV, vec_slv, __builtin_vec_vslv]
    vuc __builtin_vec_vslv (vuc, vuc);
@@ -3699,6 +3723,10 @@
      VSRDB_V2DI  VSRDB_VSLL
    vull __builtin_vec_srdb (vull, vull, const int);
      VSRDB_V2DI  VSRDB_VULL
+  vsq __builtin_vec_srdb (vsq, vsq, const int);
+    VSRDB_V1TI  VSRDB_VSQ
+  vuq __builtin_vec_srdb (vuq, vuq, const int);
+    VSRDB_V1TI  VSRDB_VUQ

  [VEC_SRL, vec_srl, __builtin_vec_srl]
    vsc __builtin_vec_srl (vsc, vuc);
@@ -3719,6 +3747,10 @@
      VSR  VSR_VSLL
    vull __builtin_vec_srl (vull, vuc);
      VSR  VSR_VULL
+  vsq __builtin_vec_srl (vsq, vuc);
+    VSR  VSR_VSQ
+  vuq __builtin_vec_srl (vuq, vuc);
+    VSR  VSR_VUQ
  ; The following variants are deprecated.
    vsc __builtin_vec_srl (vsc, vus);
      VSR  VSR_VSC_VUS
@@ -3808,6 +3840,14 @@
      VSRO  VSRO_VFS
    vf __builtin_vec_sro (vf, vuc);
      VSRO  VSRO_VFU
+  vsq __builtin_vec_sro (vsq, vsc);
+    VSRO  VSRDO_VSQS
+  vsq __builtin_vec_sro (vsq, vuc);
+    VSRO  VSRDO_VSQU
+  vuq __builtin_vec_sro (vuq, vsc);
+    VSRO  VSRDO_VUQS
+  vuq __builtin_vec_sro (vuq, vuc);
+    VSRO  VSRDO_VUQU

  [VEC_SRV, vec_srv, __builtin_vec_vsrv]
    vuc __builtin_vec_vsrv (vuc, vuc);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 00b93f954e3..4a9cf366f56 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -23519,6 +23519,10 @@ const unsigned int);
  vector signed long long, const unsigned int);
  @exdent vector unsigned long long vec_sldb (vector unsigned long long,
  vector unsigned long long, const unsigned int);
+@exdent vector signed __int128 vec_sldb (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sldb (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
  @end smallexample

  Shift the combined input vectors left by the amount specified by the low-order
@@ -23546,6 +23550,10 @@ const unsigned int);
  vector signed long long, const unsigned int);
  @exdent vector unsigned long long vec_srdb (vector unsigned long long,
  vector unsigned long long, const unsigned int);
+@exdent vector signed __int128 vec_srdb (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_srdb (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
  @end smallexample

  Shift the combined input vectors right by the amount specified by the 
low-order
@@ -24041,6 +24049,41 @@ int vec_any_le (vector unsigned __int128, vector 
unsigned __int128);
  @end smallexample

Nit:

The following instances are extension of the existing overloaded built-ins
@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro}, @code{vec_srl}
that are documented in the PVIPR.


Changed to:

...
int vec_any_le (vector signed __int128, vector signed __int128);
int vec_any_le (vector unsigned __int128, vector unsigned __int128);
@end smallexample


The following instances are extension of the existing overloaded built-ins

@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro},@code{vec_srl}

that are documented in the PVIPR.

@smallexample
@exdent vector signed __int128 vec_sld (vector signed __int128,
vector signed __int128, const unsigned int);
@exdent vector unsigned __int128 vec_sld (vector unsigned __int128,
vector unsigned __int128, const unsigned int);
@exdent vector signed __int128 vec_sldw (vector signed __int128,
vector signed __int128, const unsigned int);
...

+@smallexample
+@exdent vector signed __int128 vec_sld (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sld (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
+@exdent vector signed __int128 vec_sldw (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sldw (vector unsigned __int,
+vector unsigned __int128, const unsigned int);
+@exdent vector signed __int128 vec_slo (vector signed __int128,
+vector signed char);
+@exdent vector signed __int128 vec_slo (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
+vector signed char);
+@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
+vector unsigned char);
+@exdent vector signed __int128 vec_sro (vector signed __int128,
+vector signed char);
+@exdent vector signed __int128 vec_sro (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
+vector signed char);
+@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
+vector unsigned char);
+@exdent vector signed __int128 vec_srl (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_srl (vector unsigned __int128,
+vector unsigned char);
+@end smallexample
+
+The above instances are extension of the existing overloaded built-ins
+@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro}, @code{vec_srl}
+that are documented in the PVIPR.

... this praragraph is moved above to break two "smallexample"s.


Moved, see above.

                                                 Carl

Re: [PATCH ver 3] rs6000, Add new overloaded vector shift builtin int128, variants

Reply via email to