[PATCH] [arm][testsuite]: Fix ACLE data-intrinsics testcases

2023-05-30 Thread Christophe Lyon via Gcc-patches
data-intrinsics-assembly.c forces -march=armv6 using dg-add-options
arm_arch_v6, which implicitly adds -mfloat-abi=softfp.

However, for a toolchain configured for arm-linux-gnueabihf and
--with-arch=armv7-a, the testcase will fail when including arm_acle.h
(which includes stdint.h, which will fail to include the non-existing
gnu/stubs-soft.h).

Other effective-targets related to arm_acle.h would also pass because
they first try without -mfloat-abi=softfp, so it seems the
simplest/safest is to add { dg-require-effective-target arm_softfp_ok }
to make sure arm_arch_v6_ok's assumption is valid.

The patch also fixes what seems to be an oversight in
data-intrinsics-armv6.c: it requires arm_arch_v6_ok, but uses
arm_arch_v6t2: the patch makes it require arm_arch_v6t2_ok.

2023-05-30  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/acle/data-intrinsics-armv6.c: Fix typo.
* gcc.target/arm/acle/data-intrinsics-assembly.c Require
arm_softfp_ok.
---
 gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c| 2 +-
 gcc/testsuite/gcc.target/arm/acle/data-intrinsics-assembly.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c 
b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c
index aafdff35cee..988ecac3787 100644
--- a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c
+++ b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-armv6.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-require-effective-target arm_arch_v6_ok } */
+/* { dg-require-effective-target arm_arch_v6t2_ok } */
 /* { dg-add-options arm_arch_v6t2 } */
 
 #include "arm_acle.h"
diff --git a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-assembly.c 
b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-assembly.c
index 3e066877a70..478cbde1600 100644
--- a/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-assembly.c
+++ b/gcc/testsuite/gcc.target/arm/acle/data-intrinsics-assembly.c
@@ -1,5 +1,6 @@
 /* Test the ACLE data intrinsics get expanded to the correct instructions on a 
specific architecture  */
 /* { dg-do assemble } */
+/* { dg-require-effective-target arm_softfp_ok } */
 /* { dg-require-effective-target arm_arch_v6_ok } */
 /* { dg-additional-options "--save-temps -O1" } */
 /* { dg-add-options arm_arch_v6 } */
-- 
2.34.1



Re: [PATCH] [arm] testsuite: make mve_intrinsic_type_overloads-int.c libc-agnostic

2023-05-30 Thread Christophe Lyon via Gcc-patches
Ping?


On Tue, 23 May 2023 at 16:59, Stamatis Markianos-Wright <
stam.markianos-wri...@arm.com> wrote:

>
> On 23/05/2023 15:41, Christophe Lyon wrote:
> > Glibc defines int32_t as 'int' while newlib defines it as 'long int'.
> >
> > Although these correspond to the same size, g++ complains when using
> the
>
>
>
>'wrong' version:
> >invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'}
> [-fpermissive]
> > or
> >invalid conversion from 'int*' to 'int32_t*' {aka 'long int*'}
> [-fpermissive]
> >
> > when calling vst1q(int32*, int32x4_t) with a first parameter of type
> > 'long int *' (resp. 'int *')
> >
> > To make this test pass with any type of toolchain, this patch defines
> > 'word_type' according to which libc is in use.
>
> Thank you for spotting this! I think this fix is needed on all of
> GCC12,13,trunk btw (it should apply cleanly)
>
>
> >
> > 2023-05-23  Christophe Lyon  
> >
> >   gcc/testsuite/
> >   * gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c:
> >   Support both definitions of int32_t.
> > ---
> >   .../mve_intrinsic_type_overloads-int.c| 28 ++-
> >   1 file changed, 15 insertions(+), 13 deletions(-)
> >
> > diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
> > index 7947dc024bc..ab51cc8b323 100644
> > ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
> > +++
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
> > @@ -47,14 +47,22 @@ foo2 (short * addr, int16x8_t value)
> > vst1q (addr, value);
> >   }
> >
> > -void
> > -foo3 (int * addr, int32x4_t value)
> > -{
> > -  vst1q (addr, value); /* { dg-warning "invalid conversion" "" { target
> c++ } } */
> > -}
> > +/* Glibc defines int32_t as 'int' while newlib defines it as 'long int'.
> > +
> > +   Although these correspond to the same size, g++ complains when using
> the
> > +   'wrong' version:
> > +  invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'}
> [-fpermissive]
> > +
> > +  The trick below is to make this test pass whether using glibc-based or
> > +  newlib-based toolchains.  */
> >
> > +#if defined(__GLIBC__)
> > +#define word_type int
> > +#else
> > +#define word_type long int
> > +#endif
> >   void
> > -foo4 (long * addr, int32x4_t value)
> > +foo3 (word_type * addr, int32x4_t value)
> >   {
> > vst1q (addr, value);
> >   }
> > @@ -78,13 +86,7 @@ foo7 (unsigned short * addr, uint16x8_t value)
> >   }
> >
> >   void
> > -foo8 (unsigned int * addr, uint32x4_t value)
> > -{
> > -  vst1q (addr, value); /* { dg-warning "invalid conversion" "" { target
> c++ } } */
> > -}
> > -
> > -void
> > -foo9 (unsigned long * addr, uint32x4_t value)
> > +foo8 (unsigned word_type * addr, uint32x4_t value)
> >   {
> > vst1q (addr, value);
> >   }
>


[PATCH] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Christophe Lyon via Gcc-patches
After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a
pattern to match the new GIMPLE form.

With this patch, gcc.target/aarch64/rev16_2.c passes again.

2023-05-31  Christophe Lyon  

PR target/110039
gcc/
* config/aarch64/aarch64.md (aarch64_rev162_alt3): New
pattern.
---
 gcc/config/aarch64/aarch64.md | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 8b8951d7b14..663353791fd 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -6267,6 +6267,16 @@
   [(set_attr "type" "rev")]
 )
 
+;; Similar pattern to mache (rotate (bswap) 16)
+(define_insn "aarch64_rev162_alt3"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(rotate:GPI (bswap:GPI (match_operand:GPI 1 "register_operand" "r"))
+(const_int 16)))]
+  ""
+  "rev16\\t%0, %1"
+  [(set_attr "type" "rev")]
+)
+
 ;; zero_extend version of above
 (define_insn "*bswapsi2_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r")
-- 
2.34.1



Re: [PATCH] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Christophe Lyon via Gcc-patches
On Wed, 31 May 2023 at 11:49, Richard Sandiford 
wrote:

> Christophe Lyon  writes:
> > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a
> > pattern to match the new GIMPLE form.
> >
> > With this patch, gcc.target/aarch64/rev16_2.c passes again.
> >
> > 2023-05-31  Christophe Lyon  
> >
> >   PR target/110039
> >   gcc/
> >   * config/aarch64/aarch64.md (aarch64_rev162_alt3): New
> >   pattern.
> > ---
> >  gcc/config/aarch64/aarch64.md | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/gcc/config/aarch64/aarch64.md
> b/gcc/config/aarch64/aarch64.md
> > index 8b8951d7b14..663353791fd 100644
> > --- a/gcc/config/aarch64/aarch64.md
> > +++ b/gcc/config/aarch64/aarch64.md
> > @@ -6267,6 +6267,16 @@
> >[(set_attr "type" "rev")]
> >  )
> >
> > +;; Similar pattern to mache (rotate (bswap) 16)
> > +(define_insn "aarch64_rev162_alt3"
> > +  [(set (match_operand:GPI 0 "register_operand" "=r")
> > +(rotate:GPI (bswap:GPI (match_operand:GPI 1 "register_operand"
> "r"))
> > +(const_int 16)))]
> > +  ""
> > +  "rev16\\t%0, %1"
> > +  [(set_attr "type" "rev")]
> > +)
> > +
>
> Doesn't this have to be :SI only?  The rtl expression and the
> instruction are different for :DI.
>
> Do you mean the other two examples in the testcase?
( __rev16_64_alt, __rev16_64)
They currently use aarch64_rev16di2_alt1 and aarch64_rev16di2_alt2
respectively.

So yeah for this case it seems :SI would be right.

Thanks,

Christophe

Thanks,
> Richard
>
> >  ;; zero_extend version of above
> >  (define_insn "*bswapsi2_uxtw"
> >[(set (match_operand:DI 0 "register_operand" "=r")
>


[PATCH v2] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Christophe Lyon via Gcc-patches
After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a
pattern to match the new GIMPLE form.

With this patch, gcc.target/aarch64/rev16_2.c passes again.

2023-05-31  Christophe Lyon  

PR target/110039
gcc/
* config/aarch64/aarch64.md (aarch64_rev16si2_alt3): New
pattern.
---
 gcc/config/aarch64/aarch64.md | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 8b8951d7b14..9af7024da43 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -6267,6 +6267,16 @@
   [(set_attr "type" "rev")]
 )
 
+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "aarch64_rev16si2_alt3"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(rotate:SI (bswap:SI (match_operand:SI 1 "register_operand" "r"))
+   (const_int 16)))]
+  ""
+  "rev16\\t%w0, %w1"
+  [(set_attr "type" "rev")]
+)
+
 ;; zero_extend version of above
 (define_insn "*bswapsi2_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r")
-- 
2.34.1



Re: [r14-1452 Regression] FAIL: g++.dg/pr104547.C -std=gnu++17 scan-tree-dump-not vrp2 "_M_default_append" on Linux/x86_64

2023-06-01 Thread Christophe Lyon via Gcc-patches
Hi!

We have noticed the same problem on aarch64, if that's easier to reproduce.

Thanks,
Christophe


On Thu, 1 Jun 2023 at 06:20, haochen.jiang via Gcc-regression <
gcc-regress...@gcc.gnu.org> wrote:

> On Linux/x86_64,
>
> fb409a15d9babc78fe1d9957afcbaf1102cce58f is the first bad commit
> commit fb409a15d9babc78fe1d9957afcbaf1102cce58f
> Author: Jonathan Wakely 
> Date:   Thu May 25 09:57:46 2023 +0100
>
> libstdc++: Express std::vector's size() <= capacity() invariant in code
>
> caused
>
> FAIL: g++.dg/pr104547.C  -std=gnu++14  scan-tree-dump-not vrp2
> "_M_default_append"
> FAIL: g++.dg/pr104547.C  -std=gnu++17  scan-tree-dump-not vrp2
> "_M_default_append"
>
> with GCC configured with
>
> ../../gcc/configure
> --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-1452/usr
> --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet
> --without-isl --enable-libmpx x86_64-linux --disable-bootstrap
>
> To reproduce:
>
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m64}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m64\ -march=cascadelake}'"
>
> (Please do not reply to this email, for question about this report,
> contact me at haochen dot jiang at intel.com)
>


Re: [committed] libstdc++: Fix preprocessor conditions for std::from_chars [PR109921]

2023-06-01 Thread Christophe Lyon via Gcc-patches
Hi,


On Wed, 31 May 2023 at 14:25, Jonathan Wakely via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Tested powerpc64le-linux. Pushed to trunk.
>
> -- >8 --
>
> We use the from_chars_strtod function with __strtof128 to read a
> _Float128 value, but from_chars_strtod is not defined unless uselocale
> is available. This can lead to compilation failures for some targets,
> because we try to define the _Flaot128 overload in terms of a
> non-existing from_chars_strtod function.
>
> Only try to use __strtof128 if uselocale is available, otherwise
> fallback to the long double overload of std::from_chars (which might
> fallback to the double overload, which should use fast_float).
>
> This ensures we always define the full set of overloads, even if they
> are not always accurate for all values of the wider types.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/109921
> * src/c++17/floating_from_chars.cc (USE_STRTOF128_FOR_FROM_CHARS):
> Only define when USE_STRTOD_FOR_FROM_CHARS is also defined.
> (USE_STRTOD_FOR_FROM_CHARS): Do not undefine when long double is
> binary64.
> (from_chars(const char*, const char*, double&, chars_format)):
> Check __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ here.
> (from_chars(const char*, const char*, _Float128&, chars_format))
> Only use from_chars_strtod when USE_STRTOD_FOR_FROM_CHARS is
> defined, otherwise parse a long double and convert to _Float128.
>


This is causing a regression on aarch64:
 FAIL: libstdc++-abi/abi_check

The log says:

3 added symbols
0
_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE11_S_allocateERS3_m
std::__cxx11::basic_string,
std::allocator >::_S_allocate(std::allocator&, unsigned
long)
version status: compatible
GLIBCXX_3.4.32
type: function
status: added

1
_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_S_allocateERS3_m
std::__cxx11::basic_string,
std::allocator >::_S_allocate(std::allocator&, unsigned long)
version status: compatible
GLIBCXX_3.4.32
type: function
status: added

2
_ZSt10from_charsPKcS0_RDF128_St12chars_format
std::from_chars(char const*, char const*, _Float128&, std::chars_format)
version status: incompatible
GLIBCXX_3.4.31
type: function
status: added


2 undesignated symbols
0
_ZSt11__once_call
std::__once_call
version status: compatible
GLIBCXX_3.4.11
type: tls
type size: 8
status: undesignated

1
_ZSt15__once_callable
std::__once_callable
version status: compatible
GLIBCXX_3.4.11
type: tls
type size: 8
status: undesignated


1 incompatible symbols
0
_ZSt10from_charsPKcS0_RDF128_St12chars_format
std::from_chars(char const*, char const*, _Float128&, std::chars_format)
version status: incompatible
GLIBCXX_3.4.31
type: function
status: added



 libstdc++-v3 check-abi Summary 

# of added symbols:  3
# of missing symbols:0
# of undesignated symbols:   2
# of incompatible symbols:   1


Can you have a look?

Thanks,
Christophe

---
>  libstdc++-v3/src/c++17/floating_from_chars.cc | 20 ---
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc
> b/libstdc++-v3/src/c++17/floating_from_chars.cc
> index ebd428d5be3..eea878072b0 100644
> --- a/libstdc++-v3/src/c++17/floating_from_chars.cc
> +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
> @@ -64,7 +64,7 @@
>  // strtold for __ieee128
>  extern "C" __ieee128 __strtoieee128(const char*, char**);
>  #elif __FLT128_MANT_DIG__ == 113 && __LDBL_MANT_DIG__ != 113 \
> -  && defined(__GLIBC_PREREQ)
> +  && defined(__GLIBC_PREREQ) && defined(USE_STRTOD_FOR_FROM_CHARS)
>  #define USE_STRTOF128_FOR_FROM_CHARS 1
>  extern "C" _Float128 __strtof128(const char*, char**)
>__asm ("strtof128")
> @@ -77,10 +77,6 @@ extern "C" _Float128 __strtof128(const char*, char**)
>  #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
>  && __SIZE_WIDTH__ >= 32
>  # define USE_LIB_FAST_FLOAT 1
> -# if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
> -// No need to use strtold.
> -#  undef USE_STRTOD_FOR_FROM_CHARS
> -# endif
>  #endif
>
>  #if USE_LIB_FAST_FLOAT
> @@ -1261,7 +1257,7 @@ from_chars_result
>  from_chars(const char* first, const char* last, long double& value,
>chars_format fmt) noexcept
>  {
> -#if ! USE_STRTOD_FOR_FROM_CHARS
> +#if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ || !defined
> USE_STRTOD_FOR_FROM_CHARS
>// Either long double is the same as double, or we can't use strtold.
>// In the latter case, this might give an incorrect result (e.g. values
>// out of range of double give an error, even if they fit in long
> double).
> @@ -1329,13 +1325,23 @@
> _ZSt10from_charsPKcS0_RDF128_St12chars_format(const char* first,
>   __ieee128& value,
>   chars_format fmt) noexcept
>  __attribute__((alias
> ("_ZSt10from_charsPKcS0_Ru9__ieee128St12chars_form

Re: [PATCH] ci: Add a linux CI

2023-07-12 Thread Christophe Lyon via Gcc-patches
Hi,


On Sun, 9 Jul 2023 at 19:13, Tal Regev via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Description: adding a ci in a github repo. Everytime a user will do a PR to
> master branch or releases branches, it will activate the ci on their repo.
> for example: https://github.com/talregev/gcc/pull/1. Can help users to
> verify their own changes before submitting a patch.
>
> ChangeLog: Add a linux CI
>
> Bootstrapping and testing: I tested it on linux with
> host: x86_64-linux-gnu
> target: x86_64-linux-gnu
> some tests are failing. You can see the results in my CI yourself.
>
>
Thanks for sharing your patch & idea.
I think GCC validation is and has been a problem for a long time ;-)

I am not a maintainer, so take my comments with a grain of salt ;-)

- I don't know if the GCC project would want to accept such patches,
pointing to github etc...
- github is not the main GCC repository, it hosts several mirrors AFAIK
- these mirrors are updated by individuals, I think, I don't know at which
frequency etc... correct me if I'm wrong
- would this mean that each time each such mirror/fork is updated, this
triggers builds on github servers? Would that handle the load? I don't
think so (also: how many "free" minutes of CPU time can be used?)
- as you have noticed the GCC testsuite is not 100% clean (i.e. there are
failures, so 'make check' always exits with an error code), making such a
step useless. What we need is to compare to a baseline (eg. results of
previous build) and report if there were detections. Several companies have
CI systems doing this (either internally, or on publicly accessible servers)

In particular, at Linaro we monitor regressions for several arm and aarch64
flavors, and we are also experimenting with "pre-commit CI", based on
patchwork.

Thanks anyway for sharing, it's good to see such initiatives ;-)

Christophe



> Patch: attach to this email.
>


[PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp

2023-07-13 Thread Christophe Lyon via Gcc-patches
If GCC is configured with the default (soft) -mfloat-abi, and we don't
override the target_board test flags appropriately,
gcc.target/arm/mve/general-c/nomve_fp_1.c fails for lack of
-mfloat-abi=softfp or -mfloat-abi=hard, because it doesn't use
dg-add-options arm_v8_1m_mve (on purpose, see comment in the test).

Require and use the options needed for arm_fp to fix this problem.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/general-c/nomve_fp_1.c: Require arm_fp.
---
 gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c 
b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
index 21c2af16a61..c9d279ead68 100644
--- a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
@@ -1,9 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
which could imply mve+fp depending on the user settings. We want to make
sure the '+fp' extension is not enabled.  */
 /* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
+/* { dg-add-options arm_fp } */
 
 #include 
 
-- 
2.34.1



[PATCH 2/2] [testsuite, arm]: Make mve_fp_fpu[12].c accept single or double precision FPU

2023-07-13 Thread Christophe Lyon via Gcc-patches
This tests currently expect a directive containing .fpu fpv5-sp-d16
and thus may fail if the test is executed for instance with
-march=armv8.1-m.main+mve.fp+fp.dp

This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu
scan-assembler.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
index e375327fb97..8358a616bb5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
index 1fca1100cf0..5dd2feefc35 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
-- 
2.34.1



[PATCH 6/6] arm: [MVE intrinsics] rework vcmlaq

2023-07-13 Thread Christophe Lyon via Gcc-patches
Implement vcmlaq using the new MVE builtins framework.

2023-07-13  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vcmlaq, vcmlaq_rot90)
(vcmlaq_rot180, vcmlaq_rot270): New.
* config/arm/arm-mve-builtins-base.def (vcmlaq, vcmlaq_rot90)
(vcmlaq_rot180, vcmlaq_rot270): New.
* config/arm/arm-mve-builtins-base.h: (vcmlaq, vcmlaq_rot90)
(vcmlaq_rot180, vcmlaq_rot270): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vcmlaq,
vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270.
* config/arm/arm_mve.h (vcmlaq): Delete.
(vcmlaq_rot180): Delete.
(vcmlaq_rot270): Delete.
(vcmlaq_rot90): Delete.
(vcmlaq_m): Delete.
(vcmlaq_rot180_m): Delete.
(vcmlaq_rot270_m): Delete.
(vcmlaq_rot90_m): Delete.
(vcmlaq_f16): Delete.
(vcmlaq_rot180_f16): Delete.
(vcmlaq_rot270_f16): Delete.
(vcmlaq_rot90_f16): Delete.
(vcmlaq_f32): Delete.
(vcmlaq_rot180_f32): Delete.
(vcmlaq_rot270_f32): Delete.
(vcmlaq_rot90_f32): Delete.
(vcmlaq_m_f32): Delete.
(vcmlaq_m_f16): Delete.
(vcmlaq_rot180_m_f32): Delete.
(vcmlaq_rot180_m_f16): Delete.
(vcmlaq_rot270_m_f32): Delete.
(vcmlaq_rot270_m_f16): Delete.
(vcmlaq_rot90_m_f32): Delete.
(vcmlaq_rot90_m_f16): Delete.
(__arm_vcmlaq_f16): Delete.
(__arm_vcmlaq_rot180_f16): Delete.
(__arm_vcmlaq_rot270_f16): Delete.
(__arm_vcmlaq_rot90_f16): Delete.
(__arm_vcmlaq_f32): Delete.
(__arm_vcmlaq_rot180_f32): Delete.
(__arm_vcmlaq_rot270_f32): Delete.
(__arm_vcmlaq_rot90_f32): Delete.
(__arm_vcmlaq_m_f32): Delete.
(__arm_vcmlaq_m_f16): Delete.
(__arm_vcmlaq_rot180_m_f32): Delete.
(__arm_vcmlaq_rot180_m_f16): Delete.
(__arm_vcmlaq_rot270_m_f32): Delete.
(__arm_vcmlaq_rot270_m_f16): Delete.
(__arm_vcmlaq_rot90_m_f32): Delete.
(__arm_vcmlaq_rot90_m_f16): Delete.
(__arm_vcmlaq): Delete.
(__arm_vcmlaq_rot180): Delete.
(__arm_vcmlaq_rot270): Delete.
(__arm_vcmlaq_rot90): Delete.
(__arm_vcmlaq_m): Delete.
(__arm_vcmlaq_rot180_m): Delete.
(__arm_vcmlaq_rot270_m): Delete.
(__arm_vcmlaq_rot90_m): Delete.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   4 +
 gcc/config/arm/arm-mve-builtins-base.def |   4 +
 gcc/config/arm/arm-mve-builtins-base.h   |  16 +-
 gcc/config/arm/arm-mve-builtins.cc   |   4 +
 gcc/config/arm/arm_mve.h | 304 ---
 5 files changed, 22 insertions(+), 310 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 3ad8df304e8..e31095ae112 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -262,6 +262,10 @@ FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
 FUNCTION_ONLY_N (vbrsrq, VBRSRQ)
 FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, 
UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, 
VCADDQ_ROT90_M_F))
 FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, 
UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, 
VCADDQ_ROT270_M_F))
+FUNCTION (vcmlaq, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMLA, 
-1, -1, VCMLAQ_M_F))
+FUNCTION (vcmlaq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMLA90, -1, -1, VCMLAQ_ROT90_M_F))
+FUNCTION (vcmlaq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMLA180, -1, -1, VCMLAQ_ROT180_M_F))
+FUNCTION (vcmlaq_rot270, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMLA270, -1, -1, VCMLAQ_ROT270_M_F))
 FUNCTION (vcmulq, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL, 
-1, -1, VCMULQ_M_F))
 FUNCTION (vcmulq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMUL90, -1, -1, VCMULQ_ROT90_M_F))
 FUNCTION (vcmulq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMUL180, -1, -1, VCMULQ_ROT180_M_F))
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index cbcf0d296cd..e7d466f2efd 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -158,6 +158,10 @@ DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none)
 DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none)
 DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none)
 DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none)
+DEF_MVE_FUNCTION (vcmlaq, ternary, all_float, m_or_none)
+DEF_MVE_FUNCTION (vcmlaq_rot90, ternary, all_float, m_or_none)
+DEF_MVE_FUNCTION (vcmlaq_rot180, ternary, all_float, m_or_none)
+DEF_MVE_FUNCTION (vcmlaq_rot270, ternary, all_float, m_or_none)
 DEF_MVE_FUNCTION (vcmulq, binary, all_fl

[PATCH 1/6] arm: [MVE intrinsics] Factorize vcaddq vhcaddq

2023-07-13 Thread Christophe Lyon via Gcc-patches
Factorize vcaddq, vhcaddq so that they use the same parameterized
names.

To be able to use the same patterns, we add a suffix to vcaddq.

Note that vcadd uses UNSPEC_VCADDxx for builtins without predication,
and VCADDQ_ROTxx_M_x (that is, not starting with "UNSPEC_").  The
UNPEC_* names are also used by neon.md

2023-07-13  Christophe Lyon  

gcc/
* config/arm/arm_mve_builtins.def (vcaddq_rot90_, vcaddq_rot270_)
(vcaddq_rot90_f, vcaddq_rot90_f): Add "_" or "_f" suffix.
* config/arm/iterators.md (mve_insn): Add vcadd, vhcadd.
(isu): Add UNSPEC_VCADD90, UNSPEC_VCADD270, VCADDQ_ROT270_M_U,
VCADDQ_ROT270_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_S,
VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S, VHCADDQ_ROT90_S,
VHCADDQ_ROT270_S.
(rot): Add VCADDQ_ROT90_M_F, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U,
VCADDQ_ROT270_M_F, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U,
VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, VHCADDQ_ROT90_M_S,
VHCADDQ_ROT270_M_S.
(mve_rot): Add VCADDQ_ROT90_M_F, VCADDQ_ROT90_M_S,
VCADDQ_ROT90_M_U, VCADDQ_ROT270_M_F, VCADDQ_ROT270_M_S,
VCADDQ_ROT270_M_U, VHCADDQ_ROT90_S, VHCADDQ_ROT270_S,
VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S.
(supf): Add VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S,
VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, UNSPEC_VCADD90,
UNSPEC_VCADD270.
(VCADDQ_ROT270_M): Delete.
(VCADDQ_M_F VxCADDQ VxCADDQ_M): New.
(VCADDQ_ROT90_M): Delete.
* config/arm/mve.md (mve_vcaddq)
(mve_vhcaddq_rot270_s, mve_vhcaddq_rot90_s): Merge
into ...
(@mve_q_): ... this.
(mve_vcaddq): Rename into ...
(@mve_q_f): ... this
(mve_vcaddq_rot270_m_)
(mve_vcaddq_rot90_m_, mve_vhcaddq_rot270_m_s)
(mve_vhcaddq_rot90_m_s): Merge into ...
(@mve_q_m_): ... this.
(mve_vcaddq_rot270_m_f, mve_vcaddq_rot90_m_f): Merge
into ...
(@mve_q_m_f): ... this.
---
 gcc/config/arm/arm_mve_builtins.def |   6 +-
 gcc/config/arm/iterators.md |  38 +++-
 gcc/config/arm/mve.md   | 135 +---
 3 files changed, 62 insertions(+), 117 deletions(-)

diff --git a/gcc/config/arm/arm_mve_builtins.def 
b/gcc/config/arm/arm_mve_builtins.def
index 8de765de3b0..63ad1845593 100644
--- a/gcc/config/arm/arm_mve_builtins.def
+++ b/gcc/config/arm/arm_mve_builtins.def
@@ -187,6 +187,10 @@ VAR3 (BINOP_NONE_NONE_NONE, vmaxvq_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vmaxq_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vhsubq_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vhsubq_n_s, v16qi, v8hi, v4si)
+VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot90_, v16qi, v8hi, v4si)
+VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot270_, v16qi, v8hi, v4si)
+VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot90_f, v8hf, v4sf)
+VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot270_f, v8hf, v4sf)
 VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot90_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot270_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vhaddq_s, v16qi, v8hi, v4si)
@@ -870,8 +874,6 @@ VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, 
v16qi, v8hi, v4si)
 VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si)
 
 /* optabs without any suffixes.  */
-VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot90, v16qi, v8hi, v4si, v8hf, v4sf)
-VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot270, v16qi, v8hi, v4si, v8hf, v4sf)
 VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90, v8hf, v4sf)
 VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270, v8hf, v4sf)
 VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180, v8hf, v4sf)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 9e77af55d60..da1ead34e58 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -902,6 +902,7 @@
 ])
 
 (define_int_attr mve_insn [
+(UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd")
 (VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav")
 (VABAVQ_S "vabav") (VABAVQ_U "vabav")
 (VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F "vabd")
@@ -925,6 +926,8 @@
 (VBICQ_N_S "vbic") (VBICQ_N_U "vbic")
 (VBRSRQ_M_N_S "vbrsr") (VBRSRQ_M_N_U "vbrsr") (VBRSRQ_M_N_F 
"vbrsr")
 (VBRSRQ_N_S "vbrsr") (VBRSRQ_N_U "vbrsr") (VBRSRQ_N_F "vbrsr")
+(VCADDQ_ROT270_M_U "vcadd") (VCADDQ_ROT270_M_S "vcadd") 
(VCADDQ_ROT270_M_F "vcadd")
+(VCADDQ_ROT90_M_U "vcadd") (VCADDQ_ROT90_M_S "vcadd") 
(VCADDQ_ROT90_M_F "vcadd")
 (VCLSQ_M_S "vcls")
 (VCLSQ_S "vcls")
 (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz")
@@ -944,6 +947,8 @@
 (VHADDQ_M_S "vhadd") (VHADDQ_M_U "vhadd")
 (VHADDQ_N_S "vhadd") (VHADDQ_N_U "vhadd")
 (VHADDQ_S "vhadd") (VHADDQ_U "vhadd")
+(VHCADDQ_ROT90_M_S "vhcadd") (VHCADDQ_ROT270_M_S "vhcadd")
+

[PATCH 3/6] arm: [MVE intrinsics factorize vcmulq

2023-07-13 Thread Christophe Lyon via Gcc-patches
Factorize vcmulq builtins so that they use parameterized names.

We can merged them with vcadd.

2023-07-13  Christophe Lyon  

gcc/:
* config/arm/arm_mve_builtins.def (vcmulq_rot90_f)
(vcmulq_rot270_f, vcmulq_rot180_f, vcmulq_f): Add "_f" suffix.
* config/arm/iterators.md (MVE_VCADDQ_VCMULQ)
(MVE_VCADDQ_VCMULQ_M): New.
(mve_insn): Add vcmul.
(rot): Add VCMULQ_M_F, VCMULQ_ROT90_M_F, VCMULQ_ROT180_M_F,
VCMULQ_ROT270_M_F.
(VCMUL): Delete.
(mve_rot): Add VCMULQ_M_F, VCMULQ_ROT90_M_F, VCMULQ_ROT180_M_F,
VCMULQ_ROT270_M_F.
* config/arm/mve.md (mve_vcmulq): Merge into
@mve_q_f.
(mve_vcmulq_m_f, mve_vcmulq_rot180_m_f)
(mve_vcmulq_rot270_m_f, mve_vcmulq_rot90_m_f): Merge
into @mve_q_m_f.
---
 gcc/config/arm/arm_mve_builtins.def |  8 +--
 gcc/config/arm/iterators.md | 27 +++--
 gcc/config/arm/mve.md   | 92 +++--
 3 files changed, 33 insertions(+), 94 deletions(-)

diff --git a/gcc/config/arm/arm_mve_builtins.def 
b/gcc/config/arm/arm_mve_builtins.def
index 63ad1845593..56358c0bd02 100644
--- a/gcc/config/arm/arm_mve_builtins.def
+++ b/gcc/config/arm/arm_mve_builtins.def
@@ -191,6 +191,10 @@ VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot90_, v16qi, v8hi, 
v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot270_, v16qi, v8hi, v4si)
 VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot90_f, v8hf, v4sf)
 VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot270_f, v8hf, v4sf)
+VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90_f, v8hf, v4sf)
+VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270_f, v8hf, v4sf)
+VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180_f, v8hf, v4sf)
+VAR2 (BINOP_NONE_NONE_NONE, vcmulq_f, v8hf, v4sf)
 VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot90_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot270_s, v16qi, v8hi, v4si)
 VAR3 (BINOP_NONE_NONE_NONE, vhaddq_s, v16qi, v8hi, v4si)
@@ -874,10 +878,6 @@ VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, 
v16qi, v8hi, v4si)
 VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si)
 
 /* optabs without any suffixes.  */
-VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90, v8hf, v4sf)
-VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270, v8hf, v4sf)
-VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180, v8hf, v4sf)
-VAR2 (BINOP_NONE_NONE_NONE, vcmulq, v8hf, v4sf)
 VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90, v8hf, v4sf)
 VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270, v8hf, v4sf)
 VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180, v8hf, v4sf)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index da1ead34e58..9f71404e26c 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -901,8 +901,19 @@
 VPSELQ_F
 ])
 
+(define_int_iterator MVE_VCADDQ_VCMULQ [
+UNSPEC_VCADD90 UNSPEC_VCADD270
+UNSPEC_VCMUL UNSPEC_VCMUL90 UNSPEC_VCMUL180 UNSPEC_VCMUL270
+])
+
+(define_int_iterator MVE_VCADDQ_VCMULQ_M [
+VCADDQ_ROT90_M_F VCADDQ_ROT270_M_F
+VCMULQ_M_F VCMULQ_ROT90_M_F VCMULQ_ROT180_M_F 
VCMULQ_ROT270_M_F
+])
+
 (define_int_attr mve_insn [
 (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd")
+(UNSPEC_VCMUL "vcmul") (UNSPEC_VCMUL90 "vcmul") 
(UNSPEC_VCMUL180 "vcmul") (UNSPEC_VCMUL270 "vcmul")
 (VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav")
 (VABAVQ_S "vabav") (VABAVQ_U "vabav")
 (VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F "vabd")
@@ -931,6 +942,7 @@
 (VCLSQ_M_S "vcls")
 (VCLSQ_S "vcls")
 (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz")
+(VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") 
(VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul")
 (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F 
"vcreate")
 (VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup")
 (VDUPQ_N_S "vdup") (VDUPQ_N_U "vdup") (VDUPQ_N_F "vdup")
@@ -2182,7 +2194,11 @@
  (UNSPEC_VCMLA "0")
  (UNSPEC_VCMLA90 "90")
  (UNSPEC_VCMLA180 "180")
- (UNSPEC_VCMLA270 "270")])
+ (UNSPEC_VCMLA270 "270")
+ (VCMULQ_M_F "0")
+ (VCMULQ_ROT90_M_F "90")
+ (VCMULQ_ROT180_M_F "180")
+ (VCMULQ_ROT270_M_F "270")])
 
 ;; The complex operations when performed on a real complex number require two
 ;; instructions to perform the operation. e.g. complex multiplication requires
@@ -2230,10 +2246,11 @@
  (UNSPEC_VCMUL "")
  (UNSPEC_VCMUL90 "_rot90")
  (UNSPEC_VCMUL180 "_rot180")
- (UNSPEC_VCMUL270 "_rot270")])
-
-(define_int_iterator VCMUL [UNSPEC_VCMUL UNSPEC_V

[PATCH 5/6] arm: [MVE intrinsics] factorize vcmlaq

2023-07-13 Thread Christophe Lyon via Gcc-patches
Factorize vcmlaq builtins so that they use parameterized names.

2023-17-13  Christophe Lyon  

gcc/
* config/arm/arm_mve_builtins.def (vcmlaq_rot90_f)
(vcmlaq_rot270_f, vcmlaq_rot180_f, vcmlaq_f): Add "_f" suffix.
* config/arm/iterators.md (MVE_VCMLAQ_M): New.
(mve_insn): Add vcmla.
(rot): Add VCMLAQ_M_F, VCMLAQ_ROT90_M_F, VCMLAQ_ROT180_M_F,
VCMLAQ_ROT270_M_F.
(mve_rot): Add VCMLAQ_M_F, VCMLAQ_ROT90_M_F, VCMLAQ_ROT180_M_F,
VCMLAQ_ROT270_M_F.
* config/arm/mve.md (mve_vcmlaq): Rename into ...
(@mve_q_f): ... this.
(mve_vcmlaq_m_f, mve_vcmlaq_rot180_m_f)
(mve_vcmlaq_rot270_m_f, mve_vcmlaq_rot90_m_f): Merge
into ...
(@mve_q_m_f): ... this.
---
 gcc/config/arm/arm_mve_builtins.def | 10 ++---
 gcc/config/arm/iterators.md | 19 -
 gcc/config/arm/mve.md   | 64 -
 3 files changed, 29 insertions(+), 64 deletions(-)

diff --git a/gcc/config/arm/arm_mve_builtins.def 
b/gcc/config/arm/arm_mve_builtins.def
index 56358c0bd02..43dacc3dda1 100644
--- a/gcc/config/arm/arm_mve_builtins.def
+++ b/gcc/config/arm/arm_mve_builtins.def
@@ -378,6 +378,10 @@ VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmlasq_n_s, v16qi, v8hi, 
v4si)
 VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmlaq_n_s, v16qi, v8hi, v4si)
 VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmladavaxq_s, v16qi, v8hi, v4si)
 VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmladavaq_s, v16qi, v8hi, v4si)
+VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90_f, v8hf, v4sf)
+VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270_f, v8hf, v4sf)
+VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180_f, v8hf, v4sf)
+VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_f, v8hf, v4sf)
 VAR3 (TERNOP_NONE_NONE_NONE_IMM, vsriq_n_s, v16qi, v8hi, v4si)
 VAR3 (TERNOP_NONE_NONE_NONE_IMM, vsliq_n_s, v16qi, v8hi, v4si)
 VAR2 (TERNOP_UNONE_UNONE_UNONE_PRED, vrev32q_m_u, v16qi, v8hi)
@@ -876,9 +880,3 @@ VAR3 (QUADOP_NONE_NONE_UNONE_IMM_PRED, vshlcq_m_vec_s, 
v16qi, v8hi, v4si)
 VAR3 (QUADOP_NONE_NONE_UNONE_IMM_PRED, vshlcq_m_carry_s, v16qi, v8hi, v4si)
 VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, v16qi, v8hi, v4si)
 VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si)
-
-/* optabs without any suffixes.  */
-VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90, v8hf, v4sf)
-VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270, v8hf, v4sf)
-VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180, v8hf, v4sf)
-VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq, v8hf, v4sf)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 9f71404e26c..b13ff53d36f 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -911,6 +911,10 @@
 VCMULQ_M_F VCMULQ_ROT90_M_F VCMULQ_ROT180_M_F 
VCMULQ_ROT270_M_F
 ])
 
+(define_int_iterator MVE_VCMLAQ_M [
+VCMLAQ_M_F VCMLAQ_ROT90_M_F VCMLAQ_ROT180_M_F 
VCMLAQ_ROT270_M_F
+])
+
 (define_int_attr mve_insn [
 (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd")
 (UNSPEC_VCMUL "vcmul") (UNSPEC_VCMUL90 "vcmul") 
(UNSPEC_VCMUL180 "vcmul") (UNSPEC_VCMUL270 "vcmul")
@@ -942,6 +946,7 @@
 (VCLSQ_M_S "vcls")
 (VCLSQ_S "vcls")
 (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz")
+(VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") 
(VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla")
 (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") 
(VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul")
 (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F 
"vcreate")
 (VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup")
@@ -1204,6 +1209,7 @@
 (VSUBQ_M_N_S "vsub") (VSUBQ_M_N_U "vsub") (VSUBQ_M_N_F "vsub")
 (VSUBQ_M_S "vsub") (VSUBQ_M_U "vsub") (VSUBQ_M_F "vsub")
 (VSUBQ_N_S "vsub") (VSUBQ_N_U "vsub") (VSUBQ_N_F "vsub")
+(UNSPEC_VCMLA "vcmla") (UNSPEC_VCMLA90 "vcmla") 
(UNSPEC_VCMLA180 "vcmla") (UNSPEC_VCMLA270 "vcmla")
 ])
 
 (define_int_attr isu[
@@ -2198,7 +2204,12 @@
  (VCMULQ_M_F "0")
  (VCMULQ_ROT90_M_F "90")
  (VCMULQ_ROT180_M_F "180")
- (VCMULQ_ROT270_M_F "270")])
+ (VCMULQ_ROT270_M_F "270")
+ (VCMLAQ_M_F "0")
+ (VCMLAQ_ROT90_M_F "90")
+ (VCMLAQ_ROT180_M_F "180")
+ (VCMLAQ_ROT270_M_F "270")
+ ])
 
 ;; The complex operations when performed on a real complex number require two
 ;; instructions to perform the operation. e.g. complex multiplication requires
@@ -2250,7 +2261,11 @@
  (VCMULQ_M_F "")
  (VCMULQ_ROT90_M_F "_rot90")
  (VCMULQ_ROT180_M_F "_rot180")
- 

[PATCH 4/6] arm: [MVE intrinsics] rework vcmulq

2023-07-13 Thread Christophe Lyon via Gcc-patches
Implement vcmulq using the new MVE builtins framework.

2023-07-13 Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vcmulq, vcmulq_rot90)
(vcmulq_rot180, vcmulq_rot270): New.
* config/arm/arm-mve-builtins-base.def (vcmulq, vcmulq_rot90)
(vcmulq_rot180, vcmulq_rot270): New.
* config/arm/arm-mve-builtins-base.h: (vcmulq, vcmulq_rot90)
(vcmulq_rot180, vcmulq_rot270): New.
* config/arm/arm_mve.h (vcmulq_rot90): Delete.
(vcmulq_rot270): Delete.
(vcmulq_rot180): Delete.
(vcmulq): Delete.
(vcmulq_m): Delete.
(vcmulq_rot180_m): Delete.
(vcmulq_rot270_m): Delete.
(vcmulq_rot90_m): Delete.
(vcmulq_x): Delete.
(vcmulq_rot90_x): Delete.
(vcmulq_rot180_x): Delete.
(vcmulq_rot270_x): Delete.
(vcmulq_rot90_f16): Delete.
(vcmulq_rot270_f16): Delete.
(vcmulq_rot180_f16): Delete.
(vcmulq_f16): Delete.
(vcmulq_rot90_f32): Delete.
(vcmulq_rot270_f32): Delete.
(vcmulq_rot180_f32): Delete.
(vcmulq_f32): Delete.
(vcmulq_m_f32): Delete.
(vcmulq_m_f16): Delete.
(vcmulq_rot180_m_f32): Delete.
(vcmulq_rot180_m_f16): Delete.
(vcmulq_rot270_m_f32): Delete.
(vcmulq_rot270_m_f16): Delete.
(vcmulq_rot90_m_f32): Delete.
(vcmulq_rot90_m_f16): Delete.
(vcmulq_x_f16): Delete.
(vcmulq_x_f32): Delete.
(vcmulq_rot90_x_f16): Delete.
(vcmulq_rot90_x_f32): Delete.
(vcmulq_rot180_x_f16): Delete.
(vcmulq_rot180_x_f32): Delete.
(vcmulq_rot270_x_f16): Delete.
(vcmulq_rot270_x_f32): Delete.
(__arm_vcmulq_rot90_f16): Delete.
(__arm_vcmulq_rot270_f16): Delete.
(__arm_vcmulq_rot180_f16): Delete.
(__arm_vcmulq_f16): Delete.
(__arm_vcmulq_rot90_f32): Delete.
(__arm_vcmulq_rot270_f32): Delete.
(__arm_vcmulq_rot180_f32): Delete.
(__arm_vcmulq_f32): Delete.
(__arm_vcmulq_m_f32): Delete.
(__arm_vcmulq_m_f16): Delete.
(__arm_vcmulq_rot180_m_f32): Delete.
(__arm_vcmulq_rot180_m_f16): Delete.
(__arm_vcmulq_rot270_m_f32): Delete.
(__arm_vcmulq_rot270_m_f16): Delete.
(__arm_vcmulq_rot90_m_f32): Delete.
(__arm_vcmulq_rot90_m_f16): Delete.
(__arm_vcmulq_x_f16): Delete.
(__arm_vcmulq_x_f32): Delete.
(__arm_vcmulq_rot90_x_f16): Delete.
(__arm_vcmulq_rot90_x_f32): Delete.
(__arm_vcmulq_rot180_x_f16): Delete.
(__arm_vcmulq_rot180_x_f32): Delete.
(__arm_vcmulq_rot270_x_f16): Delete.
(__arm_vcmulq_rot270_x_f32): Delete.
(__arm_vcmulq_rot90): Delete.
(__arm_vcmulq_rot270): Delete.
(__arm_vcmulq_rot180): Delete.
(__arm_vcmulq): Delete.
(__arm_vcmulq_m): Delete.
(__arm_vcmulq_rot180_m): Delete.
(__arm_vcmulq_rot270_m): Delete.
(__arm_vcmulq_rot90_m): Delete.
(__arm_vcmulq_x): Delete.
(__arm_vcmulq_rot90_x): Delete.
(__arm_vcmulq_rot180_x): Delete.
(__arm_vcmulq_rot270_x): Delete.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   4 +
 gcc/config/arm/arm-mve-builtins-base.def |   4 +
 gcc/config/arm/arm-mve-builtins-base.h   |   4 +
 gcc/config/arm/arm_mve.h | 448 ---
 4 files changed, 12 insertions(+), 448 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index f15bb926147..3ad8df304e8 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -262,6 +262,10 @@ FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
 FUNCTION_ONLY_N (vbrsrq, VBRSRQ)
 FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, 
UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, 
VCADDQ_ROT90_M_F))
 FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, 
UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, 
VCADDQ_ROT270_M_F))
+FUNCTION (vcmulq, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL, 
-1, -1, VCMULQ_M_F))
+FUNCTION (vcmulq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMUL90, -1, -1, VCMULQ_ROT90_M_F))
+FUNCTION (vcmulq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMUL180, -1, -1, VCMULQ_ROT180_M_F))
+FUNCTION (vcmulq_rot270, unspec_mve_function_exact_insn_rot, (-1, -1, 
UNSPEC_VCMUL270, -1, -1, VCMULQ_ROT270_M_F))
 FUNCTION (vhcaddq_rot90, unspec_mve_function_exact_insn_rot, (VHCADDQ_ROT90_S, 
-1, -1, VHCADDQ_ROT90_M_S, -1, -1))
 FUNCTION (vhcaddq_rot270, unspec_mve_function_exact_insn_rot, 
(VHCADDQ_ROT270_S, -1, -1, VHCADDQ_ROT270_M_S, -1, -1))
 FUNCTION_WITHOUT_N_NO_U_F (vclsq, VCLSQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 9a793147960..cbcf0d296cd 

[PATCH 2/6] arm: [MVE intrinsics] rework vcaddq vhcaddq

2023-07-13 Thread Christophe Lyon via Gcc-patches
Implement vcaddq, vhcaddq using the new MVE builtins framework.

2023-07-13  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vcaddq_rot90)
(vcaddq_rot270, vhcaddq_rot90, vhcaddq_rot270): New.
* config/arm/arm-mve-builtins-base.def (vcaddq_rot90)
(vcaddq_rot270, vhcaddq_rot90, vhcaddq_rot270): New.
* config/arm/arm-mve-builtins-base.h: (vcaddq_rot90)
(vcaddq_rot270, vhcaddq_rot90, vhcaddq_rot270): New.
* config/arm/arm-mve-builtins-functions.h (class
unspec_mve_function_exact_insn_rot): New.
* config/arm/arm_mve.h (vcaddq_rot90): Delete.
(vcaddq_rot270): Delete.
(vhcaddq_rot90): Delete.
(vhcaddq_rot270): Delete.
(vcaddq_rot270_m): Delete.
(vcaddq_rot90_m): Delete.
(vhcaddq_rot270_m): Delete.
(vhcaddq_rot90_m): Delete.
(vcaddq_rot90_x): Delete.
(vcaddq_rot270_x): Delete.
(vhcaddq_rot90_x): Delete.
(vhcaddq_rot270_x): Delete.
(vcaddq_rot90_u8): Delete.
(vcaddq_rot270_u8): Delete.
(vhcaddq_rot90_s8): Delete.
(vhcaddq_rot270_s8): Delete.
(vcaddq_rot90_s8): Delete.
(vcaddq_rot270_s8): Delete.
(vcaddq_rot90_u16): Delete.
(vcaddq_rot270_u16): Delete.
(vhcaddq_rot90_s16): Delete.
(vhcaddq_rot270_s16): Delete.
(vcaddq_rot90_s16): Delete.
(vcaddq_rot270_s16): Delete.
(vcaddq_rot90_u32): Delete.
(vcaddq_rot270_u32): Delete.
(vhcaddq_rot90_s32): Delete.
(vhcaddq_rot270_s32): Delete.
(vcaddq_rot90_s32): Delete.
(vcaddq_rot270_s32): Delete.
(vcaddq_rot90_f16): Delete.
(vcaddq_rot270_f16): Delete.
(vcaddq_rot90_f32): Delete.
(vcaddq_rot270_f32): Delete.
(vcaddq_rot270_m_s8): Delete.
(vcaddq_rot270_m_s32): Delete.
(vcaddq_rot270_m_s16): Delete.
(vcaddq_rot270_m_u8): Delete.
(vcaddq_rot270_m_u32): Delete.
(vcaddq_rot270_m_u16): Delete.
(vcaddq_rot90_m_s8): Delete.
(vcaddq_rot90_m_s32): Delete.
(vcaddq_rot90_m_s16): Delete.
(vcaddq_rot90_m_u8): Delete.
(vcaddq_rot90_m_u32): Delete.
(vcaddq_rot90_m_u16): Delete.
(vhcaddq_rot270_m_s8): Delete.
(vhcaddq_rot270_m_s32): Delete.
(vhcaddq_rot270_m_s16): Delete.
(vhcaddq_rot90_m_s8): Delete.
(vhcaddq_rot90_m_s32): Delete.
(vhcaddq_rot90_m_s16): Delete.
(vcaddq_rot270_m_f32): Delete.
(vcaddq_rot270_m_f16): Delete.
(vcaddq_rot90_m_f32): Delete.
(vcaddq_rot90_m_f16): Delete.
(vcaddq_rot90_x_s8): Delete.
(vcaddq_rot90_x_s16): Delete.
(vcaddq_rot90_x_s32): Delete.
(vcaddq_rot90_x_u8): Delete.
(vcaddq_rot90_x_u16): Delete.
(vcaddq_rot90_x_u32): Delete.
(vcaddq_rot270_x_s8): Delete.
(vcaddq_rot270_x_s16): Delete.
(vcaddq_rot270_x_s32): Delete.
(vcaddq_rot270_x_u8): Delete.
(vcaddq_rot270_x_u16): Delete.
(vcaddq_rot270_x_u32): Delete.
(vhcaddq_rot90_x_s8): Delete.
(vhcaddq_rot90_x_s16): Delete.
(vhcaddq_rot90_x_s32): Delete.
(vhcaddq_rot270_x_s8): Delete.
(vhcaddq_rot270_x_s16): Delete.
(vhcaddq_rot270_x_s32): Delete.
(vcaddq_rot90_x_f16): Delete.
(vcaddq_rot90_x_f32): Delete.
(vcaddq_rot270_x_f16): Delete.
(vcaddq_rot270_x_f32): Delete.
(__arm_vcaddq_rot90_u8): Delete.
(__arm_vcaddq_rot270_u8): Delete.
(__arm_vhcaddq_rot90_s8): Delete.
(__arm_vhcaddq_rot270_s8): Delete.
(__arm_vcaddq_rot90_s8): Delete.
(__arm_vcaddq_rot270_s8): Delete.
(__arm_vcaddq_rot90_u16): Delete.
(__arm_vcaddq_rot270_u16): Delete.
(__arm_vhcaddq_rot90_s16): Delete.
(__arm_vhcaddq_rot270_s16): Delete.
(__arm_vcaddq_rot90_s16): Delete.
(__arm_vcaddq_rot270_s16): Delete.
(__arm_vcaddq_rot90_u32): Delete.
(__arm_vcaddq_rot270_u32): Delete.
(__arm_vhcaddq_rot90_s32): Delete.
(__arm_vhcaddq_rot270_s32): Delete.
(__arm_vcaddq_rot90_s32): Delete.
(__arm_vcaddq_rot270_s32): Delete.
(__arm_vcaddq_rot270_m_s8): Delete.
(__arm_vcaddq_rot270_m_s32): Delete.
(__arm_vcaddq_rot270_m_s16): Delete.
(__arm_vcaddq_rot270_m_u8): Delete.
(__arm_vcaddq_rot270_m_u32): Delete.
(__arm_vcaddq_rot270_m_u16): Delete.
(__arm_vcaddq_rot90_m_s8): Delete.
(__arm_vcaddq_rot90_m_s32): Delete.
(__arm_vcaddq_rot90_m_s16): Delete.
(__arm_vcaddq_rot90_m_u8): Delete.
(__arm_vcaddq_rot90_m_u32): Delete.
(__arm_vcaddq_rot90_m_u16): Delete.
(__arm_vhcaddq_rot270_m_s8): Delete.
(__arm_vhcaddq_rot270_m_s32): Delete.
(__arm_vhcaddq_rot270_m_s16): Delete.
(__arm_vhcaddq_rot90_m_s8): Delete.

[PATCH][COMMITTED] doc: Fix spelling in arm_v8_1m_main_cde_mve_fp

2023-08-01 Thread Christophe Lyon via Gcc-patches
Fix spelling mistakes introduced by my previous patch in this area.

Committed as obvious.

2023-08-01  Christophe Lyon  

gcc/
* doc/sourcebuild.texi (arm_v8_1m_main_cde_mve_fp): Fix spelling.
---
 gcc/doc/sourcebuild.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index e5d15d67253..1a78b3c1abb 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2186,12 +2186,12 @@ the Custom Datapath Extension (CDE) and floating-point 
(VFP).
 Some multilibs may be incompatible with these options.
 
 @item arm_v8_1m_main_cde_mve
-Arm target supports options to generate instructions from Arm.1-M with
+Arm target supports options to generate instructions from Armv8.1-M with
 the Custom Datapath Extension (CDE) and M-Profile Vector Extension (MVE).
 Some multilibs may be incompatible with these options.
 
 @item arm_v8_1m_main_cde_mve_fp
-ARM target supports options to generate instructions from ARMv8.1-M
+Arm target supports options to generate instructions from Armv8.1-M
 with the Custom Datapath Extension (CDE) and M-Profile Vector
 Extension (MVE) with floating-point support.  Some multilibs may be
 incompatible with these options.
-- 
2.34.1



Re: arm: Remove unsigned variant of vcaddq_m

2023-08-01 Thread Christophe Lyon via Gcc-patches
Hi Stam,


On Tue, 1 Aug 2023 at 19:22, Stamatis Markianos-Wright via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Hi all,
>
> The unsigned variants of the vcaddq_m operation are not needed within the
> compiler, as the assembly output of the signed and unsigned versions of the
> ops is identical: with a `.i` suffix (as opposed to separate `.s` and `.u`
> suffixes).
>
> Tested with baremetal arm-none-eabi on Arm's fastmodels.
>
> Ok for trunk?
>

LGTM, with the very minor nit that you forgot to mention the typo fix in
mve.md in the ChangeLog part ;-)

 I think similar changes can be performed for all the other builtins that
use .i for both signed and unsigned versions, but we can do that later.

Thanks,

Christophe


> Thanks,
> Stamatis Markianos-Wright
>
> gcc/ChangeLog:
>
>  * config/arm/arm-mve-builtins-base.cc (vcaddq_rot90, vcaddq_rot270):
>Use common insn for signed and unsigned front-end definitions.
>  * config/arm/arm_mve_builtins.def
>(vcaddq_rot90_m_u, vcaddq_rot270_m_u): Make common.
>(vcaddq_rot90_m_s, vcaddq_rot270_m_s): Remove.
>  * config/arm/iterators.md (mve_insn): Merge signed and unsigned defs.
>(isu): Likewise.
>(rot): Likewise.
>(mve_rot): Likewise.
>(supf): Likewise.
>(VxCADDQ_M): Likewise.
>  * config/arm/unspecs.md (unspec): Likewise.
> ---
>   gcc/config/arm/arm-mve-builtins-base.cc |  4 ++--
>   gcc/config/arm/arm_mve_builtins.def |  6 ++---
>   gcc/config/arm/iterators.md | 30 +++--
>   gcc/config/arm/mve.md   |  4 ++--
>   gcc/config/arm/unspecs.md   |  6 ++---
>   5 files changed, 21 insertions(+), 29 deletions(-)
>
> diff --git a/gcc/config/arm/arm-mve-builtins-base.cc
> b/gcc/config/arm/arm-mve-builtins-base.cc
> index e31095ae112..426a87e9852 100644
> --- a/gcc/config/arm/arm-mve-builtins-base.cc
> +++ b/gcc/config/arm/arm-mve-builtins-base.cc
> @@ -260,8 +260,8 @@ FUNCTION_PRED_P_S_U (vaddvq, VADDVQ)
>   FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ)
>   FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
>   FUNCTION_ONLY_N (vbrsrq, VBRSRQ)
> -FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot,
> (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S,
> VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_F))
> -FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot,
> (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S,
> VCADDQ_ROT270_M_U, VCADDQ_ROT270_M_F))
> +FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot,
> (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M,
> VCADDQ_ROT90_M, VCADDQ_ROT90_M_F))
> +FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot,
> (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M,
> VCADDQ_ROT270_M, VCADDQ_ROT270_M_F))
>   FUNCTION (vcmlaq, unspec_mve_function_exact_insn_rot, (-1, -1,
> UNSPEC_VCMLA, -1, -1, VCMLAQ_M_F))
>   FUNCTION (vcmlaq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1,
> UNSPEC_VCMLA90, -1, -1, VCMLAQ_ROT90_M_F))
>   FUNCTION (vcmlaq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1,
> UNSPEC_VCMLA180, -1, -1, VCMLAQ_ROT180_M_F))
> diff --git a/gcc/config/arm/arm_mve_builtins.def
> b/gcc/config/arm/arm_mve_builtins.def
> index 43dacc3dda1..6ac1812c697 100644
> --- a/gcc/config/arm/arm_mve_builtins.def
> +++ b/gcc/config/arm/arm_mve_builtins.def
> @@ -523,8 +523,8 @@ VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED,
> vhsubq_m_n_u, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vhaddq_m_u, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vhaddq_m_n_u, v16qi, v8hi,
> v4si)
>   VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, veorq_m_u, v16qi, v8hi, v4si)
> -VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot90_m_u, v16qi,
> v8hi, v4si)
> -VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot270_m_u, v16qi,
> v8hi, v4si)
> +VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot90_m_, v16qi,
> v8hi, v4si)
> +VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot270_m_, v16qi,
> v8hi, v4si)
>   VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vbicq_m_u, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vandq_m_u, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vaddq_m_u, v16qi, v8hi, v4si)
> @@ -587,8 +587,6 @@ VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED,
> vhcaddq_rot270_m_s, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vhaddq_m_s, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vhaddq_m_n_s, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, veorq_m_s, v16qi, v8hi, v4si)
> -VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vcaddq_rot90_m_s, v16qi, v8hi,
> v4si)
> -VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vcaddq_rot270_m_s, v16qi, v8hi,
> v4si)
>   VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vbrsrq_m_n_s, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vbicq_m_s, v16qi, v8hi, v4si)
>   VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vandq_m_s, v16qi, v8

[PATCH] testsuite: Fix gcc.dg/analyzer/allocation-size-multiline-[123].c [PR 110426]

2023-08-08 Thread Christophe Lyon via Gcc-patches
For 32-bit newlib targets (e.g. arm-eabi)  int32_t is "long int".

Like previous patches in these tests, update the matching regexps to
match "aka (long )?int".

Tested on arm-eabi and aarch64-linux-gnu.

2023-08-08  Christophe Lyon  

gcc/testsuite/
PR analyzer/110426
* gcc.dg/analyzer/allocation-size-multiline-1.c: Handle
int32_t being "long int".
* gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
* gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
---
 gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c | 6 +++---
 gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c | 6 +++---
 gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c 
b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
index 9938ba237a0..b56e4b4e8e1 100644
--- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
@@ -16,7 +16,7 @@ void test_constant_1 (void)
 |   int32_t *ptr = __builtin_malloc (1);
 |  ^~~~
 |  |
-|  (1) allocated 1 bytes and assigned to 'int32_t *' {aka 
'int *'} here; 'sizeof (int32_t {aka int})' is '4'
+|  (1) allocated 1 bytes and assigned to 'int32_t *' {aka 
'{re:long :re?}int *'} here; 'sizeof (int32_t {aka {re:long :re?}int})' is '4'
 |
{ dg-end-multiline-output "" } */
 
@@ -34,7 +34,7 @@ void test_constant_2 (void)
 |   int32_t *ptr = __builtin_malloc (2);
 |  ^~~~
 |  |
-|  (1) allocated 2 bytes and assigned to 'int32_t *' {aka 
'int *'} here; 'sizeof (int32_t {aka int})' is '4'
+|  (1) allocated 2 bytes and assigned to 'int32_t *' {aka 
'{re:long :re?}int *'} here; 'sizeof (int32_t {aka {re:long :re?}int})' is '4'
 |
{ dg-end-multiline-output "" } */
 
@@ -52,6 +52,6 @@ void test_symbolic (int n)
 |   int32_t *ptr = __builtin_malloc (n * 2);
 |  ^~~~
 |  |
-|  (1) allocated 'n * 2' bytes and assigned to 'int32_t *' 
{aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
+|  (1) allocated 'n * 2' bytes and assigned to 'int32_t *' 
{aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka {re:long :re?}int})' is 
'4'
 |
{ dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c 
b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
index 9e1269cbb7a..8912913a78c 100644
--- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
+++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
@@ -16,7 +16,7 @@ void test_constant_1 (void)
 |   int32_t *ptr = __builtin_alloca (1);
 |  ^~~~
 |  |
-|  (1) allocated 1 bytes and assigned to 'int32_t *' {aka 
'int *'} here; 'sizeof (int32_t {aka int})' is '4'
+|  (1) allocated 1 bytes and assigned to 'int32_t *' {aka 
'{re:long :re?}int *'} here; 'sizeof (int32_t {aka {re:long :re?}int})' is '4'
 |
{ dg-end-multiline-output "" } */
 
@@ -33,7 +33,7 @@ void test_constant_2 (void)
 |   int32_t *ptr = __builtin_alloca (2);
 |  ^~~~
 |  |
-|  (1) allocated 2 bytes and assigned to 'int32_t *' {aka 
'int *'} here; 'sizeof (int32_t {aka int})' is '4'
+|  (1) allocated 2 bytes and assigned to 'int32_t *' {aka 
'{re:long :re?}int *'} here; 'sizeof (int32_t {aka {re:long :re?}int})' is '4'
 |
{ dg-end-multiline-output "" } */
 
@@ -50,7 +50,7 @@ void test_symbolic (int n)
 |   int32_t *ptr = __builtin_alloca (n * 2);
 |  ^~~~
 |  |
-|  (1) allocated 'n * 2' bytes and assigned to 'int32_t *' 
{aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
+|  (1) allocated 'n * 2' bytes and assigned to 'int32_t *' 
{aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka {re:long :re?}int})' is 
'4'
 |
{ dg-end-multiline-output "" } */
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c 
b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c
index 71790d91753..88fc8edee7f 100644
--- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c
+++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c
@@ -20,7 +20,7 @@ void test_constant_99 (void)
 |   int32_t *ptr = alloca (99);
 |  ^~
 |  |
-|  (1) allocated 99 bytes and assigned to 'int32_t *' {aka 
'int *'} here; 'sizeof (int32_t {a

Re: [PATCH] testsuite: Fix gcc.dg/analyzer/allocation-size-multiline-[123].c [PR 110426]

2023-08-10 Thread Christophe Lyon via Gcc-patches
Hi!

On Wed, 9 Aug 2023 at 22:30, David Malcolm  wrote:

> On Tue, 2023-08-08 at 15:01 +, Christophe Lyon wrote:
> > For 32-bit newlib targets (e.g. arm-eabi)  int32_t is "long int".
> >
> > Like previous patches in these tests, update the matching regexps to
> > match "aka (long )?int".
> >
> > Tested on arm-eabi and aarch64-linux-gnu.
>
> Sorry about this breakage.
>
> These tests used to emit the infomation as multiple messages, but were
> consolidated as a side-effect of r14-3001-g021077b94741c9.
>
> I've just committed r14-3114-g73da34a538ddc2, a cleanup of the analyzer
> code, which has a side-effect of splitting the messages back up.  I
> believe that r14-3114 restores these tests to their pre-r14-3001 state,
> but I might have messed up.
>
> Does r14-3114-g73da34a538ddc2 fix the issues for you, or is some
> patching still needed?
>
>
Thanks, indeed the tests pass again (both aarch64 and arm targets)

Christophe


> Dave
>
>
> >
> > 2023-08-08  Christophe Lyon  
> >
> > gcc/testsuite/
> > PR analyzer/110426
> > * gcc.dg/analyzer/allocation-size-multiline-1.c: Handle
> > int32_t being "long int".
> > * gcc.dg/analyzer/allocation-size-multiline-2.c: Likewise.
> > * gcc.dg/analyzer/allocation-size-multiline-3.c: Likewise.
> > ---
> >  gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c | 6 +++-
> > --
> >  gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c | 6 +++-
> > --
> >  gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c | 4 ++--
> >  3 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-
> > 1.c b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
> > index 9938ba237a0..b56e4b4e8e1 100644
> > --- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
> > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-1.c
> > @@ -16,7 +16,7 @@ void test_constant_1 (void)
> >  |   int32_t *ptr = __builtin_malloc (1);
> >  |  ^~~~
> >  |  |
> > -|  (1) allocated 1 bytes and assigned to
> > 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> > +|  (1) allocated 1 bytes and assigned to
> > 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> > {re:long :re?}int})' is '4'
> >  |
> > { dg-end-multiline-output "" } */
> >
> > @@ -34,7 +34,7 @@ void test_constant_2 (void)
> >  |   int32_t *ptr = __builtin_malloc (2);
> >  |  ^~~~
> >  |  |
> > -|  (1) allocated 2 bytes and assigned to
> > 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> > +|  (1) allocated 2 bytes and assigned to
> > 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> > {re:long :re?}int})' is '4'
> >  |
> > { dg-end-multiline-output "" } */
> >
> > @@ -52,6 +52,6 @@ void test_symbolic (int n)
> >  |   int32_t *ptr = __builtin_malloc (n * 2);
> >  |  ^~~~
> >  |  |
> > -|  (1) allocated 'n * 2' bytes and assigned to
> > 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> > +|  (1) allocated 'n * 2' bytes and assigned to
> > 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> > {re:long :re?}int})' is '4'
> >  |
> > { dg-end-multiline-output "" } */
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-
> > 2.c b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
> > index 9e1269cbb7a..8912913a78c 100644
> > --- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
> > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-2.c
> > @@ -16,7 +16,7 @@ void test_constant_1 (void)
> >  |   int32_t *ptr = __builtin_alloca (1);
> >  |  ^~~~
> >  |  |
> > -|  (1) allocated 1 bytes and assigned to
> > 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> > +|  (1) allocated 1 bytes and assigned to
> > 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> > {re:long :re?}int})' is '4'
> >  |
> > { dg-end-multiline-output "" } */
> >
> > @@ -33,7 +33,7 @@ void test_constant_2 (void)
> >  |   int32_t *ptr = __builtin_alloca (2);
> >  |  ^~~~
> >  |  |
> > -|  (1) allocated 2 bytes and assigned to
> > 'int32_t *' {aka 'int *'} here; 'sizeof (int32_t {aka int})' is '4'
> > +|  (1) allocated 2 bytes and assigned to
> > 'int32_t *' {aka '{re:long :re?}int *'} here; 'sizeof (int32_t {aka
> > {re:long :re?}int})' is '4'
> >  |
> > { dg-end-multiline-output "" } */
> >
> > @@ -50,7

Re: [PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-10 Thread Christophe Lyon via Gcc-patches
Hi Andrew,


On Wed, 9 Aug 2023 at 21:20, Andrew Pinski via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> This adds a simple match pattern for this case.
> I noticed it a couple of different places.
> One while I was looking at code generation of a parser and
> also while I was looking at locations where bitwise_inverted_equal_p
> should be used more.
>
> Committed as approved after bootstrapped and tested on x86_64-linux-gnu
> with no regressions.
>
> PR tree-optimization/110937
> PR tree-optimization/100798
>
> gcc/ChangeLog:
>
> * match.pd (`a ? ~b : b`): Handle this
> case.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/bool-14.c: New test.
> * gcc.dg/tree-ssa/bool-15.c: New test.
> * gcc.dg/tree-ssa/phi-opt-33.c: New test.
> * gcc.dg/tree-ssa/20030709-2.c: Update testcase
> so `a ? -1 : 0` is not used to hit the match
> pattern.
>

Our CI noticed that your patch introduced regressions as follows on aarch64:

 Running gcc:gcc.target/aarch64/aarch64.exp ...
FAIL: gcc.target/aarch64/cond_op_imm_1.c scan-assembler csinv\tw[0-9]*.*
FAIL: gcc.target/aarch64/cond_op_imm_1.c scan-assembler csinv\tx[0-9]*.*

Running gcc:gcc.target/aarch64/sve/aarch64-sve.exp ...
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-not \\tmov\\tz
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
\\tneg\\tz[0-9]+\\.b, p[0-7]/m, 3
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
\\tneg\\tz[0-9]+\\.h, p[0-7]/m, 2
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
\\tneg\\tz[0-9]+\\.s, p[0-7]/m, 1
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
\\tnot\\tz[0-9]+\\.b, p[0-7]/m, 3
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
\\tnot\\tz[0-9]+\\.h, p[0-7]/m, 2
FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
\\tnot\\tz[0-9]+\\.s, p[0-7]/m, 1

Hopefully you'll just need to update the testcases (I didn't check
manually, I think you can easily reproduce this on aarch64?)

Thanks,

Christophe




> ---
>  gcc/match.pd   | 14 ++
>  gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
>  gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
>  gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
>  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
>  5 files changed, 63 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 9b4819e5be7..fc630b63563 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -6460,6 +6460,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(if (cmp == NE_EXPR)
> { constant_boolean_node (true, type); })))
>
> +#if GIMPLE
> +/* a?~t:t -> (-(a))^t */
> +(simplify
> + (cond @0 @1 @2)
> + (if (INTEGRAL_TYPE_P (type)
> +  && bitwise_inverted_equal_p (@1, @2))
> +  (with {
> +auto prec = TYPE_PRECISION (type);
> +auto unsign = TYPE_UNSIGNED (type);
> +tree inttype = build_nonstandard_integer_type (prec, unsign);
> +   }
> +   (convert (bit_xor (negate (convert:inttype @0)) (convert:inttype
> @2))
> +#endif
> +
>  /* Simplify pointer equality compares using PTA.  */
>  (for neeq (ne eq)
>   (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> index 5009cd69cfe..78938f919d4 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c
> @@ -29,15 +29,16 @@ union tree_node
>  };
>  int make_decl_rtl (tree, int);
>  void *
> -get_alias_set (t)
> +get_alias_set (t, t1)
>   tree t;
> + void *t1;
>  {
>long set;
>if (t->decl.rtl)
>  return (t->decl.rtl->fld[1].rtmem
> ? 0
> : (((t->decl.rtl ? t->decl.rtl: (make_decl_rtl (t, 0),
> t->decl.rtl)))->fld[1]).rtmem);
> -  return (void*)-1;
> +  return t1;
>  }
>
>  /* There should be precisely one load of ->decl.rtl.  If there is
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
> b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
> new file mode 100644
> index 000..0149380a63b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
> +/* PR tree-optimization/110937 */
> +
> +_Bool f2(_Bool a, _Bool b)
> +{
> +if (a)
> +  return !b;
> +return b;
> +}
> +
> +/* We should be able to remove the conditional and convert it to an xor.
> */
> +/* { dg-final { scan-tree-dump-not "gimple_cond " "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "gimple_phi " "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "bit_xor_expr, " 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-

Re: [PATCH] MATCH: [PR110937/PR100798] (a ? ~b : b) should be optimized to b ^ -(a)

2023-08-11 Thread Christophe Lyon via Gcc-patches
On Thu, 10 Aug 2023 at 20:52, Andrew Pinski  wrote:

> On Thu, Aug 10, 2023 at 6:39 AM Christophe Lyon via Gcc-patches
>  wrote:
> >
> > Hi Andrew,
> >
> >
> > On Wed, 9 Aug 2023 at 21:20, Andrew Pinski via Gcc-patches <
> > gcc-patches@gcc.gnu.org> wrote:
> >
> > > This adds a simple match pattern for this case.
> > > I noticed it a couple of different places.
> > > One while I was looking at code generation of a parser and
> > > also while I was looking at locations where bitwise_inverted_equal_p
> > > should be used more.
> > >
> > > Committed as approved after bootstrapped and tested on x86_64-linux-gnu
> > > with no regressions.
> > >
> > > PR tree-optimization/110937
> > > PR tree-optimization/100798
> > >
> > > gcc/ChangeLog:
> > >
> > > * match.pd (`a ? ~b : b`): Handle this
> > > case.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/tree-ssa/bool-14.c: New test.
> > > * gcc.dg/tree-ssa/bool-15.c: New test.
> > > * gcc.dg/tree-ssa/phi-opt-33.c: New test.
> > > * gcc.dg/tree-ssa/20030709-2.c: Update testcase
> > > so `a ? -1 : 0` is not used to hit the match
> > > pattern.
> > >
> >
> > Our CI noticed that your patch introduced regressions as follows on
> aarch64:
> >
> >  Running gcc:gcc.target/aarch64/aarch64.exp ...
> > FAIL: gcc.target/aarch64/cond_op_imm_1.c scan-assembler csinv\tw[0-9]*.*
> > FAIL: gcc.target/aarch64/cond_op_imm_1.c scan-assembler csinv\tx[0-9]*.*
> >
> > Running gcc:gcc.target/aarch64/sve/aarch64-sve.exp ...
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-not \\tmov\\tz
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> > \\tneg\\tz[0-9]+\\.b, p[0-7]/m, 3
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> > \\tneg\\tz[0-9]+\\.h, p[0-7]/m, 2
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> > \\tneg\\tz[0-9]+\\.s, p[0-7]/m, 1
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> > \\tnot\\tz[0-9]+\\.b, p[0-7]/m, 3
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> > \\tnot\\tz[0-9]+\\.h, p[0-7]/m, 2
> > FAIL: gcc.target/aarch64/sve/cond_unary_5.c scan-assembler-times
> > \\tnot\\tz[0-9]+\\.s, p[0-7]/m, 1
> >
> > Hopefully you'll just need to update the testcases (I didn't check
> > manually, I think you can easily reproduce this on aarch64?)
>
> I have a few ideas of how to fix this properly inside isel without
> changing the testcases. I will start working on that starting
> tomorrow.
> In the meantime can you file a bug report? So we don't lose track of
> the regression?
>
> Hi Andrew,

Sure, I've just filed:  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110986

Thanks,

Christophe

Thanks,
> Andrew
>
> >
> > Thanks,
> >
> > Christophe
> >
> >
> >
> >
> > > ---
> > >  gcc/match.pd   | 14 ++
> > >  gcc/testsuite/gcc.dg/tree-ssa/20030709-2.c |  5 +++--
> > >  gcc/testsuite/gcc.dg/tree-ssa/bool-14.c| 15 +++
> > >  gcc/testsuite/gcc.dg/tree-ssa/bool-15.c| 18 ++
> > >  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c | 13 +
> > >  5 files changed, 63 insertions(+), 2 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-14.c
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-15.c
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-33.c
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 9b4819e5be7..fc630b63563 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -6460,6 +6460,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >(if (cmp == NE_EXPR)
> > > { constant_boolean_node (true, type); })))
> > >
> > > +#if GIMPLE
> > > +/* a?~t:t -> (-(a))^t */
> > > +(simplify
> > > + (cond @0 @1 @2)
> > > + (if (INTEGRAL_TYPE_P (type)
> > > +  && bitwise_inverted_equal_p (@1, @2))
> > > +  (with {
> > > +auto prec = TYPE_PRECISION (type);
> > > +auto unsign = TYPE_UNSIGNED (type);
> > > +tree inttype = build_nonstandard_integer_type (prec, unsign);
> > > +   }
> > > +   (convert (bit_xor 

Re: [PATCH] ipa-sra: Don't consider CLOBBERS as writes preventing splitting

2023-08-11 Thread Christophe Lyon via Gcc-patches
Hi Martin,


On Fri, 4 Aug 2023 at 18:26, Martin Jambor  wrote:

> Hello,
>
> On Wed, Aug 02 2023, Richard Biener wrote:
> > On Mon, Jul 31, 2023 at 7:05 PM Martin Jambor  wrote:
> >>
> >> Hi,
> >>
> >> when IPA-SRA detects whether a parameter passed by reference is
> >> written to, it does not special case CLOBBERs which means it often
> >> bails out unnecessarily, especially when dealing with C++ destructors.
> >> Fixed by the obvious continue in the two relevant loops.
> >>
> >> The (slightly) more complex testcases in the PR need surprisingly more
> >> effort but the simple one can be fixed now easily by this patch and I'll
> >> work on the others incrementally.
> >>
> >> Bootstrapped and currently undergoing testsuite run on x86_64-linux.  OK
> >> if it passes too?
> >
> > LGTM, btw - how are the clobbers handled during transform?
>
> it turns out your question is spot on.  I assumed that the mini-DCE that
> I implemented into IPA-SRA transform would delete but I had a closer
> look and it is not invoked on split parameters,only on removed ones.
> What was actually happening is that the parameter got remapped to a
> default definition of a replacement VAR_DECL and we were thus
> gimple-clobbering a pointer pointing to nowhere.  The clobber then got
> DSEd and so I originally did not notice looking at the optimized dump.
>
> Still that is of course not ideal and so I added a simple function
> removing clobbers when splitting.  I as considering adding that
> functionality to ipa_param_body_adjustments::mark_dead_statements but
> that would make the function harder to read without much gain.
>
> So thanks again for the remark.  The following passes bootstrap and
> testing on x86_64-linux.  I am running LTO bootstrap now.  OK if it
> passes?
>
> Martin
>
>
>
> When IPA-SRA detects whether a parameter passed by reference is
> written to, it does not special case CLOBBERs which means it often
> bails out unnecessarily, especially when dealing with C++ destructors.
> Fixed by the obvious continue in the two relevant loops and by adding
> a simple function that marks the clobbers in the transformation code
> as statements to be removed.
>
>
Not sure if you noticed: I updated bugzilla because the new test fails on
arm, and I attached  pr110378-1.C.083i.sra there, to help you debug.

Thanks,

Christophe

gcc/ChangeLog:
>
> 2023-08-04  Martin Jambor  
>
> PR ipa/110378
> * ipa-param-manipulation.h (class ipa_param_body_adjustments): New
> members get_ddef_if_exists_and_is_used and mark_clobbers_dead.
> * ipa-sra.cc (isra_track_scalar_value_uses): Ignore clobbers.
> (ptr_parm_has_nonarg_uses): Likewise.
> * ipa-param-manipulation.cc
> (ipa_param_body_adjustments::get_ddef_if_exists_and_is_used): New.
> (ipa_param_body_adjustments::mark_dead_statements): Move initial
> checks to get_ddef_if_exists_and_is_used.
> (ipa_param_body_adjustments::mark_clobbers_dead): New.
> (ipa_param_body_adjustments::common_initialization): Call
> mark_clobbers_dead when splitting.
>
> gcc/testsuite/ChangeLog:
>
> 2023-07-31  Martin Jambor  
>
> PR ipa/110378
> * g++.dg/ipa/pr110378-1.C: New test.
> ---
>  gcc/ipa-param-manipulation.cc | 44 +---
>  gcc/ipa-param-manipulation.h  |  2 ++
>  gcc/ipa-sra.cc|  6 ++--
>  gcc/testsuite/g++.dg/ipa/pr110378-1.C | 48 +++
>  4 files changed, 94 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr110378-1.C
>
> diff --git a/gcc/ipa-param-manipulation.cc b/gcc/ipa-param-manipulation.cc
> index a286af7f5d9..4a185ddbdf4 100644
> --- a/gcc/ipa-param-manipulation.cc
> +++ b/gcc/ipa-param-manipulation.cc
> @@ -1072,6 +1072,20 @@ ipa_param_body_adjustments::carry_over_param (tree
> t)
>return new_parm;
>  }
>
> +/* If DECL is a gimple register that has a default definition SSA name
> and that
> +   has some uses, return the default definition, otherwise return
> NULL_TREE.  */
> +
> +tree
> +ipa_param_body_adjustments::get_ddef_if_exists_and_is_used (tree decl)
> +{
> + if (!is_gimple_reg (decl))
> +return NULL_TREE;
> +  tree ddef = ssa_default_def (m_id->src_cfun, decl);
> +  if (!ddef || has_zero_uses (ddef))
> +return NULL_TREE;
> +  return ddef;
> +}
> +
>  /* Populate m_dead_stmts given that DEAD_PARAM is going to be removed
> without
> any replacement or splitting.  REPL is the replacement VAR_SECL to
> base any
> remaining uses of a removed parameter on.  Push all removed SSA names
> that
> @@ -1084,10 +1098,8 @@ ipa_param_body_adjustments::mark_dead_statements
> (tree dead_param,
>/* Current IPA analyses which remove unused parameters never remove a
>   non-gimple register ones which have any use except as parameters in
> other
>   calls, so we can safely leve them as they are.  */
> -  if (!is_gimple_reg (dead_param))
> -return;
> -  tree 

Re: [PATCH] ipa-sra: Don't consider CLOBBERS as writes preventing splitting

2023-08-11 Thread Christophe Lyon via Gcc-patches
On Fri, 11 Aug 2023 at 15:50, Martin Jambor  wrote:

> Hello,
>
> On Fri, Aug 11 2023, Christophe Lyon wrote:
> > Hi Martin,
> >
> >
> > On Fri, 4 Aug 2023 at 18:26, Martin Jambor  wrote:
> >
> >> Hello,
> >>
> >> On Wed, Aug 02 2023, Richard Biener wrote:
> >> > On Mon, Jul 31, 2023 at 7:05 PM Martin Jambor 
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> when IPA-SRA detects whether a parameter passed by reference is
> >> >> written to, it does not special case CLOBBERs which means it often
> >> >> bails out unnecessarily, especially when dealing with C++
> destructors.
> >> >> Fixed by the obvious continue in the two relevant loops.
> >> >>
> >> >> The (slightly) more complex testcases in the PR need surprisingly
> more
> >> >> effort but the simple one can be fixed now easily by this patch and
> I'll
> >> >> work on the others incrementally.
> >> >>
> >> >> Bootstrapped and currently undergoing testsuite run on
> x86_64-linux.  OK
> >> >> if it passes too?
> >> >
> >> > LGTM, btw - how are the clobbers handled during transform?
> >>
> >> it turns out your question is spot on.  I assumed that the mini-DCE that
> >> I implemented into IPA-SRA transform would delete but I had a closer
> >> look and it is not invoked on split parameters,only on removed ones.
> >> What was actually happening is that the parameter got remapped to a
> >> default definition of a replacement VAR_DECL and we were thus
> >> gimple-clobbering a pointer pointing to nowhere.  The clobber then got
> >> DSEd and so I originally did not notice looking at the optimized dump.
> >>
> >> Still that is of course not ideal and so I added a simple function
> >> removing clobbers when splitting.  I as considering adding that
> >> functionality to ipa_param_body_adjustments::mark_dead_statements but
> >> that would make the function harder to read without much gain.
> >>
> >> So thanks again for the remark.  The following passes bootstrap and
> >> testing on x86_64-linux.  I am running LTO bootstrap now.  OK if it
> >> passes?
> >>
> >> Martin
> >>
> >>
> >>
> >> When IPA-SRA detects whether a parameter passed by reference is
> >> written to, it does not special case CLOBBERs which means it often
> >> bails out unnecessarily, especially when dealing with C++ destructors.
> >> Fixed by the obvious continue in the two relevant loops and by adding
> >> a simple function that marks the clobbers in the transformation code
> >> as statements to be removed.
> >>
> >>
> > Not sure if you noticed: I updated bugzilla because the new test fails on
> > arm, and I attached  pr110378-1.C.083i.sra there, to help you debug.
> >
>
> I am aware and have actually started looking at the issue a while ago.
> Sorry, I'm only slowly making my way through my TODO list.
>
No worries, thanks for confirming you are aware of the problem ;-)


>
> The difference on 32bit ARM is that the destructor return this pointer,
> which means that IPA-SRA cannot just split the loaded bit - without any
> follow-up IPA analysis that the return value is unused which it does not
> take into account this way.  But now that we remove useless returns
> before splitting it should be doable.
>
> Meanwhile, is there a dejagnu target macro for architectures with
> destructors returning value so that we could xfail the test there?
>
I'm not aware of any at quick glance


>
> Thanks for bringing my attention to this.
>
> Martin
>
>
Thanks,

Christophe


>
>
> > Thanks,
> >
> > Christophe
> >
> > gcc/ChangeLog:
> >>
> >> 2023-08-04  Martin Jambor  
> >>
> >> PR ipa/110378
> >> * ipa-param-manipulation.h (class ipa_param_body_adjustments):
> New
> >> members get_ddef_if_exists_and_is_used and mark_clobbers_dead.
> >> * ipa-sra.cc (isra_track_scalar_value_uses): Ignore clobbers.
> >> (ptr_parm_has_nonarg_uses): Likewise.
> >> * ipa-param-manipulation.cc
> >> (ipa_param_body_adjustments::get_ddef_if_exists_and_is_used):
> New.
> >> (ipa_param_body_adjustments::mark_dead_statements): Move initial
> >> checks to get_ddef_if_exists_and_is_used.
> >> (ipa_param_body_adjustments::mark_clobbers_dead): New.
> >> (ipa_param_body_adjustments::common_initialization): Call
> >> mark_clobbers_dead when splitting.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> 2023-07-31  Martin Jambor  
> >>
> >> PR ipa/110378
> >> * g++.dg/ipa/pr110378-1.C: New test.
> >> ---
> >>  gcc/ipa-param-manipulation.cc | 44 +---
> >>  gcc/ipa-param-manipulation.h  |  2 ++
> >>  gcc/ipa-sra.cc|  6 ++--
> >>  gcc/testsuite/g++.dg/ipa/pr110378-1.C | 48 +++
> >>  4 files changed, 94 insertions(+), 6 deletions(-)
> >>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr110378-1.C
> >>
> >> diff --git a/gcc/ipa-param-manipulation.cc
> b/gcc/ipa-param-manipulation.cc
> >> index a286af7f5d9..4a185ddbdf4 100644
> >> --- a/gcc/ipa-param-manipulation.cc
> >> +++ b

[PATCH] arm: [MVE intrinsics] fix binary_acca_int32 and binary_acca_int64 shapes

2023-08-14 Thread Christophe Lyon via Gcc-patches
Fix these two shapes, where we were failing to check the last
non-predicate parameter.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_acca_int32): Fix loop 
bound.
(binary_acca_int64): Likewise.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 6d477a84330..1633084608e 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -455,7 +455,7 @@ struct binary_acca_int32_def : public overloaded_base<0>
|| (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES)
   return error_mark_node;
 
-unsigned int last_arg = i;
+unsigned int last_arg = i + 1;
 for (i = 1; i < last_arg; i++)
   if (!r.require_matching_vector_type (i, type))
return error_mark_node;
@@ -492,7 +492,7 @@ struct binary_acca_int64_def : public overloaded_base<0>
|| (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES)
   return error_mark_node;
 
-unsigned int last_arg = i;
+unsigned int last_arg = i + 1;
 for (i = 1; i < last_arg; i++)
   if (!r.require_matching_vector_type (i, type))
return error_mark_node;
-- 
2.34.1



[PATCH] arm: [MVE intrinsics] Remove dead check for float type in parse_element_type

2023-08-14 Thread Christophe Lyon via Gcc-patches
Fix a likely copy/paste error, where we check if ch == 'f' after we
checked it's either 's' or 'u'.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (parse_element_type):
Remove dead check.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 1633084608e..23eb9d0e69b 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -80,8 +80,7 @@ parse_element_type (const function_instance &instance, const 
char *&format)
 
   if (ch == 's' || ch == 'u')
 {
-  type_class_index tclass = (ch == 'f' ? TYPE_float
-: ch == 's' ? TYPE_signed
+  type_class_index tclass = (ch == 's' ? TYPE_signed
 : TYPE_unsigned);
   char *end;
   unsigned int bits = strtol (format, &end, 10);
-- 
2.34.1



[PATCH 1/9] arm: [MVE intrinsics] factorize vmullbq vmulltq

2023-08-14 Thread Christophe Lyon via Gcc-patches
Factorize vmullbq, vmulltq so that they use the same parameterized
names.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vmullb, vmullt.
(isu): Add VMULLBQ_INT_S, VMULLBQ_INT_U, VMULLTQ_INT_S,
VMULLTQ_INT_U.
(supf): Add VMULLBQ_POLY_P, VMULLTQ_POLY_P, VMULLBQ_POLY_M_P,
VMULLTQ_POLY_M_P.
(VMULLBQ_INT, VMULLTQ_INT, VMULLBQ_INT_M, VMULLTQ_INT_M): Delete.
(VMULLxQ_INT, VMULLxQ_POLY, VMULLxQ_INT_M, VMULLxQ_POLY_M): New.
* config/arm/mve.md (mve_vmullbq_int_)
(mve_vmulltq_int_): Merge into ...
(@mve_q_int_) ... this.
(mve_vmulltq_poly_p, mve_vmullbq_poly_p): Merge into ...
(@mve_q_poly_): ... this.
(mve_vmullbq_int_m_, mve_vmulltq_int_m_): Merge 
into ...
(@mve_q_int_m_): ... this.
(mve_vmullbq_poly_m_p, mve_vmulltq_poly_m_p): Merge into ...
(@mve_q_poly_m_): ... this.
---
 gcc/config/arm/iterators.md |  23 +++--
 gcc/config/arm/mve.md   | 100 
 2 files changed, 38 insertions(+), 85 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index b13ff53d36f..fb003bcd67b 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -917,6 +917,7 @@
 
 (define_int_attr mve_insn [
 (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd")
+(UNSPEC_VCMLA "vcmla") (UNSPEC_VCMLA90 "vcmla") 
(UNSPEC_VCMLA180 "vcmla") (UNSPEC_VCMLA270 "vcmla")
 (UNSPEC_VCMUL "vcmul") (UNSPEC_VCMUL90 "vcmul") 
(UNSPEC_VCMUL180 "vcmul") (UNSPEC_VCMUL270 "vcmul")
 (VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav")
 (VABAVQ_S "vabav") (VABAVQ_U "vabav")
@@ -1044,6 +1045,13 @@
 (VMOVNTQ_S "vmovnt") (VMOVNTQ_U "vmovnt")
 (VMULHQ_M_S "vmulh") (VMULHQ_M_U "vmulh")
 (VMULHQ_S "vmulh") (VMULHQ_U "vmulh")
+(VMULLBQ_INT_M_S "vmullb") (VMULLBQ_INT_M_U "vmullb")
+(VMULLBQ_INT_S "vmullb") (VMULLBQ_INT_U "vmullb")
+(VMULLBQ_POLY_M_P "vmullb") (VMULLTQ_POLY_M_P "vmullt")
+(VMULLBQ_POLY_P "vmullb")
+(VMULLTQ_INT_M_S "vmullt") (VMULLTQ_INT_M_U "vmullt")
+(VMULLTQ_INT_S "vmullt") (VMULLTQ_INT_U "vmullt")
+(VMULLTQ_POLY_P "vmullt")
 (VMULQ_M_N_S "vmul") (VMULQ_M_N_U "vmul") (VMULQ_M_N_F "vmul")
 (VMULQ_M_S "vmul") (VMULQ_M_U "vmul") (VMULQ_M_F "vmul")
 (VMULQ_N_S "vmul") (VMULQ_N_U "vmul") (VMULQ_N_F "vmul")
@@ -1209,7 +1217,6 @@
 (VSUBQ_M_N_S "vsub") (VSUBQ_M_N_U "vsub") (VSUBQ_M_N_F "vsub")
 (VSUBQ_M_S "vsub") (VSUBQ_M_U "vsub") (VSUBQ_M_F "vsub")
 (VSUBQ_N_S "vsub") (VSUBQ_N_U "vsub") (VSUBQ_N_F "vsub")
-(UNSPEC_VCMLA "vcmla") (UNSPEC_VCMLA90 "vcmla") 
(UNSPEC_VCMLA180 "vcmla") (UNSPEC_VCMLA270 "vcmla")
 ])
 
 (define_int_attr isu[
@@ -1246,6 +1253,8 @@
 (VMOVNBQ_S "i") (VMOVNBQ_U "i")
 (VMOVNTQ_M_S "i") (VMOVNTQ_M_U "i")
 (VMOVNTQ_S "i") (VMOVNTQ_U "i")
+(VMULLBQ_INT_S "s") (VMULLBQ_INT_U "u")
+(VMULLTQ_INT_S "s") (VMULLTQ_INT_U "u")
 (VNEGQ_M_S "s")
 (VQABSQ_M_S "s")
 (VQMOVNBQ_M_S "s") (VQMOVNBQ_M_U "u")
@@ -2330,6 +2339,10 @@
   (VMLADAVQ_U "u") (VMULHQ_S "s") (VMULHQ_U "u")
   (VMULLBQ_INT_S "s") (VMULLBQ_INT_U "u") (VQADDQ_S "s")
   (VMULLTQ_INT_S "s") (VMULLTQ_INT_U "u") (VQADDQ_U "u")
+  (VMULLBQ_POLY_P "p")
+  (VMULLTQ_POLY_P "p")
+  (VMULLBQ_POLY_M_P "p")
+  (VMULLTQ_POLY_M_P "p")
   (VMULQ_N_S "s") (VMULQ_N_U "u") (VMULQ_S "s")
   (VMULQ_U "u")
   (VQADDQ_N_S "s") (VQADDQ_N_U "u")
@@ -2713,8 +2726,8 @@
 (define_int_iterator VMINVQ [VMINVQ_U VMINVQ_S])
 (define_int_iterator VMLADAVQ [VMLADAVQ_U VMLADAVQ_S])
 (define_int_iterator VMULHQ [VMULHQ_S VMULHQ_U])
-(define_int_iterator VMULLBQ_INT [VMULLBQ_INT_U VMULLBQ_INT_S])
-(define_int_iterator VMULLTQ_INT [VMULLTQ_INT_U VMULLTQ_INT_S])
+(define_int_iterator VMULLxQ_INT [VMULLBQ_INT_U VMULLBQ_INT_S VMULLTQ_INT_U 
VMULLTQ_INT_S])
+(define_int_iterator VMULLxQ_POLY [VMULLBQ_POLY_P VMULLTQ_POLY_P])
 (define_int_iterator VMULQ [VMULQ_U VMULQ_S])
 (define_int_iterator VMULQ_N [VMULQ_N_U VMULQ_N_S])
 (define_int_iterator VQADDQ [VQADDQ_U VQADDQ_S])
@@ -2815,7 +2828,8 @@
 (define_int_iterator VSLIQ_M_N [VSLIQ_M_N_U VSLIQ_M_N_S])
 (define_int_iterator VRSHLQ_M [VRSHLQ_M_S VRSHLQ_M_U])
 (define_int_iterator VMINQ_M [VMINQ_M_S VMINQ_M_U])
-(define_int_iterator VMULLBQ_INT_M [VMULLBQ_INT_M_U VMULLBQ_INT_M_S])
+(define_int_iterator VMULLxQ_INT_M [VMULLBQ_INT_M_U VMULLBQ_INT_M_S 
VMULLTQ_I

[PATCH 2/9] arm: [MVE intrinsics] add unspec_mve_function_exact_insn_vmull

2023-08-14 Thread Christophe Lyon via Gcc-patches
Introduce a function that will be used to build vmull intrinsics with
the _int variant.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-functions.h (class
unspec_mve_function_exact_insn_vmull): New.
---
 gcc/config/arm/arm-mve-builtins-functions.h | 74 +
 1 file changed, 74 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
b/gcc/config/arm/arm-mve-builtins-functions.h
index a6573844319..c0fc450f886 100644
--- a/gcc/config/arm/arm-mve-builtins-functions.h
+++ b/gcc/config/arm/arm-mve-builtins-functions.h
@@ -838,6 +838,80 @@ public:
   }
 };
 
+
+/* Map the vmull-related function directly to CODE (UNSPEC, UNSPEC, M)
+   where M is the vector mode associated with type suffix 0.  We need
+   this special case because the builtins have _int in their
+   names.  */
+class unspec_mve_function_exact_insn_vmull : public function_base
+{
+public:
+  CONSTEXPR unspec_mve_function_exact_insn_vmull (int unspec_for_sint,
+ int unspec_for_uint,
+ int unspec_for_m_sint,
+ int unspec_for_m_uint)
+: m_unspec_for_sint (unspec_for_sint),
+  m_unspec_for_uint (unspec_for_uint),
+  m_unspec_for_m_sint (unspec_for_m_sint),
+  m_unspec_for_m_uint (unspec_for_m_uint)
+  {}
+
+  /* The unspec code associated with signed-integer and
+ unsigned-integer operations respectively.  It covers the cases
+ with and without the _m predicate.  */
+  int m_unspec_for_sint;
+  int m_unspec_for_uint;
+  int m_unspec_for_m_sint;
+  int m_unspec_for_m_uint;
+
+  rtx
+  expand (function_expander &e) const override
+  {
+insn_code code;
+
+if (! e.type_suffix (0).integer_p)
+  gcc_unreachable ();
+
+if (e.mode_suffix_id != MODE_none)
+  gcc_unreachable ();
+
+switch (e.pred)
+  {
+  case PRED_none:
+   /* No predicate, no suffix.  */
+   if (e.type_suffix (0).unsigned_p)
+ code = code_for_mve_q_int (m_unspec_for_uint, m_unspec_for_uint, 
e.vector_mode (0));
+   else
+ code = code_for_mve_q_int (m_unspec_for_sint, m_unspec_for_sint, 
e.vector_mode (0));
+
+   return e.use_exact_insn (code);
+
+  case PRED_m:
+   /* No suffix, "m" predicate.  */
+   if (e.type_suffix (0).unsigned_p)
+ code = code_for_mve_q_int_m (m_unspec_for_m_uint, 
m_unspec_for_m_uint, e.vector_mode (0));
+   else
+ code = code_for_mve_q_int_m (m_unspec_for_m_sint, 
m_unspec_for_m_sint, e.vector_mode (0));
+
+   return e.use_cond_insn (code, 0);
+
+  case PRED_x:
+   /* No suffix, "x" predicate.  */
+   if (e.type_suffix (0).unsigned_p)
+ code = code_for_mve_q_int_m (m_unspec_for_m_uint, 
m_unspec_for_m_uint, e.vector_mode (0));
+   else
+ code = code_for_mve_q_int_m (m_unspec_for_m_sint, 
m_unspec_for_m_sint, e.vector_mode (0));
+
+   return e.use_pred_x_insn (code);
+
+  default:
+   gcc_unreachable ();
+  }
+
+gcc_unreachable ();
+  }
+};
+
 } /* end namespace arm_mve */
 
 /* Declare the global function base NAME, creating it from an instance
-- 
2.34.1



[PATCH 5/9] arm: [MVE intrinsics] add support for p8 and p16 polynomial types

2023-08-14 Thread Christophe Lyon via Gcc-patches
Although they look like aliases for u8 and u16, we need to define them
so that we can handle p8 and p16 suffixes with the general framework.

They will be used by vmull[bt]q_poly intrinsics.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins.cc (type_suffixes): Handle poly_p
field..
(TYPES_poly_8_16): New.
(poly_8_16): New.
* config/arm/arm-mve-builtins.def (p8): New type suffix.
(p16): Likewise.
* config/arm/arm-mve-builtins.h (enum type_class_index): Add
TYPE_poly.
(struct type_suffix_info): Add poly_p field.
---
 gcc/config/arm/arm-mve-builtins.cc  | 6 ++
 gcc/config/arm/arm-mve-builtins.def | 2 ++
 gcc/config/arm/arm-mve-builtins.h   | 5 -
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 7eec9d2861c..fa8b0ad36b3 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -128,6 +128,7 @@ CONSTEXPR const type_suffix_info 
type_suffixes[NUM_TYPE_SUFFIXES + 1] = {
 TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \
 TYPE_##CLASS == TYPE_unsigned, \
 TYPE_##CLASS == TYPE_float, \
+TYPE_##CLASS == TYPE_poly, \
 0, \
 MODE },
 #include "arm-mve-builtins.def"
@@ -177,6 +178,10 @@ CONSTEXPR const type_suffix_info 
type_suffixes[NUM_TYPE_SUFFIXES + 1] = {
 #define TYPES_all_signed(S, D) \
   S (s8), S (s16), S (s32)
 
+/* _p8 _p16.  */
+#define TYPES_poly_8_16(S, D) \
+  S (p8), S (p16)
+
 /* _u8 _u16 _u32.  */
 #define TYPES_all_unsigned(S, D) \
   S (u8), S (u16), S (u32)
@@ -275,6 +280,7 @@ DEF_MVE_TYPES_ARRAY (integer_8);
 DEF_MVE_TYPES_ARRAY (integer_8_16);
 DEF_MVE_TYPES_ARRAY (integer_16_32);
 DEF_MVE_TYPES_ARRAY (integer_32);
+DEF_MVE_TYPES_ARRAY (poly_8_16);
 DEF_MVE_TYPES_ARRAY (signed_16_32);
 DEF_MVE_TYPES_ARRAY (signed_32);
 DEF_MVE_TYPES_ARRAY (reinterpret_integer);
diff --git a/gcc/config/arm/arm-mve-builtins.def 
b/gcc/config/arm/arm-mve-builtins.def
index e3f37876210..e2cf1baf370 100644
--- a/gcc/config/arm/arm-mve-builtins.def
+++ b/gcc/config/arm/arm-mve-builtins.def
@@ -63,6 +63,8 @@ DEF_MVE_TYPE_SUFFIX (u8, uint8x16_t, unsigned, 8, V16QImode)
 DEF_MVE_TYPE_SUFFIX (u16, uint16x8_t, unsigned, 16, V8HImode)
 DEF_MVE_TYPE_SUFFIX (u32, uint32x4_t, unsigned, 32, V4SImode)
 DEF_MVE_TYPE_SUFFIX (u64, uint64x2_t, unsigned, 64, V2DImode)
+DEF_MVE_TYPE_SUFFIX (p8, uint8x16_t, poly, 8, V16QImode)
+DEF_MVE_TYPE_SUFFIX (p16, uint16x8_t, poly, 16, V8HImode)
 #undef REQUIRES_FLOAT
 
 #define REQUIRES_FLOAT true
diff --git a/gcc/config/arm/arm-mve-builtins.h 
b/gcc/config/arm/arm-mve-builtins.h
index c9b51a0c77b..37b8223dfb2 100644
--- a/gcc/config/arm/arm-mve-builtins.h
+++ b/gcc/config/arm/arm-mve-builtins.h
@@ -146,6 +146,7 @@ enum type_class_index
   TYPE_float,
   TYPE_signed,
   TYPE_unsigned,
+  TYPE_poly,
   NUM_TYPE_CLASSES
 };
 
@@ -221,7 +222,9 @@ struct type_suffix_info
   unsigned int unsigned_p : 1;
   /* True if the suffix is for a floating-point type.  */
   unsigned int float_p : 1;
-  unsigned int spare : 13;
+  /* True if the suffix is for a polynomial type.  */
+  unsigned int poly_p : 1;
+  unsigned int spare : 12;
 
   /* The associated vector or predicate mode.  */
   machine_mode vector_mode : 16;
-- 
2.34.1



[PATCH 6/9] arm: [MVE intrinsics] add support for U and p formats in parse_element_type

2023-08-14 Thread Christophe Lyon via Gcc-patches
Introduce these two format specifiers to define the shape of
vmull[bt]q_poly intrinsics.

'U' is used to define a double-width unsigned
'p' is used to define an element of 'poly' type.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (parse_element_type): Add
support for 'U' and 'p' format specifiers.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index c8eb3351ef2..761da4d8ece 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -61,10 +61,12 @@ apply_predication (const function_instance &instance, tree 
return_type,
 
[01]- the element type in type suffix 0 or 1 of INSTANCE.
h  - a half-sized version of 
+   p  - a poly type with the same width as 
s - a signed type with the given number of bits
s[01]   - a signed type with the same width as type suffix 0 or 1
u - an unsigned type with the given number of bits
u[01]   - an unsigned type with the same width as type suffix 0 or 1
+   U  - an unsigned type with the double width as 
w  - a double-sized version of 
x - a type with the given number of bits and same signedness
  as the next argument.
@@ -102,6 +104,20 @@ parse_element_type (const function_instance &instance, 
const char *&format)
   type_suffixes[suffix].element_bits * 2);
 }
 
+   if (ch == 'U')
+{
+  type_suffix_index suffix = parse_element_type (instance, format);
+  return find_type_suffix (TYPE_unsigned,
+  type_suffixes[suffix].element_bits * 2);
+}
+
+   if (ch == 'p')
+{
+  type_suffix_index suffix = parse_element_type (instance, format);
+  return find_type_suffix (TYPE_poly,
+  type_suffixes[suffix].element_bits);
+}
+
   if (ch == 'x')
 {
   const char *next = format;
-- 
2.34.1



[PATCH 3/9] arm: [MVE intrinsics] add binary_widen shape

2023-08-14 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_widen shape description.

2023-08-14  Christophe Lyon  

gcc/:

* config/arm/arm-mve-builtins-shapes.cc (binary_widen): New.
* config/arm/arm-mve-builtins-shapes.h (binary_widen): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 42 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  5 +--
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 1f22201ac95..c8eb3351ef2 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1129,6 +1129,48 @@ struct binary_rshift_narrow_unsigned_def : public 
overloaded_base<0>
 };
 SHAPE (binary_rshift_narrow_unsigned)
 
+/* _t vfoo[_t0](_t, _t)
+
+   Example: vmullbq.
+   int32x4_t [__arm_]vmullbq_int[_s16](int16x8_t a, int16x8_t b)
+   int32x4_t [__arm_]vmullbq_int_m[_s16](int32x4_t inactive, int16x8_t a, 
int16x8_t b, mve_pred16_t p)
+   int32x4_t [__arm_]vmullbq_int_x[_s16](int16x8_t a, int16x8_t b, 
mve_pred16_t p)  */
+struct binary_widen_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "vw0,v0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (2, i, nargs)
+   || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+type_suffix_index wide_suffix
+  = find_type_suffix (type_suffixes[type].tclass,
+ type_suffixes[type].element_bits * 2);
+
+if (!r.require_matching_vector_type (i, type))
+  return error_mark_node;
+
+/* Check the inactive argument has the wide type.  */
+if ((r.pred == PRED_m)
+   && (r.infer_vector_type (0) != wide_suffix))
+  return r.report_no_such_form (type);
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+};
+SHAPE (binary_widen)
+
 /* _t vfoo[_n_t0](_t, const int)
 
Check that 'imm' is in the [1..#bits] range.
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index a1842f5845c..fa6ec4fc002 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -35,13 +35,13 @@ namespace arm_mve
   {
 
 extern const function_shape *const binary;
-extern const function_shape *const binary_lshift;
-extern const function_shape *const binary_lshift_r;
 extern const function_shape *const binary_acc_int32;
 extern const function_shape *const binary_acc_int64;
 extern const function_shape *const binary_acca_int32;
 extern const function_shape *const binary_acca_int64;
 extern const function_shape *const binary_imm32;
+extern const function_shape *const binary_lshift;
+extern const function_shape *const binary_lshift_r;
 extern const function_shape *const binary_lshift_unsigned;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
@@ -54,6 +54,7 @@ namespace arm_mve
 extern const function_shape *const binary_rshift;
 extern const function_shape *const binary_rshift_narrow;
 extern const function_shape *const binary_rshift_narrow_unsigned;
+extern const function_shape *const binary_widen;
 extern const function_shape *const binary_widen_n;
 extern const function_shape *const binary_widen_opt_n;
 extern const function_shape *const cmp;
-- 
2.34.1



[PATCH 7/9] arm: [MVE intrinsics] add binary_widen_poly shape

2023-08-14 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_widen_poly shape description.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_widen_poly): New.
* config/arm/arm-mve-builtins-shapes.h (binary_widen_poly): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 49 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 50 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 761da4d8ece..23eb9d0e69b 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1187,6 +1187,55 @@ struct binary_widen_def : public overloaded_base<0>
 };
 SHAPE (binary_widen)
 
+/* _t vfoo[_t0](_t, _t)
+
+   Example: vmullbq_poly.
+   uint32x4_t [__arm_]vmullbq_poly[_p16](uint16x8_t a, uint16x8_t b)
+   uint32x4_t [__arm_]vmullbq_poly_m[_p16](uint32x4_t inactive, uint16x8_t a, 
uint16x8_t b, mve_pred16_t p)
+   uint32x4_t [__arm_]vmullbq_poly_x[_p16](uint16x8_t a, uint16x8_t b, 
mve_pred16_t p)  */
+struct binary_widen_poly_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "vU0,vp0,vp0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (2, i, nargs)
+   || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+/* infer_vector_type found the 'unsigned' version of the 'poly'
+   type we are looking for, so find the 'poly' type with the same
+   width.  */
+type = find_type_suffix (TYPE_poly, type_suffixes[type].element_bits);
+
+type_suffix_index wide_suffix
+  = find_type_suffix (TYPE_unsigned,
+ type_suffixes[type].element_bits * 2);
+
+/* Require the 'poly' type, require_matching_vector_type would try
+   and fail with the 'unsigned' one.  */
+if (!r.require_vector_type (i, type_suffixes[type].vector_type))
+  return error_mark_node;
+
+/* Check the inactive argument has the wide type.  */
+if ((r.pred == PRED_m)
+   && (r.infer_vector_type (0) != wide_suffix))
+  return r.report_no_such_form (type);
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+};
+SHAPE (binary_widen_poly)
+
 /* _t vfoo[_n_t0](_t, const int)
 
Check that 'imm' is in the [1..#bits] range.
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index fa6ec4fc002..a93245321c9 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -57,6 +57,7 @@ namespace arm_mve
 extern const function_shape *const binary_widen;
 extern const function_shape *const binary_widen_n;
 extern const function_shape *const binary_widen_opt_n;
+extern const function_shape *const binary_widen_poly;
 extern const function_shape *const cmp;
 extern const function_shape *const create;
 extern const function_shape *const inherent;
-- 
2.34.1



[PATCH 8/9] arm: [MVE intrinsics] add unspec_mve_function_exact_insn_vmull_poly

2023-08-14 Thread Christophe Lyon via Gcc-patches
Introduce a function that will be used to build vmull[bt]q_poly
intrinsics that use poly types.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-functions.h (class
unspec_mve_function_exact_insn_vmull_poly): New.
---
 gcc/config/arm/arm-mve-builtins-functions.h | 56 -
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
b/gcc/config/arm/arm-mve-builtins-functions.h
index c0fc450f886..eba1f071af0 100644
--- a/gcc/config/arm/arm-mve-builtins-functions.h
+++ b/gcc/config/arm/arm-mve-builtins-functions.h
@@ -838,7 +838,6 @@ public:
   }
 };
 
-
 /* Map the vmull-related function directly to CODE (UNSPEC, UNSPEC, M)
where M is the vector mode associated with type suffix 0.  We need
this special case because the builtins have _int in their
@@ -912,6 +911,61 @@ public:
   }
 };
 
+/* Map the vmull_poly-related function directly to CODE (UNSPEC,
+   UNSPEC, M) where M is the vector mode associated with type suffix
+   0.  We need this special case because the builtins have _poly in
+   their names, and use the special poly type..  */
+class unspec_mve_function_exact_insn_vmull_poly : public function_base
+{
+public:
+  CONSTEXPR unspec_mve_function_exact_insn_vmull_poly (int unspec_for_poly,
+  int unspec_for_m_poly)
+: m_unspec_for_poly (unspec_for_poly),
+  m_unspec_for_m_poly (unspec_for_m_poly)
+  {}
+
+  /* The unspec code associated with signed-integer, unsigned-integer
+ and poly operations respectively.  It covers the cases with and
+ without the _m predicate.  */
+  int m_unspec_for_poly;
+  int m_unspec_for_m_poly;
+
+  rtx
+  expand (function_expander &e) const override
+  {
+insn_code code;
+
+if (e.mode_suffix_id != MODE_none)
+  gcc_unreachable ();
+
+if (! e.type_suffix (0).poly_p)
+  gcc_unreachable ();
+
+switch (e.pred)
+  {
+  case PRED_none:
+   /* No predicate, no suffix.  */
+   code = code_for_mve_q_poly (m_unspec_for_poly, m_unspec_for_poly, 
e.vector_mode (0));
+   return e.use_exact_insn (code);
+
+  case PRED_m:
+   /* No suffix, "m" predicate.  */
+   code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, 
e.vector_mode (0));
+   return e.use_cond_insn (code, 0);
+
+  case PRED_x:
+   /* No suffix, "x" predicate.  */
+   code = code_for_mve_q_poly_m (m_unspec_for_m_poly, m_unspec_for_m_poly, 
e.vector_mode (0));
+   return e.use_pred_x_insn (code);
+
+  default:
+   gcc_unreachable ();
+  }
+
+gcc_unreachable ();
+  }
+};
+
 } /* end namespace arm_mve */
 
 /* Declare the global function base NAME, creating it from an instance
-- 
2.34.1



[PATCH 4/9] arm: [MVE intrinsics] rework vmullbq_int vmulltq_int

2023-08-14 Thread Christophe Lyon via Gcc-patches
Implement vmullbq_int, vmulltq_int using the new MVE builtins
framework.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmullbq_int, vmulltq_int):
New.
* config/arm/arm-mve-builtins-base.def (vmullbq_int, vmulltq_int):
New.
* config/arm/arm-mve-builtins-base.h (vmullbq_int, vmulltq_int):
New.
* config/arm/arm_mve.h (vmulltq_int): Remove.
(vmullbq_int): Remove.
(vmullbq_int_m): Remove.
(vmulltq_int_m): Remove.
(vmullbq_int_x): Remove.
(vmulltq_int_x): Remove.
(vmulltq_int_u8): Remove.
(vmullbq_int_u8): Remove.
(vmulltq_int_s8): Remove.
(vmullbq_int_s8): Remove.
(vmulltq_int_u16): Remove.
(vmullbq_int_u16): Remove.
(vmulltq_int_s16): Remove.
(vmullbq_int_s16): Remove.
(vmulltq_int_u32): Remove.
(vmullbq_int_u32): Remove.
(vmulltq_int_s32): Remove.
(vmullbq_int_s32): Remove.
(vmullbq_int_m_s8): Remove.
(vmullbq_int_m_s32): Remove.
(vmullbq_int_m_s16): Remove.
(vmullbq_int_m_u8): Remove.
(vmullbq_int_m_u32): Remove.
(vmullbq_int_m_u16): Remove.
(vmulltq_int_m_s8): Remove.
(vmulltq_int_m_s32): Remove.
(vmulltq_int_m_s16): Remove.
(vmulltq_int_m_u8): Remove.
(vmulltq_int_m_u32): Remove.
(vmulltq_int_m_u16): Remove.
(vmullbq_int_x_s8): Remove.
(vmullbq_int_x_s16): Remove.
(vmullbq_int_x_s32): Remove.
(vmullbq_int_x_u8): Remove.
(vmullbq_int_x_u16): Remove.
(vmullbq_int_x_u32): Remove.
(vmulltq_int_x_s8): Remove.
(vmulltq_int_x_s16): Remove.
(vmulltq_int_x_s32): Remove.
(vmulltq_int_x_u8): Remove.
(vmulltq_int_x_u16): Remove.
(vmulltq_int_x_u32): Remove.
(__arm_vmulltq_int_u8): Remove.
(__arm_vmullbq_int_u8): Remove.
(__arm_vmulltq_int_s8): Remove.
(__arm_vmullbq_int_s8): Remove.
(__arm_vmulltq_int_u16): Remove.
(__arm_vmullbq_int_u16): Remove.
(__arm_vmulltq_int_s16): Remove.
(__arm_vmullbq_int_s16): Remove.
(__arm_vmulltq_int_u32): Remove.
(__arm_vmullbq_int_u32): Remove.
(__arm_vmulltq_int_s32): Remove.
(__arm_vmullbq_int_s32): Remove.
(__arm_vmullbq_int_m_s8): Remove.
(__arm_vmullbq_int_m_s32): Remove.
(__arm_vmullbq_int_m_s16): Remove.
(__arm_vmullbq_int_m_u8): Remove.
(__arm_vmullbq_int_m_u32): Remove.
(__arm_vmullbq_int_m_u16): Remove.
(__arm_vmulltq_int_m_s8): Remove.
(__arm_vmulltq_int_m_s32): Remove.
(__arm_vmulltq_int_m_s16): Remove.
(__arm_vmulltq_int_m_u8): Remove.
(__arm_vmulltq_int_m_u32): Remove.
(__arm_vmulltq_int_m_u16): Remove.
(__arm_vmullbq_int_x_s8): Remove.
(__arm_vmullbq_int_x_s16): Remove.
(__arm_vmullbq_int_x_s32): Remove.
(__arm_vmullbq_int_x_u8): Remove.
(__arm_vmullbq_int_x_u16): Remove.
(__arm_vmullbq_int_x_u32): Remove.
(__arm_vmulltq_int_x_s8): Remove.
(__arm_vmulltq_int_x_s16): Remove.
(__arm_vmulltq_int_x_s32): Remove.
(__arm_vmulltq_int_x_u8): Remove.
(__arm_vmulltq_int_x_u16): Remove.
(__arm_vmulltq_int_x_u32): Remove.
(__arm_vmulltq_int): Remove.
(__arm_vmullbq_int): Remove.
(__arm_vmullbq_int_m): Remove.
(__arm_vmulltq_int_m): Remove.
(__arm_vmullbq_int_x): Remove.
(__arm_vmulltq_int_x): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   2 +
 gcc/config/arm/arm-mve-builtins-base.def |   2 +
 gcc/config/arm/arm-mve-builtins-base.h   |   2 +
 gcc/config/arm/arm_mve.h | 648 ---
 4 files changed, 6 insertions(+), 648 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index e31095ae112..3620c56865d 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -329,6 +329,8 @@ FUNCTION_WITHOUT_N_NO_F (vmovltq, VMOVLTQ)
 FUNCTION_WITHOUT_N_NO_F (vmovnbq, VMOVNBQ)
 FUNCTION_WITHOUT_N_NO_F (vmovntq, VMOVNTQ)
 FUNCTION_WITHOUT_N_NO_F (vmulhq, VMULHQ)
+FUNCTION (vmullbq_int, unspec_mve_function_exact_insn_vmull, (VMULLBQ_INT_S, 
VMULLBQ_INT_U, VMULLBQ_INT_M_S, VMULLBQ_INT_M_U))
+FUNCTION (vmulltq_int, unspec_mve_function_exact_insn_vmull, (VMULLTQ_INT_S, 
VMULLTQ_INT_U, VMULLTQ_INT_M_S, VMULLTQ_INT_M_U))
 FUNCTION_WITH_RTX_M_N (vmulq, MULT, VMULQ)
 FUNCTION_WITH_RTX_M_N_NO_F (vmvnq, NOT, VMVNQ)
 FUNCTION (vnegq, unspec_based_mve_function_exact_insn, (NEG, NEG, NEG, -1, -1, 
-1, VNEGQ_M_S, -1, VNEGQ_M_F, -1, -1, -1))
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index e7d466f2efd..db811bec479 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/

[PATCH 9/9] arm: [MVE intrinsics] rework vmullbq_poly vmulltq_poly

2023-08-14 Thread Christophe Lyon via Gcc-patches
Implement vmull[bt]q_poly using the new MVE builtins framework.

2023-08-14  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmullbq_poly)
(vmulltq_poly): New.
* config/arm/arm-mve-builtins-base.def (vmullbq_poly)
(vmulltq_poly): New.
* config/arm/arm-mve-builtins-base.h (vmullbq_poly)
(vmulltq_poly): New.
* config/arm/arm_mve.h (vmulltq_poly): Remove.
(vmullbq_poly): Remove.
(vmullbq_poly_m): Remove.
(vmulltq_poly_m): Remove.
(vmullbq_poly_x): Remove.
(vmulltq_poly_x): Remove.
(vmulltq_poly_p8): Remove.
(vmullbq_poly_p8): Remove.
(vmulltq_poly_p16): Remove.
(vmullbq_poly_p16): Remove.
(vmullbq_poly_m_p8): Remove.
(vmullbq_poly_m_p16): Remove.
(vmulltq_poly_m_p8): Remove.
(vmulltq_poly_m_p16): Remove.
(vmullbq_poly_x_p8): Remove.
(vmullbq_poly_x_p16): Remove.
(vmulltq_poly_x_p8): Remove.
(vmulltq_poly_x_p16): Remove.
(__arm_vmulltq_poly_p8): Remove.
(__arm_vmullbq_poly_p8): Remove.
(__arm_vmulltq_poly_p16): Remove.
(__arm_vmullbq_poly_p16): Remove.
(__arm_vmullbq_poly_m_p8): Remove.
(__arm_vmullbq_poly_m_p16): Remove.
(__arm_vmulltq_poly_m_p8): Remove.
(__arm_vmulltq_poly_m_p16): Remove.
(__arm_vmullbq_poly_x_p8): Remove.
(__arm_vmullbq_poly_x_p16): Remove.
(__arm_vmulltq_poly_x_p8): Remove.
(__arm_vmulltq_poly_x_p16): Remove.
(__arm_vmulltq_poly): Remove.
(__arm_vmullbq_poly): Remove.
(__arm_vmullbq_poly_m): Remove.
(__arm_vmulltq_poly_m): Remove.
(__arm_vmullbq_poly_x): Remove.
(__arm_vmulltq_poly_x): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   2 +
 gcc/config/arm/arm-mve-builtins-base.def |   2 +
 gcc/config/arm/arm-mve-builtins-base.h   |   2 +
 gcc/config/arm/arm_mve.h | 248 ---
 4 files changed, 6 insertions(+), 248 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 3620c56865d..ed5eba656c1 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -331,6 +331,8 @@ FUNCTION_WITHOUT_N_NO_F (vmovntq, VMOVNTQ)
 FUNCTION_WITHOUT_N_NO_F (vmulhq, VMULHQ)
 FUNCTION (vmullbq_int, unspec_mve_function_exact_insn_vmull, (VMULLBQ_INT_S, 
VMULLBQ_INT_U, VMULLBQ_INT_M_S, VMULLBQ_INT_M_U))
 FUNCTION (vmulltq_int, unspec_mve_function_exact_insn_vmull, (VMULLTQ_INT_S, 
VMULLTQ_INT_U, VMULLTQ_INT_M_S, VMULLTQ_INT_M_U))
+FUNCTION (vmullbq_poly, unspec_mve_function_exact_insn_vmull_poly, 
(VMULLBQ_POLY_P, VMULLBQ_POLY_M_P))
+FUNCTION (vmulltq_poly, unspec_mve_function_exact_insn_vmull_poly, 
(VMULLTQ_POLY_P, VMULLTQ_POLY_M_P))
 FUNCTION_WITH_RTX_M_N (vmulq, MULT, VMULQ)
 FUNCTION_WITH_RTX_M_N_NO_F (vmvnq, NOT, VMVNQ)
 FUNCTION (vnegq, unspec_based_mve_function_exact_insn, (NEG, NEG, NEG, -1, -1, 
-1, VNEGQ_M_S, -1, VNEGQ_M_F, -1, -1, -1))
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index db811bec479..01dfbdef8a3 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -80,6 +80,8 @@ DEF_MVE_FUNCTION (vmovntq, binary_move_narrow, integer_16_32, 
m_or_none)
 DEF_MVE_FUNCTION (vmulhq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vmullbq_int, binary_widen, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vmulltq_int, binary_widen, all_integer, mx_or_none)
+DEF_MVE_FUNCTION (vmullbq_poly, binary_widen_poly, poly_8_16, mx_or_none)
+DEF_MVE_FUNCTION (vmulltq_poly, binary_widen_poly, poly_8_16, mx_or_none)
 DEF_MVE_FUNCTION (vmulq, binary_opt_n, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vmvnq, mvn, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vnegq, unary, all_signed, mx_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 5652fb7c701..c574c32ac53 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -104,6 +104,8 @@ extern const function_base *const vmovntq;
 extern const function_base *const vmulhq;
 extern const function_base *const vmullbq_int;
 extern const function_base *const vmulltq_int;
+extern const function_base *const vmullbq_poly;
+extern const function_base *const vmulltq_poly;
 extern const function_base *const vmulq;
 extern const function_base *const vmvnq;
 extern const function_base *const vnegq;
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 837864aaf29..b82d94e59bd 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -44,14 +44,10 @@
 #define vst4q(__addr, __value) __arm_vst4q(__addr, __value)
 #define vornq(__a, __b) __arm_vornq(__a, __b)
 #define vbicq(__a, __b) __arm_vbicq(__a, __b)
-#define vmulltq_poly(__a, __b) __arm_vmulltq_poly(__a, __b)
-#define v

[PATCH] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-06-26 Thread Christophe Lyon via Gcc-patches
After the recent MVE intrinsics re-implementation, LTO stopped working
because the intrinsics would no longer be defined.

The main part of the patch is simple and similar to what we do for
AArch64:
- call handle_arm_mve_h() from arm_init_mve_builtins to declare the
  intrinsics when the compiler is in LTO mode
- actually implement arm_builtin_decl for MVE.

It was just a bit tricky to handle __ARM_MVE_PRESERVE_USER_NAMESPACE:
its value in the user code cannot be guessed at LTO time, so we always
have to assume that it was not defined.  The led to a few fixes in the
way we register MVE builtins as placeholders or not.  Without this
patch, we would just omit some versions of the inttrinsics when
__ARM_MVE_PRESERVE_USER_NAMESPACE is true. In fact, like for the C/C++
placeholders, we need to always keep entries for all of them to ensure
that we have a consistent numbering scheme.

2023-06-26  Christophe Lyon   

PR target/110268
gcc/
* config/arm/arm-builtins.cc (arm_init_mve_builtins): Handle LTO.
(arm_builtin_decl): Hahndle MVE builtins.
* config/arm/arm-mve-builtins.cc (builtin_decl): New function.
(add_unique_function): Fix handling of
__ARM_MVE_PRESERVE_USER_NAMESPACE.
(add_overloaded_function): Likewise.
* config/arm/arm-protos.h (builtin_decl): New declaration.

gcc/testsuite/
* gcc.target/arm/pr110268-1.c: New test.
* gcc.target/arm/pr110268-2.c: New test.
---
 gcc/config/arm/arm-builtins.cc| 11 +++-
 gcc/config/arm/arm-mve-builtins.cc| 61 ---
 gcc/config/arm/arm-protos.h   |  1 +
 gcc/testsuite/gcc.target/arm/pr110268-1.c | 11 
 gcc/testsuite/gcc.target/arm/pr110268-2.c | 22 
 5 files changed, 76 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-2.c

diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
index 36365e40a5b..fca7dcaf565 100644
--- a/gcc/config/arm/arm-builtins.cc
+++ b/gcc/config/arm/arm-builtins.cc
@@ -1918,6 +1918,15 @@ arm_init_mve_builtins (void)
   arm_builtin_datum *d = &mve_builtin_data[i];
   arm_init_builtin (fcode, d, "__builtin_mve");
 }
+
+  if (in_lto_p)
+{
+  arm_mve::handle_arm_mve_types_h ();
+  /* Under LTO, we cannot know whether
+__ARM_MVE_PRESERVE_USER_NAMESPACE was defined, so assume it
+was not.  */
+  arm_mve::handle_arm_mve_h (false);
+}
 }
 
 /* Set up all the NEON builtins, even builtins for instructions that are not
@@ -2723,7 +2732,7 @@ arm_builtin_decl (unsigned code, bool initialize_p 
ATTRIBUTE_UNUSED)
 case ARM_BUILTIN_GENERAL:
   return arm_general_builtin_decl (subcode);
 case ARM_BUILTIN_MVE:
-  return error_mark_node;
+  return arm_mve::builtin_decl (subcode);
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 7033e41a571..e9a12f27411 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -493,6 +493,16 @@ handle_arm_mve_h (bool preserve_user_namespace)
 preserve_user_namespace);
 }
 
+/* Return the function decl with SVE function subcode CODE, or error_mark_node
+   if no such function exists.  */
+tree
+builtin_decl (unsigned int code)
+{
+  if (code >= vec_safe_length (registered_functions))
+return error_mark_node;
+  return (*registered_functions)[code]->decl;
+}
+
 /* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading
purposes.  */
 static bool
@@ -849,7 +859,6 @@ function_builder::add_function (const function_instance 
&instance,
 ? integer_zero_node
 : simulate_builtin_function_decl (input_location, name, fntype,
  code, NULL, attrs);
-
   registered_function &rfn = *ggc_alloc  ();
   rfn.instance = instance;
   rfn.decl = decl;
@@ -889,15 +898,12 @@ function_builder::add_unique_function (const 
function_instance &instance,
   gcc_assert (!*rfn_slot);
   *rfn_slot = &rfn;
 
-  /* Also add the non-prefixed non-overloaded function, if the user namespace
- does not need to be preserved.  */
-  if (!preserve_user_namespace)
-{
-  char *noprefix_name = get_name (instance, false, false);
-  tree attrs = get_attributes (instance);
-  add_function (instance, noprefix_name, fntype, attrs, requires_float,
-   false, false);
-}
+  /* Also add the non-prefixed non-overloaded function, as placeholder
+ if the user namespace does not need to be preserved.  */
+  char *noprefix_name = get_name (instance, false, false);
+  attrs = get_attributes (instance);
+  add_function (instance, noprefix_name, fntype, attrs, requires_float,
+   false, preserve_user_namespace);
 
   /* Also add the function under its overloaded alias, i

Re: [PATCH] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-06-26 Thread Christophe Lyon via Gcc-patches
On Mon, 26 Jun 2023 at 17:30, Prathamesh Kulkarni <
prathamesh.kulka...@linaro.org> wrote:

> On Mon, 26 Jun 2023 at 20:33, Christophe Lyon via Gcc-patches
>  wrote:
> >
> > After the recent MVE intrinsics re-implementation, LTO stopped working
> > because the intrinsics would no longer be defined.
> >
> > The main part of the patch is simple and similar to what we do for
> > AArch64:
> > - call handle_arm_mve_h() from arm_init_mve_builtins to declare the
> >   intrinsics when the compiler is in LTO mode
> > - actually implement arm_builtin_decl for MVE.
> >
> > It was just a bit tricky to handle __ARM_MVE_PRESERVE_USER_NAMESPACE:
> > its value in the user code cannot be guessed at LTO time, so we always
> > have to assume that it was not defined.  The led to a few fixes in the
> > way we register MVE builtins as placeholders or not.  Without this
> > patch, we would just omit some versions of the inttrinsics when
> > __ARM_MVE_PRESERVE_USER_NAMESPACE is true. In fact, like for the C/C++
> > placeholders, we need to always keep entries for all of them to ensure
> > that we have a consistent numbering scheme.
> >
> > 2023-06-26  Christophe Lyon   
> >
> > PR target/110268
> > gcc/
> > * config/arm/arm-builtins.cc (arm_init_mve_builtins): Handle LTO.
> > (arm_builtin_decl): Hahndle MVE builtins.
> > * config/arm/arm-mve-builtins.cc (builtin_decl): New function.
> > (add_unique_function): Fix handling of
> > __ARM_MVE_PRESERVE_USER_NAMESPACE.
> > (add_overloaded_function): Likewise.
> > * config/arm/arm-protos.h (builtin_decl): New declaration.
> >
> > gcc/testsuite/
> > * gcc.target/arm/pr110268-1.c: New test.
> > * gcc.target/arm/pr110268-2.c: New test.
> > ---
> >  gcc/config/arm/arm-builtins.cc| 11 +++-
> >  gcc/config/arm/arm-mve-builtins.cc| 61 ---
> >  gcc/config/arm/arm-protos.h   |  1 +
> >  gcc/testsuite/gcc.target/arm/pr110268-1.c | 11 
> >  gcc/testsuite/gcc.target/arm/pr110268-2.c | 22 
> >  5 files changed, 76 insertions(+), 30 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-2.c
> >
> > diff --git a/gcc/config/arm/arm-builtins.cc
> b/gcc/config/arm/arm-builtins.cc
> > index 36365e40a5b..fca7dcaf565 100644
> > --- a/gcc/config/arm/arm-builtins.cc
> > +++ b/gcc/config/arm/arm-builtins.cc
> > @@ -1918,6 +1918,15 @@ arm_init_mve_builtins (void)
> >arm_builtin_datum *d = &mve_builtin_data[i];
> >arm_init_builtin (fcode, d, "__builtin_mve");
> >  }
> > +
> > +  if (in_lto_p)
> > +{
> > +  arm_mve::handle_arm_mve_types_h ();
> > +  /* Under LTO, we cannot know whether
> > +__ARM_MVE_PRESERVE_USER_NAMESPACE was defined, so assume it
> > +was not.  */
> > +  arm_mve::handle_arm_mve_h (false);
> > +}
> >  }
> >
> >  /* Set up all the NEON builtins, even builtins for instructions that
> are not
> > @@ -2723,7 +2732,7 @@ arm_builtin_decl (unsigned code, bool initialize_p
> ATTRIBUTE_UNUSED)
> >  case ARM_BUILTIN_GENERAL:
> >return arm_general_builtin_decl (subcode);
> >  case ARM_BUILTIN_MVE:
> > -  return error_mark_node;
> > +  return arm_mve::builtin_decl (subcode);
> >  default:
> >gcc_unreachable ();
> >  }
> > diff --git a/gcc/config/arm/arm-mve-builtins.cc
> b/gcc/config/arm/arm-mve-builtins.cc
> > index 7033e41a571..e9a12f27411 100644
> > --- a/gcc/config/arm/arm-mve-builtins.cc
> > +++ b/gcc/config/arm/arm-mve-builtins.cc
> > @@ -493,6 +493,16 @@ handle_arm_mve_h (bool preserve_user_namespace)
> >  preserve_user_namespace);
> >  }
> >
> > +/* Return the function decl with SVE function subcode CODE, or
> error_mark_node
> > +   if no such function exists.  */
> Hi Christophe,
> Sorry to nitpick -- s/SVE/MVE ? :)
>
> Gasp, I must confess you are right ;-)

Thanks,

Christophe


> Thanks,
> Prathamesh
> > +tree
> > +builtin_decl (unsigned int code)
> > +{
> > +  if (code >= vec_safe_length (registered_functions))
> > +return error_mark_node;
> > +  return (*registered_functions)[code]->decl;
> > +}
> > +
> >  /* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading
> > purposes.  *

[PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp

2023-06-28 Thread Christophe Lyon via Gcc-patches
If GCC is configured with the default (soft) -mfloat-abi, and we don't
override the target_board test flags appropriately,
gcc.target/arm/mve/general-c/nomve_fp_1.c fails for lack of
-mfloat-abi=softfp or -mfloat-abi=hard, because it doesn't use
dg-add-options arm_v8_1m_mve (on purpose, see comment in the test).

Require and use the options needed for arm_fp to fix this problem.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/general-c/nomve_fp_1.c: Require arm_fp.
---
 gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c 
b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
index 21c2af16a61..c9d279ead68 100644
--- a/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/general-c/nomve_fp_1.c
@@ -1,9 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* Do not use dg-add-options arm_v8_1m_mve, because this might expand to "",
which could imply mve+fp depending on the user settings. We want to make
sure the '+fp' extension is not enabled.  */
 /* { dg-options "-mfpu=auto -march=armv8.1-m.main+mve" } */
+/* { dg-add-options arm_fp } */
 
 #include 
 
-- 
2.34.1



[PATCH 2/2] [testsuite, arm]: Make mve_fp_fpu[12].c accept single or double precision FPU

2023-06-28 Thread Christophe Lyon via Gcc-patches
This tests currently expect a directive containing .fpu fpv5-sp-d16
and thus may fail if the test is executed for instance with
-march=armv8.1-m.main+mve.fp+fp.dp

This patch accepts either fpv5-sp-d16 or fpv5-d16 to avoid the failure.

2023-06-28  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: Fix .fpu
scan-assembler.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
index e375327fb97..8358a616bb5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
index 1fca1100cf0..5dd2feefc35 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -12,4 +12,4 @@ foo1 (int8x16_t value)
   return b;
 }
 
-/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" }  } */
+/* { dg-final { scan-assembler "\.fpu fpv5(-sp|)-d16" }  } */
-- 
2.34.1



Re: [committed] libstdc++: Fix preprocessor conditions for std::from_chars [PR109921]

2023-06-29 Thread Christophe Lyon via Gcc-patches
On Thu, 29 Jun 2023 at 14:50, Jonathan Wakely  wrote:

>
>
> On Thu, 1 Jun 2023 at 12:05, Jonathan Wakely 
> wrote:
>
>> On Thu, 1 Jun 2023 at 10:30, Christophe Lyon via Libstdc++
>>  wrote:
>> >
>> > Hi,
>> >
>> >
>> > On Wed, 31 May 2023 at 14:25, Jonathan Wakely via Gcc-patches <
>> > gcc-patches@gcc.gnu.org> wrote:
>> >
>> > > Tested powerpc64le-linux. Pushed to trunk.
>> > >
>> > > -- >8 --
>> > >
>> > > We use the from_chars_strtod function with __strtof128 to read a
>> > > _Float128 value, but from_chars_strtod is not defined unless uselocale
>> > > is available. This can lead to compilation failures for some targets,
>> > > because we try to define the _Flaot128 overload in terms of a
>> > > non-existing from_chars_strtod function.
>> > >
>> > > Only try to use __strtof128 if uselocale is available, otherwise
>> > > fallback to the long double overload of std::from_chars (which might
>> > > fallback to the double overload, which should use fast_float).
>> > >
>> > > This ensures we always define the full set of overloads, even if they
>> > > are not always accurate for all values of the wider types.
>> > >
>> > > libstdc++-v3/ChangeLog:
>> > >
>> > > PR libstdc++/109921
>> > > * src/c++17/floating_from_chars.cc
>> (USE_STRTOF128_FOR_FROM_CHARS):
>> > > Only define when USE_STRTOD_FOR_FROM_CHARS is also defined.
>> > > (USE_STRTOD_FOR_FROM_CHARS): Do not undefine when long double
>> is
>> > > binary64.
>> > > (from_chars(const char*, const char*, double&, chars_format)):
>> > > Check __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ here.
>> > > (from_chars(const char*, const char*, _Float128&,
>> chars_format))
>> > > Only use from_chars_strtod when USE_STRTOD_FOR_FROM_CHARS is
>> > > defined, otherwise parse a long double and convert to
>> _Float128.
>> > >
>> >
>> >
>> > This is causing a regression on aarch64:
>> >  FAIL: libstdc++-abi/abi_check
>>
>> This is now PR 110077.
>>
>
> Hi Christophe,
>
> Is this fixed for aarch64 now? I think it should be.
>
> Hi Jonathan,

Yes, I know see
PASS: libstdc++-abi/abi_check

Thanks for fixing this.

Christophe



>
>>
>>
>> >
>> > The log says:
>> >
>> > 3 added symbols
>> > 0
>> > _ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE11_S_allocateERS3_m
>> > std::__cxx11::basic_string,
>> > std::allocator >::_S_allocate(std::allocator&,
>> unsigned
>> > long)
>> > version status: compatible
>> > GLIBCXX_3.4.32
>> > type: function
>> > status: added
>> >
>> > 1
>> > _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_S_allocateERS3_m
>> > std::__cxx11::basic_string,
>> > std::allocator >::_S_allocate(std::allocator&, unsigned
>> long)
>> > version status: compatible
>> > GLIBCXX_3.4.32
>> > type: function
>> > status: added
>> >
>> > 2
>> > _ZSt10from_charsPKcS0_RDF128_St12chars_format
>> > std::from_chars(char const*, char const*, _Float128&, std::chars_format)
>> > version status: incompatible
>> > GLIBCXX_3.4.31
>> > type: function
>> > status: added
>> >
>> >
>> > 2 undesignated symbols
>> > 0
>> > _ZSt11__once_call
>> > std::__once_call
>> > version status: compatible
>> > GLIBCXX_3.4.11
>> > type: tls
>> > type size: 8
>> > status: undesignated
>> >
>> > 1
>> > _ZSt15__once_callable
>> > std::__once_callable
>> > version status: compatible
>> > GLIBCXX_3.4.11
>> > type: tls
>> > type size: 8
>> > status: undesignated
>> >
>> >
>> > 1 incompatible symbols
>> > 0
>> > _ZSt10from_charsPKcS0_RDF128_St12chars_format
>> > std::from_chars(char const*, char const*, _Float128&, std::chars_format)
>> > version status: incompatible
>> > GLIBCXX_3.4.31
>> > type: function
>> > status: added
>> >
>> >
>> >
>> >  libstdc++-v3 check-abi Summary 
>> >
>> > # of added symbols:  3
>> > # of missing symbols:0
>> > # of undesignated symbols:   2
>> > # of incompatible symbols:   1
>> >
>> >
>> > Can you have a look?
>> >
>> > Thanks,
>> > Christophe
>> >
>> > ---
>> > >  libstdc++-v3/src/c++17/floating_from_chars.cc | 20
>> ---
>> > >  1 file changed, 13 insertions(+), 7 deletions(-)
>> > >
>> > > diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc
>> > > b/libstdc++-v3/src/c++17/floating_from_chars.cc
>> > > index ebd428d5be3..eea878072b0 100644
>> > > --- a/libstdc++-v3/src/c++17/floating_from_chars.cc
>> > > +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
>> > > @@ -64,7 +64,7 @@
>> > >  // strtold for __ieee128
>> > >  extern "C" __ieee128 __strtoieee128(const char*, char**);
>> > >  #elif __FLT128_MANT_DIG__ == 113 && __LDBL_MANT_DIG__ != 113 \
>> > > -  && defined(__GLIBC_PREREQ)
>> > > +  && defined(__GLIBC_PREREQ) &&
>> defined(USE_STRTOD_FOR_FROM_CHARS)
>> > >  #define USE_STRTOF128_FOR_FROM_CHARS 1
>> > >  extern "C" _Float128 __strtof128(const char*, char**)
>> > >__asm ("strtof128")
>> > > @@ -77,10 +77,6 @@ extern "C" _Float128 __strtof128(const char*,
>> char**)
>> > >  #if _GLIBCXX_FLOAT_IS_

Re: [PATCH] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-07-03 Thread Christophe Lyon via Gcc-patches
Ping?


On Mon, 26 Jun 2023 at 17:02, Christophe Lyon 
wrote:

> After the recent MVE intrinsics re-implementation, LTO stopped working
> because the intrinsics would no longer be defined.
>
> The main part of the patch is simple and similar to what we do for
> AArch64:
> - call handle_arm_mve_h() from arm_init_mve_builtins to declare the
>   intrinsics when the compiler is in LTO mode
> - actually implement arm_builtin_decl for MVE.
>
> It was just a bit tricky to handle __ARM_MVE_PRESERVE_USER_NAMESPACE:
> its value in the user code cannot be guessed at LTO time, so we always
> have to assume that it was not defined.  The led to a few fixes in the
> way we register MVE builtins as placeholders or not.  Without this
> patch, we would just omit some versions of the inttrinsics when
> __ARM_MVE_PRESERVE_USER_NAMESPACE is true. In fact, like for the C/C++
> placeholders, we need to always keep entries for all of them to ensure
> that we have a consistent numbering scheme.
>
> 2023-06-26  Christophe Lyon   
>
> PR target/110268
> gcc/
> * config/arm/arm-builtins.cc (arm_init_mve_builtins): Handle LTO.
> (arm_builtin_decl): Hahndle MVE builtins.
> * config/arm/arm-mve-builtins.cc (builtin_decl): New function.
> (add_unique_function): Fix handling of
> __ARM_MVE_PRESERVE_USER_NAMESPACE.
> (add_overloaded_function): Likewise.
> * config/arm/arm-protos.h (builtin_decl): New declaration.
>
> gcc/testsuite/
> * gcc.target/arm/pr110268-1.c: New test.
> * gcc.target/arm/pr110268-2.c: New test.
> ---
>  gcc/config/arm/arm-builtins.cc| 11 +++-
>  gcc/config/arm/arm-mve-builtins.cc| 61 ---
>  gcc/config/arm/arm-protos.h   |  1 +
>  gcc/testsuite/gcc.target/arm/pr110268-1.c | 11 
>  gcc/testsuite/gcc.target/arm/pr110268-2.c | 22 
>  5 files changed, 76 insertions(+), 30 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-2.c
>
> diff --git a/gcc/config/arm/arm-builtins.cc
> b/gcc/config/arm/arm-builtins.cc
> index 36365e40a5b..fca7dcaf565 100644
> --- a/gcc/config/arm/arm-builtins.cc
> +++ b/gcc/config/arm/arm-builtins.cc
> @@ -1918,6 +1918,15 @@ arm_init_mve_builtins (void)
>arm_builtin_datum *d = &mve_builtin_data[i];
>arm_init_builtin (fcode, d, "__builtin_mve");
>  }
> +
> +  if (in_lto_p)
> +{
> +  arm_mve::handle_arm_mve_types_h ();
> +  /* Under LTO, we cannot know whether
> +__ARM_MVE_PRESERVE_USER_NAMESPACE was defined, so assume it
> +was not.  */
> +  arm_mve::handle_arm_mve_h (false);
> +}
>  }
>
>  /* Set up all the NEON builtins, even builtins for instructions that are
> not
> @@ -2723,7 +2732,7 @@ arm_builtin_decl (unsigned code, bool initialize_p
> ATTRIBUTE_UNUSED)
>  case ARM_BUILTIN_GENERAL:
>return arm_general_builtin_decl (subcode);
>  case ARM_BUILTIN_MVE:
> -  return error_mark_node;
> +  return arm_mve::builtin_decl (subcode);
>  default:
>gcc_unreachable ();
>  }
> diff --git a/gcc/config/arm/arm-mve-builtins.cc
> b/gcc/config/arm/arm-mve-builtins.cc
> index 7033e41a571..e9a12f27411 100644
> --- a/gcc/config/arm/arm-mve-builtins.cc
> +++ b/gcc/config/arm/arm-mve-builtins.cc
> @@ -493,6 +493,16 @@ handle_arm_mve_h (bool preserve_user_namespace)
>  preserve_user_namespace);
>  }
>
> +/* Return the function decl with SVE function subcode CODE, or
> error_mark_node
> +   if no such function exists.  */
> +tree
> +builtin_decl (unsigned int code)
> +{
> +  if (code >= vec_safe_length (registered_functions))
> +return error_mark_node;
> +  return (*registered_functions)[code]->decl;
> +}
> +
>  /* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading
> purposes.  */
>  static bool
> @@ -849,7 +859,6 @@ function_builder::add_function (const
> function_instance &instance,
>  ? integer_zero_node
>  : simulate_builtin_function_decl (input_location, name, fntype,
>   code, NULL, attrs);
> -
>registered_function &rfn = *ggc_alloc  ();
>rfn.instance = instance;
>rfn.decl = decl;
> @@ -889,15 +898,12 @@ function_builder::add_unique_function (const
> function_instance &instance,
>gcc_assert (!*rfn_slot);
>*rfn_slot = &rfn;
>
> -  /* Also add the non-prefixed non-overloaded function, if the user
> namespace
> - does not need to be preserved.  */
> -  if (!preserve_user_namespace)
> -{
> -  char *noprefix_name = get_name (instance, false, false);
> -  tree attrs = get_attributes (instance);
> -  add_function (instance, noprefix_name, fntype, attrs,
> requires_float,
> -   false, false);
> -}
> +  /* Also add the non-prefixed non-overloaded function, as placeholder
> + if the user namespace does not need to 

Re: [PATCH] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-07-06 Thread Christophe Lyon via Gcc-patches
On Wed, 5 Jul 2023 at 19:07, Kyrylo Tkachov  wrote:

> Hi Christophe,
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: Monday, June 26, 2023 4:03 PM
> > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
> > Richard Sandiford 
> > Cc: Christophe Lyon 
> > Subject: [PATCH] arm: Fix MVE intrinsics support with LTO (PR
> target/110268)
> >
> > After the recent MVE intrinsics re-implementation, LTO stopped working
> > because the intrinsics would no longer be defined.
> >
> > The main part of the patch is simple and similar to what we do for
> > AArch64:
> > - call handle_arm_mve_h() from arm_init_mve_builtins to declare the
> >   intrinsics when the compiler is in LTO mode
> > - actually implement arm_builtin_decl for MVE.
> >
> > It was just a bit tricky to handle __ARM_MVE_PRESERVE_USER_NAMESPACE:
> > its value in the user code cannot be guessed at LTO time, so we always
> > have to assume that it was not defined.  The led to a few fixes in the
> > way we register MVE builtins as placeholders or not.  Without this
> > patch, we would just omit some versions of the inttrinsics when
> > __ARM_MVE_PRESERVE_USER_NAMESPACE is true. In fact, like for the C/C++
> > placeholders, we need to always keep entries for all of them to ensure
> > that we have a consistent numbering scheme.
> >
> >   2023-06-26  Christophe Lyon   
> >
> >   PR target/110268
> >   gcc/
> >   * config/arm/arm-builtins.cc (arm_init_mve_builtins): Handle LTO.
> >   (arm_builtin_decl): Hahndle MVE builtins.
> >   * config/arm/arm-mve-builtins.cc (builtin_decl): New function.
> >   (add_unique_function): Fix handling of
> >   __ARM_MVE_PRESERVE_USER_NAMESPACE.
> >   (add_overloaded_function): Likewise.
> >   * config/arm/arm-protos.h (builtin_decl): New declaration.
> >
> >   gcc/testsuite/
> >   * gcc.target/arm/pr110268-1.c: New test.
> >   * gcc.target/arm/pr110268-2.c: New test.
> > ---
> >  gcc/config/arm/arm-builtins.cc| 11 +++-
> >  gcc/config/arm/arm-mve-builtins.cc| 61 ---
> >  gcc/config/arm/arm-protos.h   |  1 +
> >  gcc/testsuite/gcc.target/arm/pr110268-1.c | 11 
> >  gcc/testsuite/gcc.target/arm/pr110268-2.c | 22 
> >  5 files changed, 76 insertions(+), 30 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-2.c
> >
> > diff --git a/gcc/config/arm/arm-builtins.cc
> b/gcc/config/arm/arm-builtins.cc
> > index 36365e40a5b..fca7dcaf565 100644
> > --- a/gcc/config/arm/arm-builtins.cc
> > +++ b/gcc/config/arm/arm-builtins.cc
> > @@ -1918,6 +1918,15 @@ arm_init_mve_builtins (void)
> >arm_builtin_datum *d = &mve_builtin_data[i];
> >arm_init_builtin (fcode, d, "__builtin_mve");
> >  }
> > +
> > +  if (in_lto_p)
> > +{
> > +  arm_mve::handle_arm_mve_types_h ();
> > +  /* Under LTO, we cannot know whether
> > +  __ARM_MVE_PRESERVE_USER_NAMESPACE was defined, so assume
> > it
> > +  was not.  */
> > +  arm_mve::handle_arm_mve_h (false);
> > +}
> >  }
> >
> >  /* Set up all the NEON builtins, even builtins for instructions that
> are not
> > @@ -2723,7 +2732,7 @@ arm_builtin_decl (unsigned code, bool initialize_p
> > ATTRIBUTE_UNUSED)
> >  case ARM_BUILTIN_GENERAL:
> >return arm_general_builtin_decl (subcode);
> >  case ARM_BUILTIN_MVE:
> > -  return error_mark_node;
> > +  return arm_mve::builtin_decl (subcode);
> >  default:
> >gcc_unreachable ();
> >  }
> > diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-
> > builtins.cc
> > index 7033e41a571..e9a12f27411 100644
> > --- a/gcc/config/arm/arm-mve-builtins.cc
> > +++ b/gcc/config/arm/arm-mve-builtins.cc
> > @@ -493,6 +493,16 @@ handle_arm_mve_h (bool
> > preserve_user_namespace)
> >preserve_user_namespace);
> >  }
> >
> > +/* Return the function decl with SVE function subcode CODE, or
> > error_mark_node
> > +   if no such function exists.  */
> > +tree
> > +builtin_decl (unsigned int code)
> > +{
> > +  if (code >= vec_safe_length (registered_functions))
> > +return error_mark_node;
> > +  return (*registered_functions)[code]->decl;
> > +}
> > +
> >  /* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading
> > purposes.  */
> >  static bool
> > @@ -849,7 +859,6 @@ function_builder::add_function (const
> > function_instance &instance,
> >  ? integer_zero_node
> >  : simulate_builtin_function_decl (input_location, name, fntype,
> > code, NULL, attrs);
> > -
> >registered_function &rfn = *ggc_alloc  ();
> >rfn.instance = instance;
> >rfn.decl = decl;
> > @@ -889,15 +898,12 @@ function_builder::add_unique_function (const
> > function_instance &instance,
> >gcc_assert (!*rfn_slot);
> >*rfn_slot = &rfn;
> >
> > -  /* Also add the non-prefixed non-overlo

[PATCH] doc: Document arm_v8_1m_main_cde_mve_fp

2023-07-07 Thread Christophe Lyon via Gcc-patches
The arm_v8_1m_main_cde_mve_fp family of effective targets was not
documented when it was introduced.

2023-07-07  Christophe Lyon  

gcc/
* doc/sourcebuild.texi (arm_v8_1m_main_cde_mve_fp): Document.
---
 gcc/doc/sourcebuild.texi | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 526020c7511..03fb2394705 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2190,6 +2190,12 @@ ARM target supports options to generate instructions 
from ARMv8.1-M with
 the Custom Datapath Extension (CDE) and M-Profile Vector Extension (MVE).
 Some multilibs may be incompatible with these options.
 
+@item arm_v8_1m_main_cde_mve_fp
+ARM target supports options to generate instructions from ARMv8.1-M
+with the Custom Datapath Extension (CDE) and M-Profile Vector
+Extension (MVE) with floating-point support.  Some multilibs may be
+incompatible with these options.
+
 @item arm_pacbti_hw
 Test system supports executing Pointer Authentication and Branch Target
 Identification instructions.
-- 
2.34.1



[PATCH] testsuite: Add _link flavor for several arm_arch* and arm* effective-targets

2023-07-07 Thread Christophe Lyon via Gcc-patches
For arm targets, we generate many effective-targets with
check_effective_target_FUNC_multilib and
check_effective_target_arm_arch_FUNC_multilib which check if we can
link and execute a simple program with a given set of flags/multilibs.

In some cases however, it's possible to link but not to execute a
program, so this patch adds similar _link effective-targets which only
check if link succeeds.

The patch does not uupdate the documentation as it already lacks the
numerous existing related effective-targets.

2023-07-07  Christophe Lyon  

gcc/testsuite/
* lib/target-supports.exp (arm_*FUNC_link): New effective-targets.
---
 gcc/testsuite/lib/target-supports.exp | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index c04db2be7f9..d33bc077418 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5129,6 +5129,14 @@ foreach { armfunc armflag armdefs } {
return "$flags FLAG"
}
 
+proc check_effective_target_arm_arch_FUNC_link { } {
+   return [check_no_compiler_messages arm_arch_FUNC_link executable {
+   #include 
+   int dummy;
+   int main (void) { return 0; }
+   } [add_options_for_arm_arch_FUNC ""]]
+   }
+
proc check_effective_target_arm_arch_FUNC_multilib { } {
return [check_runtime arm_arch_FUNC_multilib {
int
@@ -5906,6 +5914,7 @@ proc add_options_for_arm_v8_2a_bf16_neon { flags } {
 #   arm_v8m_main_cde: Armv8-m CDE (Custom Datapath Extension).
 #   arm_v8m_main_cde_fp: Armv8-m CDE with FP registers.
 #   arm_v8_1m_main_cde_mve: Armv8.1-m CDE with MVE.
+#   arm_v8_1m_main_cde_mve_fp: Armv8.1-m CDE with MVE with FP support.
 # Usage:
 #   /* { dg-require-effective-target arm_v8m_main_cde_ok } */
 #   /* { dg-add-options arm_v8m_main_cde } */
@@ -5965,6 +5974,24 @@ foreach { armfunc armflag armdef arminc } {
return "$flags $et_FUNC_flags"
}
 
+proc check_effective_target_FUNC_link { } {
+   if { ! [check_effective_target_FUNC_ok] } {
+   return 0;
+   }
+   return [check_no_compiler_messages FUNC_link executable {
+   #if !(DEF)
+   #error "DEF failed"
+   #endif
+   #include 
+   INC
+   int
+   main (void)
+   {
+   return 0;
+   }
+   } [add_options_for_FUNC ""]]
+   }
+
proc check_effective_target_FUNC_multilib { } {
if { ! [check_effective_target_FUNC_ok] } {
return 0;
-- 
2.34.1



[PATCH v2] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-07-10 Thread Christophe Lyon via Gcc-patches
After the recent MVE intrinsics re-implementation, LTO stopped working
because the intrinsics would no longer be defined.

The main part of the patch is simple and similar to what we do for
AArch64:
- call handle_arm_mve_h() from arm_init_mve_builtins to declare the
  intrinsics when the compiler is in LTO mode
- actually implement arm_builtin_decl for MVE.

It was just a bit tricky to handle __ARM_MVE_PRESERVE_USER_NAMESPACE:
its value in the user code cannot be guessed at LTO time, so we always
have to assume that it was not defined.  The led to a few fixes in the
way we register MVE builtins as placeholders or not.  Without this
patch, we would just omit some versions of the inttrinsics when
__ARM_MVE_PRESERVE_USER_NAMESPACE is true. In fact, like for the C/C++
placeholders, we need to always keep entries for all of them to ensure
that we have a consistent numbering scheme.

2023-06-26  Christophe Lyon   

PR target/110268
gcc/
* config/arm/arm-builtins.cc (arm_init_mve_builtins): Handle LTO.
(arm_builtin_decl): Hahndle MVE builtins.
* config/arm/arm-mve-builtins.cc (builtin_decl): New function.
(add_unique_function): Fix handling of
__ARM_MVE_PRESERVE_USER_NAMESPACE.
(add_overloaded_function): Likewise.
* config/arm/arm-protos.h (builtin_decl): New declaration.

gcc/testsuite/
* gcc.target/arm/pr110268-1.c: New test.
* gcc.target/arm/pr110268-2.c: New test.
---
 gcc/config/arm/arm-builtins.cc| 11 +++-
 gcc/config/arm/arm-mve-builtins.cc| 61 ---
 gcc/config/arm/arm-protos.h   |  1 +
 gcc/testsuite/gcc.target/arm/pr110268-1.c | 12 +
 gcc/testsuite/gcc.target/arm/pr110268-2.c | 23 +
 5 files changed, 78 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-2.c

diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
index 36365e40a5b..fca7dcaf565 100644
--- a/gcc/config/arm/arm-builtins.cc
+++ b/gcc/config/arm/arm-builtins.cc
@@ -1918,6 +1918,15 @@ arm_init_mve_builtins (void)
   arm_builtin_datum *d = &mve_builtin_data[i];
   arm_init_builtin (fcode, d, "__builtin_mve");
 }
+
+  if (in_lto_p)
+{
+  arm_mve::handle_arm_mve_types_h ();
+  /* Under LTO, we cannot know whether
+__ARM_MVE_PRESERVE_USER_NAMESPACE was defined, so assume it
+was not.  */
+  arm_mve::handle_arm_mve_h (false);
+}
 }
 
 /* Set up all the NEON builtins, even builtins for instructions that are not
@@ -2723,7 +2732,7 @@ arm_builtin_decl (unsigned code, bool initialize_p 
ATTRIBUTE_UNUSED)
 case ARM_BUILTIN_GENERAL:
   return arm_general_builtin_decl (subcode);
 case ARM_BUILTIN_MVE:
-  return error_mark_node;
+  return arm_mve::builtin_decl (subcode);
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/arm/arm-mve-builtins.cc 
b/gcc/config/arm/arm-mve-builtins.cc
index 7033e41a571..413d8100607 100644
--- a/gcc/config/arm/arm-mve-builtins.cc
+++ b/gcc/config/arm/arm-mve-builtins.cc
@@ -493,6 +493,16 @@ handle_arm_mve_h (bool preserve_user_namespace)
 preserve_user_namespace);
 }
 
+/* Return the function decl with MVE function subcode CODE, or error_mark_node
+   if no such function exists.  */
+tree
+builtin_decl (unsigned int code)
+{
+  if (code >= vec_safe_length (registered_functions))
+return error_mark_node;
+  return (*registered_functions)[code]->decl;
+}
+
 /* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading
purposes.  */
 static bool
@@ -849,7 +859,6 @@ function_builder::add_function (const function_instance 
&instance,
 ? integer_zero_node
 : simulate_builtin_function_decl (input_location, name, fntype,
  code, NULL, attrs);
-
   registered_function &rfn = *ggc_alloc  ();
   rfn.instance = instance;
   rfn.decl = decl;
@@ -889,15 +898,12 @@ function_builder::add_unique_function (const 
function_instance &instance,
   gcc_assert (!*rfn_slot);
   *rfn_slot = &rfn;
 
-  /* Also add the non-prefixed non-overloaded function, if the user namespace
- does not need to be preserved.  */
-  if (!preserve_user_namespace)
-{
-  char *noprefix_name = get_name (instance, false, false);
-  tree attrs = get_attributes (instance);
-  add_function (instance, noprefix_name, fntype, attrs, requires_float,
-   false, false);
-}
+  /* Also add the non-prefixed non-overloaded function, as placeholder
+ if the user namespace does not need to be preserved.  */
+  char *noprefix_name = get_name (instance, false, false);
+  attrs = get_attributes (instance);
+  add_function (instance, noprefix_name, fntype, attrs, requires_float,
+   false, preserve_user_namespace);
 
   /* Also add the function under its overloaded alias, if we w

Re: [PATCH] testsuite: Add _link flavor for several arm_arch* and arm* effective-targets

2023-07-10 Thread Christophe Lyon via Gcc-patches
On Mon, 10 Jul 2023 at 15:46, Kyrylo Tkachov  wrote:

>
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: Friday, July 7, 2023 8:52 AM
> > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
> > Richard Earnshaw 
> > Cc: Christophe Lyon 
> > Subject: [PATCH] testsuite: Add _link flavor for several arm_arch* and
> arm*
> > effective-targets
> >
> > For arm targets, we generate many effective-targets with
> > check_effective_target_FUNC_multilib and
> > check_effective_target_arm_arch_FUNC_multilib which check if we can
> > link and execute a simple program with a given set of flags/multilibs.
> >
> > In some cases however, it's possible to link but not to execute a
> > program, so this patch adds similar _link effective-targets which only
> > check if link succeeds.
> >
> > The patch does not uupdate the documentation as it already lacks the
> > numerous existing related effective-targets.
>
> I think this looks ok but...
>
> >
> > 2023-07-07  Christophe Lyon  
> >
> >   gcc/testsuite/
> >   * lib/target-supports.exp (arm_*FUNC_link): New effective-targets.
> > ---
> >  gcc/testsuite/lib/target-supports.exp | 27 +++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/gcc/testsuite/lib/target-supports.exp
> b/gcc/testsuite/lib/target-
> > supports.exp
> > index c04db2be7f9..d33bc077418 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -5129,6 +5129,14 @@ foreach { armfunc armflag armdefs } {
> >   return "$flags FLAG"
> >   }
> >
> > +proc check_effective_target_arm_arch_FUNC_link { } {
> > + return [check_no_compiler_messages arm_arch_FUNC_link
> > executable {
> > + #include 
> > + int dummy;
> > + int main (void) { return 0; }
> > + } [add_options_for_arm_arch_FUNC ""]]
> > + }
> > +
> >   proc check_effective_target_arm_arch_FUNC_multilib { } {
> >   return [check_runtime arm_arch_FUNC_multilib {
> >   int
> > @@ -5906,6 +5914,7 @@ proc add_options_for_arm_v8_2a_bf16_neon {
> > flags } {
> >  #   arm_v8m_main_cde: Armv8-m CDE (Custom Datapath Extension).
> >  #   arm_v8m_main_cde_fp: Armv8-m CDE with FP registers.
> >  #   arm_v8_1m_main_cde_mve: Armv8.1-m CDE with MVE.
> > +#   arm_v8_1m_main_cde_mve_fp: Armv8.1-m CDE with MVE with FP
> > support.
> >  # Usage:
> >  #   /* { dg-require-effective-target arm_v8m_main_cde_ok } */
> >  #   /* { dg-add-options arm_v8m_main_cde } */
> > @@ -5965,6 +5974,24 @@ foreach { armfunc armflag armdef arminc } {
> >   return "$flags $et_FUNC_flags"
> >   }
> >
> > +proc check_effective_target_FUNC_link { } {
> > + if { ! [check_effective_target_FUNC_ok] } {
> > + return 0;
> > + }
> > + return [check_no_compiler_messages FUNC_link executable {
> > + #if !(DEF)
> > + #error "DEF failed"
> > + #endif
> > + #include 
>
> ... why is arm_cde.h included here?
>
> It's the very same code as  check_effective_target_FUNC_multilib below.

I think it's needed in case the toolchain's default configuration is not
able to support CDE. I believe these tests would fail if the toolchain
defaults
to -mfloat-abi=soft (the gnu/stubs-{soft|hard}.h "usual" error)

I added this chunk for consistency with the other one, it's not needed at
the moment.

Christophe



> + INC
> > + int
> > + main (void)
> > + {
> > + return 0;
> > + }
> > + } [add_options_for_FUNC ""]]
> > + }
> > +
> >   proc check_effective_target_FUNC_multilib { } {
> >   if { ! [check_effective_target_FUNC_ok] } {
> >   return 0;
> > --
> > 2.34.1
>
>


Re: [PATCH] [arm] adjust tests for quotes around +cdecp

2023-02-20 Thread Christophe Lyon via Gcc-patches

Hi Alexandre,


On 2/17/23 08:17, Alexandre Oliva via Gcc-patches wrote:


Back when quotes were added around "+cdecp" in the "coproc must be
a constant immediate" error in arm-builtins.cc, tests for that message
lagged behind.  Fixed thusly.

Regstrapped on x86_64-linux-gnu.
Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?



It seems this changed with r12-6553-gc3782843badbf3, right?
I see this commit added quotes in several others places: are the two 
tests you fix the only ones impacted?


Thanks,

Christophe


for  gcc/testsuite/ChangeLog

* gcc.target/arm/acle/cde-errors.c: Adjust messages for quote
around +cdecp.
* gcc.target/arm/acle/cde-mve-error-2.c: Likewise.
---
  gcc/testsuite/gcc.target/arm/acle/cde-errors.c |   52 ++---
  .../gcc.target/arm/acle/cde-mve-error-2.c  |   82 ++--
  2 files changed, 67 insertions(+), 67 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-errors.c 
b/gcc/testsuite/gcc.target/arm/acle/cde-errors.c
index 85a91666cd5ef..f38514848677e 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cde-errors.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cde-errors.c
@@ -47,19 +47,19 @@ uint64_t test_cde (uint32_t n, uint32_t m)
accum += __arm_cx3da (7, accum, n, m,   0); /* { dg-error 
{coprocessor 7 is not enabled with \+cdecp7} } */
  
/* `coproc` out of range.  */

-  accum += __arm_cx1   (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx1a  (8, (uint32_t)accum,   0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2   (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2a  (8, (uint32_t)accum, n,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3   (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3a  (8, (uint32_t)accum, n, m, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-
-  accum += __arm_cx1d  (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx1da (8, accum, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2d  (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2da (8, accum, n,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3d  (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3da (8, accum, n, m,   0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
+  accum += __arm_cx1   (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx1a  (8, (uint32_t)accum,   0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2   (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2a  (8, (uint32_t)accum, n,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3   (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3a  (8, (uint32_t)accum, n, m, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+
+  accum += __arm_cx1d  (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx1da (8, accum, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2d  (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2da (8, accum, n,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3d  (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3da (8, accum, n, m,   0); /* { dg-error {coproc must 

Re: [PATCH] [arm] adjust tests for quotes around +cdecp

2023-02-22 Thread Christophe Lyon via Gcc-patches




On 2/22/23 13:38, Alexandre Oliva wrote:

Hello, Christophe,

On Feb 20, 2023, Christophe Lyon  wrote:


On 2/17/23 08:17, Alexandre Oliva via Gcc-patches wrote:


Back when quotes were added around "+cdecp" in the "coproc must be
a constant immediate" error in arm-builtins.cc, tests for that message
lagged behind.  Fixed thusly.

Regstrapped on x86_64-linux-gnu.
Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?



It seems this changed with r12-6553-gc3782843badbf3, right?


Yup.


I see this commit added quotes in several others places: are the two
tests you fix the only ones impacted?


https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612176.html in
asm-flag-4.c also fixed fallout from that patch, I realize now, but that
was all that came up in our testing.

I didn't start from that patch, I was just going through test results,
investigating the failures and fixing them or at least annotating the
failures as expected.

It is conceivable that other quoted strings appear in tests that are
skipped by all of the target variants that we test.  Indeed, I went
through some arm-*-eabi variants not long ago, and these didn't come up.
So, in case you're wondering whether to look for the other strings in
the tests or somesuch, please don't assume I've already done so.


OK thanks for the clarification.

(and for the other cleanup patches!)

Christophe


Re: [PATCH] [arm] adjust tests for quotes around +cdecp

2023-03-03 Thread Christophe Lyon via Gcc-patches




On 3/3/23 09:24, Alexandre Oliva wrote:

Hello, Christophe,

On Feb 22, 2023, Christophe Lyon  wrote:


OK thanks for the clarification.



(and for the other cleanup patches!)


Was this meant as approval?  I hadn't taken it as such.


Unfortunately, no, I don't have such powers :-)



Otherwise, please consider this a ping :-)
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612188.html

Thanks,



[PATCH 01/20] arm: [MVE intrinsics] factorize vcmp

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vcmp so that they use the same pattern.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_CMP_M, MVE_CMP_M_F, MVE_CMP_M_N)
(MVE_CMP_M_N_F, mve_cmp_op1): New.
(isu): Add VCMP*
(supf): Likewise.
* config/arm/mve.md (mve_vcmpq_n_): Rename into ...
(@mve_vcmpq_n_): ... this.
(mve_vcmpeqq_m_f, mve_vcmpgeq_m_f)
(mve_vcmpgtq_m_f, mve_vcmpleq_m_f)
(mve_vcmpltq_m_f, mve_vcmpneq_m_f): Merge into ...
(@mve_vcmpq_m_f): ... this.
(mve_vcmpcsq_m_u, mve_vcmpeqq_m_)
(mve_vcmpgeq_m_s, mve_vcmpgtq_m_s)
(mve_vcmphiq_m_u, mve_vcmpleq_m_s)
(mve_vcmpltq_m_s, mve_vcmpneq_m_): Merge into
...
(@mve_vcmpq_m_): ... this.
(mve_vcmpcsq_m_n_u, mve_vcmpeqq_m_n_)
(mve_vcmpgeq_m_n_s, mve_vcmpgtq_m_n_s)
(mve_vcmphiq_m_n_u, mve_vcmpleq_m_n_s)
(mve_vcmpltq_m_n_s, mve_vcmpneq_m_n_): Merge
into ...
(@mve_vcmpq_m_n_): ... this.
(mve_vcmpeqq_m_n_f, mve_vcmpgeq_m_n_f)
(mve_vcmpgtq_m_n_f, mve_vcmpleq_m_n_f)
(mve_vcmpltq_m_n_f, mve_vcmpneq_m_n_f): Merge into ...
(@mve_vcmpq_m_n_f): ... this.
---
 gcc/config/arm/iterators.md | 108 ++
 gcc/config/arm/mve.md   | 414 +++-
 2 files changed, 135 insertions(+), 387 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 3c70fd7f56d..ef9fae0412b 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -583,6 +583,47 @@ (define_int_iterator MVE_FP_CREATE_ONLY [
 VCREATEQ_F
 ])
 
+;; MVE comparison iterators
+(define_int_iterator MVE_CMP_M [
+VCMPCSQ_M_U
+VCMPEQQ_M_S VCMPEQQ_M_U
+VCMPGEQ_M_S
+VCMPGTQ_M_S
+VCMPHIQ_M_U
+VCMPLEQ_M_S
+VCMPLTQ_M_S
+VCMPNEQ_M_S VCMPNEQ_M_U
+])
+
+(define_int_iterator MVE_CMP_M_F [
+VCMPEQQ_M_F
+VCMPGEQ_M_F
+VCMPGTQ_M_F
+VCMPLEQ_M_F
+VCMPLTQ_M_F
+VCMPNEQ_M_F
+])
+
+(define_int_iterator MVE_CMP_M_N [
+VCMPCSQ_M_N_U
+VCMPEQQ_M_N_S VCMPEQQ_M_N_U
+VCMPGEQ_M_N_S
+VCMPGTQ_M_N_S
+VCMPHIQ_M_N_U
+VCMPLEQ_M_N_S
+VCMPLTQ_M_N_S
+VCMPNEQ_M_N_S VCMPNEQ_M_N_U
+])
+
+(define_int_iterator MVE_CMP_M_N_F [
+VCMPEQQ_M_N_F
+VCMPGEQ_M_N_F
+VCMPGTQ_M_N_F
+VCMPLEQ_M_N_F
+VCMPLTQ_M_N_F
+VCMPNEQ_M_N_F
+])
+
 (define_int_iterator MVE_VMAXVQ_VMINVQ [
 VMAXAVQ_S
 VMAXVQ_S VMAXVQ_U
@@ -655,6 +696,37 @@ (define_code_attr mve_addsubmul [
 (plus "vadd")
 ])
 
+(define_int_attr mve_cmp_op1 [
+(VCMPCSQ_M_U "cs")
+(VCMPEQQ_M_S "eq") (VCMPEQQ_M_U "eq")
+(VCMPGEQ_M_S "ge")
+(VCMPGTQ_M_S "gt")
+(VCMPHIQ_M_U "hi")
+(VCMPLEQ_M_S "le")
+(VCMPLTQ_M_S "lt")
+(VCMPNEQ_M_S "ne") (VCMPNEQ_M_U "ne")
+(VCMPEQQ_M_F "eq")
+(VCMPGEQ_M_F "ge")
+(VCMPGTQ_M_F "gt")
+(VCMPLEQ_M_F "le")
+(VCMPLTQ_M_F "lt")
+(VCMPNEQ_M_F "ne")
+(VCMPCSQ_M_N_U "cs")
+(VCMPEQQ_M_N_S "eq") (VCMPEQQ_M_N_U "eq")
+(VCMPGEQ_M_N_S "ge")
+(VCMPGTQ_M_N_S "gt")
+(VCMPHIQ_M_N_U "hi")
+(VCMPLEQ_M_N_S "le")
+(VCMPLTQ_M_N_S "lt")
+(VCMPNEQ_M_N_S "ne") (VCMPNEQ_M_N_U "ne")
+(VCMPEQQ_M_N_F "eq")
+(VCMPGEQ_M_N_F "ge")
+(VCMPGTQ_M_N_F "gt")
+(VCMPLEQ_M_N_F "le")
+(VCMPLTQ_M_N_F "lt")
+(VCMPNEQ_M_N_F "ne")
+])
+
 (define_int_attr mve_insn [
 (VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F "vabd")
 (VABDQ_S "vabd") (VABDQ_U "vabd") (VABDQ_F "vabd")
@@ -836,6 +908,26 @@ (define_int_attr isu[
 (VCLSQ_M_S "s")
 (VCLZQ_M_S "i")
 (VCLZQ_M_U "i")
+(VCMPCSQ_M_N_U "u")
+(VCMPCSQ_M_U "u")
+(VCMPEQQ_M_N_S "i")
+(VCMPEQQ_M_N_U "i")
+(VCMPEQQ_M_S "i")
+(VCMPEQQ_M_U "i")
+(VCMPGEQ_M_N_S "s")
+(VCMPGEQ_M_S "s")
+(VCMPGTQ_M_N_S "s")
+  

[PATCH 11/20] arm: [MVE intrinsics] rework vaddvq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Implement vaddvq using the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vaddvq): New.
* config/arm/arm-mve-builtins-base.def (vaddvq): New.
* config/arm/arm-mve-builtins-base.h (vaddvq): New.
* config/arm/arm_mve.h (vaddvq): Remove.
(vaddvq_p): Remove.
(vaddvq_s8): Remove.
(vaddvq_s16): Remove.
(vaddvq_s32): Remove.
(vaddvq_u8): Remove.
(vaddvq_u16): Remove.
(vaddvq_u32): Remove.
(vaddvq_p_u8): Remove.
(vaddvq_p_s8): Remove.
(vaddvq_p_u16): Remove.
(vaddvq_p_s16): Remove.
(vaddvq_p_u32): Remove.
(vaddvq_p_s32): Remove.
(__arm_vaddvq_s8): Remove.
(__arm_vaddvq_s16): Remove.
(__arm_vaddvq_s32): Remove.
(__arm_vaddvq_u8): Remove.
(__arm_vaddvq_u16): Remove.
(__arm_vaddvq_u32): Remove.
(__arm_vaddvq_p_u8): Remove.
(__arm_vaddvq_p_s8): Remove.
(__arm_vaddvq_p_u16): Remove.
(__arm_vaddvq_p_s16): Remove.
(__arm_vaddvq_p_u32): Remove.
(__arm_vaddvq_p_s32): Remove.
(__arm_vaddvq): Remove.
(__arm_vaddvq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   1 +
 gcc/config/arm/arm-mve-builtins-base.def |   1 +
 gcc/config/arm/arm-mve-builtins-base.h   |   1 +
 gcc/config/arm/arm_mve.h | 200 ---
 4 files changed, 3 insertions(+), 200 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index cb572130c2b..7f90fc65ae2 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -244,6 +244,7 @@ namespace arm_mve {
 FUNCTION_WITHOUT_N (vabdq, VABDQ)
 FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, 
-1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1))
 FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ)
+FUNCTION_PRED_P_S_U (vaddvq, VADDVQ)
 FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
 FUNCTION_WITHOUT_N_NO_U_F (vclsq, VCLSQ)
 FUNCTION (vclzq, unspec_based_mve_function_exact_insn, (CLZ, CLZ, CLZ, -1, -1, 
-1, VCLZQ_M_S, VCLZQ_M_U, -1, -1, -1 ,-1))
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 30e6aa1e1e6..d32745f334a 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -21,6 +21,7 @@
 DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none)
 DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none)
+DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vandq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vclsq, unary, all_signed, mx_or_none)
 DEF_MVE_FUNCTION (vclzq, unary, all_integer, mx_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 3dc9114045f..9080542e7e3 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -26,6 +26,7 @@ namespace functions {
 extern const function_base *const vabdq;
 extern const function_base *const vabsq;
 extern const function_base *const vaddq;
+extern const function_base *const vaddvq;
 extern const function_base *const vandq;
 extern const function_base *const vclsq;
 extern const function_base *const vclzq;
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index c3d18e4cc6f..11f1033deb9 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -43,7 +43,6 @@
 #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE
 #define vst4q(__addr, __value) __arm_vst4q(__addr, __value)
 #define vaddlvq(__a) __arm_vaddlvq(__a)
-#define vaddvq(__a) __arm_vaddvq(__a)
 #define vmovlbq(__a) __arm_vmovlbq(__a)
 #define vmovltq(__a) __arm_vmovltq(__a)
 #define vmvnq(__a) __arm_vmvnq(__a)
@@ -55,7 +54,6 @@
 #define vcaddq_rot90(__a, __b) __arm_vcaddq_rot90(__a, __b)
 #define vcaddq_rot270(__a, __b) __arm_vcaddq_rot270(__a, __b)
 #define vbicq(__a, __b) __arm_vbicq(__a, __b)
-#define vaddvq_p(__a, __p) __arm_vaddvq_p(__a, __p)
 #define vaddvaq(__a, __b) __arm_vaddvaq(__a, __b)
 #define vbrsrq(__a, __b) __arm_vbrsrq(__a, __b)
 #define vqshluq(__a, __imm) __arm_vqshluq(__a, __imm)
@@ -329,9 +327,6 @@
 #define vcvtq_f16_u16(__a) __arm_vcvtq_f16_u16(__a)
 #define vcvtq_f32_u32(__a) __arm_vcvtq_f32_u32(__a)
 #define vaddlvq_s32(__a) __arm_vaddlvq_s32(__a)
-#define vaddvq_s8(__a) __arm_vaddvq_s8(__a)
-#define vaddvq_s16(__a) __arm_vaddvq_s16(__a)
-#define vaddvq_s32(__a) __arm_vaddvq_s32(__a)
 #define vmovlbq_s8(__a) __arm_vmovlbq_s8(__a)
 #define vmovlbq_s16(__a) __arm_vmovlbq_s16(__a)
 #define vmovltq_s8(__a) __arm_vmovltq_s8(__a)
@@ -354,9 +349,6 @@
 #define vmvnq_u8(__a) __arm_vmvnq_u8(__a)
 #define vmvnq_u16(__a) __arm_vmvnq_u16(__a)
 #define vmvnq_u32(__a) __arm_vmvnq_u32(__a)
-#define vaddvq_u8(__a) __arm_vaddvq_u8(__a)
-#define v

[PATCH 09/20] arm: [MVE intrinsics] factorize vaddvq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vaddvq builtins so that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vaddv.
* config/arm/mve.md (@mve_vaddvq_): Rename into ...
(@mve_q_): ... this.
(mve_vaddvq_p_): Rename into ...
(@mve_q_p_): ... this.
* config/arm/vec-common.md: Use gen_mve_q instead of
gen_mve_vaddvq.
---
 gcc/config/arm/iterators.md  | 2 ++
 gcc/config/arm/mve.md| 8 
 gcc/config/arm/vec-common.md | 2 +-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index aff4e7fb814..46c7ddeda67 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -762,6 +762,8 @@ (define_int_attr mve_insn [
 (VADDQ_M_N_S "vadd") (VADDQ_M_N_U "vadd") (VADDQ_M_N_F "vadd")
 (VADDQ_M_S "vadd") (VADDQ_M_U "vadd") (VADDQ_M_F "vadd")
 (VADDQ_N_S "vadd") (VADDQ_N_U "vadd") (VADDQ_N_F "vadd")
+(VADDVQ_P_S "vaddv") (VADDVQ_P_U "vaddv")
+(VADDVQ_S "vaddv") (VADDVQ_U "vaddv")
 (VANDQ_M_S "vand") (VANDQ_M_U "vand") (VANDQ_M_F "vand")
 (VBICQ_M_N_S "vbic") (VBICQ_M_N_U "vbic")
 (VBICQ_M_S "vbic") (VBICQ_M_U "vbic") (VBICQ_M_F "vbic")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 0c4e4e60bc4..d772f4d4380 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -360,14 +360,14 @@ (define_insn "@mve_q_"
 ;;
 ;; [vaddvq_s, vaddvq_u])
 ;;
-(define_insn "@mve_vaddvq_"
+(define_insn "@mve_q_"
   [
(set (match_operand:SI 0 "s_register_operand" "=Te")
(unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w")]
 VADDVQ))
   ]
   "TARGET_HAVE_MVE"
-  "vaddv.%#\t%0, %q1"
+  ".%#\t%0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -773,7 +773,7 @@ (define_insn "mve_vaddvaq_"
 ;;
 ;; [vaddvq_p_u, vaddvq_p_s])
 ;;
-(define_insn "mve_vaddvq_p_"
+(define_insn "@mve_q_p_"
   [
(set (match_operand:SI 0 "s_register_operand" "=Te")
(unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w")
@@ -781,7 +781,7 @@ (define_insn "mve_vaddvq_p_"
 VADDVQ_P))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vaddvt.%#%0, %q1"
+  "vpst\;t.%#\t%0, %q1"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index 6183c931e36..9af8429968d 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -559,7 +559,7 @@ (define_expand "reduc_plus_scal_"
   /* vaddv generates a 32 bits accumulator.  */
   rtx op0 = gen_reg_rtx (SImode);
 
-  emit_insn (gen_mve_vaddvq (VADDVQ_S, mode, op0, operands[1]));
+  emit_insn (gen_mve_q (VADDVQ_S, VADDVQ_S, mode, op0, operands[1]));
   emit_move_insn (operands[0], gen_lowpart (mode, op0));
 }
 
-- 
2.34.1



[PATCH 06/20] arm: [MVE intrinsics] factorize vdupq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vdup builtins so that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_FP_M_N_VDUPQ_ONLY)
(MVE_FP_N_VDUPQ_ONLY): New.
(mve_insn): Add vdupq.
* config/arm/mve.md (mve_vdupq_n_f): Rename into ...
(@mve_q_n_f): ... this.
(mve_vdupq_n_): Rename into ...
(@mve_q_n_): ... this.
(mve_vdupq_m_n_): Rename into ...
(@mve_q_m_n_): ... this.
(mve_vdupq_m_n_f): Rename into ...
(@mve_q_m_n_f): ... this.
---
 gcc/config/arm/iterators.md | 10 ++
 gcc/config/arm/mve.md   | 20 ++--
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 878210471c8..aff4e7fb814 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -391,6 +391,14 @@ (define_int_iterator MVE_FP_M_VREV32Q_ONLY [
 VREV32Q_M_F
 ])
 
+(define_int_iterator MVE_FP_M_N_VDUPQ_ONLY [
+VDUPQ_M_N_F
+])
+
+(define_int_iterator MVE_FP_N_VDUPQ_ONLY [
+VDUPQ_N_F
+])
+
 ;; MVE integer binary operations.
 (define_code_iterator MVE_INT_BINARY_RTX [plus minus mult])
 
@@ -762,6 +770,8 @@ (define_int_attr mve_insn [
 (VCLSQ_S "vcls")
 (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz")
 (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F 
"vcreate")
+(VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup")
+(VDUPQ_N_S "vdup") (VDUPQ_N_U "vdup") (VDUPQ_N_F "vdup")
 (VEORQ_M_S "veor") (VEORQ_M_U "veor") (VEORQ_M_F "veor")
 (VHADDQ_M_N_S "vhadd") (VHADDQ_M_N_U "vhadd")
 (VHADDQ_M_S "vhadd") (VHADDQ_M_U "vhadd")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 4dfcd6c4280..0c4e4e60bc4 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -179,14 +179,14 @@ (define_insn "mve_vq_f"
 ;;
 ;; [vdupq_n_f])
 ;;
-(define_insn "mve_vdupq_n_f"
+(define_insn "@mve_q_n_f"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=w")
(unspec:MVE_0 [(match_operand: 1 "s_register_operand" "r")]
-VDUPQ_N_F))
+MVE_FP_N_VDUPQ_ONLY))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vdup.%#\t%q0, %1"
+  ".%#\t%q0, %1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -310,14 +310,14 @@ (define_expand "mve_vmvnq_s"
 ;;
 ;; [vdupq_n_u, vdupq_n_s])
 ;;
-(define_insn "mve_vdupq_n_"
+(define_insn "@mve_q_n_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand: 1 "s_register_operand" "r")]
 VDUPQ_N))
   ]
   "TARGET_HAVE_MVE"
-  "vdup.%#\t%q0, %1"
+  ".%#\t%q0, %1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2006,7 +2006,7 @@ (define_insn "@mve_vcmpq_m_"
 ;;
 ;; [vdupq_m_n_s, vdupq_m_n_u])
 ;;
-(define_insn "mve_vdupq_m_n_"
+(define_insn "@mve_q_m_n_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
@@ -2015,7 +2015,7 @@ (define_insn "mve_vdupq_m_n_"
 VDUPQ_M_N))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vdupt.%#\t%q0, %2"
+  "vpst\;t.%#\t%q0, %2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
@@ -2666,16 +2666,16 @@ (define_insn "mve_vcvttq_m_f32_f16v4sf"
 ;;
 ;; [vdupq_m_n_f])
 ;;
-(define_insn "mve_vdupq_m_n_f"
+(define_insn "@mve_q_m_n_f"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=w")
(unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0")
   (match_operand: 2 "s_register_operand" "r")
   (match_operand: 3 "vpr_register_operand" 
"Up")]
-VDUPQ_M_N_F))
+MVE_FP_M_N_VDUPQ_ONLY))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vpst\;vdupt.%#\t%q0, %2"
+  "vpst\;t.%#\t%q0, %2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
-- 
2.34.1



[PATCH 04/20] arm: [MVE intrinsics] factorize vrev16q vrev32q vrev64q

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vrev16q vrev32q vrev64q so that they use generic builtin
names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_V8HF, MVE_V16QI)
(MVE_FP_VREV64Q_ONLY, MVE_FP_M_VREV64Q_ONLY, MVE_FP_VREV32Q_ONLY)
(MVE_FP_M_VREV32Q_ONLY): New iterators.
(mve_insn): Add vrev16q, vrev32q, vrev64q.
* config/arm/mve.md (mve_vrev64q_f): Rename into ...
(@mve_q_f): ... this
(mve_vrev32q_fv8hf): Rename into @mve_q_f.
(mve_vrev64q_): Rename into ...
(@mve_q_): ... this.
(mve_vrev32q_): Rename into
@mve_q_.
(mve_vrev16q_v16qi): Rename into
@mve_q_.
(mve_vrev64q_m_): Rename into
@mve_q_m_.
(mve_vrev32q_m_fv8hf): Rename into @mve_q_m_f.
(mve_vrev32q_m_): Rename into
@mve_q_m_.
(mve_vrev64q_m_f): Rename into @mve_q_m_f.
(mve_vrev16q_m_v16qi): Rename into
@mve_q_m_.
---
 gcc/config/arm/iterators.md | 25 +
 gcc/config/arm/mve.md   | 72 ++---
 2 files changed, 61 insertions(+), 36 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index ef9fae0412b..878210471c8 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -1,3 +1,4 @@
+
 ;; Code and mode itertator and attribute definitions for the ARM backend
 ;; Copyright (C) 2010-2023 Free Software Foundation, Inc.
 ;; Contributed by ARM Ltd.
@@ -274,6 +275,8 @@ (define_mode_iterator MVE_5 [V8HI V4SI])
 (define_mode_iterator MVE_6 [V8HI V4SI])
 (define_mode_iterator MVE_7 [V16BI V8BI V4BI V2QI])
 (define_mode_iterator MVE_7_HI [HI V16BI V8BI V4BI V2QI])
+(define_mode_iterator MVE_V8HF [V8HF])
+(define_mode_iterator MVE_V16QI [V16QI])
 
 ;;
 ;; Code iterators
@@ -372,6 +375,22 @@ (define_int_iterator MVE_FP_M_UNARY [
 VRNDXQ_M_F
 ])
 
+(define_int_iterator MVE_FP_VREV64Q_ONLY [
+VREV64Q_F
+])
+
+(define_int_iterator MVE_FP_M_VREV64Q_ONLY [
+VREV64Q_M_F
+])
+
+(define_int_iterator MVE_FP_VREV32Q_ONLY [
+VREV32Q_F
+])
+
+(define_int_iterator MVE_FP_M_VREV32Q_ONLY [
+VREV32Q_M_F
+])
+
 ;; MVE integer binary operations.
 (define_code_iterator MVE_INT_BINARY_RTX [plus minus mult])
 
@@ -862,6 +881,12 @@ (define_int_attr mve_insn [
 (VQSUBQ_M_S "vqsub") (VQSUBQ_M_U "vqsub")
 (VQSUBQ_N_S "vqsub") (VQSUBQ_N_U "vqsub")
 (VQSUBQ_S "vqsub") (VQSUBQ_U "vqsub")
+(VREV16Q_M_S "vrev16") (VREV16Q_M_U "vrev16")
+(VREV16Q_S "vrev16") (VREV16Q_U "vrev16")
+(VREV32Q_M_S "vrev32") (VREV32Q_M_U "vrev32") (VREV32Q_M_F 
"vrev32")
+(VREV32Q_S "vrev32") (VREV32Q_U "vrev32") (VREV32Q_F "vrev32")
+(VREV64Q_M_S "vrev64") (VREV64Q_M_U "vrev64") (VREV64Q_M_F 
"vrev64")
+(VREV64Q_S "vrev64") (VREV64Q_U "vrev64") (VREV64Q_F "vrev64")
 (VRHADDQ_M_S "vrhadd") (VRHADDQ_M_U "vrhadd")
 (VRHADDQ_S "vrhadd") (VRHADDQ_U "vrhadd")
 (VRMULHQ_M_S "vrmulh") (VRMULHQ_M_U "vrmulh")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 191d1268ad6..4dfcd6c4280 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -151,14 +151,14 @@ (define_insn "@mve_q_f"
 ;;
 ;; [vrev64q_f])
 ;;
-(define_insn "mve_vrev64q_f"
+(define_insn "@mve_q_f"
   [
(set (match_operand:MVE_0 0 "s_register_operand" "=&w")
(unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w")]
-VREV64Q_F))
+MVE_FP_VREV64Q_ONLY))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vrev64.%# %q0, %q1"
+  ".%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -193,14 +193,14 @@ (define_insn "mve_vdupq_n_f"
 ;;
 ;; [vrev32q_f])
 ;;
-(define_insn "mve_vrev32q_fv8hf"
+(define_insn "@mve_q_f"
   [
-   (set (match_operand:V8HF 0 "s_register_operand" "=w")
-   (unspec:V8HF [(match_operand:V8HF 1 "s_register_operand" "w")]
-VREV32Q_F))
+   (set (match_operand:MVE_V8HF 0 "s_register_operand" "=w")
+   (unspec:MVE_V8HF [(match_operand:MVE_V8HF 1 "s_register_operand" "w")]
+MVE_FP_VREV32Q_ONLY))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vrev32.16 %q0, %q1"
+  ".\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 ;;
@@ -248,14 +248,14 @@ (define_insn "mve_vcvtq_to_f_"
 ;;
 ;; [vrev64q_u, vrev64q_s])
 ;;
-(define_insn "mve_vrev64q_"
+(define_insn "@mve_q_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=&w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")]
 VREV64Q))
   ]
   "TARGET_HAVE_MVE"
-  "vrev64.%# %q0, %q1"
+  ".%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -374,14 +374,14 @@ (

[PATCH 20/20] arm: [MVE intrinsics] rework vmovlbq vmovltq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Implement vmovlbq, vmovltq using the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmovlbq, vmovltq): New.
* config/arm/arm-mve-builtins-base.def (vmovlbq, vmovltq): New.
* config/arm/arm-mve-builtins-base.h (vmovlbq, vmovltq): New.
* config/arm/arm_mve.h (vmovlbq): Remove.
(vmovltq): Remove.
(vmovlbq_m): Remove.
(vmovltq_m): Remove.
(vmovlbq_x): Remove.
(vmovltq_x): Remove.
(vmovlbq_s8): Remove.
(vmovlbq_s16): Remove.
(vmovltq_s8): Remove.
(vmovltq_s16): Remove.
(vmovltq_u8): Remove.
(vmovltq_u16): Remove.
(vmovlbq_u8): Remove.
(vmovlbq_u16): Remove.
(vmovlbq_m_s8): Remove.
(vmovltq_m_s8): Remove.
(vmovlbq_m_u8): Remove.
(vmovltq_m_u8): Remove.
(vmovlbq_m_s16): Remove.
(vmovltq_m_s16): Remove.
(vmovlbq_m_u16): Remove.
(vmovltq_m_u16): Remove.
(vmovlbq_x_s8): Remove.
(vmovlbq_x_s16): Remove.
(vmovlbq_x_u8): Remove.
(vmovlbq_x_u16): Remove.
(vmovltq_x_s8): Remove.
(vmovltq_x_s16): Remove.
(vmovltq_x_u8): Remove.
(vmovltq_x_u16): Remove.
(__arm_vmovlbq_s8): Remove.
(__arm_vmovlbq_s16): Remove.
(__arm_vmovltq_s8): Remove.
(__arm_vmovltq_s16): Remove.
(__arm_vmovltq_u8): Remove.
(__arm_vmovltq_u16): Remove.
(__arm_vmovlbq_u8): Remove.
(__arm_vmovlbq_u16): Remove.
(__arm_vmovlbq_m_s8): Remove.
(__arm_vmovltq_m_s8): Remove.
(__arm_vmovlbq_m_u8): Remove.
(__arm_vmovltq_m_u8): Remove.
(__arm_vmovlbq_m_s16): Remove.
(__arm_vmovltq_m_s16): Remove.
(__arm_vmovlbq_m_u16): Remove.
(__arm_vmovltq_m_u16): Remove.
(__arm_vmovlbq_x_s8): Remove.
(__arm_vmovlbq_x_s16): Remove.
(__arm_vmovlbq_x_u8): Remove.
(__arm_vmovlbq_x_u16): Remove.
(__arm_vmovltq_x_s8): Remove.
(__arm_vmovltq_x_s16): Remove.
(__arm_vmovltq_x_u8): Remove.
(__arm_vmovltq_x_u16): Remove.
(__arm_vmovlbq): Remove.
(__arm_vmovltq): Remove.
(__arm_vmovlbq_m): Remove.
(__arm_vmovltq_m): Remove.
(__arm_vmovlbq_x): Remove.
(__arm_vmovltq_x): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   2 +
 gcc/config/arm/arm-mve-builtins-base.def |   2 +
 gcc/config/arm/arm-mve-builtins-base.h   |   2 +
 gcc/config/arm/arm_mve.h | 454 ---
 4 files changed, 6 insertions(+), 454 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index fdc0ff50b96..2dec15ac0b1 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -279,6 +279,8 @@ FUNCTION (vminnmq, unspec_based_mve_function_exact_insn, 
(UNKNOWN, UNKNOWN, SMIN
 FUNCTION_PRED_P_F (vminnmvq, VMINNMVQ)
 FUNCTION_WITH_RTX_M_NO_F (vminq, SMIN, UMIN, VMINQ)
 FUNCTION_PRED_P_S_U (vminvq, VMINVQ)
+FUNCTION_WITHOUT_N_NO_F (vmovlbq, VMOVLBQ)
+FUNCTION_WITHOUT_N_NO_F (vmovltq, VMOVLTQ)
 FUNCTION_WITHOUT_N_NO_F (vmovnbq, VMOVNBQ)
 FUNCTION_WITHOUT_N_NO_F (vmovntq, VMOVNTQ)
 FUNCTION_WITHOUT_N_NO_F (vmulhq, VMULHQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index dcfb426a7fb..b0de5af1013 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -48,6 +48,8 @@ DEF_MVE_FUNCTION (vminaq, binary_maxamina, all_signed, 
m_or_none)
 DEF_MVE_FUNCTION (vminavq, binary_maxavminav, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vminq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vminvq, binary_maxvminv, all_integer, p_or_none)
+DEF_MVE_FUNCTION (vmovlbq, unary_widen, integer_8_16, mx_or_none)
+DEF_MVE_FUNCTION (vmovltq, unary_widen, integer_8_16, mx_or_none)
 DEF_MVE_FUNCTION (vmovnbq, binary_move_narrow, integer_16_32, m_or_none)
 DEF_MVE_FUNCTION (vmovntq, binary_move_narrow, integer_16_32, m_or_none)
 DEF_MVE_FUNCTION (vmulhq, binary, all_integer, mx_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 5de70d5e1d4..fa2e97fd461 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -61,6 +61,8 @@ extern const function_base *const vminnmq;
 extern const function_base *const vminnmvq;
 extern const function_base *const vminq;
 extern const function_base *const vminvq;
+extern const function_base *const vmovlbq;
+extern const function_base *const vmovltq;
 extern const function_base *const vmovnbq;
 extern const function_base *const vmovntq;
 extern const function_base *const vmulhq;
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 21d7768a732..c0891b7592a 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.

[PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

2023-05-10 Thread Christophe Lyon via Gcc-patches
This patch adds the unary_acc shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_acc): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 28 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 29 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index bff1c3e843b..e77a0cc20ac 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1066,6 +1066,34 @@ struct unary_def : public overloaded_base<0>
 };
 SHAPE (unary)
 
+/* _t vfoo[_](_t)
+
+   i.e. a version of "unary" in which the source elements are half the
+   size of the destination scalar, but have the same type class.
+
+   Example: vaddlvq.
+   int64_t [__arm_]vaddlvq[_s32](int32x4_t a)
+   int64_t [__arm_]vaddlvq_p[_s32](int32x4_t a, mve_pred16_t p) */
+struct unary_acc_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sw0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+/* FIXME: check that the return value is actually
+   twice as wide as arg 0.  */
+return r.resolve_unary ();
+  }
+};
+SHAPE (unary_acc)
+
 /* _t foo_t0[_t1](_t)
 
where the target type  must be specified explicitly but the source
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index fc1bacbd4da..c062fe624c4 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -53,6 +53,7 @@ namespace arm_mve
 extern const function_shape *const create;
 extern const function_shape *const inherent;
 extern const function_shape *const unary;
+extern const function_shape *const unary_acc;
 extern const function_shape *const unary_convert;
 extern const function_shape *const unary_int32;
 extern const function_shape *const unary_int32_acc;
-- 
2.34.1



[PATCH 07/20] arm: [MVE intrinsics] add unary_n shape

2023-05-10 Thread Christophe Lyon via Gcc-patches
This patch adds the unary_n shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_n): New.
* config/arm/arm-mve-builtins-shapes.h (unary_n): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 53 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 54 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index ea0112b3e99..c78683aaba2 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1094,6 +1094,59 @@ struct unary_convert_def : public overloaded_base<1>
 };
 SHAPE (unary_convert)
 
+/* _t vfoo[_n]_t0(_t)
+
+   Example: vdupq.
+   int16x8_t [__arm_]vdupq_n_s16(int16_t a)
+   int16x8_t [__arm_]vdupq_m[_n_s16](int16x8_t inactive, int16_t a, 
mve_pred16_t p)
+   int16x8_t [__arm_]vdupq_x_n_s16(int16_t a, mve_pred16_t p)  */
+struct unary_n_def : public overloaded_base<0>
+{
+  bool
+  explicit_type_suffix_p (unsigned int, enum predication_index pred,
+ enum mode_suffix_index) const override
+  {
+return pred != PRED_m;
+  }
+
+  bool
+  explicit_mode_suffix_p (enum predication_index pred,
+ enum mode_suffix_index mode) const override
+  {
+return ((mode == MODE_n)
+   && (pred != PRED_m));
+  }
+
+  bool
+  skip_overload_p (enum predication_index pred, enum mode_suffix_index mode)
+const override
+  {
+switch (mode)
+  {
+  case MODE_n:
+   return pred != PRED_m;
+
+  default:
+   gcc_unreachable ();
+  }
+  }
+
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_n, preserve_user_namespace);
+build_all (b, "v0,s0", group, MODE_n, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_unary_n ();
+  }
+};
+SHAPE (unary_n)
+
 } /* end namespace arm_mve */
 
 #undef SHAPE
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 59c4dc39c39..a35faec2542 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -54,6 +54,7 @@ namespace arm_mve
 extern const function_shape *const inherent;
 extern const function_shape *const unary;
 extern const function_shape *const unary_convert;
+extern const function_shape *const unary_n;
 
   } /* end namespace arm_mve::shapes */
 } /* end namespace arm_mve */
-- 
2.34.1



[PATCH 14/20] arm: [MVE intrinsics] rework vaddvaq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Implement vaddvaq using the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vaddvaq): New.
* config/arm/arm-mve-builtins-base.def (vaddvaq): New.
* config/arm/arm-mve-builtins-base.h (vaddvaq): New.
* config/arm/arm_mve.h (vaddvaq): Remove.
(vaddvaq_p): Remove.
(vaddvaq_u8): Remove.
(vaddvaq_s8): Remove.
(vaddvaq_u16): Remove.
(vaddvaq_s16): Remove.
(vaddvaq_u32): Remove.
(vaddvaq_s32): Remove.
(vaddvaq_p_u8): Remove.
(vaddvaq_p_s8): Remove.
(vaddvaq_p_u16): Remove.
(vaddvaq_p_s16): Remove.
(vaddvaq_p_u32): Remove.
(vaddvaq_p_s32): Remove.
(__arm_vaddvaq_u8): Remove.
(__arm_vaddvaq_s8): Remove.
(__arm_vaddvaq_u16): Remove.
(__arm_vaddvaq_s16): Remove.
(__arm_vaddvaq_u32): Remove.
(__arm_vaddvaq_s32): Remove.
(__arm_vaddvaq_p_u8): Remove.
(__arm_vaddvaq_p_s8): Remove.
(__arm_vaddvaq_p_u16): Remove.
(__arm_vaddvaq_p_s16): Remove.
(__arm_vaddvaq_p_u32): Remove.
(__arm_vaddvaq_p_s32): Remove.
(__arm_vaddvaq): Remove.
(__arm_vaddvaq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   1 +
 gcc/config/arm/arm-mve-builtins-base.def |   1 +
 gcc/config/arm/arm-mve-builtins-base.h   |   1 +
 gcc/config/arm/arm_mve.h | 202 ---
 4 files changed, 3 insertions(+), 202 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 7f90fc65ae2..e87069b0467 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -245,6 +245,7 @@ FUNCTION_WITHOUT_N (vabdq, VABDQ)
 FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, 
-1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1))
 FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ)
 FUNCTION_PRED_P_S_U (vaddvq, VADDVQ)
+FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ)
 FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
 FUNCTION_WITHOUT_N_NO_U_F (vclsq, VCLSQ)
 FUNCTION (vclzq, unspec_based_mve_function_exact_insn, (CLZ, CLZ, CLZ, -1, -1, 
-1, VCLZQ_M_S, VCLZQ_M_U, -1, -1, -1 ,-1))
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index d32745f334a..413fe4a1ef0 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -21,6 +21,7 @@
 DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none)
 DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none)
+DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vandq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vclsq, unary, all_signed, mx_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 9080542e7e3..5338b777444 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -26,6 +26,7 @@ namespace functions {
 extern const function_base *const vabdq;
 extern const function_base *const vabsq;
 extern const function_base *const vaddq;
+extern const function_base *const vaddvaq;
 extern const function_base *const vaddvq;
 extern const function_base *const vandq;
 extern const function_base *const vclsq;
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 11f1033deb9..74783570561 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -54,7 +54,6 @@
 #define vcaddq_rot90(__a, __b) __arm_vcaddq_rot90(__a, __b)
 #define vcaddq_rot270(__a, __b) __arm_vcaddq_rot270(__a, __b)
 #define vbicq(__a, __b) __arm_vbicq(__a, __b)
-#define vaddvaq(__a, __b) __arm_vaddvaq(__a, __b)
 #define vbrsrq(__a, __b) __arm_vbrsrq(__a, __b)
 #define vqshluq(__a, __imm) __arm_vqshluq(__a, __imm)
 #define vmlsdavxq(__a, __b) __arm_vmlsdavxq(__a, __b)
@@ -89,7 +88,6 @@
 #define vmlaq(__a, __b, __c) __arm_vmlaq(__a, __b, __c)
 #define vmladavq_p(__a, __b, __p) __arm_vmladavq_p(__a, __b, __p)
 #define vmladavaq(__a, __b, __c) __arm_vmladavaq(__a, __b, __c)
-#define vaddvaq_p(__a, __b, __p) __arm_vaddvaq_p(__a, __b, __p)
 #define vsriq(__a, __b, __imm) __arm_vsriq(__a, __b, __imm)
 #define vsliq(__a, __b, __imm) __arm_vsliq(__a, __b, __imm)
 #define vmlsdavxq_p(__a, __b, __p) __arm_vmlsdavxq_p(__a, __b, __p)
@@ -390,7 +388,6 @@
 #define vcaddq_rot90_u8(__a, __b) __arm_vcaddq_rot90_u8(__a, __b)
 #define vcaddq_rot270_u8(__a, __b) __arm_vcaddq_rot270_u8(__a, __b)
 #define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b)
-#define vaddvaq_u8(__a, __b) __arm_vaddvaq_u8(__a, __b)
 #define vbrsrq_n_u8(__a, __b) __arm_vbrsrq_n_u8(__a, __b)
 #define vqshluq_n_s8(__a,  __imm) __arm_vqshluq_n_s8(__a,  __imm)
 #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b)
@@ -406,7 +4

[PATCH 10/20] arm: [MVE intrinsics] add unary_int32 shape

2023-05-10 Thread Christophe Lyon via Gcc-patches
This patch adds the unary_int32 shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_int32): New.
* config/arm/arm-mve-builtins-shapes.h (unary_int32): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 27 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 28 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index c78683aaba2..0bd91b24147 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1094,6 +1094,33 @@ struct unary_convert_def : public overloaded_base<1>
 };
 SHAPE (unary_convert)
 
+/* [u]int32_t vfoo[_](_t)
+
+   i.e. a version of "unary" which generates a scalar of type int32_t
+   or uint32_t depending on the signedness of the elements of of input
+   vector.
+
+   Example: vaddvq
+   int32_t [__arm_]vaddvq[_s16](int16x8_t a)
+   int32_t [__arm_]vaddvq_p[_s16](int16x8_t a, mve_pred16_t p)  */
+struct unary_int32_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sx32,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform (1);
+  }
+};
+SHAPE (unary_int32)
+
 /* _t vfoo[_n]_t0(_t)
 
Example: vdupq.
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index a35faec2542..f422550559e 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -54,6 +54,7 @@ namespace arm_mve
 extern const function_shape *const inherent;
 extern const function_shape *const unary;
 extern const function_shape *const unary_convert;
+extern const function_shape *const unary_int32;
 extern const function_shape *const unary_n;
 
   } /* end namespace arm_mve::shapes */
-- 
2.34.1



[PATCH 16/20] arm: [MVE intrinsics] factorize vaddlvq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vaddlvq builtins so that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vaddlv.
* config/arm/mve.md (mve_vaddlvq_v4si): Rename into ...
(@mve_q_v4si): ... this.
(mve_vaddlvq_p_v4si): Rename into ...
(@mve_q_p_v4si): ... this.
---
 gcc/config/arm/iterators.md | 2 ++
 gcc/config/arm/mve.md   | 8 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 00123c0a376..84dd97249f9 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -759,6 +759,8 @@ (define_int_attr mve_insn [
 (VABDQ_S "vabd") (VABDQ_U "vabd") (VABDQ_F "vabd")
 (VABSQ_M_F "vabs")
 (VABSQ_M_S "vabs")
+(VADDLVQ_P_S "vaddlv") (VADDLVQ_P_U "vaddlv")
+(VADDLVQ_S "vaddlv") (VADDLVQ_U "vaddlv")
 (VADDQ_M_N_S "vadd") (VADDQ_M_N_U "vadd") (VADDQ_M_N_F "vadd")
 (VADDQ_M_S "vadd") (VADDQ_M_U "vadd") (VADDQ_M_F "vadd")
 (VADDQ_N_S "vadd") (VADDQ_N_U "vadd") (VADDQ_N_F "vadd")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 1ccbce3c89c..c5373fef9a2 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -500,14 +500,14 @@ (define_insn "@mve_q_"
 ;;
 ;; [vaddlvq_s vaddlvq_u])
 ;;
-(define_insn "mve_vaddlvq_v4si"
+(define_insn "@mve_q_v4si"
   [
(set (match_operand:DI 0 "s_register_operand" "=r")
(unspec:DI [(match_operand:V4SI 1 "s_register_operand" "w")]
 VADDLVQ))
   ]
   "TARGET_HAVE_MVE"
-  "vaddlv.32\t%Q0, %R0, %q1"
+  ".32\t%Q0, %R0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -666,7 +666,7 @@ (define_insn "mve_vcvtq_n_from_f_"
 ;;
 ;; [vaddlvq_p_s])
 ;;
-(define_insn "mve_vaddlvq_p_v4si"
+(define_insn "@mve_q_p_v4si"
   [
(set (match_operand:DI 0 "s_register_operand" "=r")
(unspec:DI [(match_operand:V4SI 1 "s_register_operand" "w")
@@ -674,7 +674,7 @@ (define_insn "mve_vaddlvq_p_v4si"
 VADDLVQ_P))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vaddlvt.32\t%Q0, %R0, %q1"
+  "vpst\;t.32\t%Q0, %R0, %q1"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
-- 
2.34.1



[PATCH 02/20] arm: [MVE intrinsics] add cmp shape

2023-05-10 Thread Christophe Lyon via Gcc-patches
This patch adds the cmp shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (cmp): New.
* config/arm/arm-mve-builtins-shapes.h (cmp): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 27 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 28 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index c9eac80d1e3..ea0112b3e99 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -974,6 +974,33 @@ struct binary_widen_n_def : public overloaded_base<0>
 };
 SHAPE (binary_widen_n)
 
+/* Shape for comparison operations that operate on
+   uniform types.
+
+   Examples: vcmpq.
+   mve_pred16_t [__arm_]vcmpeqq[_s16](int16x8_t a, int16x8_t b)
+   mve_pred16_t [__arm_]vcmpeqq[_n_s16](int16x8_t a, int16_t b)
+   mve_pred16_t [__arm_]vcmpeqq_m[_s16](int16x8_t a, int16x8_t b, mve_pred16_t 
p)
+   mve_pred16_t [__arm_]vcmpeqq_m[_n_s16](int16x8_t a, int16_t b, mve_pred16_t 
p)  */
+struct cmp_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "p,v0,v0", group, MODE_none, preserve_user_namespace);
+build_all (b, "p,v0,s0", group, MODE_n, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform_opt_n (2);
+  }
+};
+SHAPE (cmp)
+
 /* xN_t vfoo[_t0](uint64_t, uint64_t)
 
where there are N arguments in total.
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 7f582d7375a..59c4dc39c39 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -49,6 +49,7 @@ namespace arm_mve
 extern const function_shape *const binary_rshift_narrow;
 extern const function_shape *const binary_rshift_narrow_unsigned;
 extern const function_shape *const binary_widen_n;
+extern const function_shape *const cmp;
 extern const function_shape *const create;
 extern const function_shape *const inherent;
 extern const function_shape *const unary;
-- 
2.34.1



[PATCH 18/20] arm: [MVE intrinsics] factorize vmovlbq vmovltq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vmovlbq, vmovltq builtins so that they use the same
parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vmovlb, vmovlt.
(VMOVLBQ, VMOVLTQ): Merge into ...
(VMOVLxQ): ... this.
(VMOVLTQ_M, VMOVLBQ_M): Merge into ...
(VMOVLxQ_M): ... this.
* config/arm/mve.md (mve_vmovltq_)
(mve_vmovlbq_): Merge into ...
(@mve_q_): ... this.
(mve_vmovlbq_m_, mve_vmovltq_m_): Merge
into ...
(@mve_q_m_): ... this.
---
 gcc/config/arm/iterators.md | 10 +
 gcc/config/arm/mve.md   | 44 -
 2 files changed, 15 insertions(+), 39 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 84dd97249f9..2f6de937ef7 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -817,6 +817,10 @@ (define_int_attr mve_insn [
 (VMINVQ_S "vminv") (VMINVQ_U "vminv")
 (VMLAQ_M_N_S "vmla") (VMLAQ_M_N_U "vmla")
 (VMLASQ_M_N_S "vmlas") (VMLASQ_M_N_U "vmlas")
+(VMOVLBQ_M_S "vmovlb") (VMOVLBQ_M_U "vmovlb")
+(VMOVLBQ_S "vmovlb") (VMOVLBQ_U "vmovlb")
+(VMOVLTQ_M_S "vmovlt") (VMOVLTQ_M_U "vmovlt")
+(VMOVLTQ_S "vmovlt") (VMOVLTQ_U "vmovlt")
 (VMOVNBQ_M_S "vmovnb") (VMOVNBQ_M_U "vmovnb")
 (VMOVNBQ_S "vmovnb") (VMOVNBQ_U "vmovnb")
 (VMOVNTQ_M_S "vmovnt") (VMOVNTQ_M_U "vmovnt")
@@ -2318,8 +2322,7 @@ (define_int_iterator VCVTAQ [VCVTAQ_U VCVTAQ_S])
 (define_int_iterator VDUPQ_N [VDUPQ_N_U VDUPQ_N_S])
 (define_int_iterator VADDVQ [VADDVQ_U VADDVQ_S])
 (define_int_iterator VREV32Q [VREV32Q_U VREV32Q_S])
-(define_int_iterator VMOVLBQ [VMOVLBQ_S VMOVLBQ_U])
-(define_int_iterator VMOVLTQ [VMOVLTQ_U VMOVLTQ_S])
+(define_int_iterator VMOVLxQ [VMOVLBQ_S VMOVLBQ_U VMOVLTQ_U VMOVLTQ_S])
 (define_int_iterator VCVTPQ [VCVTPQ_S VCVTPQ_U])
 (define_int_iterator VCVTNQ [VCVTNQ_S VCVTNQ_U])
 (define_int_iterator VCVTMQ [VCVTMQ_S VCVTMQ_U])
@@ -2413,7 +2416,7 @@ (define_int_iterator VSLIQ_N [VSLIQ_N_S VSLIQ_N_U])
 (define_int_iterator VSRIQ_N [VSRIQ_N_S VSRIQ_N_U])
 (define_int_iterator VMLALDAVQ_P [VMLALDAVQ_P_U VMLALDAVQ_P_S])
 (define_int_iterator VQMOVNBQ_M [VQMOVNBQ_M_S VQMOVNBQ_M_U])
-(define_int_iterator VMOVLTQ_M [VMOVLTQ_M_U VMOVLTQ_M_S])
+(define_int_iterator VMOVLxQ_M [VMOVLBQ_M_U VMOVLBQ_M_S VMOVLTQ_M_U 
VMOVLTQ_M_S])
 (define_int_iterator VMOVNBQ_M [VMOVNBQ_M_U VMOVNBQ_M_S])
 (define_int_iterator VRSHRNTQ_N [VRSHRNTQ_N_U VRSHRNTQ_N_S])
 (define_int_iterator VORRQ_M_N [VORRQ_M_N_S VORRQ_M_N_U])
@@ -2421,7 +2424,6 @@ (define_int_iterator VREV32Q_M [VREV32Q_M_S VREV32Q_M_U])
 (define_int_iterator VREV16Q_M [VREV16Q_M_S VREV16Q_M_U])
 (define_int_iterator VQRSHRNTQ_N [VQRSHRNTQ_N_U VQRSHRNTQ_N_S])
 (define_int_iterator VMOVNTQ_M [VMOVNTQ_M_U VMOVNTQ_M_S])
-(define_int_iterator VMOVLBQ_M [VMOVLBQ_M_U VMOVLBQ_M_S])
 (define_int_iterator VMLALDAVAQ [VMLALDAVAQ_S VMLALDAVAQ_U])
 (define_int_iterator VQSHRNBQ_N [VQSHRNBQ_N_U VQSHRNBQ_N_S])
 (define_int_iterator VSHRNBQ_N [VSHRNBQ_N_U VSHRNBQ_N_S])
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index c5373fef9a2..f5cb8ef48ef 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -386,30 +386,17 @@ (define_insn "@mve_q_"
 ])
 
 ;;
-;; [vmovltq_u, vmovltq_s])
+;; [vmovlbq_s, vmovlbq_u]
+;; [vmovltq_u, vmovltq_s]
 ;;
-(define_insn "mve_vmovltq_"
-  [
-   (set (match_operand: 0 "s_register_operand" "=w")
-   (unspec: [(match_operand:MVE_3 1 "s_register_operand" 
"w")]
-VMOVLTQ))
-  ]
-  "TARGET_HAVE_MVE"
-  "vmovlt.%#   %q0, %q1"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vmovlbq_s, vmovlbq_u])
-;;
-(define_insn "mve_vmovlbq_"
+(define_insn "@mve_q_"
   [
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand:MVE_3 1 "s_register_operand" 
"w")]
-VMOVLBQ))
+VMOVLxQ))
   ]
   "TARGET_HAVE_MVE"
-  "vmovlb.%#   %q0, %q1"
+  ".%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2904,34 +2891,21 @@ (define_insn "mve_vmlsldavxq_p_s"
   "vpst\;vmlsldavxt.s%# %Q0, %R0, %q1, %q2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
+
 ;;
 ;; [vmovlbq_m_u, vmovlbq_m_s])
-;;
-(define_insn "mve_vmovlbq_m_"
-  [
-   (set (match_operand: 0 "s_register_operand" "=w")
-   (unspec: [(match_operand: 1 
"s_register_operand" "0")
-  (match_operand:MVE_3 2 "s_register_operand" "w")
-  (match_operand: 3 "vpr_register_operand" 
"Up")]
-VMOVLBQ_M))
-  ]
-  "TARGET_HAVE_MVE"
-  "vpst\;vmovlbt.%#   %q0, %q2"
-  [(set_attr "type" "mve_move")
-   (set_attr "length""8")])
-;;
 ;; [vmovltq_m_u, vmovltq_m_s])
 ;;
-(define_insn "mve_vmovltq_m_"
+(define_insn "@mve_q_m_"
   [
(set (match_operand: 0 "s_register_operand" "=w")
(unspec: [(match_operand: 1 
"s_register_operand" "0")
  

[PATCH 19/20] arm: [MVE intrinsics] add unary_widen shape

2023-05-10 Thread Christophe Lyon via Gcc-patches
This patch adds the unary_widen shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_widen): New.
* config/arm/arm-mve-builtins-shapes.h (unary_widen): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 46 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 47 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index e77a0cc20ac..ae73fc6b1b7 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1236,6 +1236,52 @@ struct unary_n_def : public overloaded_base<0>
 };
 SHAPE (unary_n)
 
+/* _t vfoo[_t0](_t)
+
+   i.e. a version of "unary" in which the source elements are half the
+   size of the destination, but have the same type class.
+
+   Example: vmovlbq.
+   int32x4_t [__arm_]vmovlbq[_s16](int16x8_t a)
+   int32x4_t [__arm_]vmovlbq_m[_s16](int32x4_t inactive, int16x8_t a, 
mve_pred16_t p)
+   int32x4_t [__arm_]vmovlbq_x[_s16](int16x8_t a, mve_pred16_t p)  */
+struct unary_widen_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "vw0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+tree res;
+if (!r.check_gp_argument (1, i, nargs)
+   || (type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+type_suffix_index wide_suffix
+  = find_type_suffix (type_suffixes[type].tclass,
+ type_suffixes[type].element_bits * 2);
+
+/* Check the inactive argument has the wide type.  */
+if ((r.pred == PRED_m)
+   && (r.infer_vector_type (0) != wide_suffix))
+return r.report_no_such_form (type);
+
+if ((res = r.lookup_form (r.mode_suffix_id, type)))
+   return res;
+
+return r.report_no_such_form (type);
+  }
+};
+SHAPE (unary_widen)
+
 } /* end namespace arm_mve */
 
 #undef SHAPE
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index c062fe624c4..5a8d9fe2b2d 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -58,6 +58,7 @@ namespace arm_mve
 extern const function_shape *const unary_int32;
 extern const function_shape *const unary_int32_acc;
 extern const function_shape *const unary_n;
+extern const function_shape *const unary_widen;
 
   } /* end namespace arm_mve::shapes */
 } /* end namespace arm_mve */
-- 
2.34.1



[PATCH 17/20] arm: [MVE intrinsics] rework vaddlvq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Implement vaddlvq using the new MVE builtins framework.

Since we kept v4si hardcoded in the builtin name, we need to
special-case it in unspec_mve_function_exact_insn_pred_p.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vaddlvq): New.
* config/arm/arm-mve-builtins-base.def (vaddlvq): New.
* config/arm/arm-mve-builtins-base.h (vaddlvq): New.
* config/arm/arm-mve-builtins-functions.h
(unspec_mve_function_exact_insn_pred_p): Handle vaddlvq.
* config/arm/arm_mve.h (vaddlvq): Remove.
(vaddlvq_p): Remove.
(vaddlvq_s32): Remove.
(vaddlvq_u32): Remove.
(vaddlvq_p_s32): Remove.
(vaddlvq_p_u32): Remove.
(__arm_vaddlvq_s32): Remove.
(__arm_vaddlvq_u32): Remove.
(__arm_vaddlvq_p_s32): Remove.
(__arm_vaddlvq_p_u32): Remove.
(__arm_vaddlvq): Remove.
(__arm_vaddlvq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc |  1 +
 gcc/config/arm/arm-mve-builtins-base.def|  1 +
 gcc/config/arm/arm-mve-builtins-base.h  |  1 +
 gcc/config/arm/arm-mve-builtins-functions.h | 69 ++--
 gcc/config/arm/arm_mve.h| 72 -
 5 files changed, 51 insertions(+), 93 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index e87069b0467..fdc0ff50b96 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -244,6 +244,7 @@ namespace arm_mve {
 FUNCTION_WITHOUT_N (vabdq, VABDQ)
 FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, 
-1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1))
 FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ)
+FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ)
 FUNCTION_PRED_P_S_U (vaddvq, VADDVQ)
 FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ)
 FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 413fe4a1ef0..dcfb426a7fb 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -20,6 +20,7 @@
 #define REQUIRES_FLOAT false
 DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none)
+DEF_MVE_FUNCTION (vaddlvq, unary_acc, integer_32, p_or_none)
 DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 5338b777444..5de70d5e1d4 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -25,6 +25,7 @@ namespace functions {
 
 extern const function_base *const vabdq;
 extern const function_base *const vabsq;
+extern const function_base *const vaddlvq;
 extern const function_base *const vaddq;
 extern const function_base *const vaddvaq;
 extern const function_base *const vaddvq;
diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
b/gcc/config/arm/arm-mve-builtins-functions.h
index d069990dcab..ea926e42b81 100644
--- a/gcc/config/arm/arm-mve-builtins-functions.h
+++ b/gcc/config/arm/arm-mve-builtins-functions.h
@@ -408,32 +408,59 @@ public:
   expand (function_expander &e) const override
   {
 insn_code code;
-switch (e.pred)
+
+if ((m_unspec_for_sint == VADDLVQ_S)
+   || m_unspec_for_sint == VADDLVAQ_S)
   {
-  case PRED_none:
-   if (e.type_suffix (0).integer_p)
- if (e.type_suffix (0).unsigned_p)
-   code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, 
e.vector_mode (0));
- else
-   code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, 
e.vector_mode (0));
-   else
- code = code_for_mve_q_f (m_unspec_for_fp, e.vector_mode (0));
+   switch (e.pred)
+ {
+ case PRED_none:
+   if (e.type_suffix (0).unsigned_p)
+ code = code_for_mve_q_v4si (m_unspec_for_uint, m_unspec_for_uint);
+   else
+ code = code_for_mve_q_v4si (m_unspec_for_sint, m_unspec_for_sint);
+   return e.use_exact_insn (code);
 
-   return e.use_exact_insn (code);
+ case PRED_p:
+   if (e.type_suffix (0).unsigned_p)
+ code = code_for_mve_q_p_v4si (m_unspec_for_p_uint, 
m_unspec_for_p_uint);
+   else
+ code = code_for_mve_q_p_v4si (m_unspec_for_p_sint, 
m_unspec_for_p_sint);
+   return e.use_exact_insn (code);
 
-  case PRED_p:
-   if (e.type_suffix (0).integer_p)
- if (e.type_suffix (0).unsigned_p)
-   code = code_for_mve_q_p (m_unspec_for_p_uint, m_unspec_for_p_uint, 
e.vector_mode (0));
- else
-   code = code_for_mve_q_p (m_unspec_for_p_sint, m_unspec_for_p_sint, 
e.vector_mode (0));
-   else
- code = 

[PATCH 08/20] arm: [MVE intrinsics] rework vdupq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Implement vdupq using the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (FUNCTION_ONLY_N): New.
(vdupq): New.
* config/arm/arm-mve-builtins-base.def (vdupq): New.
* config/arm/arm-mve-builtins-base.h: (vdupq): New.
* config/arm/arm_mve.h (vdupq_n): Remove.
(vdupq_m): Remove.
(vdupq_n_f16): Remove.
(vdupq_n_f32): Remove.
(vdupq_n_s8): Remove.
(vdupq_n_s16): Remove.
(vdupq_n_s32): Remove.
(vdupq_n_u8): Remove.
(vdupq_n_u16): Remove.
(vdupq_n_u32): Remove.
(vdupq_m_n_u8): Remove.
(vdupq_m_n_s8): Remove.
(vdupq_m_n_u16): Remove.
(vdupq_m_n_s16): Remove.
(vdupq_m_n_u32): Remove.
(vdupq_m_n_s32): Remove.
(vdupq_m_n_f16): Remove.
(vdupq_m_n_f32): Remove.
(vdupq_x_n_s8): Remove.
(vdupq_x_n_s16): Remove.
(vdupq_x_n_s32): Remove.
(vdupq_x_n_u8): Remove.
(vdupq_x_n_u16): Remove.
(vdupq_x_n_u32): Remove.
(vdupq_x_n_f16): Remove.
(vdupq_x_n_f32): Remove.
(__arm_vdupq_n_s8): Remove.
(__arm_vdupq_n_s16): Remove.
(__arm_vdupq_n_s32): Remove.
(__arm_vdupq_n_u8): Remove.
(__arm_vdupq_n_u16): Remove.
(__arm_vdupq_n_u32): Remove.
(__arm_vdupq_m_n_u8): Remove.
(__arm_vdupq_m_n_s8): Remove.
(__arm_vdupq_m_n_u16): Remove.
(__arm_vdupq_m_n_s16): Remove.
(__arm_vdupq_m_n_u32): Remove.
(__arm_vdupq_m_n_s32): Remove.
(__arm_vdupq_x_n_s8): Remove.
(__arm_vdupq_x_n_s16): Remove.
(__arm_vdupq_x_n_s32): Remove.
(__arm_vdupq_x_n_u8): Remove.
(__arm_vdupq_x_n_u16): Remove.
(__arm_vdupq_x_n_u32): Remove.
(__arm_vdupq_n_f16): Remove.
(__arm_vdupq_n_f32): Remove.
(__arm_vdupq_m_n_f16): Remove.
(__arm_vdupq_m_n_f32): Remove.
(__arm_vdupq_x_n_f16): Remove.
(__arm_vdupq_x_n_f32): Remove.
(__arm_vdupq_n): Remove.
(__arm_vdupq_m): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |  10 +
 gcc/config/arm/arm-mve-builtins-base.def |   2 +
 gcc/config/arm/arm-mve-builtins-base.h   |   1 +
 gcc/config/arm/arm_mve.h | 333 ---
 4 files changed, 13 insertions(+), 333 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 76294ddb7fb..cb572130c2b 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -176,6 +176,15 @@ namespace arm_mve {
 UNSPEC##_M_S, UNSPEC##_M_U, UNSPEC##_M_F,  \
 -1, -1, -1))
 
+  /* Helper for builtins with only unspec codes, _m predicated
+ overrides, only _n version.  */
+#define FUNCTION_ONLY_N(NAME, UNSPEC) FUNCTION \
+  (NAME, unspec_mve_function_exact_insn,   \
+   (-1, -1, -1,
\
+UNSPEC##_N_S, UNSPEC##_N_U, UNSPEC##_N_F,  \
+-1, -1, -1,
\
+UNSPEC##_M_N_S, UNSPEC##_M_N_U, UNSPEC##_M_N_F))
+
   /* Helper for builtins with only unspec codes, _m predicated
  overrides, only _n version, no floating-point.  */
 #define FUNCTION_ONLY_N_NO_F(NAME, UNSPEC) FUNCTION\
@@ -247,6 +256,7 @@ FUNCTION (vcmpltq, 
unspec_based_mve_function_exact_insn_vcmp, (LT, UNKNOWN, LT,
 FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, 
UNKNOWN, UNKNOWN, VCMPCSQ_M_U, UNKNOWN, UNKNOWN, VCMPCSQ_M_N_U, UNKNOWN))
 FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, 
UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN))
 FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ)
+FUNCTION_ONLY_N (vdupq, VDUPQ)
 FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ)
 FUNCTION_WITH_M_N_NO_F (vhaddq, VHADDQ)
 FUNCTION_WITH_M_N_NO_F (vhsubq, VHSUBQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 2602cbf20e3..30e6aa1e1e6 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -33,6 +33,7 @@ DEF_MVE_FUNCTION (vcmpleq, cmp, all_signed, m_or_none)
 DEF_MVE_FUNCTION (vcmpltq, cmp, all_signed, m_or_none)
 DEF_MVE_FUNCTION (vcmpneq, cmp, all_integer, m_or_none)
 DEF_MVE_FUNCTION (vcreateq, create, all_integer_with_64, none)
+DEF_MVE_FUNCTION (vdupq, unary_n, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (veorq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vhaddq, binary_opt_n, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vhsubq, binary_opt_n, all_integer, mx_or_none)
@@ -104,6 +105,7 @@ DEF_MVE_FUNCTION (vcmpleq, cmp, all_float, m_or_none)
 DEF_MVE_FUNCTION (vcmpltq, cmp, all_float, m_or_none)
 DEF_MVE_FUNC

[PATCH 13/20] arm: [MVE intrinsics] add unary_int32_acc shape

2023-05-10 Thread Christophe Lyon via Gcc-patches
This patch adds the unary_int32_acc shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_int32_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_int32_acc): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 34 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 35 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 0bd91b24147..bff1c3e843b 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1121,6 +1121,40 @@ struct unary_int32_def : public overloaded_base<0>
 };
 SHAPE (unary_int32)
 
+/* [u]int32_t vfoo[_]([u]int32_t, _t)
+
+   i.e. a version of "unary" which accumulates into scalar of type
+   int32_t or uint32_t depending on the signedness of the elements of
+   of input vector.
+
+   Example: vaddvaq.
+   int32_t [__arm_]vaddvaq[_s16](int32_t a, int16x8_t b)
+   int32_t [__arm_]vaddvaq_p[_s16](int32_t a, int16x8_t b, mve_pred16_t p)  */
+struct unary_int32_acc_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sx32,sx32,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (2, i, nargs)
+   || !r.require_integer_immediate (0)
+   || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+};
+SHAPE (unary_int32_acc)
+
 /* _t vfoo[_n]_t0(_t)
 
Example: vdupq.
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index f422550559e..fc1bacbd4da 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -55,6 +55,7 @@ namespace arm_mve
 extern const function_shape *const unary;
 extern const function_shape *const unary_convert;
 extern const function_shape *const unary_int32;
+extern const function_shape *const unary_int32_acc;
 extern const function_shape *const unary_n;
 
   } /* end namespace arm_mve::shapes */
-- 
2.34.1



[PATCH 12/20] arm: [MVE intrinsics] factorize vaddvaq

2023-05-10 Thread Christophe Lyon via Gcc-patches
Factorize vaddvaq builtins so that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vaddva.
* config/arm/mve.md (mve_vaddvaq_): Rename into ...
(@mve_q_): ... this.
(mve_vaddvaq_p_): Rename into ...
(@mve_q_p_): ... this.
---
 gcc/config/arm/iterators.md | 2 ++
 gcc/config/arm/mve.md   | 8 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 46c7ddeda67..00123c0a376 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -762,6 +762,8 @@ (define_int_attr mve_insn [
 (VADDQ_M_N_S "vadd") (VADDQ_M_N_U "vadd") (VADDQ_M_N_F "vadd")
 (VADDQ_M_S "vadd") (VADDQ_M_U "vadd") (VADDQ_M_F "vadd")
 (VADDQ_N_S "vadd") (VADDQ_N_U "vadd") (VADDQ_N_F "vadd")
+(VADDVAQ_P_S "vaddva") (VADDVAQ_P_U "vaddva")
+(VADDVAQ_S "vaddva") (VADDVAQ_U "vaddva")
 (VADDVQ_P_S "vaddv") (VADDVQ_P_U "vaddv")
 (VADDVQ_S "vaddv") (VADDVQ_U "vaddv")
 (VANDQ_M_S "vand") (VANDQ_M_U "vand") (VANDQ_M_F "vand")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index d772f4d4380..1ccbce3c89c 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -758,7 +758,7 @@ (define_insn "@mve_q_n_"
 ;;
 ;; [vaddvaq_s, vaddvaq_u])
 ;;
-(define_insn "mve_vaddvaq_"
+(define_insn "@mve_q_"
   [
(set (match_operand:SI 0 "s_register_operand" "=Te")
(unspec:SI [(match_operand:SI 1 "s_register_operand" "0")
@@ -766,7 +766,7 @@ (define_insn "mve_vaddvaq_"
 VADDVAQ))
   ]
   "TARGET_HAVE_MVE"
-  "vaddva.%#\t%0, %q2"
+  ".%#\t%0, %q2"
   [(set_attr "type" "mve_move")
 ])
 
@@ -1944,7 +1944,7 @@ (define_insn "@mve_q_m_"
 ;;
 ;; [vaddvaq_p_u, vaddvaq_p_s])
 ;;
-(define_insn "mve_vaddvaq_p_"
+(define_insn "@mve_q_p_"
   [
(set (match_operand:SI 0 "s_register_operand" "=Te")
(unspec:SI [(match_operand:SI 1 "s_register_operand" "0")
@@ -1953,7 +1953,7 @@ (define_insn "mve_vaddvaq_p_"
 VADDVAQ_P))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vaddvat.%#   %0, %q2"
+  "vpst\;t.%#\t%0, %q2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
-- 
2.34.1



[PATCH 05/20] arm: [MVE intrinsics] rework vrev16q vrev32q vrev64q

2023-05-10 Thread Christophe Lyon via Gcc-patches
Implement vrev16q, vrev32q, vrev64q using the new MVE builtins
framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vrev16q, vrev32q, vrev64q):
New.
* config/arm/arm-mve-builtins-base.def (vrev16q, vrev32q)
(vrev64q): New.
* config/arm/arm-mve-builtins-base.h (vrev16q, vrev32q)
(vrev64q): New.
* config/arm/arm_mve.h (vrev16q): Remove.
(vrev32q): Remove.
(vrev64q): Remove.
(vrev64q_m): Remove.
(vrev16q_m): Remove.
(vrev32q_m): Remove.
(vrev16q_x): Remove.
(vrev32q_x): Remove.
(vrev64q_x): Remove.
(vrev64q_f16): Remove.
(vrev64q_f32): Remove.
(vrev32q_f16): Remove.
(vrev16q_s8): Remove.
(vrev32q_s8): Remove.
(vrev32q_s16): Remove.
(vrev64q_s8): Remove.
(vrev64q_s16): Remove.
(vrev64q_s32): Remove.
(vrev64q_u8): Remove.
(vrev64q_u16): Remove.
(vrev64q_u32): Remove.
(vrev32q_u8): Remove.
(vrev32q_u16): Remove.
(vrev16q_u8): Remove.
(vrev64q_m_u8): Remove.
(vrev64q_m_s8): Remove.
(vrev64q_m_u16): Remove.
(vrev64q_m_s16): Remove.
(vrev64q_m_u32): Remove.
(vrev64q_m_s32): Remove.
(vrev16q_m_s8): Remove.
(vrev32q_m_f16): Remove.
(vrev16q_m_u8): Remove.
(vrev32q_m_s8): Remove.
(vrev64q_m_f16): Remove.
(vrev32q_m_u8): Remove.
(vrev32q_m_s16): Remove.
(vrev64q_m_f32): Remove.
(vrev32q_m_u16): Remove.
(vrev16q_x_s8): Remove.
(vrev16q_x_u8): Remove.
(vrev32q_x_s8): Remove.
(vrev32q_x_s16): Remove.
(vrev32q_x_u8): Remove.
(vrev32q_x_u16): Remove.
(vrev64q_x_s8): Remove.
(vrev64q_x_s16): Remove.
(vrev64q_x_s32): Remove.
(vrev64q_x_u8): Remove.
(vrev64q_x_u16): Remove.
(vrev64q_x_u32): Remove.
(vrev32q_x_f16): Remove.
(vrev64q_x_f16): Remove.
(vrev64q_x_f32): Remove.
(__arm_vrev16q_s8): Remove.
(__arm_vrev32q_s8): Remove.
(__arm_vrev32q_s16): Remove.
(__arm_vrev64q_s8): Remove.
(__arm_vrev64q_s16): Remove.
(__arm_vrev64q_s32): Remove.
(__arm_vrev64q_u8): Remove.
(__arm_vrev64q_u16): Remove.
(__arm_vrev64q_u32): Remove.
(__arm_vrev32q_u8): Remove.
(__arm_vrev32q_u16): Remove.
(__arm_vrev16q_u8): Remove.
(__arm_vrev64q_m_u8): Remove.
(__arm_vrev64q_m_s8): Remove.
(__arm_vrev64q_m_u16): Remove.
(__arm_vrev64q_m_s16): Remove.
(__arm_vrev64q_m_u32): Remove.
(__arm_vrev64q_m_s32): Remove.
(__arm_vrev16q_m_s8): Remove.
(__arm_vrev16q_m_u8): Remove.
(__arm_vrev32q_m_s8): Remove.
(__arm_vrev32q_m_u8): Remove.
(__arm_vrev32q_m_s16): Remove.
(__arm_vrev32q_m_u16): Remove.
(__arm_vrev16q_x_s8): Remove.
(__arm_vrev16q_x_u8): Remove.
(__arm_vrev32q_x_s8): Remove.
(__arm_vrev32q_x_s16): Remove.
(__arm_vrev32q_x_u8): Remove.
(__arm_vrev32q_x_u16): Remove.
(__arm_vrev64q_x_s8): Remove.
(__arm_vrev64q_x_s16): Remove.
(__arm_vrev64q_x_s32): Remove.
(__arm_vrev64q_x_u8): Remove.
(__arm_vrev64q_x_u16): Remove.
(__arm_vrev64q_x_u32): Remove.
(__arm_vrev64q_f16): Remove.
(__arm_vrev64q_f32): Remove.
(__arm_vrev32q_f16): Remove.
(__arm_vrev32q_m_f16): Remove.
(__arm_vrev64q_m_f16): Remove.
(__arm_vrev64q_m_f32): Remove.
(__arm_vrev32q_x_f16): Remove.
(__arm_vrev64q_x_f16): Remove.
(__arm_vrev64q_x_f32): Remove.
(__arm_vrev16q): Remove.
(__arm_vrev32q): Remove.
(__arm_vrev64q): Remove.
(__arm_vrev64q_m): Remove.
(__arm_vrev16q_m): Remove.
(__arm_vrev32q_m): Remove.
(__arm_vrev16q_x): Remove.
(__arm_vrev32q_x): Remove.
(__arm_vrev64q_x): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   3 +
 gcc/config/arm/arm-mve-builtins-base.def |   5 +
 gcc/config/arm/arm-mve-builtins-base.h   |   3 +
 gcc/config/arm/arm_mve.h | 820 ---
 4 files changed, 11 insertions(+), 820 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 14870f5b1aa..76294ddb7fb 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -293,6 +293,9 @@ FUNCTION_ONLY_N_NO_U_F (vqshrunbq, VQSHRUNBQ)
 FUNCTION_ONLY_N_NO_U_F (vqshruntq, VQSHRUNTQ)
 FUNCTION_WITH_M_N_NO_F (vqsubq, VQSUBQ)
 FUNCTION (vreinterpretq, vreinterpretq_impl,)
+FUNCTION_WITHOUT_N_NO_F (vrev16q, VREV16Q)
+FUNCTION_WITHOUT_N (vrev32q, VREV32Q)
+FUNCTION_WITHOUT_N (vrev64q, VREV64Q)
 FUNCTION_WITHOUT_N_NO_F (vrhaddq, VRHADDQ)
 FUNCTION_

Re: Testsuite: Add 'torture-init-done', and use it to conditionalize implicit 'torture-init' (was: Testsuite: Add missing 'torture-init'/'torture-finish' around 'LTO_TORTURE_OPTIONS' usage (was: Let e

2023-05-10 Thread Christophe Lyon via Gcc-patches
Hi Thomas,


On Wed, 10 May 2023 at 09:52, Thomas Schwinge 
wrote:

> Hi Christophe!
>
> On 2023-05-09T21:14:07+0200, Christophe Lyon 
> wrote:
> > On Tue, 9 May 2023 at 17:17, Christophe Lyon  >
> > wrote:
> >> On Tue, 9 May 2023 at 11:00, Thomas Schwinge 
> >> wrote:
> >>> On 2023-05-09T09:32:55+0200, Christophe Lyon <
> christophe.l...@linaro.org>
> >>> wrote:
> >>> > On Wed, 3 May 2023 at 13:47, Richard Biener via Gcc-patches <
> >>> gcc-patches@gcc.gnu.org> wrote:
> >>> >> On Wed, 3 May 2023, Thomas Schwinge wrote:
> >>> >> > "Let each 'lto_init' determine the default 'LTO_OPTIONS', and
> >>> 'torture-init' the 'LTO_TORTURE_OPTIONS'"?
> >>> >
> >>> > This is causing issues on arm/aarch64, including:
> >>> >
> >>> > ERROR: can't read "LTO_TORTURE_OPTIONS": no such variable
> >>> > in gcc.target/arm/acle/acle.exp:
> >>> >
> >>> > ERROR: torture-init: LTO_TORTURE_OPTIONS is not empty as expected
> >>> > in gcc.target/aarch64/sls-mitigation/sls-mitigation.exp,
> >>> > gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp,
> >>> > gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp,
> >>> > gcc.target/aarch64/torture/aarch64-torture.exp
> >>> >
> >>> > and maybe others
> >>> >
> >>> > Are other targets affected too?
> >>>
> >>> Sorry for that -- it means, the safe-guards I added are working as
> >>> expected.
> >>>
> >>> Please test whether all these issues are gone with the attached
> >>> "Testsuite: Add missing 'torture-init'/'torture-finish' around
> >>> 'LTO_TORTURE_OPTIONS' usage"?
> >>
> >> Your patch seemed reasonable,  but it doesn't work :-(
> >>
> >> Well now I get:
> >> ERROR: torture-init: LTO_TORTURE_OPTIONS is not empty as expected
> >> because gcc-dg-runtest itself calls torture-init
> >>
> >> but I'm not sure where LTO_TORTURE_OPTIONS is set
> >
> > Just checking, are you able to test your changes on arm (a cross
> toolchain
> > is OK) ?
>
> Sorry, I don't currently have an arm/aarch64 toolchain built.
>
> > The problem shows up even if running only acle.exp, so it's quick once
> you
> > have built the toolchain once.
>
> I did a quick hack:
>
> --- gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-mitigation.exp
> +++ gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-mitigation.exp
> @@ -22,3 +21,0 @@
> -if {![istarget aarch64*-*-*] } then {
> -  return
> -}
> --- gcc/testsuite/gcc.target/arm/acle/acle.exp
> +++ gcc/testsuite/gcc.target/arm/acle/acle.exp
> @@ -20,3 +19,0 @@
> -if ![istarget arm*-*-*] then {
> -  return
> -}
>
> ..., and confirm to run into the DejaGnu/TCL ERRORs in my
> x86_64-pc-linux-gnu testing.
>
> > I spent some time looking at it, and the conflict is that the .exp file
> > calls torture-init and gcc-dg-runtest, which in turn calls torture-init
> > again, leading to the error.
>
> I see, thanks -- and sorry, once again.
>
> > I haven't checked the details of why there are similar failures on
> aarch64.
>
> I now understand that the problem is the following: most of all '*.exp'
> files have 'torture-init' followed by 'set-torture-options' before
> 'gcc-dg-runtest' etc., and therefore don't run into the latter's
> "Some callers set torture options themselves; don't override those."
> code.  Some '*.exp' files however do 'torture-init' but not
> 'set-torture-options', and therefore we can't any longer conditionalize
> the implicit 'torture-init' by '![torture-options-exist]'.
> Please in addition to the earlier
> "Testsuite: Add missing 'torture-init'/'torture-finish' around
> 'LTO_TORTURE_OPTIONS' usage"
> also apply the attached
> "Testsuite: Add 'torture-init-done', and use it to conditionalize implicit
> 'torture-init'".
> That hopefully should restore sanity -- if not, I'll get arm/aarch64
> toolchains built.
>
>
Thanks for the patch, it seems to work!

Christophe


>
> Grüße
>  Thomas
>
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
> Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
> Registergericht München, HRB 106955
>


Re: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

2023-05-11 Thread Christophe Lyon via Gcc-patches




On 5/10/23 16:52, Kyrylo Tkachov wrote:




-Original Message-
From: Christophe Lyon 
Sent: Wednesday, May 10, 2023 2:31 PM
To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
Richard Earnshaw ; Richard Sandiford

Cc: Christophe Lyon 
Subject: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

This patch adds the unary_acc shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_acc): New.
---
  gcc/config/arm/arm-mve-builtins-shapes.cc | 28 +++
  gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
  2 files changed, 29 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-
mve-builtins-shapes.cc
index bff1c3e843b..e77a0cc20ac 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1066,6 +1066,34 @@ struct unary_def : public overloaded_base<0>
  };
  SHAPE (unary)

+/* _t vfoo[_](_t)
+
+   i.e. a version of "unary" in which the source elements are half the
+   size of the destination scalar, but have the same type class.
+
+   Example: vaddlvq.
+   int64_t [__arm_]vaddlvq[_s32](int32x4_t a)
+   int64_t [__arm_]vaddlvq_p[_s32](int32x4_t a, mve_pred16_t p) */
+struct unary_acc_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none,
preserve_user_namespace);
+build_all (b, "sw0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+/* FIXME: check that the return value is actually
+   twice as wide as arg 0.  */


Any reason why we can't add that check now?
I'd rather not add new FIXMEs here...


I understand :-)

That's because the resolver only knows about the arguments, not the 
return value:

  /* The arguments to the overloaded function.  */
  vec &m_arglist;

I kept this like what already exists for AArch64/SVE, but we'll need to 
extend it to handle return values too, so that we can support all 
overloaded forms of vuninitialized

(see https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616003.html)

I meant this extension to be a follow-up work when most intrinsics have 
been converted and the few remaining ones (eg. vuninitialized) needs an 
improved framework.  And that would enable to fix the FIXME.


Thanks,

Christophe



Thanks,
Kyrill


+return r.resolve_unary ();
+  }
+};
+SHAPE (unary_acc)
+
  /* _t foo_t0[_t1](_t)

 where the target type  must be specified explicitly but the source
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-
mve-builtins-shapes.h
index fc1bacbd4da..c062fe624c4 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -53,6 +53,7 @@ namespace arm_mve
  extern const function_shape *const create;
  extern const function_shape *const inherent;
  extern const function_shape *const unary;
+extern const function_shape *const unary_acc;
  extern const function_shape *const unary_convert;
  extern const function_shape *const unary_int32;
  extern const function_shape *const unary_int32_acc;
--
2.34.1




Re: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

2023-05-11 Thread Christophe Lyon via Gcc-patches




On 5/11/23 10:23, Kyrylo Tkachov wrote:




-Original Message-
From: Christophe Lyon 
Sent: Thursday, May 11, 2023 9:21 AM
To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org;
Richard Earnshaw ; Richard Sandiford

Subject: Re: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape



On 5/10/23 16:52, Kyrylo Tkachov wrote:




-Original Message-
From: Christophe Lyon 
Sent: Wednesday, May 10, 2023 2:31 PM
To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
Richard Earnshaw ; Richard Sandiford

Cc: Christophe Lyon 
Subject: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

This patch adds the unary_acc shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_acc): New.
---
   gcc/config/arm/arm-mve-builtins-shapes.cc | 28

+++

   gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
   2 files changed, 29 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc

b/gcc/config/arm/arm-

mve-builtins-shapes.cc
index bff1c3e843b..e77a0cc20ac 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1066,6 +1066,34 @@ struct unary_def : public overloaded_base<0>
   };
   SHAPE (unary)

+/* _t vfoo[_](_t)
+
+   i.e. a version of "unary" in which the source elements are half the
+   size of the destination scalar, but have the same type class.
+
+   Example: vaddlvq.
+   int64_t [__arm_]vaddlvq[_s32](int32x4_t a)
+   int64_t [__arm_]vaddlvq_p[_s32](int32x4_t a, mve_pred16_t p) */
+struct unary_acc_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none,
preserve_user_namespace);
+build_all (b, "sw0,v0", group, MODE_none,

preserve_user_namespace);

+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+/* FIXME: check that the return value is actually
+   twice as wide as arg 0.  */


Any reason why we can't add that check now?
I'd rather not add new FIXMEs here...


I understand :-)

That's because the resolver only knows about the arguments, not the
return value:
/* The arguments to the overloaded function.  */
vec &m_arglist;

I kept this like what already exists for AArch64/SVE, but we'll need to
extend it to handle return values too, so that we can support all
overloaded forms of vuninitialized
(see https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616003.html)

I meant this extension to be a follow-up work when most intrinsics have
been converted and the few remaining ones (eg. vuninitialized) needs an
improved framework.  And that would enable to fix the FIXME.


Thanks for explaining.
The series is ok for trunk then.


Great, thanks!


Kyrill



Thanks,

Christophe



Thanks,
Kyrill


+return r.resolve_unary ();
+  }
+};
+SHAPE (unary_acc)
+
   /* _t foo_t0[_t1](_t)

  where the target type  must be specified explicitly but the source
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h

b/gcc/config/arm/arm-

mve-builtins-shapes.h
index fc1bacbd4da..c062fe624c4 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -53,6 +53,7 @@ namespace arm_mve
   extern const function_shape *const create;
   extern const function_shape *const inherent;
   extern const function_shape *const unary;
+extern const function_shape *const unary_acc;
   extern const function_shape *const unary_convert;
   extern const function_shape *const unary_int32;
   extern const function_shape *const unary_int32_acc;
--
2.34.1




Re: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

2023-05-11 Thread Christophe Lyon via Gcc-patches




On 5/11/23 10:30, Richard Sandiford wrote:

Christophe Lyon  writes:

On 5/10/23 16:52, Kyrylo Tkachov wrote:




-Original Message-
From: Christophe Lyon 
Sent: Wednesday, May 10, 2023 2:31 PM
To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ;
Richard Earnshaw ; Richard Sandiford

Cc: Christophe Lyon 
Subject: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

This patch adds the unary_acc shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_acc): New.
---
   gcc/config/arm/arm-mve-builtins-shapes.cc | 28 +++
   gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
   2 files changed, 29 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-
mve-builtins-shapes.cc
index bff1c3e843b..e77a0cc20ac 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1066,6 +1066,34 @@ struct unary_def : public overloaded_base<0>
   };
   SHAPE (unary)

+/* _t vfoo[_](_t)
+
+   i.e. a version of "unary" in which the source elements are half the
+   size of the destination scalar, but have the same type class.
+
+   Example: vaddlvq.
+   int64_t [__arm_]vaddlvq[_s32](int32x4_t a)
+   int64_t [__arm_]vaddlvq_p[_s32](int32x4_t a, mve_pred16_t p) */
+struct unary_acc_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none,
preserve_user_namespace);
+build_all (b, "sw0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+/* FIXME: check that the return value is actually
+   twice as wide as arg 0.  */


Any reason why we can't add that check now?
I'd rather not add new FIXMEs here...


I understand :-)

That's because the resolver only knows about the arguments, not the
return value:
/* The arguments to the overloaded function.  */
vec &m_arglist;

I kept this like what already exists for AArch64/SVE, but we'll need to
extend it to handle return values too, so that we can support all
overloaded forms of vuninitialized
(see https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616003.html)

I meant this extension to be a follow-up work when most intrinsics have
been converted and the few remaining ones (eg. vuninitialized) needs an
improved framework.  And that would enable to fix the FIXME.


We can't resolve based on the return type though.  It has to be
arguments only.  E.g.:

decltype(foo(a, b))

has to be well-defined, even though decltype (by design) provides no
context about "what the caller wants".



So in fact we can probably get rid of (most of) the remaining 
definitions of vuninitializedq in arm_mve.h, but not by looking at the 
return type (re-reading this I'm wondering whether I overlooked this 
when I started the series)


But for things like vaddlvq, we can't check that the result is actually 
written in a twice-as-large as the argument location?


Thanks,

Christophe



Thanks,
Richard


[PATCH 02/24] arm: [MVE intrinsics] add unary_widen_acc shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the unary_widen_acc shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (unary_widen_acc): New.
* config/arm/arm-mve-builtins-shapes.h (unary_widen_acc): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 34 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 35 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index ae73fc6b1b7..a7faf8299cb 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1282,6 +1282,40 @@ struct unary_widen_def : public overloaded_base<0>
 };
 SHAPE (unary_widen)
 
+/* _t vfoo[_](_t, _t)
+
+   i.e. a version of "unary" in which the source elements are half the
+   size of the destination scalar and accumulator, but have the same
+   type class.
+
+   Example: vaddlvaq.
+   int64_t [__arm_]vaddlvaq[_s32](int64_t a, int32x4_t b)
+   int64_t [__arm_]vaddlvaq_p[_s32](int64_t a, int32x4_t b, mve_pred16_t p)  */
+struct unary_widen_acc_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sw0,sw0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (2, i, nargs)
+   || !r.require_derived_scalar_type (0, r.SAME_TYPE_CLASS)
+   || (type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+};
+SHAPE (unary_widen_acc)
+
 } /* end namespace arm_mve */
 
 #undef SHAPE
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 5a8d9fe2b2d..46cc26ef918 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -59,6 +59,7 @@ namespace arm_mve
 extern const function_shape *const unary_int32_acc;
 extern const function_shape *const unary_n;
 extern const function_shape *const unary_widen;
+extern const function_shape *const unary_widen_acc;
 
   } /* end namespace arm_mve::shapes */
 } /* end namespace arm_mve */
-- 
2.34.1



[PATCH 01/24] arm: [MVE intrinsics] factorize vaddlvaq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vaddlvaq builtins so that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vaddlva.
* config/arm/mve.md (mve_vaddlvaq_v4si): Rename into ...
(@mve_q_v4si): ... this.
(mve_vaddlvaq_p_v4si): Rename into ...
(@mve_q_p_v4si): ... this.
---
 gcc/config/arm/iterators.md | 2 ++
 gcc/config/arm/mve.md   | 8 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 2f6de937ef7..ff146afd913 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -759,6 +759,8 @@ (define_int_attr mve_insn [
 (VABDQ_S "vabd") (VABDQ_U "vabd") (VABDQ_F "vabd")
 (VABSQ_M_F "vabs")
 (VABSQ_M_S "vabs")
+(VADDLVAQ_P_S "vaddlva") (VADDLVAQ_P_U "vaddlva")
+(VADDLVAQ_S "vaddlva") (VADDLVAQ_U "vaddlva")
 (VADDLVQ_P_S "vaddlv") (VADDLVQ_P_U "vaddlv")
 (VADDLVQ_S "vaddlv") (VADDLVQ_U "vaddlv")
 (VADDQ_M_N_S "vadd") (VADDQ_M_N_U "vadd") (VADDQ_M_N_F "vadd")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f5cb8ef48ef..b548eced4f5 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1222,7 +1222,7 @@ (define_insn "@mve_q_f"
 ;;
 ;; [vaddlvaq_s vaddlvaq_u])
 ;;
-(define_insn "mve_vaddlvaq_v4si"
+(define_insn "@mve_q_v4si"
   [
(set (match_operand:DI 0 "s_register_operand" "=r")
(unspec:DI [(match_operand:DI 1 "s_register_operand" "0")
@@ -1230,7 +1230,7 @@ (define_insn "mve_vaddlvaq_v4si"
 VADDLVAQ))
   ]
   "TARGET_HAVE_MVE"
-  "vaddlva.32\t%Q0, %R0, %q2"
+  ".32\t%Q0, %R0, %q2"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2534,7 +2534,7 @@ (define_insn "@mve_q_m_f"
 ;;
 ;; [vaddlvaq_p_s vaddlvaq_p_u])
 ;;
-(define_insn "mve_vaddlvaq_p_v4si"
+(define_insn "@mve_q_p_v4si"
   [
(set (match_operand:DI 0 "s_register_operand" "=r")
(unspec:DI [(match_operand:DI 1 "s_register_operand" "0")
@@ -2543,7 +2543,7 @@ (define_insn "mve_vaddlvaq_p_v4si"
 VADDLVAQ_P))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vaddlvat.32\t%Q0, %R0, %q2"
+  "vpst\;t.32\t%Q0, %R0, %q2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 ;;
-- 
2.34.1



[PATCH 07/24] arm: [MVE intrinsics] add binary_acca_int32 shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_acca_int32 shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_acca_int32): New.
* config/arm/arm-mve-builtins-shapes.h  (binary_acca_int32): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 37 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 38 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index e491c810b40..ceb13230da6 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -392,6 +392,43 @@ struct binary_acc_int32_def : public overloaded_base<0>
 };
 SHAPE (binary_acc_int32)
 
+/* <[u]int32>_t vfoo[_]([u]int32_t, _t, _t)
+
+   Example: vmladavaq.
+   int32_t [__arm_]vmladavaq[_s16](int32_t add, int16x8_t m1, int16x8_t m2)
+   int32_t [__arm_]vmladavaq_p[_s16](int32_t add, int16x8_t m1, int16x8_t m2, 
mve_pred16_t p)  */
+struct binary_acca_int32_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sx32,sx32,v0,v0", group, MODE_none, 
preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (3, i, nargs)
+   || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+unsigned int last_arg = i;
+for (i = 1; i < last_arg; i++)
+  if (!r.require_matching_vector_type (i, type))
+   return error_mark_node;
+
+if (!r.require_integer_immediate (0))
+  return error_mark_node;
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+};
+SHAPE (binary_acca_int32)
+
 /* _t vfoo[_n_t0](_t, const int)
 
Shape for vector shift right operations that take a vector first
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 9e877c9591a..7f68d41efe6 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -38,6 +38,7 @@ namespace arm_mve
 extern const function_shape *const binary_lshift;
 extern const function_shape *const binary_lshift_r;
 extern const function_shape *const binary_acc_int32;
+extern const function_shape *const binary_acca_int32;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
 extern const function_shape *const binary_maxvminv;
-- 
2.34.1



[PATCH 12/24] arm: [MVE intrinsics] factorize vmlaldavq vmlaldavxq vmlsldavq vmlsldavxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vmlaldavq, vmlaldavxq, vmlsldavq, vmlsldavxq builtins so
that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_VMLxLDAVxQ, MVE_VMLxLDAVxQ_P): New.
(mve_insn): Add vmlaldav, vmlaldavx, vmlsldav, vmlsldavx.
(supf): Add VMLALDAVXQ_S, VMLSLDAVQ_S, VMLSLDAVXQ_S,
VMLALDAVXQ_P_S, VMLSLDAVQ_P_S, VMLSLDAVXQ_P_S.
* config/arm/mve.md (mve_vmlaldavq_)
(mve_vmlaldavxq_s, mve_vmlsldavq_s)
(mve_vmlsldavxq_s): Merge into ...
(@mve_q_): ... this.
(mve_vmlaldavq_p_, mve_vmlaldavxq_p_s)
(mve_vmlsldavq_p_s, mve_vmlsldavxq_p_s): Merge into
...
(@mve_q_p_): ... this.
---
 gcc/config/arm/iterators.md |  28 +
 gcc/config/arm/mve.md   | 114 +---
 2 files changed, 42 insertions(+), 100 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index cafb62a574e..227ba52aed5 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -727,6 +727,20 @@ (define_int_iterator MVE_VMLxDAVAQ_P [
 VMLADAVAXQ_P_S
 ])
 
+(define_int_iterator MVE_VMLxLDAVxQ [
+VMLALDAVQ_S VMLALDAVQ_U
+VMLALDAVXQ_S
+VMLSLDAVQ_S
+VMLSLDAVXQ_S
+])
+
+(define_int_iterator MVE_VMLxLDAVxQ_P [
+VMLALDAVQ_P_S VMLALDAVQ_P_U
+VMLALDAVXQ_P_S
+VMLSLDAVQ_P_S
+VMLSLDAVXQ_P_S
+])
+
 (define_int_iterator MVE_MOVN [
 VMOVNBQ_S VMOVNBQ_U
 VMOVNTQ_S VMOVNTQ_U
@@ -855,6 +869,10 @@ (define_int_attr mve_insn [
 (VMLADAVQ_S "vmladav") (VMLADAVQ_U "vmladav")
 (VMLADAVXQ_P_S "vmladavx")
 (VMLADAVXQ_S "vmladavx")
+(VMLALDAVQ_P_S "vmlaldav") (VMLALDAVQ_P_U "vmlaldav")
+(VMLALDAVQ_S "vmlaldav") (VMLALDAVQ_U "vmlaldav")
+(VMLALDAVXQ_P_S "vmlaldavx")
+(VMLALDAVXQ_S "vmlaldavx")
 (VMLAQ_M_N_S "vmla") (VMLAQ_M_N_U "vmla")
 (VMLASQ_M_N_S "vmlas") (VMLASQ_M_N_U "vmlas")
 (VMLSDAVAQ_P_S "vmlsdava")
@@ -865,6 +883,10 @@ (define_int_attr mve_insn [
 (VMLSDAVQ_S "vmlsdav")
 (VMLSDAVXQ_P_S "vmlsdavx")
 (VMLSDAVXQ_S "vmlsdavx")
+(VMLSLDAVQ_P_S "vmlsldav")
+(VMLSLDAVQ_S "vmlsldav")
+(VMLSLDAVXQ_P_S "vmlsldavx")
+(VMLSLDAVXQ_S "vmlsldavx")
 (VMOVLBQ_M_S "vmovlb") (VMOVLBQ_M_U "vmovlb")
 (VMOVLBQ_S "vmovlb") (VMOVLBQ_U "vmovlb")
 (VMOVLTQ_M_S "vmovlt") (VMOVLTQ_M_U "vmovlt")
@@ -2295,6 +2317,12 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VMLSDAVQ_S "s")
   (VMLSDAVXQ_P_S "s")
   (VMLSDAVXQ_S "s")
+  (VMLALDAVXQ_S "s")
+  (VMLSLDAVQ_S "s")
+  (VMLSLDAVXQ_S "s")
+  (VMLALDAVXQ_P_S "s")
+  (VMLSLDAVQ_P_S "s")
+  (VMLSLDAVXQ_P_S "s")
   ])
 
 ;; Both kinds of return insn.
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index df7829bc183..584e6129ea5 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1405,62 +1405,20 @@ (define_insn "@mve_q_f"
 ])
 
 ;;
-;; [vmlaldavq_u, vmlaldavq_s])
+;; [vmlaldavq_u, vmlaldavq_s]
+;; [vmlaldavxq_s]
+;; [vmlsldavq_s]
+;; [vmlsldavxq_s]
 ;;
-(define_insn "mve_vmlaldavq_"
-  [
-   (set (match_operand:DI 0 "s_register_operand" "=r")
-   (unspec:DI [(match_operand:MVE_5 1 "s_register_operand" "w")
-   (match_operand:MVE_5 2 "s_register_operand" "w")]
-VMLALDAVQ))
-  ]
-  "TARGET_HAVE_MVE"
-  "vmlaldav.%#%Q0, %R0, %q1, %q2"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vmlaldavxq_s])
-;;
-(define_insn "mve_vmlaldavxq_s"
-  [
-   (set (match_operand:DI 0 "s_register_operand" "=r")
-   (unspec:DI [(match_operand:MVE_5 1 "s_register_operand" "w")
-   (match_operand:MVE_5 2 "s_register_operand" "w")]
-VMLALDAVXQ_S))
-  ]
-  "TARGET_HAVE_MVE"
-  "vmlaldavx.s%# %Q0, %R0, %q1, %q2"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vmlsldavq_s])
-;;
-(define_insn "mve_vmlsldavq_s"
-  [
-   (set (match_operand:DI 0 "s_register_operand" "=r")
-   (unspec:DI [(match_operand:MVE_5 1 "s_register_operand" "w")
-   (match_operand:MVE_5 2 "s_register_operand" "w")]
-VMLSLDAVQ_S))
-  ]
-  "TARGET_HAVE_MVE"
-  "vmlsldav.s%# %Q0, %R0, %q1, %q2"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vmlsldavxq_s])
-;;
-(define_insn "mve_vmlsldavxq_s"
+(define_insn "@mve_q_"
   [
(set (match_operand:DI 0 "s_regis

[PATCH 05/24] arm: [MVE intrinsics] factorize vmladav vmladavx vmlsdav vmlsdavx vmladava vmladavax vmlsdava vmlsdavax

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vmladav, vmladavx, vmlsdav, vmlsdavx, vmladava, vmladavax,
vmlsdava, vmlsdavax builtins so that they use the same parameterized
names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_VMLxDAVQ, MVE_VMLxDAVQ_P)
(MVE_VMLxDAVAQ, MVE_VMLxDAVAQ_P): New.
(mve_insn): Add vmladava, vmladavax, vmladav, vmladavx, vmlsdava,
vmlsdavax, vmlsdav, vmlsdavx.
(supf): Add VMLADAVAXQ_P_S, VMLADAVAXQ_S, VMLADAVXQ_P_S,
VMLADAVXQ_S, VMLSDAVAQ_P_S, VMLSDAVAQ_S, VMLSDAVAXQ_P_S,
VMLSDAVAXQ_S, VMLSDAVQ_P_S, VMLSDAVQ_S, VMLSDAVXQ_P_S,
VMLSDAVXQ_S.
* config/arm/mve.md (mve_vmladavq_)
(mve_vmladavxq_s, mve_vmlsdavq_s)
(mve_vmlsdavxq_s): Merge into ...
(@mve_q_): ... this.
(mve_vmlsdavaq_s, mve_vmladavaxq_s)
(mve_vmlsdavaxq_s, mve_vmladavaq_): Merge into
...
(@mve_q_): ... this.
(mve_vmladavq_p_, mve_vmladavxq_p_s)
(mve_vmlsdavq_p_s, mve_vmlsdavxq_p_s): Merge into ...
(@mve_q_p_): ... this.
(mve_vmladavaq_p_, mve_vmladavaxq_p_s)
(mve_vmlsdavaq_p_s, mve_vmlsdavaxq_p_s): Merge into
...
(@mve_q_p_): ... this.
---
 gcc/config/arm/iterators.md |  56 +
 gcc/config/arm/mve.md   | 236 +---
 2 files changed, 84 insertions(+), 208 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index ff146afd913..68f5314041b 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -699,6 +699,34 @@ (define_int_iterator MVE_VMAXAVMINAQ_M [
 VMINAQ_M_S
 ])
 
+(define_int_iterator MVE_VMLxDAVQ [
+VMLADAVQ_S VMLADAVQ_U
+VMLADAVXQ_S
+VMLSDAVQ_S
+VMLSDAVXQ_S
+])
+
+(define_int_iterator MVE_VMLxDAVQ_P [
+VMLADAVQ_P_S VMLADAVQ_P_U
+VMLADAVXQ_P_S
+VMLSDAVQ_P_S
+VMLSDAVXQ_P_S
+])
+
+(define_int_iterator MVE_VMLxDAVAQ [
+VMLADAVAQ_S VMLADAVAQ_U
+VMLSDAVAXQ_S
+VMLSDAVAQ_S
+VMLADAVAXQ_S
+])
+
+(define_int_iterator MVE_VMLxDAVAQ_P [
+VMLADAVAQ_P_S VMLADAVAQ_P_U
+VMLSDAVAXQ_P_S
+VMLSDAVAQ_P_S
+VMLADAVAXQ_P_S
+])
+
 (define_int_iterator MVE_MOVN [
 VMOVNBQ_S VMOVNBQ_U
 VMOVNTQ_S VMOVNTQ_U
@@ -817,8 +845,24 @@ (define_int_attr mve_insn [
 (VMINQ_M_S "vmin") (VMINQ_M_U "vmin")
 (VMINVQ_P_S "vminv") (VMINVQ_P_U "vminv")
 (VMINVQ_S "vminv") (VMINVQ_U "vminv")
+(VMLADAVAQ_P_S "vmladava") (VMLADAVAQ_P_U "vmladava")
+(VMLADAVAQ_S "vmladava") (VMLADAVAQ_U "vmladava")
+(VMLADAVAXQ_P_S "vmladavax")
+(VMLADAVAXQ_S "vmladavax")
+(VMLADAVQ_P_S "vmladav") (VMLADAVQ_P_U "vmladav")
+(VMLADAVQ_S "vmladav") (VMLADAVQ_U "vmladav")
+(VMLADAVXQ_P_S "vmladavx")
+(VMLADAVXQ_S "vmladavx")
 (VMLAQ_M_N_S "vmla") (VMLAQ_M_N_U "vmla")
 (VMLASQ_M_N_S "vmlas") (VMLASQ_M_N_U "vmlas")
+(VMLSDAVAQ_P_S "vmlsdava")
+(VMLSDAVAQ_S "vmlsdava")
+(VMLSDAVAXQ_P_S "vmlsdavax")
+(VMLSDAVAXQ_S "vmlsdavax")
+(VMLSDAVQ_P_S "vmlsdav")
+(VMLSDAVQ_S "vmlsdav")
+(VMLSDAVXQ_P_S "vmlsdavx")
+(VMLSDAVXQ_S "vmlsdavx")
 (VMOVLBQ_M_S "vmovlb") (VMOVLBQ_M_U "vmovlb")
 (VMOVLBQ_S "vmovlb") (VMOVLBQ_U "vmovlb")
 (VMOVLTQ_M_S "vmovlt") (VMOVLTQ_M_U "vmovlt")
@@ -2237,6 +2281,18 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VCMPLTQ_M_S "s")
   (VCMPNEQ_M_N_S "s") (VCMPNEQ_M_N_U "u")
   (VCMPNEQ_M_S "s") (VCMPNEQ_M_U "u")
+  (VMLADAVAXQ_P_S "s")
+  (VMLADAVAXQ_S "s")
+  (VMLADAVXQ_P_S "s")
+  (VMLADAVXQ_S "s")
+  (VMLSDAVAQ_P_S "s")
+  (VMLSDAVAQ_S "s")
+  (VMLSDAVAXQ_P_S "s")
+  (VMLSDAVAXQ_S "s")
+  (VMLSDAVQ_P_S "s")
+  (VMLSDAVQ_S "s")
+  (VMLSDAVXQ_P_S "s")
+  (VMLSDAVXQ_S "s")
   ])
 
 ;; Both kinds of return insn.
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index b548eced4f5..f95525db583 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -985,62 +985,20 @@ (define_insn "@mve_q_"
 ])
 
 

[PATCH 04/24] arm: [MVE intrinsics] add binary_acc_int32 shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_acc_int32 shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_acc_int32): New.
* config/arm/arm-mve-builtins-shapes.h (binary_acc_int32): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 27 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 28 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index a7faf8299cb..e491c810b40 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -365,6 +365,33 @@ struct binary_def : public overloaded_base<0>
 };
 SHAPE (binary)
 
+/* <[u]int32>_t vfoo[_](_t, _t)
+
+   i.e. the shape for binary operations that operate on a pair of
+   vectors and produce an int32_t or an uint32_t depending on the
+   signedness of the input elements.
+
+   Example: vmladavq.
+   int32_t [__arm_]vmladavq[_s16](int16x8_t m1, int16x8_t m2)
+   int32_t [__arm_]vmladavq_p[_s16](int16x8_t m1, int16x8_t m2, mve_pred16_t 
p)  */
+struct binary_acc_int32_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sx32,v0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform (2);
+  }
+};
+SHAPE (binary_acc_int32)
+
 /* _t vfoo[_n_t0](_t, const int)
 
Shape for vector shift right operations that take a vector first
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 46cc26ef918..9e877c9591a 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -37,6 +37,7 @@ namespace arm_mve
 extern const function_shape *const binary;
 extern const function_shape *const binary_lshift;
 extern const function_shape *const binary_lshift_r;
+extern const function_shape *const binary_acc_int32;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
 extern const function_shape *const binary_maxvminv;
-- 
2.34.1



[PATCH 03/24] arm: [MVE intrinsics] rework vaddlvaq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vaddlvaq using the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vaddlvaq): New.
* config/arm/arm-mve-builtins-base.def (vaddlvaq): New.
* config/arm/arm-mve-builtins-base.h (vaddlvaq): New.
* config/arm/arm_mve.h (vaddlvaq): Remove.
(vaddlvaq_p): Remove.
(vaddlvaq_u32): Remove.
(vaddlvaq_s32): Remove.
(vaddlvaq_p_s32): Remove.
(vaddlvaq_p_u32): Remove.
(__arm_vaddlvaq_u32): Remove.
(__arm_vaddlvaq_s32): Remove.
(__arm_vaddlvaq_p_s32): Remove.
(__arm_vaddlvaq_p_u32): Remove.
(__arm_vaddlvaq): Remove.
(__arm_vaddlvaq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |  1 +
 gcc/config/arm/arm-mve-builtins-base.def |  1 +
 gcc/config/arm/arm-mve-builtins-base.h   |  1 +
 gcc/config/arm/arm_mve.h | 74 
 4 files changed, 3 insertions(+), 74 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 2dec15ac0b1..070a41c2d89 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -244,6 +244,7 @@ namespace arm_mve {
 FUNCTION_WITHOUT_N (vabdq, VABDQ)
 FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, 
-1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1))
 FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ)
+FUNCTION_PRED_P_S_U (vaddlvaq, VADDLVAQ)
 FUNCTION_PRED_P_S_U (vaddlvq, VADDLVQ)
 FUNCTION_PRED_P_S_U (vaddvq, VADDVQ)
 FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index b0de5af1013..62d2050b86d 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -20,6 +20,7 @@
 #define REQUIRES_FLOAT false
 DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none)
+DEF_MVE_FUNCTION (vaddlvaq, unary_widen_acc, integer_32, p_or_none)
 DEF_MVE_FUNCTION (vaddlvq, unary_acc, integer_32, p_or_none)
 DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index fa2e97fd461..59754a03977 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -25,6 +25,7 @@ namespace functions {
 
 extern const function_base *const vabdq;
 extern const function_base *const vabsq;
+extern const function_base *const vaddlvaq;
 extern const function_base *const vaddlvq;
 extern const function_base *const vaddq;
 extern const function_base *const vaddvaq;
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index c0891b7592a..8b61593c6b0 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -66,7 +66,6 @@
 #define vmlsldavq(__a, __b) __arm_vmlsldavq(__a, __b)
 #define vmlaldavxq(__a, __b) __arm_vmlaldavxq(__a, __b)
 #define vrmlaldavhq(__a, __b) __arm_vrmlaldavhq(__a, __b)
-#define vaddlvaq(__a, __b) __arm_vaddlvaq(__a, __b)
 #define vrmlsldavhxq(__a, __b) __arm_vrmlsldavhxq(__a, __b)
 #define vrmlsldavhq(__a, __b) __arm_vrmlsldavhq(__a, __b)
 #define vrmlaldavhxq(__a, __b) __arm_vrmlaldavhxq(__a, __b)
@@ -103,7 +102,6 @@
 #define vrmlaldavhaxq(__a, __b, __c) __arm_vrmlaldavhaxq(__a, __b, __c)
 #define vrmlsldavhaq(__a, __b, __c) __arm_vrmlsldavhaq(__a, __b, __c)
 #define vrmlsldavhaxq(__a, __b, __c) __arm_vrmlsldavhaxq(__a, __b, __c)
-#define vaddlvaq_p(__a, __b, __p) __arm_vaddlvaq_p(__a, __b, __p)
 #define vrmlaldavhq_p(__a, __b, __p) __arm_vrmlaldavhq_p(__a, __b, __p)
 #define vrmlaldavhxq_p(__a, __b, __p) __arm_vrmlaldavhxq_p(__a, __b, __p)
 #define vrmlsldavhq_p(__a, __b, __p) __arm_vrmlsldavhq_p(__a, __b, __p)
@@ -474,14 +472,12 @@
 #define vctp64q_m(__a, __p) __arm_vctp64q_m(__a, __p)
 #define vctp32q_m(__a, __p) __arm_vctp32q_m(__a, __p)
 #define vctp16q_m(__a, __p) __arm_vctp16q_m(__a, __p)
-#define vaddlvaq_u32(__a, __b) __arm_vaddlvaq_u32(__a, __b)
 #define vrmlsldavhxq_s32(__a, __b) __arm_vrmlsldavhxq_s32(__a, __b)
 #define vrmlsldavhq_s32(__a, __b) __arm_vrmlsldavhq_s32(__a, __b)
 #define vrmlaldavhxq_s32(__a, __b) __arm_vrmlaldavhxq_s32(__a, __b)
 #define vrmlaldavhq_s32(__a, __b) __arm_vrmlaldavhq_s32(__a, __b)
 #define vcvttq_f16_f32(__a, __b) __arm_vcvttq_f16_f32(__a, __b)
 #define vcvtbq_f16_f32(__a, __b) __arm_vcvtbq_f16_f32(__a, __b)
-#define vaddlvaq_s32(__a, __b) __arm_vaddlvaq_s32(__a, __b)
 #define vabavq_s8(__a, __b, __c) __arm_vabavq_s8(__a, __b, __c)
 #define vabavq_s16(__a, __b, __c) __arm_vabavq_s16(__a, __b, __c)
 #define vabavq_s32(__a, __b, __c) __arm_vabavq_s32(__a, __b, __c)
@@ -615,7 +611,6 @@
 #define vrmlaldavhaxq_s32(__a, __b, __c) __arm_vrmlaldavhaxq_s32(__a, __b, __c)
 #define vrmlsldavhaq_s32(__a, __b, __c) __arm_vrmlsldavhaq

[PATCH 16/24] arm: [MVE intrinsics] add binary_acca_int64 shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_acca_int64 shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_acca_int64): New.
* config/arm/arm-mve-builtins-shapes.h (binary_acca_int64): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 37 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 38 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index f1c3844953a..af770fd3e39 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -452,6 +452,43 @@ struct binary_acca_int32_def : public overloaded_base<0>
 };
 SHAPE (binary_acca_int32)
 
+/* [u]int64_t vfoo[_]([u]int64_t, _t, _t)
+
+   Example: vmlaldavaq.
+   int64_t [__arm_]vmlaldavaq[_s16](int64_t add, int16x8_t m1, int16x8_t m2)
+   int64_t [__arm_]vmlaldavaq_p[_s16](int64_t add, int16x8_t m1, int16x8_t m2, 
mve_pred16_t p)  */
+struct binary_acca_int64_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sx64,sx64,v0,v0", group, MODE_none, 
preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (3, i, nargs)
+   || (type = r.infer_vector_type (1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+unsigned int last_arg = i;
+for (i = 1; i < last_arg; i++)
+  if (!r.require_matching_vector_type (i, type))
+   return error_mark_node;
+
+if (!r.require_integer_immediate (0))
+  return error_mark_node;
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+};
+SHAPE (binary_acca_int64)
+
 /* _t vfoo[_n_t0](_t, const int)
 
Shape for vector shift right operations that take a vector first
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 73e82d2fd7a..1c4254122bc 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -40,6 +40,7 @@ namespace arm_mve
 extern const function_shape *const binary_acc_int32;
 extern const function_shape *const binary_acc_int64;
 extern const function_shape *const binary_acca_int32;
+extern const function_shape *const binary_acca_int64;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
 extern const function_shape *const binary_maxvminv;
-- 
2.34.1



[PATCH 09/24] arm: [MVE intrinsics] factorize vabavq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vabavq builtins so that they use parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vabav.
* config/arm/mve.md (mve_vabavq_): Rename into ...
(@mve_q_): ... this,.
(mve_vabavq_p_): Rename into ...
(@mve_q_p_): ... this,.
---
 gcc/config/arm/iterators.md | 2 ++
 gcc/config/arm/mve.md   | 8 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 68f5314041b..cafb62a574e 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -783,6 +783,8 @@ (define_int_attr mve_cmp_op1 [
 ])
 
 (define_int_attr mve_insn [
+(VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav")
+(VABAVQ_S "vabav") (VABAVQ_U "vabav")
 (VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F "vabd")
 (VABDQ_S "vabd") (VABDQ_U "vabd") (VABDQ_F "vabd")
 (VABSQ_M_F "vabs")
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f95525db583..df7829bc183 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1807,7 +1807,7 @@ (define_insn "mve_vrmlaldavhaq_v4si"
 ;;
 ;; [vabavq_s, vabavq_u])
 ;;
-(define_insn "mve_vabavq_"
+(define_insn "@mve_q_"
   [
(set (match_operand:SI 0 "s_register_operand" "=r")
(unspec:SI [(match_operand:SI 1 "s_register_operand" "0")
@@ -1816,7 +1816,7 @@ (define_insn "mve_vabavq_"
 VABAVQ))
   ]
   "TARGET_HAVE_MVE"
-  "vabav.%#\t%0, %q2, %q3"
+  ".%#\t%0, %q2, %q3"
   [(set_attr "type" "mve_move")
 ])
 
@@ -3107,7 +3107,7 @@ (define_insn "mve_vrmlsldavhaq_sv4si"
 ;;
 ;; [vabavq_p_s, vabavq_p_u])
 ;;
-(define_insn "mve_vabavq_p_"
+(define_insn "@mve_q_p_"
   [
(set (match_operand:SI 0 "s_register_operand" "=r")
(unspec:SI [(match_operand:SI 1 "s_register_operand" "0")
@@ -3117,7 +3117,7 @@ (define_insn "mve_vabavq_p_"
 VABAVQ_P))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vabavt.%#\t%0, %q2, %q3"
+  "vpst\;t.%#\t%0, %q2, %q3"
   [(set_attr "type" "mve_move")
(set_attr "length" "8")])
 
-- 
2.34.1



[PATCH 13/24] arm: [MVE intrinsics] rework vmlaldavq vmlaldavxq vmlsldavq vmlsldavxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vmlaldavq, vmlaldavxq, vmlsldavq, vmlsldavxq using the new
MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmlaldavq, vmlaldavxq)
(vmlsldavq, vmlsldavxq): New.
* config/arm/arm-mve-builtins-base.def (vmlaldavq, vmlaldavxq)
(vmlsldavq, vmlsldavxq): New.
* config/arm/arm-mve-builtins-base.h (vmlaldavq, vmlaldavxq)
(vmlsldavq, vmlsldavxq): New.
* config/arm/arm_mve.h (vmlaldavq): Remove.
(vmlsldavxq): Remove.
(vmlsldavq): Remove.
(vmlaldavxq): Remove.
(vmlaldavq_p): Remove.
(vmlaldavxq_p): Remove.
(vmlsldavq_p): Remove.
(vmlsldavxq_p): Remove.
(vmlaldavq_u16): Remove.
(vmlsldavxq_s16): Remove.
(vmlsldavq_s16): Remove.
(vmlaldavxq_s16): Remove.
(vmlaldavq_s16): Remove.
(vmlaldavq_u32): Remove.
(vmlsldavxq_s32): Remove.
(vmlsldavq_s32): Remove.
(vmlaldavxq_s32): Remove.
(vmlaldavq_s32): Remove.
(vmlaldavq_p_s16): Remove.
(vmlaldavxq_p_s16): Remove.
(vmlsldavq_p_s16): Remove.
(vmlsldavxq_p_s16): Remove.
(vmlaldavq_p_u16): Remove.
(vmlaldavq_p_s32): Remove.
(vmlaldavxq_p_s32): Remove.
(vmlsldavq_p_s32): Remove.
(vmlsldavxq_p_s32): Remove.
(vmlaldavq_p_u32): Remove.
(__arm_vmlaldavq_u16): Remove.
(__arm_vmlsldavxq_s16): Remove.
(__arm_vmlsldavq_s16): Remove.
(__arm_vmlaldavxq_s16): Remove.
(__arm_vmlaldavq_s16): Remove.
(__arm_vmlaldavq_u32): Remove.
(__arm_vmlsldavxq_s32): Remove.
(__arm_vmlsldavq_s32): Remove.
(__arm_vmlaldavxq_s32): Remove.
(__arm_vmlaldavq_s32): Remove.
(__arm_vmlaldavq_p_s16): Remove.
(__arm_vmlaldavxq_p_s16): Remove.
(__arm_vmlsldavq_p_s16): Remove.
(__arm_vmlsldavxq_p_s16): Remove.
(__arm_vmlaldavq_p_u16): Remove.
(__arm_vmlaldavq_p_s32): Remove.
(__arm_vmlaldavxq_p_s32): Remove.
(__arm_vmlsldavq_p_s32): Remove.
(__arm_vmlsldavxq_p_s32): Remove.
(__arm_vmlaldavq_p_u32): Remove.
(__arm_vmlaldavq): Remove.
(__arm_vmlsldavxq): Remove.
(__arm_vmlsldavq): Remove.
(__arm_vmlaldavxq): Remove.
(__arm_vmlaldavq_p): Remove.
(__arm_vmlaldavxq_p): Remove.
(__arm_vmlsldavq_p): Remove.
(__arm_vmlsldavxq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   4 +
 gcc/config/arm/arm-mve-builtins-base.def |   4 +
 gcc/config/arm/arm-mve-builtins-base.h   |   4 +
 gcc/config/arm/arm_mve.h | 366 ---
 4 files changed, 12 insertions(+), 366 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index a81cf4cba5e..af1a2c9942a 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -285,10 +285,14 @@ FUNCTION_PRED_P_S (vmladavaxq, VMLADAVAXQ)
 FUNCTION_PRED_P_S_U (vmladavaq, VMLADAVAQ)
 FUNCTION_PRED_P_S_U (vmladavq, VMLADAVQ)
 FUNCTION_PRED_P_S (vmladavxq, VMLADAVXQ)
+FUNCTION_PRED_P_S_U (vmlaldavq, VMLALDAVQ)
+FUNCTION_PRED_P_S (vmlaldavxq, VMLALDAVXQ)
 FUNCTION_PRED_P_S (vmlsdavaq, VMLSDAVAQ)
 FUNCTION_PRED_P_S (vmlsdavaxq, VMLSDAVAXQ)
 FUNCTION_PRED_P_S (vmlsdavq, VMLSDAVQ)
 FUNCTION_PRED_P_S (vmlsdavxq, VMLSDAVXQ)
+FUNCTION_PRED_P_S (vmlsldavq, VMLSLDAVQ)
+FUNCTION_PRED_P_S (vmlsldavxq, VMLSLDAVXQ)
 FUNCTION_WITHOUT_N_NO_F (vmovlbq, VMOVLBQ)
 FUNCTION_WITHOUT_N_NO_F (vmovltq, VMOVLTQ)
 FUNCTION_WITHOUT_N_NO_F (vmovnbq, VMOVNBQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 934f45bc220..f7f353b34a7 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -54,10 +54,14 @@ DEF_MVE_FUNCTION (vmladavaq, binary_acca_int32, 
all_integer, p_or_none)
 DEF_MVE_FUNCTION (vmladavaxq, binary_acca_int32, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vmladavq, binary_acc_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vmladavxq, binary_acc_int32, all_signed, p_or_none)
+DEF_MVE_FUNCTION (vmlaldavq, binary_acc_int64, integer_16_32, p_or_none)
+DEF_MVE_FUNCTION (vmlaldavxq, binary_acc_int64, signed_16_32, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavaq, binary_acca_int32, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavaxq, binary_acca_int32, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavq, binary_acc_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavxq, binary_acc_int32, all_signed, p_or_none)
+DEF_MVE_FUNCTION (vmlsldavq, binary_acc_int64, signed_16_32, p_or_none)
+DEF_MVE_FUNCTION (vmlsldavxq, binary_acc_int64, signed_16_32, p_or_none)
 DEF_MVE_FUNCTION (vmovlbq, unary_widen, integer_8_16, mx_or_none)
 DEF_MVE_FUNCTION (vmovltq, unary_widen, integer_8_16, mx_or_none)
 DEF_MVE_FUNCTION (vmovnbq, binary_move_narrow, integer

[PATCH 20/24] arm: [MVE intrinsics] factorize vqdmladhq vqdmladhxq vqdmlsdhq vqdmlsdhxq vqrdmladhq vqrdmladhxq vqrdmlsdhq vqrdmlsdhxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vqdmladhq, vqdmladhxq, vqdmlsdhq, vqdmlsdhxq, vqrdmladhq,
vqrdmladhxq, vqrdmlsdhq, vqrdmlsdhxq builtins so that they use the
same parameterized names.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_VQxDMLxDHxQ_S): New.
(mve_insn): Add vqdmladh, vqdmladhx, vqdmlsdh, vqdmlsdhx,
vqrdmladh, vqrdmladhx, vqrdmlsdh, vqrdmlsdhx.
(supf): Add VQDMLADHQ_S, VQDMLADHXQ_S, VQDMLSDHQ_S, VQDMLSDHXQ_S,
VQRDMLADHQ_S,VQRDMLADHXQ_S, VQRDMLSDHQ_S, VQRDMLSDHXQ_S.
* config/arm/mve.md (mve_vqrdmladhq_s)
(mve_vqrdmladhxq_s, mve_vqrdmlsdhq_s)
(mve_vqrdmlsdhxq_s, mve_vqdmlsdhxq_s)
(mve_vqdmlsdhq_s, mve_vqdmladhxq_s)
(mve_vqdmladhq_s): Merge into ...
(@mve_q_): ... this.
---
 gcc/config/arm/iterators.md |  27 
 gcc/config/arm/mve.md   | 127 
 2 files changed, 38 insertions(+), 116 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 7a88bc91182..c23ca7361c1 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -755,6 +755,17 @@ (define_int_iterator MVE_VMLxLDAVAxQ_P [
 VMLSLDAVAXQ_P_S
 ])
 
+(define_int_iterator MVE_VQxDMLxDHxQ_S [
+VQDMLADHQ_S
+VQDMLADHXQ_S
+VQDMLSDHQ_S
+VQDMLSDHXQ_S
+VQRDMLADHQ_S
+VQRDMLADHXQ_S
+VQRDMLSDHQ_S
+VQRDMLSDHXQ_S
+])
+
 (define_int_iterator MVE_VRMLxLDAVxQ [
 VRMLALDAVHQ_S VRMLALDAVHQ_U
 VRMLALDAVHXQ_S
@@ -948,11 +959,15 @@ (define_int_attr mve_insn [
 (VQADDQ_N_S "vqadd") (VQADDQ_N_U "vqadd")
 (VQADDQ_S "vqadd") (VQADDQ_U "vqadd")
 (VQDMLADHQ_M_S "vqdmladh")
+(VQDMLADHQ_S "vqdmladh")
 (VQDMLADHXQ_M_S "vqdmladhx")
+(VQDMLADHXQ_S "vqdmladhx")
 (VQDMLAHQ_M_N_S "vqdmlah")
 (VQDMLASHQ_M_N_S "vqdmlash")
 (VQDMLSDHQ_M_S "vqdmlsdh")
+(VQDMLSDHQ_S "vqdmlsdh")
 (VQDMLSDHXQ_M_S "vqdmlsdhx")
+(VQDMLSDHXQ_S "vqdmlsdhx")
 (VQDMULHQ_M_N_S "vqdmulh")
 (VQDMULHQ_M_S "vqdmulh")
 (VQDMULHQ_N_S "vqdmulh")
@@ -968,11 +983,15 @@ (define_int_attr mve_insn [
 (VQNEGQ_M_S "vqneg")
 (VQNEGQ_S "vqneg")
 (VQRDMLADHQ_M_S "vqrdmladh")
+(VQRDMLADHQ_S "vqrdmladh")
 (VQRDMLADHXQ_M_S "vqrdmladhx")
+(VQRDMLADHXQ_S "vqrdmladhx")
 (VQRDMLAHQ_M_N_S "vqrdmlah")
 (VQRDMLASHQ_M_N_S "vqrdmlash")
 (VQRDMLSDHQ_M_S "vqrdmlsdh")
+(VQRDMLSDHQ_S "vqrdmlsdh")
 (VQRDMLSDHXQ_M_S "vqrdmlsdhx")
+(VQRDMLSDHXQ_S "vqrdmlsdhx")
 (VQRDMULHQ_M_N_S "vqrdmulh")
 (VQRDMULHQ_M_S "vqrdmulh")
 (VQRDMULHQ_N_S "vqrdmulh")
@@ -2379,6 +2398,14 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VMLSLDAVAQ_S "s")
   (VMLSLDAVAXQ_P_S "s")
   (VMLSLDAVAXQ_S "s")
+  (VQDMLADHQ_S "s")
+  (VQDMLADHXQ_S "s")
+  (VQDMLSDHQ_S "s")
+  (VQDMLSDHXQ_S "s")
+  (VQRDMLADHQ_S "s")
+  (VQRDMLADHXQ_S "s")
+  (VQRDMLSDHQ_S "s")
+  (VQRDMLSDHXQ_S "s")
   ])
 
 ;; Both kinds of return insn.
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index c6fd634b5c0..bf4d18455fe 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -2051,34 +2051,25 @@ (define_insn "mve_vqdmlashq_n_"
 ])
 
 ;;
-;; [vqrdmladhq_s])
+;; [vqdmladhq_s]
+;; [vqdmladhxq_s]
+;; [vqdmlsdhq_s]
+;; [vqdmlsdhxq_s]
+;; [vqrdmladhq_s]
+;; [vqrdmladhxq_s]
+;; [vqrdmlsdhq_s]
+;; [vqrdmlsdhxq_s]
 ;;
-(define_insn "mve_vqrdmladhq_s"
-  [
-   (set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
-  (match_operand:MVE_2 2 "s_register_operand" "w")
-  (match_operand:MVE_2 3 "s_register_operand" "w")]
-VQRDMLADHQ_S))
-  ]
-  "TARGET_HAVE_MVE"
-  "vqrdmladh.s%#\t%q0, %q2, %q3"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vqrdmladhxq_s])
-;;
-(define_insn "mve_vqrdmladhxq_s"
+(define_insn "@mve_q_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
   (match_operand:MVE_2 2 "s_register_operand" "w")
   (match_operand:MVE_2 3 "s_register_operand" "w")]

[PATCH 22/24] arm: [MVE intrinsics] add ternary_n shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the ternary_n shape description.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (ternary_n): New.
* config/arm/arm-mve-builtins-shapes.h (ternary_n): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 27 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 28 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 4455a253579..5a299a272f5 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1189,6 +1189,33 @@ struct ternary_def : public overloaded_base<0>
 };
 SHAPE (ternary)
 
+/* _t vfoo[_n_t0](_t, _t, _t)
+
+   i.e. the standard shape for ternary operations that operate on a
+   pair of vectors of the same type as the destination, and take a
+   third scalar argument of the same type as the vector elements.
+
+   Example: vmlaq.
+   int8x16_t [__arm_]vmlaq[_n_s8](int8x16_t add, int8x16_t m1, int8_t m2)
+   int8x16_t [__arm_]vmlaq_m[_n_s8](int8x16_t add, int8x16_t m1, int8_t m2, 
mve_pred16_t p)  */
+struct ternary_n_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_n, preserve_user_namespace);
+build_all (b, "v0,v0,v0,s0", group, MODE_n, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform (2, 1);
+  }
+};
+SHAPE (ternary_n)
+
 /* _t vfoo[_t0](_t)
 
i.e. the standard shape for unary operations that operate on
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index b3ddd0a9e8d..a28cd6a1547 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -57,6 +57,7 @@ namespace arm_mve
 extern const function_shape *const create;
 extern const function_shape *const inherent;
 extern const function_shape *const ternary;
+extern const function_shape *const ternary_n;
 extern const function_shape *const unary;
 extern const function_shape *const unary_acc;
 extern const function_shape *const unary_convert;
-- 
2.34.1



[PATCH 06/24] arm: [MVE intrinsics] rework vmladavq vmladavxq vmlsdavq vmlsdavxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vmladavq, vmladavxq, vmlsdavq, vmlsdavxq using the new MVE
builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmladavq, vmladavxq)
(vmlsdavq, vmlsdavxq): New.
* config/arm/arm-mve-builtins-base.def (vmladavq, vmladavxq)
(vmlsdavq, vmlsdavxq): New.
* config/arm/arm-mve-builtins-base.h (vmladavq, vmladavxq)
(vmlsdavq, vmlsdavxq): New.
* config/arm/arm_mve.h (vmladavq): Remove.
(vmlsdavxq): Remove.
(vmlsdavq): Remove.
(vmladavxq): Remove.
(vmladavq_p): Remove.
(vmlsdavxq_p): Remove.
(vmlsdavq_p): Remove.
(vmladavxq_p): Remove.
(vmladavq_u8): Remove.
(vmlsdavxq_s8): Remove.
(vmlsdavq_s8): Remove.
(vmladavxq_s8): Remove.
(vmladavq_s8): Remove.
(vmladavq_u16): Remove.
(vmlsdavxq_s16): Remove.
(vmlsdavq_s16): Remove.
(vmladavxq_s16): Remove.
(vmladavq_s16): Remove.
(vmladavq_u32): Remove.
(vmlsdavxq_s32): Remove.
(vmlsdavq_s32): Remove.
(vmladavxq_s32): Remove.
(vmladavq_s32): Remove.
(vmladavq_p_u8): Remove.
(vmlsdavxq_p_s8): Remove.
(vmlsdavq_p_s8): Remove.
(vmladavxq_p_s8): Remove.
(vmladavq_p_s8): Remove.
(vmladavq_p_u16): Remove.
(vmlsdavxq_p_s16): Remove.
(vmlsdavq_p_s16): Remove.
(vmladavxq_p_s16): Remove.
(vmladavq_p_s16): Remove.
(vmladavq_p_u32): Remove.
(vmlsdavxq_p_s32): Remove.
(vmlsdavq_p_s32): Remove.
(vmladavxq_p_s32): Remove.
(vmladavq_p_s32): Remove.
(__arm_vmladavq_u8): Remove.
(__arm_vmlsdavxq_s8): Remove.
(__arm_vmlsdavq_s8): Remove.
(__arm_vmladavxq_s8): Remove.
(__arm_vmladavq_s8): Remove.
(__arm_vmladavq_u16): Remove.
(__arm_vmlsdavxq_s16): Remove.
(__arm_vmlsdavq_s16): Remove.
(__arm_vmladavxq_s16): Remove.
(__arm_vmladavq_s16): Remove.
(__arm_vmladavq_u32): Remove.
(__arm_vmlsdavxq_s32): Remove.
(__arm_vmlsdavq_s32): Remove.
(__arm_vmladavxq_s32): Remove.
(__arm_vmladavq_s32): Remove.
(__arm_vmladavq_p_u8): Remove.
(__arm_vmlsdavxq_p_s8): Remove.
(__arm_vmlsdavq_p_s8): Remove.
(__arm_vmladavxq_p_s8): Remove.
(__arm_vmladavq_p_s8): Remove.
(__arm_vmladavq_p_u16): Remove.
(__arm_vmlsdavxq_p_s16): Remove.
(__arm_vmlsdavq_p_s16): Remove.
(__arm_vmladavxq_p_s16): Remove.
(__arm_vmladavq_p_s16): Remove.
(__arm_vmladavq_p_u32): Remove.
(__arm_vmlsdavxq_p_s32): Remove.
(__arm_vmlsdavq_p_s32): Remove.
(__arm_vmladavxq_p_s32): Remove.
(__arm_vmladavq_p_s32): Remove.
(__arm_vmladavq): Remove.
(__arm_vmlsdavxq): Remove.
(__arm_vmlsdavq): Remove.
(__arm_vmladavxq): Remove.
(__arm_vmladavq_p): Remove.
(__arm_vmlsdavxq_p): Remove.
(__arm_vmlsdavq_p): Remove.
(__arm_vmladavxq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   4 +
 gcc/config/arm/arm-mve-builtins-base.def |   4 +
 gcc/config/arm/arm-mve-builtins-base.h   |   4 +
 gcc/config/arm/arm_mve.h | 523 ---
 4 files changed, 12 insertions(+), 523 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 070a41c2d89..69af6f9139e 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -280,6 +280,10 @@ FUNCTION (vminnmq, unspec_based_mve_function_exact_insn, 
(UNKNOWN, UNKNOWN, SMIN
 FUNCTION_PRED_P_F (vminnmvq, VMINNMVQ)
 FUNCTION_WITH_RTX_M_NO_F (vminq, SMIN, UMIN, VMINQ)
 FUNCTION_PRED_P_S_U (vminvq, VMINVQ)
+FUNCTION_PRED_P_S_U (vmladavq, VMLADAVQ)
+FUNCTION_PRED_P_S (vmladavxq, VMLADAVXQ)
+FUNCTION_PRED_P_S (vmlsdavq, VMLSDAVQ)
+FUNCTION_PRED_P_S (vmlsdavxq, VMLSDAVXQ)
 FUNCTION_WITHOUT_N_NO_F (vmovlbq, VMOVLBQ)
 FUNCTION_WITHOUT_N_NO_F (vmovltq, VMOVLTQ)
 FUNCTION_WITHOUT_N_NO_F (vmovnbq, VMOVNBQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 62d2050b86d..40d462fc7d2 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -49,6 +49,10 @@ DEF_MVE_FUNCTION (vminaq, binary_maxamina, all_signed, 
m_or_none)
 DEF_MVE_FUNCTION (vminavq, binary_maxavminav, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vminq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vminvq, binary_maxvminv, all_integer, p_or_none)
+DEF_MVE_FUNCTION (vmladavq, binary_acc_int32, all_integer, p_or_none)
+DEF_MVE_FUNCTION (vmladavxq, binary_acc_int32, all_signed, p_or_none)
+DEF_MVE_FUNCTION (vmlsdavq, binary_acc_int32, all_integer, p_or_none)
+DEF_MVE_FUNCTION (vmlsdavxq, binary_acc_int32, all_signed, p_or_none)
 DEF_MVE_FU

[PATCH 14/24] arm: [MVE intrinsics] factorize vrmlaldavhq vrmlaldavhxq vrmlsldavhq vrmlsldavhxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vrmlaldavhq, vrmlaldavhxq, vrmlsldavhq, vrmlsldavhxq
builtins so that they use the same parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_VRMLxLDAVxQ, MVE_VRMLxLDAVHxQ_P):
New.
(mve_insn): Add vrmlaldavh, vrmlaldavhx, vrmlsldavh, vrmlsldavhx.
(supf): Add VRMLALDAVHXQ_P_S, VRMLALDAVHXQ_S, VRMLSLDAVHQ_P_S,
VRMLSLDAVHQ_S, VRMLSLDAVHXQ_P_S, VRMLSLDAVHXQ_S.
* config/arm/mve.md (mve_vrmlaldavhxq_sv4si)
(mve_vrmlsldavhq_sv4si, mve_vrmlsldavhxq_sv4si)
(mve_vrmlaldavhq_v4si): Merge into ...
(@mve_q_v4si): ... this.
(mve_vrmlaldavhxq_p_sv4si, mve_vrmlsldavhq_p_sv4si)
(mve_vrmlsldavhxq_p_sv4si, mve_vrmlaldavhq_p_v4si): Merge
into ...
(@mve_q_p_v4si): ... this.
---
 gcc/config/arm/iterators.md |  28 +
 gcc/config/arm/mve.md   | 117 +---
 2 files changed, 43 insertions(+), 102 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 227ba52aed5..729127d8586 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -741,6 +741,20 @@ (define_int_iterator MVE_VMLxLDAVxQ_P [
 VMLSLDAVXQ_P_S
 ])
 
+(define_int_iterator MVE_VRMLxLDAVxQ [
+VRMLALDAVHQ_S VRMLALDAVHQ_U
+VRMLALDAVHXQ_S
+VRMLSLDAVHQ_S
+VRMLSLDAVHXQ_S
+])
+
+(define_int_iterator MVE_VRMLxLDAVHxQ_P [
+VRMLALDAVHQ_P_S VRMLALDAVHQ_P_U
+VRMLALDAVHXQ_P_S
+VRMLSLDAVHQ_P_S
+VRMLSLDAVHXQ_P_S
+])
+
 (define_int_iterator MVE_MOVN [
 VMOVNBQ_S VMOVNBQ_U
 VMOVNTQ_S VMOVNTQ_U
@@ -979,6 +993,14 @@ (define_int_attr mve_insn [
 (VREV64Q_S "vrev64") (VREV64Q_U "vrev64") (VREV64Q_F "vrev64")
 (VRHADDQ_M_S "vrhadd") (VRHADDQ_M_U "vrhadd")
 (VRHADDQ_S "vrhadd") (VRHADDQ_U "vrhadd")
+(VRMLALDAVHQ_P_S "vrmlaldavh") (VRMLALDAVHQ_P_U "vrmlaldavh")
+(VRMLALDAVHQ_S "vrmlaldavh") (VRMLALDAVHQ_U "vrmlaldavh")
+(VRMLALDAVHXQ_P_S "vrmlaldavhx")
+(VRMLALDAVHXQ_S "vrmlaldavhx")
+(VRMLSLDAVHQ_P_S "vrmlsldavh")
+(VRMLSLDAVHQ_S "vrmlsldavh")
+(VRMLSLDAVHXQ_P_S "vrmlsldavhx")
+(VRMLSLDAVHXQ_S "vrmlsldavhx")
 (VRMULHQ_M_S "vrmulh") (VRMULHQ_M_U "vrmulh")
 (VRMULHQ_S "vrmulh") (VRMULHQ_U "vrmulh")
 (VRNDAQ_F "vrnda") (VRNDAQ_M_F "vrnda")
@@ -2323,6 +2345,12 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VMLALDAVXQ_P_S "s")
   (VMLSLDAVQ_P_S "s")
   (VMLSLDAVXQ_P_S "s")
+  (VRMLALDAVHXQ_P_S "s")
+  (VRMLALDAVHXQ_S "s")
+  (VRMLSLDAVHQ_P_S "s")
+  (VRMLSLDAVHQ_S "s")
+  (VRMLSLDAVHXQ_P_S "s")
+  (VRMLSLDAVHXQ_S "s")
   ])
 
 ;; Both kinds of return insn.
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 584e6129ea5..e2259aa48e9 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1563,47 +1563,20 @@ (define_insn "mve_vqdmulltq_s"
 ])
 
 ;;
-;; [vrmlaldavhxq_s])
+;; [vrmlaldavhq_u vrmlaldavhq_s]
+;; [vrmlaldavhxq_s]
+;; [vrmlsldavhq_s]
+;; [vrmlsldavhxq_s]
 ;;
-(define_insn "mve_vrmlaldavhxq_sv4si"
-  [
-   (set (match_operand:DI 0 "s_register_operand" "=r")
-   (unspec:DI [(match_operand:V4SI 1 "s_register_operand" "w")
-   (match_operand:V4SI 2 "s_register_operand" "w")]
-VRMLALDAVHXQ_S))
-  ]
-  "TARGET_HAVE_MVE"
-  "vrmlaldavhx.s32 %Q0, %R0, %q1, %q2"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vrmlsldavhq_s])
-;;
-(define_insn "mve_vrmlsldavhq_sv4si"
-  [
-   (set (match_operand:DI 0 "s_register_operand" "=r")
-   (unspec:DI [(match_operand:V4SI 1 "s_register_operand" "w")
-   (match_operand:V4SI 2 "s_register_operand" "w")]
-VRMLSLDAVHQ_S))
-  ]
-  "TARGET_HAVE_MVE"
-  "vrmlsldavh.s32\t%Q0, %R0, %q1, %q2"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vrmlsldavhxq_s])
-;;
-(define_insn "mve_vrmlsldavhxq_sv4si"
+(define_insn "@mve_q_v4si"
   [
(set (match_operand:DI 0 "s_register_operand" "=r")
(unspec:DI [(match_operand:V4SI 1 "s_register_operand" "w")
(match_operand:V4SI 2 "s_register_operand" "w")]
-VRMLSLDAVHXQ_S))
+MVE_VRMLxLDAVxQ))
   ]
   "TARGET_HAVE_MVE"
-  "vrmlsldavhx.s32\t%Q0, %R0, %q1, %q2"
+  ".32\t%Q0, %R0, %q1, %q2"
   [(set_attr "type" "mve_move")
 ])
 
@@ -1653,21 +1626,6 @@ (define_insn "mve_vmullbq_poly_p"
   [(set_attr "type" "mve_move")
 ])
 
-;;
-;; [vrmlaldav

[PATCH 11/24] arm: [MVE intrinsics] add binary_acc_int64 shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_acc_int64 shape description.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_acc_int64): New.
* config/arm/arm-mve-builtins-shapes.h (binary_acc_int64): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 23 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 24 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index ceb13230da6..f1c3844953a 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -392,6 +392,29 @@ struct binary_acc_int32_def : public overloaded_base<0>
 };
 SHAPE (binary_acc_int32)
 
+/* <[u]int64>_t vfoo[_](_t, _t)
+
+   Example: vmlaldavq.
+   int64_t [__arm_]vmlaldavq[_s16](int16x8_t m1, int16x8_t m2)
+   int64_t [__arm_]vmlaldavq_p[_s16](int16x8_t m1, int16x8_t m2, mve_pred16_t 
p)  */
+struct binary_acc_int64_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "sx64,v0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform (2);
+  }
+};
+SHAPE (binary_acc_int64)
+
 /* <[u]int32>_t vfoo[_]([u]int32_t, _t, _t)
 
Example: vmladavaq.
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 7f68d41efe6..73e82d2fd7a 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -38,6 +38,7 @@ namespace arm_mve
 extern const function_shape *const binary_lshift;
 extern const function_shape *const binary_lshift_r;
 extern const function_shape *const binary_acc_int32;
+extern const function_shape *const binary_acc_int64;
 extern const function_shape *const binary_acca_int32;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
-- 
2.34.1



[PATCH 10/24] arm: [MVE intrinsics] rework vabavq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vabavq using the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vabavq): New.
* config/arm/arm-mve-builtins-base.def (vabavq): New.
* config/arm/arm-mve-builtins-base.h (vabavq): New.
* config/arm/arm_mve.h (vabavq): Remove.
(vabavq_p): Remove.
(vabavq_s8): Remove.
(vabavq_s16): Remove.
(vabavq_s32): Remove.
(vabavq_u8): Remove.
(vabavq_u16): Remove.
(vabavq_u32): Remove.
(vabavq_p_s8): Remove.
(vabavq_p_u8): Remove.
(vabavq_p_s16): Remove.
(vabavq_p_u16): Remove.
(vabavq_p_s32): Remove.
(vabavq_p_u32): Remove.
(__arm_vabavq_s8): Remove.
(__arm_vabavq_s16): Remove.
(__arm_vabavq_s32): Remove.
(__arm_vabavq_u8): Remove.
(__arm_vabavq_u16): Remove.
(__arm_vabavq_u32): Remove.
(__arm_vabavq_p_s8): Remove.
(__arm_vabavq_p_u8): Remove.
(__arm_vabavq_p_s16): Remove.
(__arm_vabavq_p_u16): Remove.
(__arm_vabavq_p_s32): Remove.
(__arm_vabavq_p_u32): Remove.
(__arm_vabavq): Remove.
(__arm_vabavq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   1 +
 gcc/config/arm/arm-mve-builtins-base.def |   1 +
 gcc/config/arm/arm-mve-builtins-base.h   |   1 +
 gcc/config/arm/arm_mve.h | 215 ---
 4 files changed, 3 insertions(+), 215 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 8a5ab990337..a81cf4cba5e 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -241,6 +241,7 @@ namespace arm_mve {
(-1, -1, UNSPEC##_F,
\
 -1, -1, UNSPEC##_P_F))
 
+FUNCTION_PRED_P_S_U (vabavq, VABAVQ)
 FUNCTION_WITHOUT_N (vabdq, VABDQ)
 FUNCTION (vabsq, unspec_based_mve_function_exact_insn, (ABS, ABS, ABS, -1, -1, 
-1, VABSQ_M_S, -1, VABSQ_M_F, -1, -1, -1))
 FUNCTION_WITH_RTX_M_N (vaddq, PLUS, VADDQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index cf0ed4b58df..934f45bc220 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -18,6 +18,7 @@
.  */
 
 #define REQUIRES_FLOAT false
+DEF_MVE_FUNCTION (vabavq, binary_acca_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vabdq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vabsq, unary, all_signed, mx_or_none)
 DEF_MVE_FUNCTION (vaddlvaq, unary_widen_acc, integer_32, p_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 4f09bebf1cb..1d29a940200 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -23,6 +23,7 @@
 namespace arm_mve {
 namespace functions {
 
+extern const function_base *const vabavq;
 extern const function_base *const vabdq;
 extern const function_base *const vabsq;
 extern const function_base *const vaddlvaq;
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 86fa7fcf789..f8afe19e86e 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -65,7 +65,6 @@
 #define vrmlsldavhxq(__a, __b) __arm_vrmlsldavhxq(__a, __b)
 #define vrmlsldavhq(__a, __b) __arm_vrmlsldavhq(__a, __b)
 #define vrmlaldavhxq(__a, __b) __arm_vrmlaldavhxq(__a, __b)
-#define vabavq(__a, __b, __c) __arm_vabavq(__a, __b, __c)
 #define vbicq_m_n(__a, __imm, __p) __arm_vbicq_m_n(__a, __imm, __p)
 #define vrmlaldavhaq(__a, __b, __c) __arm_vrmlaldavhaq(__a, __b, __c)
 #define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm)
@@ -104,7 +103,6 @@
 #define vmlsldavxq_p(__a, __b, __p) __arm_vmlsldavxq_p(__a, __b, __p)
 #define vsriq_m(__a, __b, __imm, __p) __arm_vsriq_m(__a, __b, __imm, __p)
 #define vqshluq_m(__inactive, __a, __imm, __p) __arm_vqshluq_m(__inactive, 
__a, __imm, __p)
-#define vabavq_p(__a, __b, __c, __p) __arm_vabavq_p(__a, __b, __c, __p)
 #define vbicq_m(__inactive, __a, __b, __p) __arm_vbicq_m(__inactive, __a, __b, 
__p)
 #define vbrsrq_m(__inactive, __a, __b, __p) __arm_vbrsrq_m(__inactive, __a, 
__b, __p)
 #define vcaddq_rot270_m(__inactive, __a, __b, __p) 
__arm_vcaddq_rot270_m(__inactive, __a, __b, __p)
@@ -447,9 +445,6 @@
 #define vrmlaldavhq_s32(__a, __b) __arm_vrmlaldavhq_s32(__a, __b)
 #define vcvttq_f16_f32(__a, __b) __arm_vcvttq_f16_f32(__a, __b)
 #define vcvtbq_f16_f32(__a, __b) __arm_vcvtbq_f16_f32(__a, __b)
-#define vabavq_s8(__a, __b, __c) __arm_vabavq_s8(__a, __b, __c)
-#define vabavq_s16(__a, __b, __c) __arm_vabavq_s16(__a, __b, __c)
-#define vabavq_s32(__a, __b, __c) __arm_vabavq_s32(__a, __b, __c)
 #define vbicq_m_n_s16(__a,  __imm, __p) __arm_vbicq_m_n_s16(__a,  __imm, __p)
 #define vbicq_m_n_s32(__a,  __imm, __p) __arm_vbicq_m_n_s32(__a,  __imm, __p)
 #define vbicq_m_n_u16(_

[PATCH 21/24] arm: [MVE intrinsics] rework vqrdmladhq vqrdmladhxq vqrdmlsdhq vqrdmlsdhxq vqdmladhq vqdmladhxq vqdmlsdhq vqdmlsdhxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vqrdmladhq, vqrdmladhxq, vqrdmlsdhq, vqrdmlsdhxq vqdmladhq,
vqdmladhxq, vqdmlsdhq, vqdmlsdhxq using the new MVE builtins
framework.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vqdmladhq, vqdmladhxq)
(vqdmlsdhq, vqdmlsdhxq, vqrdmladhq, vqrdmladhxq, vqrdmlsdhq)
(vqrdmlsdhxq): New.
* config/arm/arm-mve-builtins-base.def (vqdmladhq, vqdmladhxq)
(vqdmlsdhq, vqdmlsdhxq, vqrdmladhq, vqrdmladhxq, vqrdmlsdhq)
(vqrdmlsdhxq): New.
* config/arm/arm-mve-builtins-base.h (vqdmladhq, vqdmladhxq)
(vqdmlsdhq, vqdmlsdhxq, vqrdmladhq, vqrdmladhxq, vqrdmlsdhq)
(vqrdmlsdhxq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vqrdmladhq,
vqrdmladhxq, vqrdmlsdhq, vqrdmlsdhxq vqdmladhq, vqdmladhxq,
vqdmlsdhq, vqdmlsdhxq.
* config/arm/arm_mve.h (vqrdmlsdhxq): Remove.
(vqrdmlsdhq): Remove.
(vqrdmladhxq): Remove.
(vqrdmladhq): Remove.
(vqdmlsdhxq): Remove.
(vqdmlsdhq): Remove.
(vqdmladhxq): Remove.
(vqdmladhq): Remove.
(vqdmladhq_m): Remove.
(vqdmladhxq_m): Remove.
(vqdmlsdhq_m): Remove.
(vqdmlsdhxq_m): Remove.
(vqrdmladhq_m): Remove.
(vqrdmladhxq_m): Remove.
(vqrdmlsdhq_m): Remove.
(vqrdmlsdhxq_m): Remove.
(vqrdmlsdhxq_s8): Remove.
(vqrdmlsdhq_s8): Remove.
(vqrdmladhxq_s8): Remove.
(vqrdmladhq_s8): Remove.
(vqdmlsdhxq_s8): Remove.
(vqdmlsdhq_s8): Remove.
(vqdmladhxq_s8): Remove.
(vqdmladhq_s8): Remove.
(vqrdmlsdhxq_s16): Remove.
(vqrdmlsdhq_s16): Remove.
(vqrdmladhxq_s16): Remove.
(vqrdmladhq_s16): Remove.
(vqdmlsdhxq_s16): Remove.
(vqdmlsdhq_s16): Remove.
(vqdmladhxq_s16): Remove.
(vqdmladhq_s16): Remove.
(vqrdmlsdhxq_s32): Remove.
(vqrdmlsdhq_s32): Remove.
(vqrdmladhxq_s32): Remove.
(vqrdmladhq_s32): Remove.
(vqdmlsdhxq_s32): Remove.
(vqdmlsdhq_s32): Remove.
(vqdmladhxq_s32): Remove.
(vqdmladhq_s32): Remove.
(vqdmladhq_m_s8): Remove.
(vqdmladhq_m_s32): Remove.
(vqdmladhq_m_s16): Remove.
(vqdmladhxq_m_s8): Remove.
(vqdmladhxq_m_s32): Remove.
(vqdmladhxq_m_s16): Remove.
(vqdmlsdhq_m_s8): Remove.
(vqdmlsdhq_m_s32): Remove.
(vqdmlsdhq_m_s16): Remove.
(vqdmlsdhxq_m_s8): Remove.
(vqdmlsdhxq_m_s32): Remove.
(vqdmlsdhxq_m_s16): Remove.
(vqrdmladhq_m_s8): Remove.
(vqrdmladhq_m_s32): Remove.
(vqrdmladhq_m_s16): Remove.
(vqrdmladhxq_m_s8): Remove.
(vqrdmladhxq_m_s32): Remove.
(vqrdmladhxq_m_s16): Remove.
(vqrdmlsdhq_m_s8): Remove.
(vqrdmlsdhq_m_s32): Remove.
(vqrdmlsdhq_m_s16): Remove.
(vqrdmlsdhxq_m_s8): Remove.
(vqrdmlsdhxq_m_s32): Remove.
(vqrdmlsdhxq_m_s16): Remove.
(__arm_vqrdmlsdhxq_s8): Remove.
(__arm_vqrdmlsdhq_s8): Remove.
(__arm_vqrdmladhxq_s8): Remove.
(__arm_vqrdmladhq_s8): Remove.
(__arm_vqdmlsdhxq_s8): Remove.
(__arm_vqdmlsdhq_s8): Remove.
(__arm_vqdmladhxq_s8): Remove.
(__arm_vqdmladhq_s8): Remove.
(__arm_vqrdmlsdhxq_s16): Remove.
(__arm_vqrdmlsdhq_s16): Remove.
(__arm_vqrdmladhxq_s16): Remove.
(__arm_vqrdmladhq_s16): Remove.
(__arm_vqdmlsdhxq_s16): Remove.
(__arm_vqdmlsdhq_s16): Remove.
(__arm_vqdmladhxq_s16): Remove.
(__arm_vqdmladhq_s16): Remove.
(__arm_vqrdmlsdhxq_s32): Remove.
(__arm_vqrdmlsdhq_s32): Remove.
(__arm_vqrdmladhxq_s32): Remove.
(__arm_vqrdmladhq_s32): Remove.
(__arm_vqdmlsdhxq_s32): Remove.
(__arm_vqdmlsdhq_s32): Remove.
(__arm_vqdmladhxq_s32): Remove.
(__arm_vqdmladhq_s32): Remove.
(__arm_vqdmladhq_m_s8): Remove.
(__arm_vqdmladhq_m_s32): Remove.
(__arm_vqdmladhq_m_s16): Remove.
(__arm_vqdmladhxq_m_s8): Remove.
(__arm_vqdmladhxq_m_s32): Remove.
(__arm_vqdmladhxq_m_s16): Remove.
(__arm_vqdmlsdhq_m_s8): Remove.
(__arm_vqdmlsdhq_m_s32): Remove.
(__arm_vqdmlsdhq_m_s16): Remove.
(__arm_vqdmlsdhxq_m_s8): Remove.
(__arm_vqdmlsdhxq_m_s32): Remove.
(__arm_vqdmlsdhxq_m_s16): Remove.
(__arm_vqrdmladhq_m_s8): Remove.
(__arm_vqrdmladhq_m_s32): Remove.
(__arm_vqrdmladhq_m_s16): Remove.
(__arm_vqrdmladhxq_m_s8): Remove.
(__arm_vqrdmladhxq_m_s32): Remove.
(__arm_vqrdmladhxq_m_s16): Remove.
(__arm_vqrdmlsdhq_m_s8): Remove.
(__arm_vqrdmlsdhq_m_s32): Remove.
(__arm_vqrdmlsdhq_m_s16): Remove.
(__arm_vqrdmlsdhxq_m_s8): Remove.
(__arm_vqrdmlsdhxq_m_s32): Remove.

[PATCH 23/24] arm: [MVE intrinsics] factorize vmlaq_n vmlasq_n vqdmlahq_n vqdmlashq_n vqrdmlahq_n vqrdmlashq_n

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vmlaq_n, vmlasq_n, vqdmlahq_n, vqdmlashq_n, vqrdmlahq_n,
vqrdmlashq_n builtins so that they use the same parameterized names.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_VMLxQ_N): New.
(mve_insn): Add vmla, vmlas, vqdmlah, vqdmlash, vqrdmlah,
vqrdmlash.
(supf): Add VQDMLAHQ_N_S, VQDMLASHQ_N_S, VQRDMLAHQ_N_S,
VQRDMLASHQ_N_S.
* config/arm/mve.md (mve_vmlaq_n_)
(mve_vmlasq_n_, mve_vqdmlahq_n_)
(mve_vqdmlashq_n_, mve_vqrdmlahq_n_)
(mve_vqrdmlashq_n_): Merge into ...
(@mve_q_n_): ... this.
---
 gcc/config/arm/iterators.md | 19 
 gcc/config/arm/mve.md   | 93 -
 2 files changed, 28 insertions(+), 84 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index c23ca7361c1..abd904da11e 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -699,6 +699,15 @@ (define_int_iterator MVE_VMAXAVMINAQ_M [
 VMINAQ_M_S
 ])
 
+(define_int_iterator MVE_VMLxQ_N [
+VMLAQ_N_S VMLAQ_N_U
+VMLASQ_N_S VMLASQ_N_U
+VQDMLAHQ_N_S
+VQDMLASHQ_N_S
+VQRDMLAHQ_N_S
+VQRDMLASHQ_N_S
+])
+
 (define_int_iterator MVE_VMLxDAVQ [
 VMLADAVQ_S VMLADAVQ_U
 VMLADAVXQ_S
@@ -917,7 +926,9 @@ (define_int_attr mve_insn [
 (VMLALDAVXQ_P_S "vmlaldavx")
 (VMLALDAVXQ_S "vmlaldavx")
 (VMLAQ_M_N_S "vmla") (VMLAQ_M_N_U "vmla")
+(VMLAQ_N_S "vmla") (VMLAQ_N_U "vmla")
 (VMLASQ_M_N_S "vmlas") (VMLASQ_M_N_U "vmlas")
+(VMLASQ_N_S "vmlas") (VMLASQ_N_U "vmlas")
 (VMLSDAVAQ_P_S "vmlsdava")
 (VMLSDAVAQ_S "vmlsdava")
 (VMLSDAVAXQ_P_S "vmlsdavax")
@@ -963,7 +974,9 @@ (define_int_attr mve_insn [
 (VQDMLADHXQ_M_S "vqdmladhx")
 (VQDMLADHXQ_S "vqdmladhx")
 (VQDMLAHQ_M_N_S "vqdmlah")
+(VQDMLAHQ_N_S "vqdmlah")
 (VQDMLASHQ_M_N_S "vqdmlash")
+(VQDMLASHQ_N_S "vqdmlash")
 (VQDMLSDHQ_M_S "vqdmlsdh")
 (VQDMLSDHQ_S "vqdmlsdh")
 (VQDMLSDHXQ_M_S "vqdmlsdhx")
@@ -987,7 +1000,9 @@ (define_int_attr mve_insn [
 (VQRDMLADHXQ_M_S "vqrdmladhx")
 (VQRDMLADHXQ_S "vqrdmladhx")
 (VQRDMLAHQ_M_N_S "vqrdmlah")
+(VQRDMLAHQ_N_S "vqrdmlah")
 (VQRDMLASHQ_M_N_S "vqrdmlash")
+(VQRDMLASHQ_N_S "vqrdmlash")
 (VQRDMLSDHQ_M_S "vqrdmlsdh")
 (VQRDMLSDHQ_S "vqrdmlsdh")
 (VQRDMLSDHXQ_M_S "vqrdmlsdhx")
@@ -2406,6 +2421,10 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VQRDMLADHXQ_S "s")
   (VQRDMLSDHQ_S "s")
   (VQRDMLSDHXQ_S "s")
+  (VQDMLAHQ_N_S "s")
+  (VQDMLASHQ_N_S "s")
+  (VQRDMLAHQ_N_S "s")
+  (VQRDMLASHQ_N_S "s")
   ])
 
 ;; Both kinds of return insn.
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index bf4d18455fe..14634cbf333 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1955,34 +1955,23 @@ (define_insn "@mve_q_p_"
(set_attr "length""8")])
 
 ;;
-;; [vmlaq_n_u, vmlaq_n_s])
+;; [vmlaq_n_u, vmlaq_n_s]
+;; [vmlasq_n_u, vmlasq_n_s]
+;; [vqdmlahq_n_s]
+;; [vqdmlashq_n_s]
+;; [vqrdmlahq_n_s]
+;; [vqrdmlashq_n_s]
 ;;
-(define_insn "mve_vmlaq_n_"
-  [
-   (set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
-  (match_operand:MVE_2 2 "s_register_operand" "w")
-  (match_operand: 3 "s_register_operand" "r")]
-VMLAQ_N))
-  ]
-  "TARGET_HAVE_MVE"
-  "vmla.%#\t%q0, %q2, %3"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vmlasq_n_u, vmlasq_n_s])
-;;
-(define_insn "mve_vmlasq_n_"
+(define_insn "@mve_q_n_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
   (match_operand:MVE_2 2 "s_register_operand" "w")
   (match_operand: 3 "s_register_operand" "r")]
-VMLASQ_N))
+MVE_VMLxQ_N))
   ]
   "TARGET_HAVE_MVE"
-  "vmlas.%#   %q0, %q2, %3"
+  ".%#\t%q0, %q2, %3"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2018,38 +2007,6 @@ (define_insn "@mve_vpselq_"
   [(set_attr "type" "mve_move")
 ])
 
-;;
-;; [vqdmlahq_n_s])
-;;
-(define_insn "mve_vqdmlahq_n_"
-  [
-   (set (match_operand:MVE_2 0 "s_register_operand" "=w")
-   (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
-

[PATCH 19/24] arm: [MVE intrinsics] add ternary shape

2023-05-11 Thread Christophe Lyon via Gcc-patches
This patch adds the ternary shape description.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (ternary): New.
* config/arm/arm-mve-builtins-shapes.h (ternary): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 26 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 27 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index af770fd3e39..4455a253579 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1163,6 +1163,32 @@ struct inherent_def : public nonoverloaded_base
 };
 SHAPE (inherent)
 
+/* _t vfoo[_t0](_t, _t, _t)
+
+   i.e. the standard shape for ternary operations that operate on
+   uniform types.
+
+   Example: vqrdmlsdhxq.
+   int8x16_t [__arm_]vqrdmlsdhxq[_s8](int8x16_t inactive, int8x16_t a, 
int8x16_t b)
+   int8x16_t [__arm_]vqrdmlsdhxq_m[_s8](int8x16_t inactive, int8x16_t a, 
int8x16_t b, mve_pred16_t p)  */
+struct ternary_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "v0,v0,v0,v0", group, MODE_none, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform_opt_n (3);
+  }
+};
+SHAPE (ternary)
+
 /* _t vfoo[_t0](_t)
 
i.e. the standard shape for unary operations that operate on
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 1c4254122bc..b3ddd0a9e8d 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -56,6 +56,7 @@ namespace arm_mve
 extern const function_shape *const cmp;
 extern const function_shape *const create;
 extern const function_shape *const inherent;
+extern const function_shape *const ternary;
 extern const function_shape *const unary;
 extern const function_shape *const unary_acc;
 extern const function_shape *const unary_convert;
-- 
2.34.1



[PATCH 18/24] arm: [MVE intrinsics] rework vmlaldavaq vmlaldavaxq vmlsldavaq vmlsldavaxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vmlaldavaq, vmlaldavaxq, vmlsldavaq, vmlsldavaxq using the
new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmlaldavaq, vmlaldavaxq)
(vmlsldavaq, vmlsldavaxq): New.
* config/arm/arm-mve-builtins-base.def (vmlaldavaq, vmlaldavaxq)
(vmlsldavaq, vmlsldavaxq): New.
* config/arm/arm-mve-builtins-base.h (vmlaldavaq, vmlaldavaxq)
(vmlsldavaq, vmlsldavaxq): New.
* config/arm/arm_mve.h (vmlaldavaq): Remove.
(vmlaldavaxq): Remove.
(vmlsldavaq): Remove.
(vmlsldavaxq): Remove.
(vmlaldavaq_p): Remove.
(vmlaldavaxq_p): Remove.
(vmlsldavaq_p): Remove.
(vmlsldavaxq_p): Remove.
(vmlaldavaq_s16): Remove.
(vmlaldavaxq_s16): Remove.
(vmlsldavaq_s16): Remove.
(vmlsldavaxq_s16): Remove.
(vmlaldavaq_u16): Remove.
(vmlaldavaq_s32): Remove.
(vmlaldavaxq_s32): Remove.
(vmlsldavaq_s32): Remove.
(vmlsldavaxq_s32): Remove.
(vmlaldavaq_u32): Remove.
(vmlaldavaq_p_s32): Remove.
(vmlaldavaq_p_s16): Remove.
(vmlaldavaq_p_u32): Remove.
(vmlaldavaq_p_u16): Remove.
(vmlaldavaxq_p_s32): Remove.
(vmlaldavaxq_p_s16): Remove.
(vmlsldavaq_p_s32): Remove.
(vmlsldavaq_p_s16): Remove.
(vmlsldavaxq_p_s32): Remove.
(vmlsldavaxq_p_s16): Remove.
(__arm_vmlaldavaq_s16): Remove.
(__arm_vmlaldavaxq_s16): Remove.
(__arm_vmlsldavaq_s16): Remove.
(__arm_vmlsldavaxq_s16): Remove.
(__arm_vmlaldavaq_u16): Remove.
(__arm_vmlaldavaq_s32): Remove.
(__arm_vmlaldavaxq_s32): Remove.
(__arm_vmlsldavaq_s32): Remove.
(__arm_vmlsldavaxq_s32): Remove.
(__arm_vmlaldavaq_u32): Remove.
(__arm_vmlaldavaq_p_s32): Remove.
(__arm_vmlaldavaq_p_s16): Remove.
(__arm_vmlaldavaq_p_u32): Remove.
(__arm_vmlaldavaq_p_u16): Remove.
(__arm_vmlaldavaxq_p_s32): Remove.
(__arm_vmlaldavaxq_p_s16): Remove.
(__arm_vmlsldavaq_p_s32): Remove.
(__arm_vmlsldavaq_p_s16): Remove.
(__arm_vmlsldavaxq_p_s32): Remove.
(__arm_vmlsldavaxq_p_s16): Remove.
(__arm_vmlaldavaq): Remove.
(__arm_vmlaldavaxq): Remove.
(__arm_vmlsldavaq): Remove.
(__arm_vmlsldavaxq): Remove.
(__arm_vmlaldavaq_p): Remove.
(__arm_vmlaldavaxq_p): Remove.
(__arm_vmlsldavaq_p): Remove.
(__arm_vmlsldavaxq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   4 +
 gcc/config/arm/arm-mve-builtins-base.def |   4 +
 gcc/config/arm/arm-mve-builtins-base.h   |   4 +
 gcc/config/arm/arm_mve.h | 368 ---
 4 files changed, 12 insertions(+), 368 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 142ba9357a1..2b0c800013c 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -285,12 +285,16 @@ FUNCTION_PRED_P_S (vmladavaxq, VMLADAVAXQ)
 FUNCTION_PRED_P_S_U (vmladavaq, VMLADAVAQ)
 FUNCTION_PRED_P_S_U (vmladavq, VMLADAVQ)
 FUNCTION_PRED_P_S (vmladavxq, VMLADAVXQ)
+FUNCTION_PRED_P_S_U (vmlaldavaq, VMLALDAVAQ)
+FUNCTION_PRED_P_S (vmlaldavaxq, VMLALDAVAXQ)
 FUNCTION_PRED_P_S_U (vmlaldavq, VMLALDAVQ)
 FUNCTION_PRED_P_S (vmlaldavxq, VMLALDAVXQ)
 FUNCTION_PRED_P_S (vmlsdavaq, VMLSDAVAQ)
 FUNCTION_PRED_P_S (vmlsdavaxq, VMLSDAVAXQ)
 FUNCTION_PRED_P_S (vmlsdavq, VMLSDAVQ)
 FUNCTION_PRED_P_S (vmlsdavxq, VMLSDAVXQ)
+FUNCTION_PRED_P_S (vmlsldavaq, VMLSLDAVAQ)
+FUNCTION_PRED_P_S (vmlsldavaxq, VMLSLDAVAXQ)
 FUNCTION_PRED_P_S (vmlsldavq, VMLSLDAVQ)
 FUNCTION_PRED_P_S (vmlsldavxq, VMLSLDAVXQ)
 FUNCTION_WITHOUT_N_NO_F (vmovlbq, VMOVLBQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 1dd3ad3489b..d61badb99d9 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -54,12 +54,16 @@ DEF_MVE_FUNCTION (vmladavaq, binary_acca_int32, 
all_integer, p_or_none)
 DEF_MVE_FUNCTION (vmladavaxq, binary_acca_int32, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vmladavq, binary_acc_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vmladavxq, binary_acc_int32, all_signed, p_or_none)
+DEF_MVE_FUNCTION (vmlaldavaq, binary_acca_int64, integer_16_32, p_or_none)
+DEF_MVE_FUNCTION (vmlaldavaxq, binary_acca_int64, signed_16_32, p_or_none)
 DEF_MVE_FUNCTION (vmlaldavq, binary_acc_int64, integer_16_32, p_or_none)
 DEF_MVE_FUNCTION (vmlaldavxq, binary_acc_int64, signed_16_32, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavaq, binary_acca_int32, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavaxq, binary_acca_int32, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavq, binary_acc_int32, all_integer, p_or_none)
 DEF_MVE_FUNCTION (vmlsdavxq, binary_acc_int32, all_signed, p_or_none)
+DEF_MVE_FUNCTION 

[PATCH 17/24] arm: [MVE intrinsics] factorize vmlaldavaq vmlaldavaxq vmlsldavaq vmlsldavaxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Factorize vmlaldavaq, vmlaldavaxq, vmlsldavaq, vmlsldavaxq builtins so
that they use the same parameterized names.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/iterators.md (MVE_VMLxLDAVAxQ, MVE_VMLxLDAVAxQ_P):
New.
(mve_insn): Add vmlaldava, vmlaldavax, vmlsldava, vmlsldavax.
(supf): Add VMLALDAVAXQ_P_S, VMLALDAVAXQ_S, VMLSLDAVAQ_P_S,
VMLSLDAVAQ_S, VMLSLDAVAXQ_P_S, VMLSLDAVAXQ_S.
* config/arm/mve.md (mve_vmlaldavaq_)
(mve_vmlsldavaq_s, mve_vmlsldavaxq_s)
(mve_vmlaldavaxq_s): Merge into ...
(@mve_q_): ... this.
(mve_vmlaldavaq_p_, mve_vmlaldavaxq_p_)
(mve_vmlsldavaq_p_s, mve_vmlsldavaxq_p_s): Merge into
...
(@mve_q_p_): ... this.
---
 gcc/config/arm/iterators.md |  28 +
 gcc/config/arm/mve.md   | 121 +---
 2 files changed, 42 insertions(+), 107 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 729127d8586..7a88bc91182 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -741,6 +741,20 @@ (define_int_iterator MVE_VMLxLDAVxQ_P [
 VMLSLDAVXQ_P_S
 ])
 
+(define_int_iterator MVE_VMLxLDAVAxQ [
+VMLALDAVAQ_S VMLALDAVAQ_U
+VMLALDAVAXQ_S
+VMLSLDAVAQ_S
+VMLSLDAVAXQ_S
+])
+
+(define_int_iterator MVE_VMLxLDAVAxQ_P [
+VMLALDAVAQ_P_S VMLALDAVAQ_P_U
+VMLALDAVAXQ_P_S
+VMLSLDAVAQ_P_S
+VMLSLDAVAXQ_P_S
+])
+
 (define_int_iterator MVE_VRMLxLDAVxQ [
 VRMLALDAVHQ_S VRMLALDAVHQ_U
 VRMLALDAVHXQ_S
@@ -883,6 +897,10 @@ (define_int_attr mve_insn [
 (VMLADAVQ_S "vmladav") (VMLADAVQ_U "vmladav")
 (VMLADAVXQ_P_S "vmladavx")
 (VMLADAVXQ_S "vmladavx")
+(VMLALDAVAQ_P_S "vmlaldava") (VMLALDAVAQ_P_U "vmlaldava")
+(VMLALDAVAQ_S "vmlaldava") (VMLALDAVAQ_U "vmlaldava")
+(VMLALDAVAXQ_P_S "vmlaldavax")
+(VMLALDAVAXQ_S "vmlaldavax")
 (VMLALDAVQ_P_S "vmlaldav") (VMLALDAVQ_P_U "vmlaldav")
 (VMLALDAVQ_S "vmlaldav") (VMLALDAVQ_U "vmlaldav")
 (VMLALDAVXQ_P_S "vmlaldavx")
@@ -897,6 +915,10 @@ (define_int_attr mve_insn [
 (VMLSDAVQ_S "vmlsdav")
 (VMLSDAVXQ_P_S "vmlsdavx")
 (VMLSDAVXQ_S "vmlsdavx")
+(VMLSLDAVAQ_P_S "vmlsldava")
+(VMLSLDAVAQ_S "vmlsldava")
+(VMLSLDAVAXQ_P_S "vmlsldavax")
+(VMLSLDAVAXQ_S "vmlsldavax")
 (VMLSLDAVQ_P_S "vmlsldav")
 (VMLSLDAVQ_S "vmlsldav")
 (VMLSLDAVXQ_P_S "vmlsldavx")
@@ -2351,6 +2373,12 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VRMLSLDAVHQ_S "s")
   (VRMLSLDAVHXQ_P_S "s")
   (VRMLSLDAVHXQ_S "s")
+  (VMLALDAVAXQ_P_S "s")
+  (VMLALDAVAXQ_S "s")
+  (VMLSLDAVAQ_P_S "s")
+  (VMLSLDAVAQ_S "s")
+  (VMLSLDAVAXQ_P_S "s")
+  (VMLSLDAVAXQ_S "s")
   ])
 
 ;; Both kinds of return insn.
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index e2259aa48e9..c6fd634b5c0 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -2550,34 +2550,21 @@ (define_insn "@mve_q_p_f"
(set_attr "length""8")])
 
 ;;
-;; [vmlaldavaq_s, vmlaldavaq_u])
+;; [vmlaldavaq_s, vmlaldavaq_u]
+;; [vmlaldavaxq_s]
+;; [vmlsldavaq_s]
+;; [vmlsldavaxq_s]
 ;;
-(define_insn "mve_vmlaldavaq_"
-  [
-   (set (match_operand:DI 0 "s_register_operand" "=r")
-   (unspec:DI [(match_operand:DI 1 "s_register_operand" "0")
-  (match_operand:MVE_5 2 "s_register_operand" "w")
-  (match_operand:MVE_5 3 "s_register_operand" "w")]
-VMLALDAVAQ))
-  ]
-  "TARGET_HAVE_MVE"
-  "vmlaldava.%#\t%Q0, %R0, %q2, %q3"
-  [(set_attr "type" "mve_move")
-])
-
-;;
-;; [vmlaldavaxq_s])
-;;
-(define_insn "mve_vmlaldavaxq_s"
+(define_insn "@mve_q_"
   [
(set (match_operand:DI 0 "s_register_operand" "=r")
(unspec:DI [(match_operand:DI 1 "s_register_operand" "0")
   (match_operand:MVE_5 2 "s_register_operand" "w")
   (match_operand:MVE_5 3 "s_register_operand" "w")]
-VMLALDAVAXQ_S))
+MVE_VMLxLDAVAxQ))
   ]
   "TARGET_HAVE_MVE"
-  "vmlaldavax.s%#\t%Q0, %R0, %q2, %q3"
+  ".%#\t%Q0, %R0, %q2, %q3"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2600,38 +2587,6 @@ (define_insn "@mve_q_p_"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
-;;
-;; [vmlsldavaq_s])
-;;
-(define_insn "mve_vmlsldavaq_s"
-  [
-   (set (

[PATCH 15/24] arm: [MVE intrinsics] rework vrmlaldavhq vrmlaldavhxq vrmlsldavhq vrmlsldavhxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vrmlaldavhq, vrmlaldavhxq, vrmlsldavhq, vrmlsldavhxq using
the new MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vrmlaldavhq, vrmlaldavhxq)
(vrmlsldavhq, vrmlsldavhxq): New.
* config/arm/arm-mve-builtins-base.def (vrmlaldavhq, vrmlaldavhxq)
(vrmlsldavhq, vrmlsldavhxq): New.
* config/arm/arm-mve-builtins-base.h (vrmlaldavhq, vrmlaldavhxq)
(vrmlsldavhq, vrmlsldavhxq): New.
* config/arm/arm-mve-builtins-functions.h
(unspec_mve_function_exact_insn_pred_p): Handle vrmlaldavhq,
vrmlaldavhxq, vrmlsldavhq, vrmlsldavhxq.
* config/arm/arm_mve.h (vrmlaldavhq): Remove.
(vrmlsldavhxq): Remove.
(vrmlsldavhq): Remove.
(vrmlaldavhxq): Remove.
(vrmlaldavhq_p): Remove.
(vrmlaldavhxq_p): Remove.
(vrmlsldavhq_p): Remove.
(vrmlsldavhxq_p): Remove.
(vrmlaldavhq_u32): Remove.
(vrmlsldavhxq_s32): Remove.
(vrmlsldavhq_s32): Remove.
(vrmlaldavhxq_s32): Remove.
(vrmlaldavhq_s32): Remove.
(vrmlaldavhq_p_s32): Remove.
(vrmlaldavhxq_p_s32): Remove.
(vrmlsldavhq_p_s32): Remove.
(vrmlsldavhxq_p_s32): Remove.
(vrmlaldavhq_p_u32): Remove.
(__arm_vrmlaldavhq_u32): Remove.
(__arm_vrmlsldavhxq_s32): Remove.
(__arm_vrmlsldavhq_s32): Remove.
(__arm_vrmlaldavhxq_s32): Remove.
(__arm_vrmlaldavhq_s32): Remove.
(__arm_vrmlaldavhq_p_s32): Remove.
(__arm_vrmlaldavhxq_p_s32): Remove.
(__arm_vrmlsldavhq_p_s32): Remove.
(__arm_vrmlsldavhxq_p_s32): Remove.
(__arm_vrmlaldavhq_p_u32): Remove.
(__arm_vrmlaldavhq): Remove.
(__arm_vrmlsldavhxq): Remove.
(__arm_vrmlsldavhq): Remove.
(__arm_vrmlaldavhxq): Remove.
(__arm_vrmlaldavhq_p): Remove.
(__arm_vrmlaldavhxq_p): Remove.
(__arm_vrmlsldavhq_p): Remove.
(__arm_vrmlsldavhxq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc |   4 +
 gcc/config/arm/arm-mve-builtins-base.def|   4 +
 gcc/config/arm/arm-mve-builtins-base.h  |   4 +
 gcc/config/arm/arm-mve-builtins-functions.h |   8 +-
 gcc/config/arm/arm_mve.h| 182 
 5 files changed, 18 insertions(+), 184 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index af1a2c9942a..142ba9357a1 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -326,6 +326,10 @@ FUNCTION_WITHOUT_N_NO_F (vrev16q, VREV16Q)
 FUNCTION_WITHOUT_N (vrev32q, VREV32Q)
 FUNCTION_WITHOUT_N (vrev64q, VREV64Q)
 FUNCTION_WITHOUT_N_NO_F (vrhaddq, VRHADDQ)
+FUNCTION_PRED_P_S_U (vrmlaldavhq, VRMLALDAVHQ)
+FUNCTION_PRED_P_S (vrmlaldavhxq, VRMLALDAVHXQ)
+FUNCTION_PRED_P_S (vrmlsldavhq, VRMLSLDAVHQ)
+FUNCTION_PRED_P_S (vrmlsldavhxq, VRMLSLDAVHXQ)
 FUNCTION_WITHOUT_N_NO_F (vrmulhq, VRMULHQ)
 FUNCTION_ONLY_F (vrndq, VRNDQ)
 FUNCTION_ONLY_F (vrndaq, VRNDAQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index f7f353b34a7..1dd3ad3489b 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -96,6 +96,10 @@ DEF_MVE_FUNCTION (vrev16q, unary, integer_8, mx_or_none)
 DEF_MVE_FUNCTION (vrev32q, unary, integer_8_16, mx_or_none)
 DEF_MVE_FUNCTION (vrev64q, unary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vrhaddq, binary, all_integer, mx_or_none)
+DEF_MVE_FUNCTION (vrmlaldavhq, binary_acc_int64, integer_32, p_or_none)
+DEF_MVE_FUNCTION (vrmlaldavhxq, binary_acc_int64, signed_32, p_or_none)
+DEF_MVE_FUNCTION (vrmlsldavhq, binary_acc_int64, signed_32, p_or_none)
+DEF_MVE_FUNCTION (vrmlsldavhxq, binary_acc_int64, signed_32, p_or_none)
 DEF_MVE_FUNCTION (vrmulhq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vrshlq, binary_round_lshift, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vrshrnbq, binary_rshift_narrow, integer_16_32, m_or_none)
diff --git a/gcc/config/arm/arm-mve-builtins-base.h 
b/gcc/config/arm/arm-mve-builtins-base.h
index 08d07a7c6d5..9604991b168 100644
--- a/gcc/config/arm/arm-mve-builtins-base.h
+++ b/gcc/config/arm/arm-mve-builtins-base.h
@@ -108,6 +108,10 @@ extern const function_base *const vrev16q;
 extern const function_base *const vrev32q;
 extern const function_base *const vrev64q;
 extern const function_base *const vrhaddq;
+extern const function_base *const vrmlaldavhq;
+extern const function_base *const vrmlaldavhxq;
+extern const function_base *const vrmlsldavhq;
+extern const function_base *const vrmlsldavhxq;
 extern const function_base *const vrmulhq;
 extern const function_base *const vrndaq;
 extern const function_base *const vrndmq;
diff --git a/gcc/config/arm/arm-mve-builtins-functions.h 
b/gcc/config/arm/arm-mve-builtins-functions.h
index ea926e42b81..77a6269f0da 100644
--- a/gcc/config/arm/arm-mve-

[PATCH 08/24] arm: [MVE intrinsics] rework vmladavaq vmladavaxq vmlsdavaq vmlsdavaxq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vmladavaq, vmladavaxq, vmlsdavaq, vmlsdavaxq using the new
MVE builtins framework.

2022-10-25  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmladavaxq, vmladavaq)
(vmlsdavaq, vmlsdavaxq): New.
* config/arm/arm-mve-builtins-base.def (vmladavaxq, vmladavaq)
(vmlsdavaq, vmlsdavaxq): New.
* config/arm/arm-mve-builtins-base.h (vmladavaxq, vmladavaq)
(vmlsdavaq, vmlsdavaxq): New.
* config/arm/arm_mve.h (vmladavaq): Remove.
(vmlsdavaxq): Remove.
(vmlsdavaq): Remove.
(vmladavaxq): Remove.
(vmladavaq_p): Remove.
(vmladavaxq_p): Remove.
(vmlsdavaq_p): Remove.
(vmlsdavaxq_p): Remove.
(vmladavaq_u8): Remove.
(vmlsdavaxq_s8): Remove.
(vmlsdavaq_s8): Remove.
(vmladavaxq_s8): Remove.
(vmladavaq_s8): Remove.
(vmladavaq_u16): Remove.
(vmlsdavaxq_s16): Remove.
(vmlsdavaq_s16): Remove.
(vmladavaxq_s16): Remove.
(vmladavaq_s16): Remove.
(vmladavaq_u32): Remove.
(vmlsdavaxq_s32): Remove.
(vmlsdavaq_s32): Remove.
(vmladavaxq_s32): Remove.
(vmladavaq_s32): Remove.
(vmladavaq_p_s8): Remove.
(vmladavaq_p_s32): Remove.
(vmladavaq_p_s16): Remove.
(vmladavaq_p_u8): Remove.
(vmladavaq_p_u32): Remove.
(vmladavaq_p_u16): Remove.
(vmladavaxq_p_s8): Remove.
(vmladavaxq_p_s32): Remove.
(vmladavaxq_p_s16): Remove.
(vmlsdavaq_p_s8): Remove.
(vmlsdavaq_p_s32): Remove.
(vmlsdavaq_p_s16): Remove.
(vmlsdavaxq_p_s8): Remove.
(vmlsdavaxq_p_s32): Remove.
(vmlsdavaxq_p_s16): Remove.
(__arm_vmladavaq_u8): Remove.
(__arm_vmlsdavaxq_s8): Remove.
(__arm_vmlsdavaq_s8): Remove.
(__arm_vmladavaxq_s8): Remove.
(__arm_vmladavaq_s8): Remove.
(__arm_vmladavaq_u16): Remove.
(__arm_vmlsdavaxq_s16): Remove.
(__arm_vmlsdavaq_s16): Remove.
(__arm_vmladavaxq_s16): Remove.
(__arm_vmladavaq_s16): Remove.
(__arm_vmladavaq_u32): Remove.
(__arm_vmlsdavaxq_s32): Remove.
(__arm_vmlsdavaq_s32): Remove.
(__arm_vmladavaxq_s32): Remove.
(__arm_vmladavaq_s32): Remove.
(__arm_vmladavaq_p_s8): Remove.
(__arm_vmladavaq_p_s32): Remove.
(__arm_vmladavaq_p_s16): Remove.
(__arm_vmladavaq_p_u8): Remove.
(__arm_vmladavaq_p_u32): Remove.
(__arm_vmladavaq_p_u16): Remove.
(__arm_vmladavaxq_p_s8): Remove.
(__arm_vmladavaxq_p_s32): Remove.
(__arm_vmladavaxq_p_s16): Remove.
(__arm_vmlsdavaq_p_s8): Remove.
(__arm_vmlsdavaq_p_s32): Remove.
(__arm_vmlsdavaq_p_s16): Remove.
(__arm_vmlsdavaxq_p_s8): Remove.
(__arm_vmlsdavaxq_p_s32): Remove.
(__arm_vmlsdavaxq_p_s16): Remove.
(__arm_vmladavaq): Remove.
(__arm_vmlsdavaxq): Remove.
(__arm_vmlsdavaq): Remove.
(__arm_vmladavaxq): Remove.
(__arm_vmladavaq_p): Remove.
(__arm_vmladavaxq_p): Remove.
(__arm_vmlsdavaq_p): Remove.
(__arm_vmlsdavaxq_p): Remove.
---
 gcc/config/arm/arm-mve-builtins-base.cc  |   4 +
 gcc/config/arm/arm-mve-builtins-base.def |   4 +
 gcc/config/arm/arm-mve-builtins-base.h   |   4 +
 gcc/config/arm/arm_mve.h | 538 ---
 4 files changed, 12 insertions(+), 538 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc 
b/gcc/config/arm/arm-mve-builtins-base.cc
index 69af6f9139e..8a5ab990337 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -280,8 +280,12 @@ FUNCTION (vminnmq, unspec_based_mve_function_exact_insn, 
(UNKNOWN, UNKNOWN, SMIN
 FUNCTION_PRED_P_F (vminnmvq, VMINNMVQ)
 FUNCTION_WITH_RTX_M_NO_F (vminq, SMIN, UMIN, VMINQ)
 FUNCTION_PRED_P_S_U (vminvq, VMINVQ)
+FUNCTION_PRED_P_S (vmladavaxq, VMLADAVAXQ)
+FUNCTION_PRED_P_S_U (vmladavaq, VMLADAVAQ)
 FUNCTION_PRED_P_S_U (vmladavq, VMLADAVQ)
 FUNCTION_PRED_P_S (vmladavxq, VMLADAVXQ)
+FUNCTION_PRED_P_S (vmlsdavaq, VMLSDAVAQ)
+FUNCTION_PRED_P_S (vmlsdavaxq, VMLSDAVAXQ)
 FUNCTION_PRED_P_S (vmlsdavq, VMLSDAVQ)
 FUNCTION_PRED_P_S (vmlsdavxq, VMLSDAVXQ)
 FUNCTION_WITHOUT_N_NO_F (vmovlbq, VMOVLBQ)
diff --git a/gcc/config/arm/arm-mve-builtins-base.def 
b/gcc/config/arm/arm-mve-builtins-base.def
index 40d462fc7d2..cf0ed4b58df 100644
--- a/gcc/config/arm/arm-mve-builtins-base.def
+++ b/gcc/config/arm/arm-mve-builtins-base.def
@@ -49,8 +49,12 @@ DEF_MVE_FUNCTION (vminaq, binary_maxamina, all_signed, 
m_or_none)
 DEF_MVE_FUNCTION (vminavq, binary_maxavminav, all_signed, p_or_none)
 DEF_MVE_FUNCTION (vminq, binary, all_integer, mx_or_none)
 DEF_MVE_FUNCTION (vminvq, binary_maxvminv, all_integer, p_or_none)
+DEF_MVE_FUNCTION (vmladavaq, binary_acca_int32, all_integer, p_or_none)
+DEF_MVE_FUNCTION (vmladavaxq, binary_acca_i

[PATCH 24/24] arm: [MVE intrinsics] rework vmlaq vmlasq vqdmlahq vqdmlashq vqrdmlahq vqrdmlashq

2023-05-11 Thread Christophe Lyon via Gcc-patches
Implement vmlaq, vmlasq, vqdmlahq, vqdmlashq, vqrdmlahq, vqrdmlashq
using the new MVE builtins framework.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-base.cc (vmlaq, vmlasq, vqdmlahq)
(vqdmlashq, vqrdmlahq, vqrdmlashq): New.
* config/arm/arm-mve-builtins-base.def (vmlaq, vmlasq, vqdmlahq)
(vqdmlashq, vqrdmlahq, vqrdmlashq): New.
* config/arm/arm-mve-builtins-base.h (vmlaq, vmlasq, vqdmlahq)
(vqdmlashq, vqrdmlahq, vqrdmlashq): New.
* config/arm/arm-mve-builtins.cc
(function_instance::has_inactive_argument): Handle vmlaq, vmlasq,
vqdmlahq, vqdmlashq, vqrdmlahq, vqrdmlashq.
* config/arm/arm_mve.h (vqrdmlashq): Remove.
(vqrdmlahq): Remove.
(vqdmlashq): Remove.
(vqdmlahq): Remove.
(vmlasq): Remove.
(vmlaq): Remove.
(vmlaq_m): Remove.
(vmlasq_m): Remove.
(vqdmlashq_m): Remove.
(vqdmlahq_m): Remove.
(vqrdmlahq_m): Remove.
(vqrdmlashq_m): Remove.
(vmlasq_n_u8): Remove.
(vmlaq_n_u8): Remove.
(vqrdmlashq_n_s8): Remove.
(vqrdmlahq_n_s8): Remove.
(vqdmlahq_n_s8): Remove.
(vqdmlashq_n_s8): Remove.
(vmlasq_n_s8): Remove.
(vmlaq_n_s8): Remove.
(vmlasq_n_u16): Remove.
(vmlaq_n_u16): Remove.
(vqrdmlashq_n_s16): Remove.
(vqrdmlahq_n_s16): Remove.
(vqdmlashq_n_s16): Remove.
(vqdmlahq_n_s16): Remove.
(vmlasq_n_s16): Remove.
(vmlaq_n_s16): Remove.
(vmlasq_n_u32): Remove.
(vmlaq_n_u32): Remove.
(vqrdmlashq_n_s32): Remove.
(vqrdmlahq_n_s32): Remove.
(vqdmlashq_n_s32): Remove.
(vqdmlahq_n_s32): Remove.
(vmlasq_n_s32): Remove.
(vmlaq_n_s32): Remove.
(vmlaq_m_n_s8): Remove.
(vmlaq_m_n_s32): Remove.
(vmlaq_m_n_s16): Remove.
(vmlaq_m_n_u8): Remove.
(vmlaq_m_n_u32): Remove.
(vmlaq_m_n_u16): Remove.
(vmlasq_m_n_s8): Remove.
(vmlasq_m_n_s32): Remove.
(vmlasq_m_n_s16): Remove.
(vmlasq_m_n_u8): Remove.
(vmlasq_m_n_u32): Remove.
(vmlasq_m_n_u16): Remove.
(vqdmlashq_m_n_s8): Remove.
(vqdmlashq_m_n_s32): Remove.
(vqdmlashq_m_n_s16): Remove.
(vqdmlahq_m_n_s8): Remove.
(vqdmlahq_m_n_s32): Remove.
(vqdmlahq_m_n_s16): Remove.
(vqrdmlahq_m_n_s8): Remove.
(vqrdmlahq_m_n_s32): Remove.
(vqrdmlahq_m_n_s16): Remove.
(vqrdmlashq_m_n_s8): Remove.
(vqrdmlashq_m_n_s32): Remove.
(vqrdmlashq_m_n_s16): Remove.
(__arm_vmlasq_n_u8): Remove.
(__arm_vmlaq_n_u8): Remove.
(__arm_vqrdmlashq_n_s8): Remove.
(__arm_vqdmlashq_n_s8): Remove.
(__arm_vqrdmlahq_n_s8): Remove.
(__arm_vqdmlahq_n_s8): Remove.
(__arm_vmlasq_n_s8): Remove.
(__arm_vmlaq_n_s8): Remove.
(__arm_vmlasq_n_u16): Remove.
(__arm_vmlaq_n_u16): Remove.
(__arm_vqrdmlashq_n_s16): Remove.
(__arm_vqdmlashq_n_s16): Remove.
(__arm_vqrdmlahq_n_s16): Remove.
(__arm_vqdmlahq_n_s16): Remove.
(__arm_vmlasq_n_s16): Remove.
(__arm_vmlaq_n_s16): Remove.
(__arm_vmlasq_n_u32): Remove.
(__arm_vmlaq_n_u32): Remove.
(__arm_vqrdmlashq_n_s32): Remove.
(__arm_vqdmlashq_n_s32): Remove.
(__arm_vqrdmlahq_n_s32): Remove.
(__arm_vqdmlahq_n_s32): Remove.
(__arm_vmlasq_n_s32): Remove.
(__arm_vmlaq_n_s32): Remove.
(__arm_vmlaq_m_n_s8): Remove.
(__arm_vmlaq_m_n_s32): Remove.
(__arm_vmlaq_m_n_s16): Remove.
(__arm_vmlaq_m_n_u8): Remove.
(__arm_vmlaq_m_n_u32): Remove.
(__arm_vmlaq_m_n_u16): Remove.
(__arm_vmlasq_m_n_s8): Remove.
(__arm_vmlasq_m_n_s32): Remove.
(__arm_vmlasq_m_n_s16): Remove.
(__arm_vmlasq_m_n_u8): Remove.
(__arm_vmlasq_m_n_u32): Remove.
(__arm_vmlasq_m_n_u16): Remove.
(__arm_vqdmlahq_m_n_s8): Remove.
(__arm_vqdmlahq_m_n_s32): Remove.
(__arm_vqdmlahq_m_n_s16): Remove.
(__arm_vqrdmlahq_m_n_s8): Remove.
(__arm_vqrdmlahq_m_n_s32): Remove.
(__arm_vqrdmlahq_m_n_s16): Remove.
(__arm_vqrdmlashq_m_n_s8): Remove.
(__arm_vqrdmlashq_m_n_s32): Remove.
(__arm_vqrdmlashq_m_n_s16): Remove.
(__arm_vqdmlashq_m_n_s8): Remove.
(__arm_vqdmlashq_m_n_s16): Remove.
(__arm_vqdmlashq_m_n_s32): Remove.
(__arm_vmlasq): Remove.
(__arm_vmlaq): Remove.
(__arm_vqrdmlashq): Remove.
(__arm_vqdmlashq): Remove.
(__arm_vqrdmlahq): Remove.
(__arm_vqdmlahq): Remove.
(__arm_vmlaq_m): Remove.
(__arm_vmlasq_m): Remove.
(__arm_vqdmlahq_m): Remove.
(__arm_vqrdmlahq_m): Remove.
(__arm_vqrdmlashq_m): Remove.
(__arm_vqdmlashq

[PATCH 01/26] arm: [MVE intrinsics] add binary_widen_opt_n shape

2023-05-12 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_widen_opt_n shape description.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_widen_opt_n): New.
* config/arm/arm-mve-builtins-shapes.h (binary_widen_opt_n): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 49 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 50 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 5a299a272f5..ee4bc3f8ea4 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -1098,6 +1098,55 @@ struct binary_widen_n_def : public overloaded_base<0>
 };
 SHAPE (binary_widen_n)
 
+/* _t vfoo[_t0](_t, _t)
+   _t vfoo[_n_t0](_t, _t)
+
+   Example: vqdmullbq.
+   int32x4_t [__arm_]vqdmulltq[_n_s16](int16x8_t a, int16_t b)
+   int32x4_t [__arm_]vqdmulltq_m[_n_s16](int32x4_t inactive, int16x8_t a, 
int16_t b, mve_pred16_t p)
+   int32x4_t [__arm_]vqdmulltq[_s16](int16x8_t a, int16x8_t b)
+   int32x4_t [__arm_]vqdmulltq_m[_s16](int32x4_t inactive, int16x8_t a, 
int16x8_t b, mve_pred16_t p)  */
+struct binary_widen_opt_n_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+build_all (b, "vw0,v0,v0", group, MODE_none, preserve_user_namespace);
+build_all (b, "vw0,v0,s0", group, MODE_n, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (2, i, nargs)
+   || (type = r.infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+type_suffix_index wide_suffix
+  = find_type_suffix (type_suffixes[type].tclass,
+ type_suffixes[type].element_bits * 2);
+
+/* Skip last argument, may be scalar, will be checked below by
+   finish_opt_n_resolution.  */
+unsigned int last_arg = i--;
+for (; i > 0; i--)
+  if (!r.require_matching_vector_type (i, type))
+   return error_mark_node;
+
+/* Check the inactive argument has the wide type.  */
+if ((r.pred == PRED_m)
+   && (r.infer_vector_type (0) != wide_suffix))
+return r.report_no_such_form (type);
+
+return r.finish_opt_n_resolution (last_arg, 0, type);
+  }
+};
+SHAPE (binary_widen_opt_n)
+
 /* Shape for comparison operations that operate on
uniform types.
 
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index a28cd6a1547..07b12b4af68 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -53,6 +53,7 @@ namespace arm_mve
 extern const function_shape *const binary_rshift_narrow;
 extern const function_shape *const binary_rshift_narrow_unsigned;
 extern const function_shape *const binary_widen_n;
+extern const function_shape *const binary_widen_opt_n;
 extern const function_shape *const cmp;
 extern const function_shape *const create;
 extern const function_shape *const inherent;
-- 
2.34.1



[PATCH 09/26] arm: [MVE intrinsics] add binary_imm32 shape

2023-05-12 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_imm32 shape description.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc (binary_imm32): New.
* config/arm/arm-mve-builtins-shapes.h (binary_imm32): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 27 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 28 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index 91540838e03..c2e138c12e1 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -489,6 +489,33 @@ struct binary_acca_int64_def : public overloaded_base<0>
 };
 SHAPE (binary_acca_int64)
 
+/* _t vfoo[_n_t0](_t, int32_t)
+
+   i.e. the shape for binary operations that operate on
+   a vector and an int32_t.
+
+   Example: vbrsrq.
+   int16x8_t [__arm_]vbrsrq[_n_s16](int16x8_t a, int32_t b)
+   int16x8_t [__arm_]vbrsrq_m[_n_s16](int16x8_t inactive, int16x8_t a, int32_t 
b, mve_pred16_t p)
+   int16x8_t [__arm_]vbrsrq_x[_n_s16](int16x8_t a, int32_t b, mve_pred16_t p)  
*/
+struct binary_imm32_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_n, preserve_user_namespace);
+build_all (b, "v0,v0,ss32", group, MODE_n, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+return r.resolve_uniform (1, 1);
+  }
+};
+SHAPE (binary_imm32)
+
 /* _t vfoo[_n_t0](_t, const int)
 
Shape for vector shift right operations that take a vector first
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 6ae1443f26b..bba38194ce2 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -41,6 +41,7 @@ namespace arm_mve
 extern const function_shape *const binary_acc_int64;
 extern const function_shape *const binary_acca_int32;
 extern const function_shape *const binary_acca_int64;
+extern const function_shape *const binary_imm32;
 extern const function_shape *const binary_lshift_unsigned;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
-- 
2.34.1



[PATCH 07/26] arm: [MVE intrinsics] factorize vqshluq

2023-05-12 Thread Christophe Lyon via Gcc-patches
Factorize vqshluq builtins so that they use parameterized names.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/iterators.md (mve_insn): Add vqshlu.
(supf): Add VQSHLUQ_M_N_S, VQSHLUQ_N_S.
(VQSHLUQ_M_N, VQSHLUQ_N): New.
* config/arm/mve.md (mve_vqshluq_n_s): Change name into ...
(@mve_q_n_): ... this.
(mve_vqshluq_m_n_s): Change name into ...
(@mve_q_m_n_): ... this.
---
 gcc/config/arm/iterators.md |  6 ++
 gcc/config/arm/mve.md   | 12 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 116dd95fd88..d1d14488b56 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -1071,6 +1071,8 @@ (define_int_attr mve_insn [
 (VQSHLQ_N_S "vqshl") (VQSHLQ_N_U "vqshl")
 (VQSHLQ_R_S "vqshl") (VQSHLQ_R_U "vqshl")
 (VQSHLQ_S "vqshl") (VQSHLQ_U "vqshl")
+(VQSHLUQ_M_N_S "vqshlu")
+(VQSHLUQ_N_S "vqshlu")
 (VQSHRNBQ_M_N_S "vqshrnb") (VQSHRNBQ_M_N_U "vqshrnb")
 (VQSHRNBQ_N_S "vqshrnb") (VQSHRNBQ_N_U "vqshrnb")
 (VQSHRNTQ_M_N_S "vqshrnt") (VQSHRNTQ_M_N_U "vqshrnt")
@@ -2490,6 +2492,8 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U 
"u") (VREV16Q_S "s")
   (VRMLSLDAVHAXQ_P_S "s")
   (VRMLSLDAVHAXQ_S "s")
   (VRMLALDAVHAQ_P_S "s") (VRMLALDAVHAQ_P_U "u")
+  (VQSHLUQ_M_N_S "s")
+  (VQSHLUQ_N_S "s")
   ])
 
 ;; Both kinds of return insn.
@@ -2793,6 +2797,8 @@ (define_int_iterator VADCQ_M [VADCQ_M_U VADCQ_M_S])
 (define_int_iterator UQRSHLLQ [UQRSHLL_64 UQRSHLL_48])
 (define_int_iterator SQRSHRLQ [SQRSHRL_64 SQRSHRL_48])
 (define_int_iterator VSHLCQ_M [VSHLCQ_M_S VSHLCQ_M_U])
+(define_int_iterator VQSHLUQ_M_N [VQSHLUQ_M_N_S])
+(define_int_iterator VQSHLUQ_N [VQSHLUQ_N_S])
 
 ;; Define iterators for VCMLA operations
 (define_int_iterator VCMLA_OP [UNSPEC_VCMLA
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index b4faf7a4b18..7898361b859 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -1150,15 +1150,15 @@ (define_insn "@mve_q_r_"
 ;;
 ;; [vqshluq_n_s])
 ;;
-(define_insn "mve_vqshluq_n_s"
+(define_insn "@mve_q_n_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
   (match_operand:SI 2 "" "")]
-VQSHLUQ_N_S))
+VQSHLUQ_N))
   ]
   "TARGET_HAVE_MVE"
-  "vqshlu.s%#\t%q0, %q1, %2"
+  ".%#\t%q0, %q1, %2"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2653,17 +2653,17 @@ (define_insn "@mve_q_p_"
 ;;
 ;; [vqshluq_m_n_s])
 ;;
-(define_insn "mve_vqshluq_m_n_s"
+(define_insn "@mve_q_m_n_"
   [
(set (match_operand:MVE_2 0 "s_register_operand" "=w")
(unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0")
   (match_operand:MVE_2 2 "s_register_operand" "w")
   (match_operand:SI 3 "" "")
   (match_operand: 4 "vpr_register_operand" 
"Up")]
-VQSHLUQ_M_N_S))
+VQSHLUQ_M_N))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\n\tvqshlut.s%#\t%q0, %q2, %3"
+  "vpst\n\tt.%#\t%q0, %q2, %3"
   [(set_attr "type" "mve_move")
(set_attr "length" "8")])
 
-- 
2.34.1



[PATCH 06/26] arm: [MVE intrinsics] add binary_lshift_unsigned shape

2023-05-12 Thread Christophe Lyon via Gcc-patches
This patch adds the binary_lshift_unsigned shape description.

2022-12-12  Christophe Lyon  

gcc/
* config/arm/arm-mve-builtins-shapes.cc
(binary_lshift_unsigned): New.
* config/arm/arm-mve-builtins-shapes.h
(binary_lshift_unsigned): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 58 +++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 59 insertions(+)

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc 
b/gcc/config/arm/arm-mve-builtins-shapes.cc
index ee4bc3f8ea4..91540838e03 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -526,6 +526,64 @@ struct binary_rshift_def : public overloaded_base<0>
 SHAPE (binary_rshift)
 
 
+/* _t vfoo[_n_t0](_t, int)
+
+   Shape for vector saturating shift left operations that take a
+   vector of signed elements as first argument and an integer, and
+   produce a vector of unsigned elements.
+
+   Check that 'imm' is in the [0..#bits-1] range.
+
+   Example: vqshluq.
+   uint16x8_t [__arm_]vqshluq[_n_s16](int16x8_t a, const int imm)
+   uint16x8_t [__arm_]vqshluq_m[_n_s16](uint16x8_t inactive, int16x8_t a, 
const int imm, mve_pred16_t p)  */
+struct binary_lshift_unsigned_def : public overloaded_base<0>
+{
+  void
+  build (function_builder &b, const function_group_info &group,
+bool preserve_user_namespace) const override
+  {
+b.add_overloaded_functions (group, MODE_n, preserve_user_namespace);
+build_all (b, "vu0,vs0,ss32", group, MODE_n, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+unsigned int i, nargs;
+type_suffix_index type;
+if (!r.check_gp_argument (2, i, nargs)
+   || (type = r.infer_vector_type (i-1)) == NUM_TYPE_SUFFIXES)
+  return error_mark_node;
+
+if (r.pred == PRED_m)
+  {
+   /* With PRED_m, check that the 'inactive' first argument has
+  the expeected unsigned type.  */
+   type_suffix_index return_type
+ = find_type_suffix (TYPE_unsigned, type_suffixes[type].element_bits);
+
+   if (!r.require_matching_vector_type (0, return_type))
+ return error_mark_node;
+  }
+
+for (; i < nargs; ++i)
+  if (!r.require_integer_immediate (i))
+   return error_mark_node;
+
+return r.resolve_to (r.mode_suffix_id, type);
+  }
+
+  bool
+  check (function_checker &c) const override
+  {
+unsigned int bits = c.type_suffix (0).element_bits;
+return c.require_immediate_range (1, 0, bits - 1);
+  }
+
+};
+SHAPE (binary_lshift_unsigned)
+
 /* _t vfoo[_t0](_t, _t)
 
i.e. binary operations that take a vector of unsigned elements as first 
argument and a
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h 
b/gcc/config/arm/arm-mve-builtins-shapes.h
index 07b12b4af68..6ae1443f26b 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -41,6 +41,7 @@ namespace arm_mve
 extern const function_shape *const binary_acc_int64;
 extern const function_shape *const binary_acca_int32;
 extern const function_shape *const binary_acca_int64;
+extern const function_shape *const binary_lshift_unsigned;
 extern const function_shape *const binary_maxamina;
 extern const function_shape *const binary_maxavminav;
 extern const function_shape *const binary_maxvminv;
-- 
2.34.1



  1   2   3   4   5   6   7   8   9   10   >