Re: [PATCH] niter: Fix up unused var warning [PR108457]

2023-01-20 Thread Richard Biener via Gcc-patches
On Thu, 19 Jan 2023, Jakub Jelinek wrote:

> Hi!
> 
> tree-ssa-loop-niter.cc (build_cltz_expr) gets unused variable mode
> warning on some architectures where C[LT]Z_DEFINED_VALUE_AT_ZERO
> macro(s) don't use the first argument (which includes the
> defaults.h definitions of:
> #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  0
> #define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  0
> Other uses of this macro avoid this problem by avoiding temporaries
> which are only used as argument to those macros, the following patch
> does it the same way for consistency.  Plus some formatting fixes
> while at it.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2023-01-19  Jakub Jelinek  
> 
>   PR tree-optimization/108457
>   * tree-ssa-loop-niter.cc (build_cltz_expr): Use
>   SCALAR_INT_TYPE_MODE (utype) directly as C[LT]Z_DEFINED_VALUE_AT_ZERO
>   argument instead of a temporary.  Formatting fixes.
> 
> --- gcc/tree-ssa-loop-niter.cc.jj 2023-01-16 11:52:05.806885510 +0100
> +++ gcc/tree-ssa-loop-niter.cc2023-01-19 13:10:42.872595970 +0100
> @@ -2252,16 +2252,16 @@ build_cltz_expr (tree src, bool leading,
>call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
>  integer_type_node, 1, src);
>int val;
> -  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (utype);
>int optab_defined_at_zero
> - = leading ? CLZ_DEFINED_VALUE_AT_ZERO (mode, val)
> -   : CTZ_DEFINED_VALUE_AT_ZERO (mode, val);
> + = (leading
> +? CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val)
> +: CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val));
>if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
>   {
> tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
> build_zero_cst (TREE_TYPE (src)));
> -   call = fold_build3(COND_EXPR, integer_type_node, is_zero, call,
> -  build_int_cst (integer_type_node, prec));
> +   call = fold_build3 (COND_EXPR, integer_type_node, is_zero, call,
> +   build_int_cst (integer_type_node, prec));
>   }
>  }
>else if (prec == 2 * lli_prec)
> @@ -2275,22 +2275,22 @@ build_cltz_expr (tree src, bool leading,
>/* We count the zeroes in src1, and add the number in src2 when src1
>is 0.  */
>if (!leading)
> - std::swap(src1, src2);
> + std::swap (src1, src2);
>tree call1 = build_call_expr (fn, 1, src1);
>tree call2 = build_call_expr (fn, 1, src2);
>if (define_at_zero)
>   {
> tree is_zero2 = fold_build2 (NE_EXPR, boolean_type_node, src2,
>  build_zero_cst (TREE_TYPE (src2)));
> -   call2 = fold_build3(COND_EXPR, integer_type_node, is_zero2, call2,
> -   build_int_cst (integer_type_node, lli_prec));
> +   call2 = fold_build3 (COND_EXPR, integer_type_node, is_zero2, call2,
> +build_int_cst (integer_type_node, lli_prec));
>   }
>tree is_zero1 = fold_build2 (NE_EXPR, boolean_type_node, src1,
>  build_zero_cst (TREE_TYPE (src1)));
> -  call = fold_build3(COND_EXPR, integer_type_node, is_zero1, call1,
> -  fold_build2 (PLUS_EXPR, integer_type_node, call2,
> -   build_int_cst (integer_type_node,
> -  lli_prec)));
> +  call = fold_build3 (COND_EXPR, integer_type_node, is_zero1, call1,
> +   fold_build2 (PLUS_EXPR, integer_type_node, call2,
> +build_int_cst (integer_type_node,
> +   lli_prec)));
>  }
>else
>  {
> @@ -2302,14 +2302,13 @@ build_cltz_expr (tree src, bool leading,
>   {
> tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
> build_zero_cst (TREE_TYPE (src)));
> -   call = fold_build3(COND_EXPR, integer_type_node, is_zero, call,
> -  build_int_cst (integer_type_node, prec));
> +   call = fold_build3 (COND_EXPR, integer_type_node, is_zero, call,
> +   build_int_cst (integer_type_node, prec));
>   }
>  
>if (leading && prec < i_prec)
> - call = fold_build2(MINUS_EXPR, integer_type_node, call,
> -build_int_cst (integer_type_node,
> -   i_prec - prec));
> + call = fold_build2 (MINUS_EXPR, integer_type_node, call,
> + build_int_cst (integer_type_node, i_prec - prec));
>  }
>  
>return call;
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonal

[PATCH] RISC-V: Add TARGET_MIN_VLEN > 32 into iterators of EEW = 64 vector modes

2023-01-20 Thread juzhe . zhong
From: Ju-Zhe Zhong 

According to RVV ISA, RVV doesn't support EEW == 64 vector type for zve32x
and zve32f. So it makes sense add predicate in the iterators of EEW = 64
vector modes.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Add TARGET_MIN_VLEN > 32 predicates.

---
 gcc/config/riscv/vector-iterators.md | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 92c4bd0a6a3..1f29050622b 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -22,7 +22,8 @@
   VNx1QI VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN > 32")
   VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32")
   VNx1SI VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32")
-  VNx1DI VNx2DI VNx4DI (VNx8DI "TARGET_MIN_VLEN > 32")
+  (VNx1DI "TARGET_MIN_VLEN > 32") (VNx2DI "TARGET_MIN_VLEN > 32")
+  (VNx4DI "TARGET_MIN_VLEN > 32") (VNx8DI "TARGET_MIN_VLEN > 32")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
@@ -38,7 +39,8 @@
   (VNx4QI "TARGET_MIN_VLEN == 32") VNx8QI VNx16QI VNx32QI (VNx64QI 
"TARGET_MIN_VLEN > 32")
   (VNx2HI "TARGET_MIN_VLEN == 32") VNx4HI VNx8HI VNx16HI (VNx32HI 
"TARGET_MIN_VLEN > 32")
   (VNx1SI "TARGET_MIN_VLEN == 32") VNx2SI VNx4SI VNx8SI (VNx16SI 
"TARGET_MIN_VLEN > 32")
-  VNx1DI VNx2DI VNx4DI (VNx8DI "TARGET_MIN_VLEN > 32")
+  (VNx1DI "TARGET_MIN_VLEN > 32") (VNx2DI "TARGET_MIN_VLEN > 32")
+  (VNx4DI "TARGET_MIN_VLEN > 32") (VNx8DI "TARGET_MIN_VLEN > 32")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN == 32")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
-- 
2.36.3



[PATCH 1/4] libbacktrace: change all pc related variables to uintptr_t

2023-01-20 Thread Björn Schäpers
From: Björn Schäpers 

It's the right thing to do, since the PC shouldn't go out of the
uintptr_t domain, and in backtrace_pcinfo the pc is uintptr_t.
This is a preparation for a following patch.

Tested on x86_64-linux and i686-w64-mingw32.

-- >8 --

* dwarf.c: changed variables holding pc values from uint64_t to
uintptr_t.

Signed-off-by: Björn Schäpers 
---
 libbacktrace/dwarf.c | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 45cc9e77e40..0707ccddd3e 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -274,8 +274,8 @@ struct function
 struct function_addrs
 {
   /* Range is LOW <= PC < HIGH.  */
-  uint64_t low;
-  uint64_t high;
+  uintptr_t low;
+  uintptr_t high;
   /* Function for this address range.  */
   struct function *function;
 };
@@ -356,8 +356,8 @@ struct unit
 struct unit_addrs
 {
   /* Range is LOW <= PC < HIGH.  */
-  uint64_t low;
-  uint64_t high;
+  uintptr_t low;
+  uintptr_t high;
   /* Compilation unit for this address range.  */
   struct unit *u;
 };
@@ -1094,7 +1094,7 @@ resolve_addr_index (const struct dwarf_sections 
*dwarf_sections,
uint64_t addr_base, int addrsize, int is_bigendian,
uint64_t addr_index,
backtrace_error_callback error_callback, void *data,
-   uint64_t *address)
+   uintptr_t *address)
 {
   uint64_t offset;
   struct dwarf_buf addr_buf;
@@ -1194,7 +1194,7 @@ function_addrs_search (const void *vkey, const void 
*ventry)
 
 static int
 add_unit_addr (struct backtrace_state *state, void *rdata,
-  uint64_t lowpc, uint64_t highpc,
+  uintptr_t lowpc, uintptr_t highpc,
   backtrace_error_callback error_callback, void *data,
   void *pvec)
 {
@@ -1530,10 +1530,10 @@ lookup_abbrev (struct abbrevs *abbrevs, uint64_t code,
lowpc/highpc is set or ranges is set.  */
 
 struct pcrange {
-  uint64_t lowpc;  /* The low PC value.  */
+  uintptr_t lowpc; /* The low PC value.  */
   int have_lowpc;  /* Whether a low PC value was found.  */
   int lowpc_is_addr_index; /* Whether lowpc is in .debug_addr.  */
-  uint64_t highpc; /* The high PC value.  */
+  uintptr_t highpc;/* The high PC value.  */
   int have_highpc; /* Whether a high PC value was found.  */
   int highpc_is_relative;  /* Whether highpc is relative to lowpc.  */
   int highpc_is_addr_index;/* Whether highpc is in .debug_addr.  */
@@ -1613,16 +1613,16 @@ add_low_high_range (struct backtrace_state *state,
uintptr_t base_address, int is_bigendian,
struct unit *u, const struct pcrange *pcrange,
int (*add_range) (struct backtrace_state *state,
- void *rdata, uint64_t lowpc,
- uint64_t highpc,
+ void *rdata, uintptr_t lowpc,
+ uintptr_t highpc,
  backtrace_error_callback error_callback,
  void *data, void *vec),
void *rdata,
backtrace_error_callback error_callback, void *data,
void *vec)
 {
-  uint64_t lowpc;
-  uint64_t highpc;
+  uintptr_t lowpc;
+  uintptr_t highpc;
 
   lowpc = pcrange->lowpc;
   if (pcrange->lowpc_is_addr_index)
@@ -1663,7 +1663,7 @@ add_ranges_from_ranges (
 struct unit *u, uint64_t base,
 const struct pcrange *pcrange,
 int (*add_range) (struct backtrace_state *state, void *rdata,
- uint64_t lowpc, uint64_t highpc,
+ uintptr_t lowpc, uintptr_t highpc,
  backtrace_error_callback error_callback, void *data,
  void *vec),
 void *rdata,
@@ -1727,10 +1727,10 @@ add_ranges_from_rnglists (
 struct backtrace_state *state,
 const struct dwarf_sections *dwarf_sections,
 uintptr_t base_address, int is_bigendian,
-struct unit *u, uint64_t base,
+struct unit *u, uintptr_t base,
 const struct pcrange *pcrange,
 int (*add_range) (struct backtrace_state *state, void *rdata,
- uint64_t lowpc, uint64_t highpc,
+ uintptr_t lowpc, uintptr_t highpc,
  backtrace_error_callback error_callback, void *data,
  void *vec),
 void *rdata,
@@ -1796,8 +1796,8 @@ add_ranges_from_rnglists (
case DW_RLE_startx_endx:
  {
uint64_t index;
-   uint64_t low;
-   uint64_t high;
+   uintptr_t low;
+   uintptr_t high;
 
index = read_uleb128 (&rnglists_buf);
if (!resolve_addr_index (dwarf_sections, u->addr_base,
@@ -1819,8 +1819,8 @@ add_ranges_from_r

[PATCH 3/4] libbacktrace: work with aslr on windows

2023-01-20 Thread Björn Schäpers
From: Björn Schäpers 

Any underflow which might happen, will be countered by an overflow in
dwarf.c.

Tested on x86_64-linux and i686-w64-mingw32.

-- >8 --

Fixes https://github.com/ianlancetaylor/libbacktrace/issues/89 and
https://github.com/ianlancetaylor/libbacktrace/issues/82.

* pecoff.c (coff_add): Set the base_address of the module, to
find the debug information on moved applications.

Signed-off-by: Björn Schäpers 
---
 libbacktrace/pecoff.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/libbacktrace/pecoff.c b/libbacktrace/pecoff.c
index 87b3c0cc647..296f1357b5f 100644
--- a/libbacktrace/pecoff.c
+++ b/libbacktrace/pecoff.c
@@ -39,6 +39,18 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #include "backtrace.h"
 #include "internal.h"
 
+#ifdef HAVE_WINDOWS_H
+#ifndef WIN32_MEAN_AND_LEAN
+#define WIN32_MEAN_AND_LEAN
+#endif
+
+#ifndef NOMINMAX
+#define NOMINMAX
+#endif
+
+#include 
+#endif
+
 /* Coff file header.  */
 
 typedef struct {
@@ -610,6 +622,8 @@ coff_add (struct backtrace_state *state, int descriptor,
   int debug_view_valid;
   int is_64;
   uintptr_t image_base;
+  uintptr_t base_address = 0;
+  uintptr_t module_handle;
   struct dwarf_sections dwarf_sections;
 
   *found_sym = 0;
@@ -856,7 +870,12 @@ coff_add (struct backtrace_state *state, int descriptor,
  + (sections[i].offset - min_offset));
 }
 
-  if (!backtrace_dwarf_add (state, /* base_address */ 0, &dwarf_sections,
+#ifdef HAVE_WINDOWS_H
+module_handle = (uintptr_t) GetModuleHandleW (NULL);
+base_address = module_handle - image_base;
+#endif
+
+  if (!backtrace_dwarf_add (state, base_address, &dwarf_sections,
0, /* FIXME: is_bigendian */
NULL, /* altlink */
error_callback, data, fileline_fn,
-- 
2.38.1



[PATCH 2/4] libbacktrace: detect executable path on windows

2023-01-20 Thread Björn Schäpers
From: Björn Schäpers 

This is actually needed so that libstdc++'s  implementation
to be able to work on windows.

Tested on x86_64-linux and i686-w64-mingw32.

-- >8 --

* configure.ac: Add a check for windows.h.
* configure, config.h.in: Regenerate.
* fileline.c: Add windows_get_executable_path.
* fileline.c (fileline_initialiez): Add a pass using
windows_get_executable_path.

Signed-off-by: Björn Schäpers 
---
 libbacktrace/config.h.in  |  3 +++
 libbacktrace/configure| 13 
 libbacktrace/configure.ac |  2 ++
 libbacktrace/fileline.c   | 43 ++-
 4 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/libbacktrace/config.h.in b/libbacktrace/config.h.in
index a21e2eaf525..355e820741b 100644
--- a/libbacktrace/config.h.in
+++ b/libbacktrace/config.h.in
@@ -100,6 +100,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_UNISTD_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_WINDOWS_H
+
 /* Define if -lz is available. */
 #undef HAVE_ZLIB
 
diff --git a/libbacktrace/configure b/libbacktrace/configure
index a5bd133f4e4..ef677423733 100755
--- a/libbacktrace/configure
+++ b/libbacktrace/configure
@@ -13403,6 +13403,19 @@ $as_echo "#define HAVE_LOADQUERY 1" >>confdefs.h
 
 fi
 
+for ac_header in windows.h
+do :
+  ac_fn_c_check_header_mongrel "$LINENO" "windows.h" "ac_cv_header_windows_h" 
"$ac_includes_default"
+if test "x$ac_cv_header_windows_h" = xyes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_WINDOWS_H 1
+_ACEOF
+
+fi
+
+done
+
+
 # Check for the fcntl function.
 if test -n "${with_target_subdir}"; then
case "${host}" in
diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac
index 1daaa2f62d2..b5feb29bcdc 100644
--- a/libbacktrace/configure.ac
+++ b/libbacktrace/configure.ac
@@ -377,6 +377,8 @@ if test "$have_loadquery" = "yes"; then
   AC_DEFINE(HAVE_LOADQUERY, 1, [Define if AIX loadquery is available.])
 fi
 
+AC_CHECK_HEADERS(windows.h)
+
 # Check for the fcntl function.
 if test -n "${with_target_subdir}"; then
case "${host}" in
diff --git a/libbacktrace/fileline.c b/libbacktrace/fileline.c
index a40cd498114..73c2c8e8bc9 100644
--- a/libbacktrace/fileline.c
+++ b/libbacktrace/fileline.c
@@ -47,6 +47,18 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #include 
 #endif
 
+#ifdef HAVE_WINDOWS_H
+#ifndef WIN32_MEAN_AND_LEAN
+#define WIN32_MEAN_AND_LEAN
+#endif
+
+#ifndef NOMINMAX
+#define NOMINMAX
+#endif
+
+#include 
+#endif
+
 #include "backtrace.h"
 #include "internal.h"
 
@@ -155,6 +167,28 @@ macho_get_executable_path (struct backtrace_state *state,
 
 #endif /* !defined (HAVE_MACH_O_DYLD_H) */
 
+#ifdef HAVE_WINDOWS_H
+
+static char *
+windows_get_executable_path (char *buf, backtrace_error_callback 
error_callback,
+void *data)
+{
+  if (GetModuleFileNameA (NULL, buf, MAX_PATH - 1) == 0)
+{
+  error_callback (data,
+ "could not get the filename of the current executable",
+ (int) GetLastError ());
+  return NULL;
+}
+  return buf;
+}
+
+#else /* !defined (HAVE_WINDOWS_H) */
+
+#define windows_get_executable_path(buf, error_callback, data) NULL
+
+#endif /* !defined (HAVE_WINDOWS_H) */
+
 /* Initialize the fileline information from the executable.  Returns 1
on success, 0 on failure.  */
 
@@ -168,7 +202,11 @@ fileline_initialize (struct backtrace_state *state,
   int called_error_callback;
   int descriptor;
   const char *filename;
+#ifdef HAVE_WINDOWS_H
+  char buf[MAX_PATH];
+#else
   char buf[64];
+#endif
 
   if (!state->threaded)
 failed = state->fileline_initialization_failed;
@@ -192,7 +230,7 @@ fileline_initialize (struct backtrace_state *state,
 
   descriptor = -1;
   called_error_callback = 0;
-  for (pass = 0; pass < 8; ++pass)
+  for (pass = 0; pass < 9; ++pass)
 {
   int does_not_exist;
 
@@ -224,6 +262,9 @@ fileline_initialize (struct backtrace_state *state,
case 7:
  filename = macho_get_executable_path (state, error_callback, data);
  break;
+   case 8:
+ filename = windows_get_executable_path (buf, error_callback, data);
+ break;
default:
  abort ();
}
-- 
2.38.1



[PATCH 4/4] libbacktrace: get debug information for loaded dlls

2023-01-20 Thread Björn Schäpers
From: Björn Schäpers 

Fixes https://github.com/ianlancetaylor/libbacktrace/issues/53, except
that libraries loaded after the backtrace_initialize are not handled.
But as far as I can see that's the same for elf.

Tested on x86_64-linux and i686-w64-mingw32.

-- >8 --

* pecoff.c (coff_add): New argument for the module handle of the
file, to get the base address.
* pecoff.c (backtrace_initialize): Iterate over loaded libraries
and call coff_add.

Signed-off-by: Björn Schäpers 
---
 libbacktrace/pecoff.c | 76 ---
 1 file changed, 72 insertions(+), 4 deletions(-)

diff --git a/libbacktrace/pecoff.c b/libbacktrace/pecoff.c
index 296f1357b5f..40395109e51 100644
--- a/libbacktrace/pecoff.c
+++ b/libbacktrace/pecoff.c
@@ -49,6 +49,7 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #endif
 
 #include 
+#include 
 #endif
 
 /* Coff file header.  */
@@ -592,7 +593,8 @@ coff_syminfo (struct backtrace_state *state, uintptr_t addr,
 static int
 coff_add (struct backtrace_state *state, int descriptor,
  backtrace_error_callback error_callback, void *data,
- fileline *fileline_fn, int *found_sym, int *found_dwarf)
+ fileline *fileline_fn, int *found_sym, int *found_dwarf,
+ uintptr_t module_handle ATTRIBUTE_UNUSED)
 {
   struct backtrace_view fhdr_view;
   off_t fhdr_off;
@@ -623,7 +625,6 @@ coff_add (struct backtrace_state *state, int descriptor,
   int is_64;
   uintptr_t image_base;
   uintptr_t base_address = 0;
-  uintptr_t module_handle;
   struct dwarf_sections dwarf_sections;
 
   *found_sym = 0;
@@ -871,7 +872,6 @@ coff_add (struct backtrace_state *state, int descriptor,
 }
 
 #ifdef HAVE_WINDOWS_H
-module_handle = (uintptr_t) GetModuleHandleW (NULL);
 base_address = module_handle - image_base;
 #endif
 
@@ -914,12 +914,80 @@ backtrace_initialize (struct backtrace_state *state,
   int found_sym;
   int found_dwarf;
   fileline coff_fileline_fn;
+  uintptr_t module_handle = 0;
+
+#ifdef HAVE_WINDOWS_H
+  DWORD i;
+  DWORD module_count;
+  DWORD bytes_needed_for_modules;
+  HMODULE *modules;
+  char module_name[MAX_PATH];
+  int module_found_sym;
+  fileline module_fileline_fn;
+
+  module_handle = (uintptr_t) GetModuleHandleW (NULL);
+#endif
 
   ret = coff_add (state, descriptor, error_callback, data,
- &coff_fileline_fn, &found_sym, &found_dwarf);
+ &coff_fileline_fn, &found_sym, &found_dwarf, module_handle);
   if (!ret)
 return 0;
 
+#ifdef HAVE_WINDOWS_H
+  module_count = 1000;
+ alloc_modules:
+  modules = backtrace_alloc (state, module_count * sizeof(HMODULE),
+error_callback, data);
+  if (modules == NULL)
+goto skip_modules;
+  if (!EnumProcessModules (GetCurrentProcess (), modules, module_count,
+  &bytes_needed_for_modules))
+{
+  error_callback(data, "Could not enumerate process modules",
+(int) GetLastError ());
+  goto free_modules;
+}
+  if (bytes_needed_for_modules > module_count * sizeof(HMODULE))
+{
+  backtrace_free (state, modules, module_count * sizeof(HMODULE),
+ error_callback, data);
+  // Add an extra of 2, if some module is loaded in another thread.
+  module_count = bytes_needed_for_modules / sizeof(HMODULE) + 2;
+  modules = NULL;
+  goto alloc_modules;
+}
+
+  for (i = 0; i < bytes_needed_for_modules / sizeof(HMODULE); ++i)
+{
+  if (GetModuleFileNameA (modules[i], module_name, MAX_PATH - 1))
+   {
+ if (strcmp (filename, module_name) == 0)
+   continue;
+
+ module_handle = (uintptr_t) GetModuleHandleA (module_name);
+ if (module_handle == 0)
+   continue;
+
+ descriptor = backtrace_open (module_name, error_callback, data, NULL);
+ if (descriptor < 0)
+   continue;
+
+ coff_add (state, descriptor, error_callback, data,
+   &module_fileline_fn, &module_found_sym, &found_dwarf,
+   module_handle);
+ if (module_found_sym)
+   found_sym = 1;
+   }
+}
+
+ free_modules:
+  if (modules)
+backtrace_free(state, modules, module_count * sizeof(HMODULE),
+  error_callback, data);
+  modules = NULL;
+ skip_modules:
+#endif
+
   if (!state->threaded)
 {
   if (found_sym)
-- 
2.38.1



[PATCH] modula2/108144 - Fix multilib install of libgm2

2023-01-20 Thread Richard Biener via Gcc-patches
The following adjusts libgm2 to properly use the multilib build
infrastructure, thereby fixing the install with
--enable-version-specific-runtime-libs

In particular config-ml.pl needs to be applied to generated Makefiles
as documented in the manual and we have to avoid clobbering the
variables via make arguments.  The explicit install rules used different
ways to construct the multilib dir which isn't necessary and breaks
when MUTLIDIR is now finally set correctly.  Instead use
$(toolexeclibdir).

This results in some dead variables in the Makefile.am (and there were
some before), I refrained from doing even more changes here.

Verified with an install with and without --enable-version-specific-runtime-libs
and checking the result.

OK?

Thanks,
Richard.

PR modula2/108144
libgm2/
* configure.ac: Apply config-ml.pl to the generated Makefiles.
Set multilib_arg, use AM_PROG_LIBTOOL.
* configure: Regenerate.
* Makefile.am (AM_MAKEFLAGS): Do not override MULTI* flags.
* Makefile.in: Regenerate.
* libm2cor/Makefile.am: Install to $(toolexeclibdir)$(M2LIBDIR)
rather than $(inst_libdir)/$(MULTIDIR)$(M2LIBDIR).
* libm2iso/Makefile.am: Likewise.
* libm2log/Makefile.am: Likewise.
* libm2min/Makefile.am: Likewise.
* libm2pim/Makefile.am: Likewise.
* libm2cor/Makefile.am: Regenerate.
* libm2iso/Makefile.am: Likewise.
* libm2log/Makefile.am: Likewise.
* libm2min/Makefile.am: Likewise.
* libm2pim/Makefile.am: Likewise.
---
 libgm2/Makefile.am  |  4 --
 libgm2/Makefile.in  |  4 --
 libgm2/configure| 79 -
 libgm2/configure.ac | 25 ++--
 libgm2/libm2cor/Makefile.am | 18 -
 libgm2/libm2cor/Makefile.in | 18 -
 libgm2/libm2iso/Makefile.am | 22 +--
 libgm2/libm2iso/Makefile.in | 22 +--
 libgm2/libm2log/Makefile.am | 18 -
 libgm2/libm2log/Makefile.in | 18 -
 libgm2/libm2min/Makefile.am | 18 -
 libgm2/libm2min/Makefile.in | 18 -
 libgm2/libm2pim/Makefile.am | 16 
 libgm2/libm2pim/Makefile.in | 16 
 14 files changed, 183 insertions(+), 113 deletions(-)

diff --git a/libgm2/Makefile.am b/libgm2/Makefile.am
index 524ea6c7124..0b593f6ff21 100644
--- a/libgm2/Makefile.am
+++ b/libgm2/Makefile.am
@@ -69,10 +69,6 @@ AM_MAKEFLAGS = \
"LIBCFLAGS_FOR_TARGET=$(LIBCFLAGS_FOR_TARGET)" \
"MAKE=$(MAKE)" \
"MAKEINFO=$(MAKEINFO) $(MAKEINFOFLAGS)" \
-"MULTIBUILDTOP=$(MULTIBUILDTOP)" \
-"MULTISUBDIR=$(MULTISUBDIR)" \
-"MULTIOSDIR=$(MULTIDIR)" \
-"MULTIFLAGS=$(MULTIFLAGS)" \
"PICFLAG=$(PICFLAG)" \
"PICFLAG_FOR_TARGET=$(PICFLAG_FOR_TARGET)" \
"SHELL=$(SHELL)" \
diff --git a/libgm2/Makefile.in b/libgm2/Makefile.in
index ac01eafe45c..da2ec7c2a09 100644
--- a/libgm2/Makefile.in
+++ b/libgm2/Makefile.in
@@ -368,10 +368,6 @@ AM_MAKEFLAGS = \
"LIBCFLAGS_FOR_TARGET=$(LIBCFLAGS_FOR_TARGET)" \
"MAKE=$(MAKE)" \
"MAKEINFO=$(MAKEINFO) $(MAKEINFOFLAGS)" \
-"MULTIBUILDTOP=$(MULTIBUILDTOP)" \
-"MULTISUBDIR=$(MULTISUBDIR)" \
-"MULTIOSDIR=$(MULTIDIR)" \
-"MULTIFLAGS=$(MULTIFLAGS)" \
"PICFLAG=$(PICFLAG)" \
"PICFLAG_FOR_TARGET=$(PICFLAG_FOR_TARGET)" \
"SHELL=$(SHELL)" \
diff --git a/libgm2/configure b/libgm2/configure
index 8b2c28cb163..8eb1bc81c66 100755
--- a/libgm2/configure
+++ b/libgm2/configure
@@ -6575,6 +6575,10 @@ fi
 
 
 
+enable_dlopen=yes
+
+
+
 case `pwd` in
   *\ * | *\*)
 { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: Libtool does not cope 
well with whitespace in \`pwd\`" >&5
@@ -9193,8 +9197,6 @@ done
 
 
 
-enable_dlopen=no
-
 
   enable_win32_dll=no
 
@@ -12704,7 +12706,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12707 "configure"
+#line 12709 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12810,7 +12812,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12813 "configure"
+#line 12815 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -16087,13 +16089,14 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 # Only expand once:
 
 
-enable_dlopen=yes
-
 
 
-# AM_PROG_LIBTOOL
-
 
+if test "${multilib}" = "yes"; then
+  multilib_arg="--enable-multilib"
+else
+  multilib_arg=
+fi
 
 ac_fn_c_check_type "$LINENO" "struct timezone" "ac_cv_type_struct_timezone" 
"$ac_includes_default"
 if test "x$ac_cv_type_struct_timezone" = xyes; then :
@@ -19716,7 +19719,10 @@ fi
 
 
 
-ac_config_files="$ac_config_files Makefile libm2min/Makefile libm2pim/Makefile 
libm2iso/Makefile libm2cor/Makefile libm2log/Makefile"
+ac_config_files="$ac_config_files Makefile"
+
+
+ac_config_files="$ac_config_files libm2min/Makefile libm

Re: [PATCH] AArch64: Gate various crypto intrinsics availability based on features

2023-01-20 Thread Tejas Belagod via Gcc-patches



From: Kyrylo Tkachov 
Date: Tuesday, January 17, 2023 at 3:53 PM
To: Tejas Belagod , gcc-patches@gcc.gnu.org 

Cc: Richard Sandiford , Richard Earnshaw 

Subject: RE: [PATCH] AArch64: Gate various crypto intrinsics availability based 
on features
Hi Tejas,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Tejas Belagod
> via Gcc-patches
> Sent: Monday, January 16, 2023 7:12 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tejas Belagod ; Richard Sandiford
> ; Richard Earnshaw
> 
> Subject: [PATCH] AArch64: Gate various crypto intrinsics availability based on
> features
>
> The 64-bit variant of PMULL{2} and AES instructions are available if FEAT_AES
> is implemented according to the Arm ARM [1].  Similarly FEAT_SHA1 and
> FEAT_SHA256 enable the use of SHA1 and SHA256 instruction variants.
> This patch fixes arm_neon.h to correctly reflect the feature availability 
> based
> on '+aes' and '+sha2' as opposed to the ambiguous catch-all '+crypto'.
>
> [1] Section D17.2.61, C7.2.215
>
> 2022-01-11  Tejas Belagod  
>
> gcc/
>* config/aarch64/arm_neon.h: Gate AES and PMULL64 intrinsics
>under target feature +aes as opposed to +crypto. Gate SHA1 and
> SHA2
>intrinsics under +sha2.

The ChangeLog should list the intrinsics affected like
* config/aarch64/arm_neon.h (vmull_p64, vmull_high_p64): Gate under 
"nothing+aes"
For example.
Ok with a fixed ChangeLog.
Thanks,
Kyrill


Thanks for the review Kyrill, now pushed to master. OK to backport to 12?
Thanks,
Tejas.

>
> testsuite/
>
>* gcc.target/aarch64/acle/pmull64.c: New.
>* gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c: Replace '+crypto'
> with
>corresponding feature flag based on the intrinsic.
>* gcc.target/aarch64/aes-fuse-2.c: Likewise.
>* gcc.target/aarch64/aes_1.c: Likewise.
>* gcc.target/aarch64/aes_2.c: Likewise.
>* gcc.target/aarch64/aes_xor_combine.c: Likewise.
>* gcc.target/aarch64/sha1_1.c: Likewise.
>* gcc.target/aarch64/sha256_1.c: Likewise.
>* gcc.target/aarch64/target_attr_crypto_ice_1.c: Likewise.
> ---
>  gcc/config/aarch64/arm_neon.h | 35 ++-
>  .../gcc.target/aarch64/acle/pmull64.c | 14 
>  gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c |  4 +--
>  gcc/testsuite/gcc.target/aarch64/aes-fuse-2.c |  4 +--
>  gcc/testsuite/gcc.target/aarch64/aes_1.c  |  2 +-
>  gcc/testsuite/gcc.target/aarch64/aes_2.c  |  4 ++-
>  .../gcc.target/aarch64/aes_xor_combine.c  |  2 +-
>  gcc/testsuite/gcc.target/aarch64/sha1_1.c |  2 +-
>  gcc/testsuite/gcc.target/aarch64/sha256_1.c   |  2 +-
>  .../aarch64/target_attr_crypto_ice_1.c|  2 +-
>  10 files changed, 44 insertions(+), 27 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
>
> diff --git a/gcc/config/aarch64/arm_neon.h
> b/gcc/config/aarch64/arm_neon.h
> index cf6af728ca9..a795a387b38 100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -7496,7 +7496,7 @@ vqrdmlshs_laneq_s32 (int32_t __a, int32_t __b,
> int32x4_t __c, const int __d)
>  #pragma GCC pop_options
>
>  #pragma GCC push_options
> -#pragma GCC target ("+nothing+crypto")
> +#pragma GCC target ("+nothing+aes")
>  /* vaes  */
>
>  __extension__ extern __inline uint8x16_t
> @@ -7526,6 +7526,22 @@ vaesimcq_u8 (uint8x16_t data)
>  {
>return __builtin_aarch64_crypto_aesimcv16qi_uu (data);
>  }
> +
> +__extension__ extern __inline poly128_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmull_p64 (poly64_t __a, poly64_t __b)
> +{
> +  return
> +__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
> +}
> +
> +__extension__ extern __inline poly128_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
> +{
> +  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
> +}
> +
>  #pragma GCC pop_options
>
>  /* vcage  */
> @@ -20772,7 +20788,7 @@ vrsrad_n_u64 (uint64_t __a, uint64_t __b, const
> int __c)
>  }
>
>  #pragma GCC push_options
> -#pragma GCC target ("+nothing+crypto")
> +#pragma GCC target ("+nothing+sha2")
>
>  /* vsha1  */
>
> @@ -20849,21 +20865,6 @@ vsha256su1q_u32 (uint32x4_t __tw0_3,
> uint32x4_t __w8_11, uint32x4_t __w12_15)
>   __w12_15);
>  }
>
> -__extension__ extern __inline poly128_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -vmull_p64 (poly64_t __a, poly64_t __b)
> -{
> -  return
> -__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
> -}
> -
> -__extension__ extern __inline poly128_t
> -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> -vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
> -{
> -  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
> -}
> -
>  #pragma GCC pop_options
>
>  /* vshl */
> diff --git a/gcc/testsuite/gcc.target/

RE: [PATCH] AArch64: Gate various crypto intrinsics availability based on features

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches
Hi Tejas,

Ok to backport, but can you please check the older supported releases as well 
if they need this?
Ok for the other branches too assuming testing works ok.
Thanks,
Kyrill

From: Tejas Belagod 
Sent: Friday, January 20, 2023 12:57 PM
To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
Cc: Richard Sandiford ; Richard Earnshaw 

Subject: Re: [PATCH] AArch64: Gate various crypto intrinsics availability based 
on features



From: Kyrylo Tkachov mailto:kyrylo.tkac...@arm.com>>
Date: Tuesday, January 17, 2023 at 3:53 PM
To: Tejas Belagod mailto:tejas.bela...@arm.com>>, 
gcc-patches@gcc.gnu.org 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Richard Sandiford 
mailto:richard.sandif...@arm.com>>, Richard Earnshaw 
mailto:richard.earns...@arm.com>>
Subject: RE: [PATCH] AArch64: Gate various crypto intrinsics availability based 
on features
Hi Tejas,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org>
>  On Behalf Of Tejas Belagod
> via Gcc-patches
> Sent: Monday, January 16, 2023 7:12 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tejas Belagod mailto:tejas.bela...@arm.com>>; 
> Richard Sandiford
> mailto:richard.sandif...@arm.com>>; Richard 
> Earnshaw
> mailto:richard.earns...@arm.com>>
> Subject: [PATCH] AArch64: Gate various crypto intrinsics availability based on
> features
>
> The 64-bit variant of PMULL{2} and AES instructions are available if FEAT_AES
> is implemented according to the Arm ARM [1].  Similarly FEAT_SHA1 and
> FEAT_SHA256 enable the use of SHA1 and SHA256 instruction variants.
> This patch fixes arm_neon.h to correctly reflect the feature availability 
> based
> on '+aes' and '+sha2' as opposed to the ambiguous catch-all '+crypto'.
>
> [1] Section D17.2.61, C7.2.215
>
> 2022-01-11  Tejas Belagod  
>
> gcc/
>* config/aarch64/arm_neon.h: Gate AES and PMULL64 intrinsics
>under target feature +aes as opposed to +crypto. Gate SHA1 and
> SHA2
>intrinsics under +sha2.

The ChangeLog should list the intrinsics affected like
* config/aarch64/arm_neon.h (vmull_p64, vmull_high_p64): Gate under 
"nothing+aes"
For example.
Ok with a fixed ChangeLog.
Thanks,
Kyrill

Thanks for the review Kyrill, now pushed to master. OK to backport to 12?
Thanks,
Tejas.

>
> testsuite/
>
>* gcc.target/aarch64/acle/pmull64.c: New.
>* gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c: Replace '+crypto'
> with
>corresponding feature flag based on the intrinsic.
>* gcc.target/aarch64/aes-fuse-2.c: Likewise.
>* gcc.target/aarch64/aes_1.c: Likewise.
>* gcc.target/aarch64/aes_2.c: Likewise.
>* gcc.target/aarch64/aes_xor_combine.c: Likewise.
>* gcc.target/aarch64/sha1_1.c: Likewise.
>* gcc.target/aarch64/sha256_1.c: Likewise.
>* gcc.target/aarch64/target_attr_crypto_ice_1.c: Likewise.
> ---
>  gcc/config/aarch64/arm_neon.h | 35 ++-
>  .../gcc.target/aarch64/acle/pmull64.c | 14 
>  gcc/testsuite/gcc.target/aarch64/aes-fuse-1.c |  4 +--
>  gcc/testsuite/gcc.target/aarch64/aes-fuse-2.c |  4 +--
>  gcc/testsuite/gcc.target/aarch64/aes_1.c  |  2 +-
>  gcc/testsuite/gcc.target/aarch64/aes_2.c  |  4 ++-
>  .../gcc.target/aarch64/aes_xor_combine.c  |  2 +-
>  gcc/testsuite/gcc.target/aarch64/sha1_1.c |  2 +-
>  gcc/testsuite/gcc.target/aarch64/sha256_1.c   |  2 +-
>  .../aarch64/target_attr_crypto_ice_1.c|  2 +-
>  10 files changed, 44 insertions(+), 27 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/pmull64.c
>
> diff --git a/gcc/config/aarch64/arm_neon.h
> b/gcc/config/aarch64/arm_neon.h
> index cf6af728ca9..a795a387b38 100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -7496,7 +7496,7 @@ vqrdmlshs_laneq_s32 (int32_t __a, int32_t __b,
> int32x4_t __c, const int __d)
>  #pragma GCC pop_options
>
>  #pragma GCC push_options
> -#pragma GCC target ("+nothing+crypto")
> +#pragma GCC target ("+nothing+aes")
>  /* vaes  */
>
>  __extension__ extern __inline uint8x16_t
> @@ -7526,6 +7526,22 @@ vaesimcq_u8 (uint8x16_t data)
>  {
>return __builtin_aarch64_crypto_aesimcv16qi_uu (data);
>  }
> +
> +__extension__ extern __inline poly128_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmull_p64 (poly64_t __a, poly64_t __b)
> +{
> +  return
> +__builtin_aarch64_crypto_pmulldi_ppp (__a, __b);
> +}
> +
> +__extension__ extern __inline poly128_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmull_high_p64 (poly64x2_t __a, poly64x2_t __b)
> +{
> +  return __builtin_aarch64_crypto_pmullv2di_ppp (__a, __b);
> +}
> +
>  #pragma GCC pop_options
>
>  /* vcage  */
> @@ -20772,7 +20788,7 @@ vrsrad_n_u64 (uint64_t __a, uint64_t __b, const
> int __c)
>  }
>
>  #pragma GCC push_options
> -#pragma GCC target ("+nothing+crypto"

Re: [PATCH 3/4] libbacktrace: work with aslr on windows

2023-01-20 Thread Eli Zaretskii via Gcc-patches
> From: Björn Schäpers 
> Date: Fri, 20 Jan 2023 11:54:08 +0100
> 
> @@ -856,7 +870,12 @@ coff_add (struct backtrace_state *state, int descriptor,
> + (sections[i].offset - min_offset));
>  }
>  
> -  if (!backtrace_dwarf_add (state, /* base_address */ 0, &dwarf_sections,
> +#ifdef HAVE_WINDOWS_H
> +module_handle = (uintptr_t) GetModuleHandleW (NULL);
> +base_address = module_handle - image_base;
> +#endif
> +
> +  if (!backtrace_dwarf_add (state, base_address, &dwarf_sections,
>   0, /* FIXME: is_bigendian */
>   NULL, /* altlink */
>   error_callback, data, fileline_fn,

Why do you force using the "wide" APIs here?  Won't GetModuleHandle do
the job, whether it resolves to GetModuleHandleA or GetModuleHandleW?


libquadmath fix for 94756 and 87204

2023-01-20 Thread i.nixman--- via Gcc-patches

hello,

I have fixed:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94756
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87204

tested on i686-mingw-w64, x86_64-mingw-w64, and for i686 and x86_64 
linux.


could anyone check and apply please?



best!
diff --git a/libquadmath/printf/gmp-impl.h b/libquadmath/printf/gmp-impl.h
index 94d88efc57f..af0719321dc 100644
--- a/libquadmath/printf/gmp-impl.h
+++ b/libquadmath/printf/gmp-impl.h
@@ -33,15 +33,30 @@ MA 02111-1307, USA. */
 #define MAX(h,i) ((h) > (i) ? (h) : (i))
 #endif
 
-#define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
-#define BYTES_PER_MP_LIMB (BITS_PER_MP_LIMB / __CHAR_BIT__)
-typedef unsigned long int	mp_limb_t;
-typedef long int		mp_limb_signed_t;
+#ifdef __MINGW32__
+  /* for MinGW targets the Microsoft ABI requires that `long`
+ types will always have 32 bit, because of that we will use
+ `int32_t` for 32-bit builds and `int64_t` for 64-bit builds */
+# if __x86_64__
+   typedef  long long int mp_limb_signed_t;
+   typedef unsigned long long int mp_limb_t;
+#  define BITS_PER_MP_LIMB (__SIZEOF_LONG_LONG__ * __CHAR_BIT__)
+# else // !__x86_64__
+   typedef  long int mp_limb_signed_t;
+   typedef unsigned long int mp_limb_t;
+#  define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
+# endif // __x86_64__
+#else // !__MINGW32__
+  typedef  long int mp_limb_signed_t;
+  typedef unsigned long int mp_limb_t;
+# define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
+#endif // __MINGW32__
 
-typedef mp_limb_t * mp_ptr;
-typedef const mp_limb_t *	mp_srcptr;
-typedef long intmp_size_t;
-typedef long intmp_exp_t;
+#define BYTES_PER_MP_LIMB (BITS_PER_MP_LIMB / __CHAR_BIT__)
+typedef long int  mp_size_t;
+typedef long int  mp_exp_t;
+typedef mp_limb_t*mp_ptr;
+typedef const mp_limb_t  *mp_srcptr;
 
 /* Define stuff for longlong.h.  */
 typedef unsigned int UQItype	__attribute__ ((mode (QI)));
diff --git a/libquadmath/strtod/strtod_l.c b/libquadmath/strtod/strtod_l.c
index 0b0e85a3cf7..6790124e6fc 100644
--- a/libquadmath/strtod/strtod_l.c
+++ b/libquadmath/strtod/strtod_l.c
@@ -200,7 +200,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 
 	  round_limb = retval[RETURN_LIMB_SIZE - 1];
 	  round_bit = (MANT_DIG - 1) % BITS_PER_MP_LIMB;
-	  for (i = 0; i < RETURN_LIMB_SIZE; ++i)
+	  for (i = 0; i < RETURN_LIMB_SIZE - 1; ++i)
 	more_bits |= retval[i] != 0;
 	  MPN_ZERO (retval, RETURN_LIMB_SIZE);
 	}
@@ -215,9 +215,14 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 	  more_bits |= ((round_limb & mp_limb_t) 1) << round_bit) - 1))
 			!= 0);
 
-	  (void) mpn_rshift (retval, &retval[shift / BITS_PER_MP_LIMB],
-			 RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB),
-			 shift % BITS_PER_MP_LIMB);
+/* mpn_rshift requires 0 < shift < BITS_PER_MP_LIMB.  */
+if ((shift % BITS_PER_MP_LIMB) != 0)
+  (void) mpn_rshift (retval, &retval[shift / BITS_PER_MP_LIMB],
+  RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB),
+  shift % BITS_PER_MP_LIMB);
+else
+  for (i = 0; i < RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB); i++)
+retval[i] = retval[i + (shift / BITS_PER_MP_LIMB)];
 	  MPN_ZERO (&retval[RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB)],
 		shift / BITS_PER_MP_LIMB);
 	}
@@ -276,7 +281,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 	}
 }
 
-  if (exponent > MAX_EXP)
+  if (exponent >= MAX_EXP)
 goto overflow;
 
 #ifdef HAVE_FENV_H
@@ -308,7 +313,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 }
 #endif
 
-  if (exponent > MAX_EXP)
+  if (exponent >= MAX_EXP)
   overflow:
 return overflow_value (negative);
 
@@ -688,7 +693,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
 	  if (endptr != NULL)
 	*endptr = (STRING_TYPE *) cp;
 
-	  return retval;
+	  return negative ? -retval : retval;
 	}
 
   /* It is really a text we do not recognize.  */
@@ -1193,7 +1198,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
   if (__builtin_expect (exponent > MAX_10_EXP + 1 - (intmax_t) int_no, 0))
 return overflow_value (negative);
 
-  if (__builtin_expect (exponent < MIN_10_EXP - (DIG + 1), 0))
+  if (__builtin_expect (exponent < MIN_10_EXP - (DIG + 2), 0))
 return underflow_value (negative);
 
   if (int_no > 0)
@@ -1360,7 +1365,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
 
 assert (dig_no > int_no
 	&& exponent <= 0
-	&& exponent >= MIN_10_EXP - (DIG + 1));
+	&& exponent >= MIN_10_EXP - (DIG + 2));
 
 /* We need to compute MANT_DIG - BITS fractional bits that lie
within the mantissa of the result, the following bit for
@@ -1651,8 +1656,8 @@ STRTOF_INTERNAL (nptr, endptr, group)
 	  d1 = den[densize - 2];
 
 	  /* The division does not work if the upper limb of the two-limb
-	 nume

Re: libquadmath fix for 94756 and 87204

2023-01-20 Thread i.nixman--- via Gcc-patches

updated path.
only the comment has been corrected.
diff --git a/libquadmath/printf/gmp-impl.h b/libquadmath/printf/gmp-impl.h
index 94d88efc57f..af0719321dc 100644
--- a/libquadmath/printf/gmp-impl.h
+++ b/libquadmath/printf/gmp-impl.h
@@ -33,15 +33,30 @@ MA 02111-1307, USA. */
 #define MAX(h,i) ((h) > (i) ? (h) : (i))
 #endif
 
-#define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
-#define BYTES_PER_MP_LIMB (BITS_PER_MP_LIMB / __CHAR_BIT__)
-typedef unsigned long int	mp_limb_t;
-typedef long int		mp_limb_signed_t;
+#ifdef __MINGW32__
+  /* for MinGW targets the Microsoft ABI requires `long`
+ type will always have 32 bit, because of that we will use
+ `long` for 32-bit builds and `long long` for 64-bit builds */
+# if __x86_64__
+   typedef  long long int mp_limb_signed_t;
+   typedef unsigned long long int mp_limb_t;
+#  define BITS_PER_MP_LIMB (__SIZEOF_LONG_LONG__ * __CHAR_BIT__)
+# else // !__x86_64__
+   typedef  long int mp_limb_signed_t;
+   typedef unsigned long int mp_limb_t;
+#  define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
+# endif // __x86_64__
+#else // !__MINGW32__
+  typedef  long int mp_limb_signed_t;
+  typedef unsigned long int mp_limb_t;
+# define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
+#endif // __MINGW32__
 
-typedef mp_limb_t * mp_ptr;
-typedef const mp_limb_t *	mp_srcptr;
-typedef long intmp_size_t;
-typedef long intmp_exp_t;
+#define BYTES_PER_MP_LIMB (BITS_PER_MP_LIMB / __CHAR_BIT__)
+typedef long int  mp_size_t;
+typedef long int  mp_exp_t;
+typedef mp_limb_t*mp_ptr;
+typedef const mp_limb_t  *mp_srcptr;
 
 /* Define stuff for longlong.h.  */
 typedef unsigned int UQItype	__attribute__ ((mode (QI)));
diff --git a/libquadmath/strtod/strtod_l.c b/libquadmath/strtod/strtod_l.c
index 0b0e85a3cf7..6790124e6fc 100644
--- a/libquadmath/strtod/strtod_l.c
+++ b/libquadmath/strtod/strtod_l.c
@@ -200,7 +200,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 
 	  round_limb = retval[RETURN_LIMB_SIZE - 1];
 	  round_bit = (MANT_DIG - 1) % BITS_PER_MP_LIMB;
-	  for (i = 0; i < RETURN_LIMB_SIZE; ++i)
+	  for (i = 0; i < RETURN_LIMB_SIZE - 1; ++i)
 	more_bits |= retval[i] != 0;
 	  MPN_ZERO (retval, RETURN_LIMB_SIZE);
 	}
@@ -215,9 +215,14 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 	  more_bits |= ((round_limb & mp_limb_t) 1) << round_bit) - 1))
 			!= 0);
 
-	  (void) mpn_rshift (retval, &retval[shift / BITS_PER_MP_LIMB],
-			 RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB),
-			 shift % BITS_PER_MP_LIMB);
+/* mpn_rshift requires 0 < shift < BITS_PER_MP_LIMB.  */
+if ((shift % BITS_PER_MP_LIMB) != 0)
+  (void) mpn_rshift (retval, &retval[shift / BITS_PER_MP_LIMB],
+  RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB),
+  shift % BITS_PER_MP_LIMB);
+else
+  for (i = 0; i < RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB); i++)
+retval[i] = retval[i + (shift / BITS_PER_MP_LIMB)];
 	  MPN_ZERO (&retval[RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB)],
 		shift / BITS_PER_MP_LIMB);
 	}
@@ -276,7 +281,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 	}
 }
 
-  if (exponent > MAX_EXP)
+  if (exponent >= MAX_EXP)
 goto overflow;
 
 #ifdef HAVE_FENV_H
@@ -308,7 +313,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 }
 #endif
 
-  if (exponent > MAX_EXP)
+  if (exponent >= MAX_EXP)
   overflow:
 return overflow_value (negative);
 
@@ -688,7 +693,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
 	  if (endptr != NULL)
 	*endptr = (STRING_TYPE *) cp;
 
-	  return retval;
+	  return negative ? -retval : retval;
 	}
 
   /* It is really a text we do not recognize.  */
@@ -1193,7 +1198,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
   if (__builtin_expect (exponent > MAX_10_EXP + 1 - (intmax_t) int_no, 0))
 return overflow_value (negative);
 
-  if (__builtin_expect (exponent < MIN_10_EXP - (DIG + 1), 0))
+  if (__builtin_expect (exponent < MIN_10_EXP - (DIG + 2), 0))
 return underflow_value (negative);
 
   if (int_no > 0)
@@ -1360,7 +1365,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
 
 assert (dig_no > int_no
 	&& exponent <= 0
-	&& exponent >= MIN_10_EXP - (DIG + 1));
+	&& exponent >= MIN_10_EXP - (DIG + 2));
 
 /* We need to compute MANT_DIG - BITS fractional bits that lie
within the mantissa of the result, the following bit for
@@ -1651,8 +1656,8 @@ STRTOF_INTERNAL (nptr, endptr, group)
 	  d1 = den[densize - 2];
 
 	  /* The division does not work if the upper limb of the two-limb
-	 numerator is greater than the denominator.  */
-	  if (mpn_cmp (num, &den[densize - numsize], numsize) > 0)
+	 numerator is greater or equal to than the denominator.  */
+	  if (mpn_cmp (num, &den[densize 

Re: [GCC][PATCH 13/15, v6] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

2023-01-20 Thread Richard Earnshaw via Gcc-patches




On 18/01/2023 17:18, Srinath Parvathaneni via Gcc-patches wrote:

Hello,

This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo 
hard-register and also
updates the ".save", ".cfi_register", ".cfi_offset", ".cfi_restore" directives 
accordingly.
This patch also adds support to emit ".pacspval" directive when "pac ip, lr, 
sp" instruction
in generated in the assembly.

RA_AUTH_CODE register number is 107 and it's dwarf register number is 143.

Applying this patch on top of PACBTI series posted here
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599658.html and when 
compiling the following
test.c with "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
-mthumb -mfloat-abi=hard
fasynchronous-unwind-tables -g -O0 -S" command line options, the assembly 
output after this patch
looks like below:

$cat test.c

void fun1(int a);
void fun(int a,...)
{
   fun1(a);
}

int main()
{
   fun (10);
   return 0;
}

$ arm-none-eabi-gcc -march=armv8.1-m.main+mve+pacbti 
-mbranch-protection=pac-ret -mthumb -mfloat-abi=hard
-fasynchronous-unwind-tables -g -O0 -S test.s

Assembly output:
...
fun:
...
 .pacspval
 pac ip, lr, sp
 .cfi_register 143, 12
 push{r3, r7, ip, lr}
 .save {r3, r7, ra_auth_code, lr}
...
 .cfi_offset 143, -24
...
 .cfi_restore 143
...
 aut ip, lr, sp
 bx  lr
...
main:
...
 .pacspval
 pac ip, lr, sp
 .cfi_register 143, 12
 push{r3, r7, ip, lr}
 .save {r3, r7, ra_auth_code, lr}
...
 .cfi_offset 143, -8
...
 .cfi_restore 143
...
 aut ip, lr, sp
 bx  lr
...

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

2023-01-18  Srinath Parvathaneni  

 * config/arm/aout.h (ra_auth_code): Add entry in enum.
 (emit_multi_reg_push): Add RA_AUTH_CODE register to
 dwarf frame expression.
 (arm_emit_multi_reg_pop): Restore RA_AUTH_CODE register.
 (arm_expand_prologue): Update frame related information and reg notes
 for pac/pacbit insn.
 (arm_regno_class): Check for pac pseudo reigster.
 (arm_dbx_register_number): Assign ra_auth_code register number in 
dwarf.
 (arm_init_machine_status): Set pacspval_needed to zero.
 (arm_debugger_regno): Check for PAC register.
 (arm_unwind_emit_sequence): Print .save directive with ra_auth_code
 register.
 (arm_unwind_emit_set): Add entry for IP_REGNUM in switch case.
 (arm_unwind_emit): Update REG_CFA_REGISTER case._
 * config/arm/arm.h (FIRST_PSEUDO_REGISTER): Modify.
 (DWARF_PAC_REGNUM): Define.
 (IS_PAC_REGNUM): Likewise.
 (enum reg_class): Add PAC_REG entry.
 (machine_function): Add pacbti_needed state to structure.
 * config/arm/arm.md (RA_AUTH_CODE): Define.

gcc/testsuite/ChangeLog:

2023-01-18  Srinath Parvathaneni  

 * g++.target/arm/pac-1.C: New test.
 * gcc.target/arm/pac-15.c: Likewise.


OK.

R.


[og12] Fix 'libgomp.c/simd-math-1.c' configuration (was: [OG12] [committed] amdgcn: Enable SIMD vectorization of math library functions)

2023-01-20 Thread Thomas Schwinge
Hi!

On 2022-11-02T00:50:40+, Kwok Cheung Yeung  wrote:
> I have committed the following patches onto the devel/omp/gcc-12
> development branch:
>
> 863579c4e30 amdgcn: Enable SIMD vectorization of math functions

I've pushed to devel/omp/gcc-12
commit e7d4bcb974915bfe95be6c385641fc66a4201581
"Fix 'libgomp.c/simd-math-1.c' configuration", see attached.


Grüße
 Thomas


> bd9a6106b95 amdgcn: Add SIMD versions of math routines to libgcc
> d3a2a1cc424 amdgcn: Add builtins for vector floor/floorf
> a3c04a367a9 amdgcn: Fix expansion of builtin for vector fabs operation
>
> These patches implement a vectorized version of most of the C math
> library for AMD GCN. These routines will be used when math functions are
> used in auto-vectorized code.
>
> Note that -fno-math-errno must be specified on the command line in most
> cases before the compiler will consider using these functions.
>
> Vectors smaller than the native 64 element ones are also supported (by
> masking off the unused lanes), which can be useful for SLP vectorized code.
>
> Kwok Yeung


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From e7d4bcb974915bfe95be6c385641fc66a4201581 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sat, 14 Jan 2023 10:28:09 +0100
Subject: [PATCH] Fix 'libgomp.c/simd-math-1.c' configuration

If nvptx offloading is configured in addition to GCN, we see:

FAIL: libgomp.c/simd-math-1.c (test for excess errors)
UNRESOLVED: libgomp.c/simd-math-1.c compilation failed to produce executable

x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: unrecognized command-line option '-mstack-size=300'

Thus, restrict that ooption to GCN offloading compilation, and on the other
hand, there's no reason to skip this test for non-GCN offloading execution:
even if not SIMD-vectorized there, we still benefit from correctness testing.

	libgomp/
	* testsuite/libgomp.c/simd-math-1.c: Fix configuration.
---
 libgomp/ChangeLog.omp | 4 
 libgomp/testsuite/libgomp.c/simd-math-1.c | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 629efbc5832..23e93495b62 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,7 @@
+2023-01-20  Thomas Schwinge  
+
+	* testsuite/libgomp.c/simd-math-1.c: Fix configuration.
+
 2023-01-19  Tobias Burnus  
 
 	Backported from master:
diff --git a/libgomp/testsuite/libgomp.c/simd-math-1.c b/libgomp/testsuite/libgomp.c/simd-math-1.c
index caf032a77ae..1ebdfeb 100644
--- a/libgomp/testsuite/libgomp.c/simd-math-1.c
+++ b/libgomp/testsuite/libgomp.c/simd-math-1.c
@@ -2,9 +2,9 @@
sufficiently close) results as their scalar equivalents.  */
 
 /* { dg-do run } */
-/* { dg-skip-if "AMD GCN only" { ! amdgcn_offloading_enabled } } */
 /* { dg-options "-O2 -ftree-vectorize -fno-math-errno" } */
-/* { dg-additional-options "-foffload=-mstack-size=300 -foffload=-lm" } */
+/* { dg-additional-options -foffload-options=amdgcn-amdhsa=-mstack-size=300 } */
+/* { dg-additional-options -foffload-options=-lm } */
 
 #undef PRINT_RESULT
 #define VERBOSE 0
-- 
2.25.1



Re: libquadmath fix for 94756 and 87204

2023-01-20 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 20, 2023 at 02:06:01PM +, i.nixman--- via Gcc-patches wrote:
> I have fixed:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94756
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87204
> 
> tested on i686-mingw-w64, x86_64-mingw-w64, and for i686 and x86_64 linux.
> 
> could anyone check and apply please?
> 
> 
> 
> best!

> diff --git a/libquadmath/printf/gmp-impl.h b/libquadmath/printf/gmp-impl.h
> index 94d88efc57f..af0719321dc 100644
> --- a/libquadmath/printf/gmp-impl.h
> +++ b/libquadmath/printf/gmp-impl.h
> @@ -33,15 +33,30 @@ MA 02111-1307, USA. */
>  #define MAX(h,i) ((h) > (i) ? (h) : (i))
>  #endif
>  
> -#define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
> -#define BYTES_PER_MP_LIMB (BITS_PER_MP_LIMB / __CHAR_BIT__)
> -typedef unsigned long intmp_limb_t;
> -typedef long int mp_limb_signed_t;
> +#ifdef __MINGW32__
> +  /* for MinGW targets the Microsoft ABI requires that `long`
> + types will always have 32 bit, because of that we will use
> + `int32_t` for 32-bit builds and `int64_t` for 64-bit builds */
> +# if __x86_64__
> +   typedef  long long int mp_limb_signed_t;
> +   typedef unsigned long long int mp_limb_t;
> +#  define BITS_PER_MP_LIMB (__SIZEOF_LONG_LONG__ * __CHAR_BIT__)
> +# else // !__x86_64__
> +   typedef  long int mp_limb_signed_t;
> +   typedef unsigned long int mp_limb_t;
> +#  define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
> +# endif // __x86_64__
> +#else // !__MINGW32__
> +  typedef  long int mp_limb_signed_t;
> +  typedef unsigned long int mp_limb_t;
> +# define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
> +#endif // __MINGW32__

The above looks way too complicated for what it does.
If all you want to change mp_limb* to be long long for mingw 64-bit,
then just do:
#if defined(__MINGW32__) && defined(__x86_64__)
typedef unsigned long long int  mp_limb_t;
typedef long long int   mp_limb_signed_t;
#define BITS_PER_MP_LIMB (__SIZEOF_LONG_LONG__ * __CHAR_BIT__)
#else
typedef unsigned long int   mp_limb_t;
typedef long intmp_limb_signed_t;
#define BITS_PER_MP_LIMB (__SIZEOF_LONG__ * __CHAR_BIT__)
#endif
and nothing else.  Or port the changes from glibc stdlib/gmp.h
etc. where there are macros to control what type is used for mp_limb_t etc.

> -typedef mp_limb_t * mp_ptr;
> -typedef const mp_limb_t *mp_srcptr;
> -typedef long intmp_size_t;
> -typedef long intmp_exp_t;
> +#define BYTES_PER_MP_LIMB (BITS_PER_MP_LIMB / __CHAR_BIT__)
> +typedef long int  mp_size_t;
> +typedef long int  mp_exp_t;
> +typedef mp_limb_t*mp_ptr;
> +typedef const mp_limb_t  *mp_srcptr;

Why?

As for the rest, it would help if you could list the exact glibc commits
which you've ported to libquadmath and indicate if it is solely those
and nothing else.

The patch needs a ChangeLog entry too.

> --- a/libquadmath/strtod/strtod_l.c
> +++ b/libquadmath/strtod/strtod_l.c
> @@ -200,7 +200,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, 
> int negative,
>  
> round_limb = retval[RETURN_LIMB_SIZE - 1];
> round_bit = (MANT_DIG - 1) % BITS_PER_MP_LIMB;
> -   for (i = 0; i < RETURN_LIMB_SIZE; ++i)
> +   for (i = 0; i < RETURN_LIMB_SIZE - 1; ++i)
>   more_bits |= retval[i] != 0;
> MPN_ZERO (retval, RETURN_LIMB_SIZE);
>   }
> @@ -215,9 +215,14 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, 
> int negative,
> more_bits |= ((round_limb & mp_limb_t) 1) << round_bit) - 1))
>   != 0);
>  
> -   (void) mpn_rshift (retval, &retval[shift / BITS_PER_MP_LIMB],
> -  RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB),
> -  shift % BITS_PER_MP_LIMB);
> +/* mpn_rshift requires 0 < shift < BITS_PER_MP_LIMB.  */
> +if ((shift % BITS_PER_MP_LIMB) != 0)
> +  (void) mpn_rshift (retval, &retval[shift / BITS_PER_MP_LIMB],
> +  RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB),
> +  shift % BITS_PER_MP_LIMB);
> +else
> +  for (i = 0; i < RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB); i++)
> +retval[i] = retval[i + (shift / BITS_PER_MP_LIMB)];
> MPN_ZERO (&retval[RETURN_LIMB_SIZE - (shift / BITS_PER_MP_LIMB)],
>   shift / BITS_PER_MP_LIMB);
>   }
> @@ -276,7 +281,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, 
> int negative,
>   }
>  }
>  
> -  if (exponent > MAX_EXP)
> +  if (exponent >= MAX_EXP)
>  goto overflow;
>  
>  #ifdef HAVE_FENV_H
> @@ -308,7 +313,7 @@ round_and_return (mp_limb_t *retval, intmax_t exponent, 
> int negative,
>  }
>  #endif
>  
> -  if (exponent > MAX_EXP)
> +  if (exponent >= MAX_EXP)
>overflow:
>  return overflow_value (negative);
>  
> @@ -688,7 +693,7 @@ STRTOF_INTERNAL (nptr, endptr, group)
> if (endptr !

[og12] Force '--param openacc-kernels=parloops' in 'libgomp.oacc-c-c++-common/abort-3.c'

2023-01-20 Thread Thomas Schwinge
Hi!

Not sure what's going on, but until we get to look into that, I've pushed
to devel/omp/gcc-12 commit 9dde5e1fd14eb336afe161c0b43c74b522e20f3e
"Force '--param openacc-kernels=parloops' in 
'libgomp.oacc-c-c++-common/abort-3.c'",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 9dde5e1fd14eb336afe161c0b43c74b522e20f3e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 17 Jan 2023 09:56:15 +0100
Subject: [PATCH] Force '--param openacc-kernels=parloops' in
 'libgomp.oacc-c-c++-common/abort-3.c'

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Force
	'--param openacc-kernels=parloops'.
---
 libgomp/ChangeLog.omp |  3 +
 .../libgomp.oacc-c-c++-common/abort-3.c   | 57 +++
 2 files changed, 60 insertions(+)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 23e93495b62..0f8fca4e71c 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,5 +1,8 @@
 2023-01-20  Thomas Schwinge  
 
+	* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Force
+	'--param openacc-kernels=parloops'.
+
 	* testsuite/libgomp.c/simd-math-1.c: Fix configuration.
 
 2023-01-19  Tobias Burnus  
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c
index bca425e8473..3ba4ef76ade 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/abort-3.c
@@ -1,4 +1,61 @@
 /* { dg-do run } */
+/* This test case is meant to 'abort' in device execution, which works as
+   expected with '--param openacc-kernels=parloops':
+
+   GCN:
+
+   GCN Kernel Aborted
+
+   nvptx:
+
+   libgomp: cuStreamSynchronize error: unspecified launch failure (perhaps abort was called)
+
+   ..., or:
+
+   libgomp: cuStreamSynchronize error: an illegal instruction was encountered
+
+   However, with '--param openacc-kernels=decompose', for '-O0', '-O1', this
+   does *not* 'abort' in device execution, but instead we run into whatever the
+   compiler generates for (implicit) host-side '__builtin_unreachable ()':
+
+   Segmentation fault (core dumped)
+
+   ..., or:
+
+   Illegal instruction (core dumped)
+
+   (This, unfortunately, still means "correct" execution of this test case...)
+
+   And, with '--param openacc-kernels=decompose', with '-O2' and higher, we
+   get things like the following:
+
+   GCN:
+
+   libgomp: Called kernel must be initialized
+
+   ..., potentially followed by:
+
+   libgomp: Duplicate node
+   WARNING: program timed out.
+
+   nvptx:
+
+   libgomp: cuModuleLoadData error: unspecified launch failure
+
+   ..., or:
+
+   libgomp: cuModuleLoadData error: an illegal instruction was encountered
+
+   That is, for nvptx, the code doesn't even load?
+
+   Worse, on one system, this process then shows 100 % CPU utilization, GPU
+   locks up; process un-SIGKILLable, system needs to be (forcefully) rebooted.
+
+   Until we understand what's happening (how the decomposed OpenACC 'kernels'
+   code is different from 'abort-1.c', for example), play it safe:
+
+   { dg-additional-options {--param openacc-kernels=parloops} }
+*/
 
 #include 
 #include 
-- 
2.25.1



[PATCH] file-prefix-map: Fix up -f*-prefix-map= [PR108464]

2023-01-20 Thread Jakub Jelinek via Gcc-patches
Hi!

On Tue, Nov 01, 2022 at 01:46:20PM -0600, Jeff Law via Gcc-patches wrote:
> > This does cause a change of behaviour if users were previously relying upon
> > symlinks or absolute paths not being resolved.
> 
> I'm not too worried about this scenario.

As mentioned in the PR, this patch breaks e.g. ccache testsuite.

I strongly doubt most of the users want such a behavior, because it
makes all filenames absolute when -f*-prefix-map= options remap one
absolute path to another one.
Say if I'm in /tmp and /tmp is the canonical path and there is
src/test.c file, with -fdebug-prefix-map=/tmp=/blah
previously there would be DW_AT_comp_dir "/blah" and it is still there,
but DW_AT_name which was previouly "src/test.c" (relative against
DW_AT_comp_dir) is now "/blah/src/test.c" instead.

Even worse, the canonicalization is only done on the remap_filename
argument, but not on the old_prefix side.  That is e.g. what breaks
ccache.  If there is
/tmp/foobar1 directory and
ln -sf foobar1 /tmp/foobar2
cd /tmp/foobar2
then -fdebug-prefix-map=`pwd`:/blah will just not work, while
src/test.c will be canonicalized to /tmp/foobar1/src/test.c,
old_prefix is still what the user provided which is /tmp/foobar2.
User would need to change their uses to use -fdebug-prefix-map=`realpath 
$(pwd)`=/blah

I'm attaching 3 so far just compile tested patches.

The first patch just reverts the patch (and its follow-up patch).

The second introduces a new option, -f{,no}-canon-prefix-map which affects
the behavior of -f{file,macro,debug,profile}-prefix-map=, if on it
canonicalizes the old path of the prefix map option and compares that
against the canonicalized filename for absolute paths but not relative.

And last is like the second, but does that also for relative paths except
for filenames with no / (or / or \ on DOS based fs).  So, the third patch
gets an optional behavior of what has been on the trunk lately with the
difference that the old_prefix is canonicalized by the compiler.

Initially I've thought I'd just add some magic syntax to the OLD=NEW
argument of those options (because there are 4 of them), but as noted
in the comments, = is valid char in OLD (just not new), so it would
be hard to figure out some syntax.  So instead a new option, which one
can turn on and off for different -f*-prefix-map= options if needed.

-fdebug-prefix-map=/path1=/mypath1 -fcanon-prefix-map \
-fdebug-prefix-map=/path2=/mypath2 -fno-canon-prefix-map \
-fdebug-prefix-map=/path3=/mypath3

will use the old behavior for the /path1 and /path3 handling and
the new one only for /path2 handling.

Thoughts on this?

Jakub
2023-01-20  Jakub Jelinek  

PR other/108464
* file-prefix-map.cc (remap_filename): Revert 2022-11-01 and 2022-11-07
changes.

--- gcc/file-prefix-map.cc
+++ gcc/file-prefix-map.cc
@@ -70,29 +70,19 @@ remap_filename (file_prefix_map *maps, const char *filename)
   file_prefix_map *map;
   char *s;
   const char *name;
-  char *realname;
   size_t name_len;
 
-  if (!filename || lbasename (filename) == filename)
-return filename;
-
-  realname = lrealpath (filename);
-
   for (map = maps; map; map = map->next)
-if (filename_ncmp (realname, map->old_prefix, map->old_len) == 0)
+if (filename_ncmp (filename, map->old_prefix, map->old_len) == 0)
   break;
   if (!map)
-{
-  free (realname);
-  return filename;
-}
-  name = realname + map->old_len;
+return filename;
+  name = filename + map->old_len;
   name_len = strlen (name) + 1;
 
   s = (char *) ggc_alloc_atomic (name_len + map->new_len);
   memcpy (s, map->new_prefix, map->new_len);
   memcpy (s + map->new_len, name, name_len);
-  free (realname);
   return s;
 }
 
2023-01-20  Jakub Jelinek  

PR other/108464
* common.opt (fcanon-prefix-map): New option.
* opts.cc: Include file-prefix-map.h.
(flag_canon_prefix_map): New variable.
(common_handle_option): Handle OPT_fcanon_prefix_map.
(gen_command_line_string): Ignore OPT_fcanon_prefix_map.
* file-prefix-map.h (flag_canon_prefix_map): Declare.
* file-prefix-map.cc (struct file_prefix_map): Add canonicalize
member.
(add_prefix_map): Initialize canonicalize member from
flag_canon_prefix_map, and if true and old_prefix is absolute
pathname, canonicalize it using lrealpath.
(remap_filename): Revert 2022-11-01 and 2022-11-07 changes,
use lrealpath result only for absolute filenames and only for
map->canonicalize map entries.
* lto-opts.cc (lto_write_options): Ignore OPT_fcanon_prefix_map.
* opts-global.cc (handle_common_deferred_options): Clear
flag_canon_prefix_map at the start and handle OPT_fcanon_prefix_map.
* doc/invoke.texi (-fcanon-prefix-map): Document.
(-ffile-prefix-map, -fdebug-prefix-map, -fprofile-prefix-map): Add
see also for -fcanon-prefix-map.
* doc/cppopts.texi (-fmacro-prefix-map): Likewi

Re: [PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-20 Thread Max Filippov via Gcc-patches
Hi Suwa-san,

On Wed, Jan 18, 2023 at 7:50 PM Takayuki 'January June' Suwa
 wrote:
>
> In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the reg 
> check (possibly FAIL).
>
> =
> In the case of the CALL0 ABI, values that must be retained before and
> after function calls are placed in the callee-saved registers (A12
> through A15) and referenced later.  However, it is often the case that
> the save and the reference are each only once and a simple register-
> register move (the frame pointer is needed to recover the stack pointer
> and must be excluded).
>
> e.g. in the following example, if there are no other occurrences of
> register A14:
>
> ;; before
> ; prologue {
>   ...
> s32i.n  a14, sp, 16
>   ...
> ; } prologue
>   ...
> mov.n   a14, a6
>   ...
> call0   foo
>   ...
> mov.n   a8, a14
>   ...
> ; epilogue {
>   ...
> l32i.n  a14, sp, 16
>   ...
> ; } epilogue
>
> It can be possible like this:
>
> ;; after
> ; prologue {
>   ...
> (deleted)
>   ...
> ; } prologue
>   ...
> s32i.n  a6, sp, 16
>   ...
> call0   foo
>   ...
> l32i.n  a8, sp, 16
>   ...
> ; epilogue {
>   ...
> (deleted)
>   ...
> ; } epilogue
>
> This patch introduces a new peephole2 pattern that implements the above.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the use of callee-saved register that saves and restores only once
> for other register, by using its stack slot directly.
> ---
>  gcc/config/xtensa/xtensa.md | 62 +
>  1 file changed, 62 insertions(+)

There are still issues with this change in the libgomp:

FAIL: libgomp.c/examples-4/target-1.c execution test
FAIL: libgomp.c/examples-4/target-2.c execution test

They come from the following function:

code produced before the change:
   .literal_position
   .literal .LC8, init@PLT
   .literal .LC9, 40
   .literal .LC10, 10
   .literal .LC11, -80
   .literal .LC12, 80
   .align  4
   .global vec_mult_ref
   .type   vec_mult_ref, @function
vec_mult_ref:
   l32ra9, .LC11
   addisp, sp, -16
   l32ra10, .LC9
   s32i.n  a12, sp, 8
   s32i.n  a13, sp, 4
   s32i.n  a0, sp, 12
   add.n   sp, sp, a9
   add.n   a12, sp, a10
   l32ra9, .LC8
   mov.n   a13, a2
   mov.n   a3, sp
   mov.n   a2, a12
   callx0  a9
   l32ra7, .LC10
   mov.n   a10, a12
   mov.n   a11, sp
   mov.n   a2, a13
   loopa7, .L17_LEND
.L17:
   l32i.n  a9, a10, 0
   l32i.n  a6, a11, 0
   addi.n  a10, a10, 4
   mulla9, a9, a6
   addi.n  a11, a11, 4
   s32i.n  a9, a2, 0
   addi.n  a2, a2, 4
   .L17_LEND:
   l32ra9, .LC12
   add.n   sp, sp, a9
   l32i.n  a0, sp, 12
   l32i.n  a12, sp, 8
   l32i.n  a13, sp, 4
   addisp, sp, 16
   ret.n



with the change:
   .literal_position
   .literal .LC8, init@PLT
   .literal .LC9, 40
   .literal .LC10, 10
   .literal .LC11, -80
   .literal .LC12, 80
   .align  4
   .global vec_mult_ref
   .type   vec_mult_ref, @function
vec_mult_ref:
   l32ra9, .LC11
   l32ra10, .LC9
   addisp, sp, -16
   s32i.n  a12, sp, 8
   s32i.n  a0, sp, 12
   add.n   sp, sp, a9
   add.n   a12, sp, a10
   l32ra9, .LC8
   s32i.n  a2, sp, 4
   mov.n   a3, sp
   mov.n   a2, a12
   callx0  a9
   l32ra7, .LC10
   l32i.n  a2, sp, 4
   mov.n   a10, a12
   mov.n   a11, sp
   loopa7, .L17_LEND
.L17:
   l32i.n  a9, a10, 0
   l32i.n  a6, a11, 0
   addi.n  a10, a10, 4
   mulla9, a9, a6
   addi.n  a11, a11, 4
   s32i.n  a9, a2, 0
   addi.n  a2, a2, 4
   .L17_LEND:
   l32ra9, .LC12
   add.n   sp, sp, a9
   l32i.n  a0, sp, 12
   l32i.n  a12, sp, 8
   addisp, sp, 16
   ret.n

the stack pointer is modified after saving callee-saved registers,
but the stack offset where a2 is stored and reloaded does not take
this into an account.

After having this many attempts and getting to the issues that are
really hard to detect I wonder if the target backend is the right place
for this optimization?

-- 
Thanks.
-- Max


Re: [PATCH v3] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-20 Thread Max Filippov via Gcc-patches
Hi Suwa-san,

On Wed, Jan 18, 2023 at 9:06 PM Takayuki 'January June' Suwa
 wrote:
>
> Register-register move instructions that can be easily seen as
> unnecessary by the human eye may remain in the compiled result.
> For example:
>
> /* example */
> double test(double a, double b) {
>   return __builtin_copysign(a, b);
> }
>
> test:
> add.n   a3, a3, a3
> extui   a5, a5, 31, 1
> ssai1
> ;; be in the same BB
> src a7, a5, a3  ;; No '0' in the source constraints
> ;; No CALL insns in this span
> ;; Both A3 and A7 are irrelevant to
> ;;   insns in this span
> mov.n   a3, a7  ;; An unnecessary reg-reg move
> ;; A7 is not used after this
> ret.n
>
> The last two instructions above, excluding the return instruction,
> could be done like this:
>
> src a3, a5, a3
>
> This symptom often occurs when handling DI/DFmode values with SImode
> instructions.  This patch solves the above problem using peephole2
> pattern.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the occurrence of general-purpose register used only once and for
> transferring intermediate value.
> ---
>  gcc/config/xtensa/xtensa.md | 45 +
>  1 file changed, 45 insertions(+)

With this change I see the following ICEs:

in the libgcc build:

gcc/libgcc/libgcov-interface.c: In function ‘__gcov_execl’:
gcc/libgcc/libgcov-interface.c:228:1: error: insn does not satisfy its
constraints:
 228 | }
 | ^
(insn 96 95 98 11 (set (reg/f:SI 1 sp)
   (minus:SI (reg/f:SI 1 sp)
   (reg:SI 8 a8 [85])))
"gcc/libgcc/libgcov-interface.c":218:20 4 {subsi3}
(expr_list:REG_DEAD (reg:SI 8 a8 [85])
   (nil)))
during RTL pass: cprop_hardreg


in the linux kernel build:

linux/lib/find_bit.c: In function ‘_find_next_bit’:
linux/lib/find_bit.c:70:1: error: unrecognizable insn:
  70 | }
 | ^
(insn 74 72 75 16 (set (reg:SI 10 a10)
   (asm_operands:SI ("ssai 8
   srli %0, %1, 16
   src  %0, %0, %1
   src  %0, %0, %0
   src  %0, %1, %0
") ("=&a") 0 [
   (reg/v:SI 10 a10 [orig:59 res ] [59])
   ]
[
   (asm_input:SI ("a") linux/arch/xtensa/include/uapi/asm/swab.h:24)
   ]
[] linux/arch/xtensa/include/uapi/asm/swab.h:24))
"linux/arch/xtensa/include/uapi/asm/swab.h":24:5 -1
(nil))
during RTL pass: cprop_hardreg
linux/lib/find_bit.c:70:1: internal compiler error: in
extract_constrain_insn, at recog.cc:2692
0x6c3214 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
   gcc/gcc/rtl-error.cc:108
0x6c3297 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
   gcc/gcc/rtl-error.cc:116
0x6b4735 extract_constrain_insn(rtx_insn*)
   gcc/gcc/recog.cc:2692
0xe1f67e copyprop_hardreg_forward_1
   gcc/gcc/regcprop.cc:826
0xe20a0f execute
   gcc/gcc/regcprop.cc:1408

-- 
Thanks.
-- Max


[og12] Fix 'libgomp.c/simd-math-1.c' configuration, again (was: [og12] Fix 'libgomp.c/simd-math-1.c' configuration (was: [OG12] [committed] amdgcn: Enable SIMD vectorization of math library functions)

2023-01-20 Thread Thomas Schwinge
Hi!

On 2023-01-20T15:16:26+0100, I wrote:
> On 2022-11-02T00:50:40+, Kwok Cheung Yeung  wrote:
>> I have committed the following patches onto the devel/omp/gcc-12
>> development branch:
>>
>> 863579c4e30 amdgcn: Enable SIMD vectorization of math functions
>
> I've pushed to devel/omp/gcc-12
> commit e7d4bcb974915bfe95be6c385641fc66a4201581
> "Fix 'libgomp.c/simd-math-1.c' configuration", see attached.

I've pushed to devel/omp/gcc-12
commit bbd4eb1772893ba99aa23a4eaf8950415624964e
"Fix 'libgomp.c/simd-math-1.c' configuration, again", see attached.

Thanks for the report, Tobias!


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From bbd4eb1772893ba99aa23a4eaf8950415624964e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 20 Jan 2023 17:17:21 +0100
Subject: [PATCH] Fix 'libgomp.c/simd-math-1.c' configuration, again

Tobias pointed out that as of my recent
og12 commit e7d4bcb974915bfe95be6c385641fc66a4201581
"Fix 'libgomp.c/simd-math-1.c' configuration",
in GCC configurations without GCN offloading configured, we'd get:

xgcc: error: GCC is not configured to support 'amdgcn-amdhsa' as '-foffload=' argument

("Interestingly", GCC doesn't complain for '-foffload-options=-lm' if there are
no offload targets configured...)

	libgomp/
	* testsuite/libgomp.c/simd-math-1.c: Fix configuration, again.
---
 libgomp/ChangeLog.omp | 2 ++
 libgomp/testsuite/libgomp.c/simd-math-1.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 0f8fca4e71c..134d450f44a 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,5 +1,7 @@
 2023-01-20  Thomas Schwinge  
 
+	* testsuite/libgomp.c/simd-math-1.c: Fix configuration, again.
+
 	* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Force
 	'--param openacc-kernels=parloops'.
 
diff --git a/libgomp/testsuite/libgomp.c/simd-math-1.c b/libgomp/testsuite/libgomp.c/simd-math-1.c
index 1ebdfeb..ea629696e55 100644
--- a/libgomp/testsuite/libgomp.c/simd-math-1.c
+++ b/libgomp/testsuite/libgomp.c/simd-math-1.c
@@ -3,7 +3,7 @@
 
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fno-math-errno" } */
-/* { dg-additional-options -foffload-options=amdgcn-amdhsa=-mstack-size=300 } */
+/* { dg-additional-options -foffload-options=amdgcn-amdhsa=-mstack-size=300 { target offload_target_amdgcn } } */
 /* { dg-additional-options -foffload-options=-lm } */
 
 #undef PRINT_RESULT
-- 
2.25.1



[PATCH 15/23] arm: improve tests for vqrdmlashq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlashq_n_s16.c | 32 +++
 .../arm/mve/intrinsics/vqrdmlashq_n_s32.c | 32 +++
 .../arm/mve/intrinsics/vqrdmlashq_n_s8.c  | 32 +++
 3 files changed, 78 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
index 8ff8c34d529..2710f2f0442 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vqrdmlash.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo (int16x8_t a, int16x8_t b, int16_t c)
+foo (int16x8_t m1, int16x8_t m2, int16_t add)
 {
-  return vqrdmlashq_n_s16 (a, b, c);
+  return vqrdmlashq_n_s16 (m1, m2, add);
 }
 
-/* { dg-final { scan-assembler "vqrdmlash.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vqrdmlash.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo1 (int16x8_t a, int16x8_t b, int16_t c)
+foo1 (int16x8_t m1, int16x8_t m2, int16_t add)
 {
-  return vqrdmlashq (a, b, c);
+  return vqrdmlashq (m1, m2, add);
+}
+
+#ifdef __cplusplus
 }
+#endif
 
-/* { dg-final { scan-assembler "vqrdmlash.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
index 02583f0627b..5fefc3938c5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vqrdmlash.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo (int32x4_t a, int32x4_t b, int32_t c)
+foo (int32x4_t m1, int32x4_t m2, int32_t add)
 {
-  return vqrdmlashq_n_s32 (a, b, c);
+  return vqrdmlashq_n_s32 (m1, m2, add);
 }
 
-/* { dg-final { scan-assembler "vqrdmlash.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vqrdmlash.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo1 (int32x4_t a, int32x4_t b, int32_t c)
+foo1 (int32x4_t m1, int32x4_t m2, int32_t add)
 {
-  return vqrdmlashq (a, b, c);
+  return vqrdmlashq (m1, m2, add);
+}
+
+#ifdef __cplusplus
 }
+#endif
 
-/* { dg-final { scan-assembler "vqrdmlash.s32"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
index 0bd5bcac71f..df96fe85213 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vqrdmlash.s8q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int8x16_t
-foo (int8x16_t a, int8x16_t b, int8_t c)
+foo (int8x16_t m1, int8x16_t m2, int8_t add)
 {
-  return vqrdmlashq_n_s8 (a, b, c);
+  return vqrdmlashq_n_s8 (m1, m2, add);
 }
 
-/* { dg-final { scan-assembler "vqrdmlash.s8"  }  } */
 
+/*
+**foo1:
+** ...
+** vqrdmlash.s8q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int8x16_t
-foo1 (int8x16_t a, int8x16_t b, int8_t c)
+foo1 (int8x16_t m1, int8x16_t m2, int8_t add)
 {
-  return vqrdmlashq (a, b, c);
+  return vqrdmlashq (m1, m2, add);
+}
+
+#ifdef __cplusplus
 }
+#endif
 
-/* { dg-final { scan-assembler "vqrdmlash.s8"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
-- 
2.25.1



[PATCH 17/23] arm: improve tests for vqdmlsdhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmlsdhxq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhxq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhxq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhxq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhxq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhxq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
index 6ab9743054c..1742d47291c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
index a34618e97fd..1c1b73a2251 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
index fdbe89ab6b8..0a980a081a1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq

[PATCH 01/23] arm: improve tests and fix vclsq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vclsq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vclsq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vclsq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_x_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vclsq_m_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_m_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_m_s8.c   | 33 +--
 .../gcc.target/arm/mve/intrinsics/vclsq_s16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclsq_s32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclsq_s8.c  | 24 --
 .../arm/mve/intrinsics/vclsq_x_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_x_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_x_s8.c   | 33 +--
 10 files changed, 251 insertions(+), 29 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f123edc449b..e35ea5d9f9c 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -469,7 +469,7 @@ (define_insn "mve_vclsq_s"
 VCLSQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vcls.s%#  %q0, %q1"
+  "vcls.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
index d0eb7008537..1996ac8b03e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclsq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vclst.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
index b6d7088a8e7..f51841d024e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s32   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vclsq_m_s32 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vclst.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s32   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vclsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c
index 28d4d966802..2975c4cda56 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c
@@ -1,22 +1,49 @@
 /* { dg-require-

[PATCH 16/23] arm: improve tests for vqdmlsdhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmlsdhq_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhq_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhq_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhq_s16.c| 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhq_s32.c| 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhq_s8.c | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
index d1e66864d10..f87287ab8cd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
index cc80f211ec8..8155aaf843c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
index 5c9d81a6526..d39badc7707 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s8q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m_s8 (inactive, a, b, p);

[PATCH 06/23] arm: improve tests for vmulltq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_m_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_m_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_x_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_x_p8.c: Likewise.
---
 .../arm/mve/intrinsics/vmulltq_int_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_u16.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_u32.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_u8.c | 34 ---
 .../arm/mve/intrinsics/vmulltq_int_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_s8.c   | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_u16.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_u32.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_u8.c   | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_x_s16.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_s32.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_s8.c | 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_u16.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_u32.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_u8.c | 33 --
 .../arm/mve/intrinsics/vmulltq_poly_m_p16.c   | 34 ---
 .../arm/mve/intrinsics/vmulltq_poly_m_p8.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_poly_p16.c | 24 +++--
 .../arm/mve/intrinsics/vmulltq_poly_p8.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_poly_x_p16.c   | 33 --
 .../arm/mve/intrinsics/vmulltq_poly_x_p8.c| 33 --
 24 files changed, 656 insertions(+), 72 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
index 25ecf7a2c51..7f573e9109e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulltt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulltq_int_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulltt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulltt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulltq_int_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulltt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_

[PATCH 04/23] arm: improve tests for vmulhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vmulhq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_u16.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_u32.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_u8.c  | 34 ---
 .../arm/mve/intrinsics/vmulhq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vmulhq_s32.c   | 24 +++--
 .../gcc.target/arm/mve/intrinsics/vmulhq_s8.c | 24 +++--
 .../arm/mve/intrinsics/vmulhq_u16.c   | 24 +++--
 .../arm/mve/intrinsics/vmulhq_u32.c   | 24 +++--
 .../gcc.target/arm/mve/intrinsics/vmulhq_u8.c | 24 +++--
 .../arm/mve/intrinsics/vmulhq_x_s16.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_s32.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_s8.c  | 33 --
 .../arm/mve/intrinsics/vmulhq_x_u16.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_u32.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_u8.c  | 33 --
 18 files changed, 492 insertions(+), 54 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
index 4971869a27b..a7d8460c265 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
index 3006de7fd24..997fdbe8d23 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vmulhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-fi

[PATCH 18/23] arm: improve tests for vqrdmlsdhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlsdhq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
index d0054b8ea97..6a5776215ca 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
index 7d3fe45eb4d..9539e249d6a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
index c33f8ea903b..69e54f53a76 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq

[PATCH 05/23] arm: improve tests for vmullbq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_m_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_m_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_x_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_x_p8.c: Likewise.
---
 .../arm/mve/intrinsics/vmullbq_int_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_u16.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_u32.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_u8.c | 34 ---
 .../arm/mve/intrinsics/vmullbq_int_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_s8.c   | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_u16.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_u32.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_u8.c   | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_x_s16.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_s32.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_s8.c | 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_u16.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_u32.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_u8.c | 33 --
 .../arm/mve/intrinsics/vmullbq_poly_m_p16.c   | 34 ---
 .../arm/mve/intrinsics/vmullbq_poly_m_p8.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_poly_p16.c | 24 +++--
 .../arm/mve/intrinsics/vmullbq_poly_p8.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_poly_x_p16.c   | 33 --
 .../arm/mve/intrinsics/vmullbq_poly_x_p8.c| 33 --
 24 files changed, 656 insertions(+), 72 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
index be933274d77..a4cc5e52773 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmullbt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmullbq_int_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmullbt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmullbt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmullbq_int_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmullbt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_

[PATCH 03/23] arm: improve tests and fix vnegq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vnegq_f, mve_vnegq_s):
Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vnegq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vnegq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_s8.c: Likewise.
* gcc.target/arm/simd/mve-vneg.c: Update test.
* gcc.target/arm/simd/mve-vshr.c: Likewise
---
 gcc/config/arm/mve.md |  4 +--
 .../gcc.target/arm/mve/intrinsics/vnegq_f16.c | 30 -
 .../gcc.target/arm/mve/intrinsics/vnegq_f32.c | 30 -
 .../arm/mve/intrinsics/vnegq_m_f16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_f32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_s16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_s32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_s8.c   | 33 +--
 .../gcc.target/arm/mve/intrinsics/vnegq_s16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vnegq_s32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vnegq_s8.c  | 24 --
 .../arm/mve/intrinsics/vnegq_x_f16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_f32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_s16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_s32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_s8.c   | 33 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vneg.c  |  4 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vshr.c  |  2 +-
 18 files changed, 433 insertions(+), 47 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 854371f7e11..0a243486bdb 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -252,7 +252,7 @@ (define_insn "mve_vnegq_f"
(neg:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vneg.f%#  %q0, %q1"
+  "vneg.f%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -401,7 +401,7 @@ (define_insn "mve_vnegq_s"
(neg:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "vneg.s%#  %q0, %q1"
+  "vneg.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
index 9572c140d7e..9853cf6e6dd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
@@ -1,13 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vneg.f16q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a)
 {
   return vnegq_f16 (a);
 }
 
-/* { dg-final { scan-assembler "vneg.f16"  }  } */
+
+/*
+**foo1:
+** ...
+** vneg.f16q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
+float16x8_t
+foo1 (float16x8_t a)
+{
+  return vnegq (a);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
index be73cc0c5f5..489cfc760ba 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
@@ -1,13 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vneg.f32q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 float32x4_t
 foo (float32x4_t a)
 {
   return vnegq_f32 (a);
 }
 
-/* { dg-final { scan-assembler "vneg.f32"  }  } */
+
+/*
+**foo1:
+** ...

[PATCH 21/23] arm: improve tests and fix vqnegq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vqnegq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vqnegq_m_s16.c | 33 +--
 .../arm/mve/intrinsics/vqnegq_m_s32.c | 33 +--
 .../arm/mve/intrinsics/vqnegq_m_s8.c  | 33 +--
 .../arm/mve/intrinsics/vqnegq_s16.c   | 28 +---
 .../arm/mve/intrinsics/vqnegq_s32.c   | 24 --
 .../gcc.target/arm/mve/intrinsics/vqnegq_s8.c | 24 --
 7 files changed, 159 insertions(+), 18 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 600adf7d69b..4f94cf14a0b 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -374,7 +374,7 @@ (define_insn "mve_vqnegq_s"
 VQNEGQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vqneg.s%# %q0, %q1"
+  "vqneg.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
index 4f0145d2ebd..f3799a35b12 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqnegq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqnegt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqnegq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
index da4f90bad53..bbe64ff4d52 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqnegq_m_s32 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqnegt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqnegq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
index ac1250b2fac..71fcdd7cba7 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s8  

[PATCH 12/23] arm: improve tests for vqdmladhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_s32.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_s8.c: Improve test.
---
 .../arm/mve/intrinsics/vqdmladhxq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqdmladhxq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqdmladhxq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqdmladhxq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmladhxq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmladhxq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
index c2446e69181..19c5ce5a64f 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
index 12b45517535..e00162addae 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
index 146aa51306b..19767d2cd41 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {

[PATCH 13/23] arm: improve tests for vqrdmladhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmladhq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqrdmladhq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqrdmladhq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmladhq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
index fce4f5a35ef..5b0e134a0ff 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
index e550b6a7995..6fdf3879cc2 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
index b07b28e5bcd..ef75f737161 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqrdmladhq

[PATCH 02/23] arm: improve tests and fix vclzq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (@mve_vclzq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vclzq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vclzq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_u8.c: Likewise.
* gcc.target/arm/simd/mve-vclz.c: Update test.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vclzq_m_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_s8.c   | 33 +--
 .../arm/mve/intrinsics/vclzq_m_u16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_u32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_u8.c   | 33 +--
 .../gcc.target/arm/mve/intrinsics/vclzq_s16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_s32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_s8.c  | 24 --
 .../gcc.target/arm/mve/intrinsics/vclzq_u16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_u32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_u8.c  | 28 +---
 .../arm/mve/intrinsics/vclzq_x_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_s8.c   | 33 +--
 .../arm/mve/intrinsics/vclzq_x_u16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_u32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_u8.c   | 33 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vclz.c  |  6 ++--
 20 files changed, 506 insertions(+), 62 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index e35ea5d9f9c..854371f7e11 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -448,7 +448,7 @@ (define_insn "@mve_vclzq_s"
(clz:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "vclz.i%#  %q0, %q1"
+  "vclz.i%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 (define_expand "mve_vclzq_u"
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
index 9670f8f56f3..620314e4ff2 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclzt.i16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclzq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vclzt.i16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclzt.i16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclzq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c
index 18427354570..dfda1e67287 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /*

[PATCH 08/23] arm: improve tests for vcmlaq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcmlaq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcmlaq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_m_f32.c: Likewise.
---
 .../arm/mve/intrinsics/vcmlaq_f16.c   | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_f32.c   | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_m_f16.c | 34 ---
 .../arm/mve/intrinsics/vcmlaq_m_f32.c | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot180_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot180_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot180_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot180_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot90_f16.c | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot90_f32.c | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot90_m_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot90_m_f32.c   | 34 ---
 16 files changed, 416 insertions(+), 48 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
index fa7d0c05e8c..bb8a99790a0 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vcmla.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a, float16x8_t b, float16x8_t c)
 {
   return vcmlaq_f16 (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f16"  }  } */
 
+/*
+**foo1:
+** ...
+** vcmla.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo1 (float16x8_t a, float16x8_t b, float16x8_t c)
 {
   return vcmlaq (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
index 166bf421f14..71ec4b8479c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vcmla.f32   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float32x4_t
 foo (float32x4_t a, float32x4_t b, float32x4_t c)
 {
   return vcmlaq_f32 (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f32"  }  } */
 
+/*
+**foo1:
+** ...
+** vcmla.f32   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float32x4_t
 foo1 (float32x4_t a, float32x4_t b, float32x4_t c)
 {
   return vcmlaq (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c
index 0929f5a0a89..3db345d0791 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq

[PATCH 11/23] arm: improve tests for vqdmladhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmladhq_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vqdmladhq_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vqdmladhq_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vqdmladhq_s16.c| 24 +++--
 .../arm/mve/intrinsics/vqdmladhq_s32.c| 24 +++--
 .../arm/mve/intrinsics/vqdmladhq_s8.c | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
index 51cdadc9ece..aa9c78c883b 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
index 7e43fed1503..4694a6f9ec5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
index adf591041e3..c8dc67fdd12 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s8q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqdmladhq_m_s8 (inactive, a, b, p);

[PATCH 19/23] arm: improve tests for vqrdmlsdhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhxq_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhxq_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhxq_s8.c   | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
index 2fbd351f3b4..3598f50ccba 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
index 324a6e63398..1ab22edf9ca 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
index 287868b1190..01103e99b61 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s8  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t 

[PATCH 22/23] arm: improve tests for vld2q*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vld2q_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vld2q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_u8.c: Likewise.
---
 .../gcc.target/arm/mve/intrinsics/vld2q_f16.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_f32.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_s16.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_s32.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_s8.c  | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_u16.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_u32.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_u8.c  | 33 ---
 8 files changed, 224 insertions(+), 40 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
index 24e7a2ea4d0..81690b1022e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
@@ -1,22 +1,45 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float16x8x2_t
-foo (float16_t const * addr)
+foo (float16_t const *addr)
 {
   return vld2q_f16 (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.16"  }  } */
-/* { dg-final { scan-assembler "vld21.16"  }  } */
 
+/*
+**foo1:
+** ...
+** vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float16x8x2_t
-foo1 (float16_t const * addr)
+foo1 (float16_t const *addr)
 {
   return vld2q (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
index 727484caaf6..d2ae31fa9e5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
@@ -1,22 +1,45 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vld20.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float32x4x2_t
-foo (float32_t const * addr)
+foo (float32_t const *addr)
 {
   return vld2q_f32 (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.32"  }  } */
-/* { dg-final { scan-assembler "vld21.32"  }  } */
 
+/*
+**foo1:
+** ...
+** vld20.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float32x4x2_t
-foo1 (float32_t const * addr)
+foo1 (float32_t const *addr)
 {
   return vld2q (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
index f2864a00478..fb4dc1b4fcf 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
@@ -1,22 +1,45 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 int16x8x2_t
-foo (int16_t const * addr)
+foo (int16_t const *addr)
 {
   return vld2q_s16

[PATCH 10/23] arm: improve tests and fix vqabsq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vqabsq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vqabsq_m_s16.c | 33 +--
 .../arm/mve/intrinsics/vqabsq_m_s32.c | 33 +--
 .../arm/mve/intrinsics/vqabsq_m_s8.c  | 33 +--
 .../arm/mve/intrinsics/vqabsq_s16.c   | 28 +---
 .../arm/mve/intrinsics/vqabsq_s32.c   | 28 +---
 .../gcc.target/arm/mve/intrinsics/vqabsq_s8.c | 24 --
 7 files changed, 161 insertions(+), 20 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 0a243486bdb..600adf7d69b 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -388,7 +388,7 @@ (define_insn "mve_vqabsq_s"
 VQABSQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vqabs.s%# %q0, %q1"
+  "vqabs.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
index e74e04ac92f..7172ac5cddd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqabsq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqabst.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqabsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
index f6ca8a6c3d6..297cb196f1a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqabsq_m_s32 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqabst.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqabsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
index d89a5aa3fa5..83c69931239 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s8

[PATCH 07/23] arm: improve tests for vcaddq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcaddq_rot270_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vcaddq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_s16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s8.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u16.c  | 33 --

[PATCH 20/23] arm: improve tests for vqrdmulhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmulhq_m_n_s16.c| 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_n_s32.c| 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_n_s8.c | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_n_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_n_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_n_s8.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_s16.c| 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_s32.c| 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_s8.c | 24 +++--
 12 files changed, 312 insertions(+), 36 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
index c4b6b7e22f8..fc3a33073aa 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m_n_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
index 6de3eb1cb9a..897ad5bd28c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m_n_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/ar

[PATCH 14/23] arm: improve tests for vqrdmladhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmladhxq_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vqrdmladhxq_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vqrdmladhxq_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vqrdmladhxq_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhxq_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhxq_s8.c   | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
index 677efdcd1e4..1f68671b3f9 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
index 8ee8bbb420b..eaea6e1f482 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
index 7cfa88fee28..0f582a91f3a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s8  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t 

[PATCH 09/23] arm: improve tests for vcmulq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcmulq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcmulq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_x_f32.c: Likewise.
---
 .../arm/mve/intrinsics/vcmulq_f16.c   | 24 +++--
 .../arm/mve/intrinsics/vcmulq_f32.c   | 24 +++--
 .../arm/mve/intrinsics/vcmulq_m_f16.c | 34 ---
 .../arm/mve/intrinsics/vcmulq_m_f32.c | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot180_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot180_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot180_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot180_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot180_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot180_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot270_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot270_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot90_f16.c | 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot90_f32.c | 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot90_m_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot90_m_f32.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot90_x_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot90_x_f32.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_x_f16.c | 33 --
 .../arm/mve/intrinsics/vcmulq_x_f32.c | 33 --
 24 files changed, 656 insertions(+), 74 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
index 142c315ecf5..456370e1de1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vcmul.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a, float16x8_t b)
 {
   return vcmulq_f16 (a, b);
 }
 
-/* { dg-final { scan-assembler "vcmul.f16"  }  } */
 
+/*
+**foo1:
+** ...
+** vcmul.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo1 (float16x8_t a, float16x8_t b)
 {
   return vcmulq (a, b);
 }
 
-/* { dg-final { scan-assembler "vcmul.f16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c
index 158d750793d..64db652a1a1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m

[PATCH 00/23] arm: rework MVE testsuite and rework backend where necessary (3rd chunck)

2023-01-20 Thread Andrea Corallo via Gcc-patches
Hi all,

this 3rd series, similarly to the previous ones, rework the arm MVE
testsuite for better coverage.  Contextually some trivial fixes to the
backend are performed.

23/23 also adds some extern "C" I forgot to add with the previous
series in order to fix those tests for C++.

Best Regards

  Andrea

Andrea Corallo (23):
  arm: improve tests and fix vclsq*
  arm: improve tests and fix vclzq*
  arm: improve tests and fix vnegq*
  arm: improve tests for vmulhq*
  arm: improve tests for vmullbq*
  arm: improve tests for vmulltq*
  arm: improve tests for vcaddq*
  arm: improve tests for vcmlaq*
  arm: improve tests for vcmulq*
  arm: improve tests and fix vqabsq*
  arm: improve tests for vqdmladhq*
  arm: improve tests for vqdmladhxq*
  arm: improve tests for vqrdmladhq*
  arm: improve tests for vqrdmladhxq*
  arm: improve tests for vqrdmlashq*
  arm: improve tests for vqdmlsdhq*
  arm: improve tests for vqdmlsdhxq*
  arm: improve tests for vqrdmlsdhq*
  arm: improve tests for vqrdmlsdhxq*
  arm: improve tests for vqrdmulhq*
  arm: improve tests and fix vqnegq*
  arm: improve tests for vld2q*
  arm: fix missing extern "C" in MVE tests

 gcc/config/arm/mve.md | 12 +++
 .../arm/mve/intrinsics/vcaddq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_s16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s8.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u8.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_f16.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_f32.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_m_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_f32.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_s16.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_s32.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_s8.c| 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_u16.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_u32.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_u8.c| 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_s16.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_s32.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_s8.c  | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_u16.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_u32.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_u8.c  | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_x_f16.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_f32.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_s16.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_s32.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_s8.c| 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_u16.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_u32.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_u8.c| 33 --
 .../arm/mve/intrinsics/vclsq_m_s16.c  | 33 --
 .../arm/mve/intrinsics/vclsq_m_s32.c  | 33 --
 .../arm/mve/intrinsics/vclsq_m_s8.c   | 33 --
 .../gcc.target/arm/mve/intrinsics/vclsq_s16.c | 28 ---
 .../gcc.target/arm/mve/intrinsics/vclsq_s32.c | 28 ---
 .../gcc.target/arm/mve/intrinsics/vclsq_s8.c  | 24 +++--
 .../arm/mve/intrinsics/v

[PATCH 23/23] arm: fix missing extern "C" in MVE tests

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vhaddq_n_s16.c: Add missing extern
"C".
* gcc.target/arm/mve/intrinsics/vhaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s8.c: 

Re: [PATCH 3/4] libbacktrace: work with aslr on windows

2023-01-20 Thread Gabriel Ravier via Gcc-patches

On 1/20/23 14:39, Eli Zaretskii via Gcc wrote:

From: Björn Schäpers 
Date: Fri, 20 Jan 2023 11:54:08 +0100

@@ -856,7 +870,12 @@ coff_add (struct backtrace_state *state, int descriptor,
  + (sections[i].offset - min_offset));
  }
  
-  if (!backtrace_dwarf_add (state, /* base_address */ 0, &dwarf_sections,

+#ifdef HAVE_WINDOWS_H
+module_handle = (uintptr_t) GetModuleHandleW (NULL);
+base_address = module_handle - image_base;
+#endif
+
+  if (!backtrace_dwarf_add (state, base_address, &dwarf_sections,
0, /* FIXME: is_bigendian */
NULL, /* altlink */
error_callback, data, fileline_fn,

Why do you force using the "wide" APIs here?  Won't GetModuleHandle do
the job, whether it resolves to GetModuleHandleA or GetModuleHandleW?


I would expect the reason to be either that:

- using wide APIs with Windows is generally considered to be a best 
practice, even when not strictly needed (and in this case I can't see 
any problem with doing so, unless maybe we want to code to work with 
Windows 95 or something like that...)


- using the narrow API somehow has an actual drawback, for example maybe 
it might not work if the name of the exe file the NULL will tell it to 
get a handle to contains wide characters




Re: [PATCH][GCC] arm: Add support for new frame unwinding instruction "0xb5".

2023-01-20 Thread Richard Earnshaw via Gcc-patches




On 10/11/2022 10:37, Srinath Parvathaneni via Gcc-patches wrote:

Hi,

This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.

[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2022-11-09  Srinath Parvathaneni  

 * libgcc/config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode
"0xb5".


### Attachment also inlined for ease of reply###


diff --git a/libgcc/config/arm/pr-support.c b/libgcc/config/arm/pr-support.c
index 
e48854587c667a959aa66ccc4982231f6ecc..73e4942a39b34a83c2da85def6b13e82ec501552
 100644
--- a/libgcc/config/arm/pr-support.c
+++ b/libgcc/config/arm/pr-support.c
@@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
_uw op;
int set_pc;
int set_pac = 0;
+  int set_pac_sp = 0;
_uw reg;
+  _uw sp;
  
set_pc = 0;

for (;;)
@@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  #if defined(TARGET_HAVE_PACBTI)
  if (set_pac)
{
- _uw sp;
  _uw lr;
  _uw pac;
- _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ if (!set_pac_sp)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+&sp);
  _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32, &lr);
  _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
   _UVRSD_UINT32, &pac);
@@ -259,7 +262,19 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  continue;
}
  
-	  if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */

+ /* Use current VSP as modifier in PAC validation.  */
+ if (op == 0xb5)
+   {
+ if (set_pac)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+&sp);
+ else
+   return _URC_FAILURE;


I don't think you need to worry about the case when set_pac is false; in 
fact, I don't think you need to even test set_pac here.  It's harmless 
if this opcode appears and then we never do the authentication, so just 
record the SP value at this point.



+ set_pac_sp = 1;
+ continue;
+   }
+
+ if ((op & 0xfd) == 0xb6)  /* Obsolete FPA.  */


No, this is logically impossible (0xfd is binary _1101, while 0xb6 
is binary 1011_110 and thus bit 2 will never be set after the mask). 
But you don't need to change the condition here at all, because we've 
already taken out the case you're worried about immediately above (and 
ended that block with a 'continue').



return _URC_FAILURE;
 > /* op & 0xf8 == 0xb8.  */





R.


Re: [GCC][PATCH v4] arm: Add pacbti related multilib support for armv8.1-m.main.

2023-01-20 Thread Richard Earnshaw via Gcc-patches




On 13/01/2023 17:46, Srinath Parvathaneni via Gcc-patches wrote:

Hi,

This patch adds the support for pacbti multlilib linking by making
"-mbranch-protection=none" as default multilib option for arm-none-eabi
target.

Eg 1.

If the passed command line flags are (without mbranch-protection):
a) -march=armv8.1-m.main+mve -mfloat-abi=hard -mfpu=auto

"-mbranch-protection=none" will be used in the multilib matching.

Eg 2.

If the passed command line flags are (with mbranch-protection):
a) -march=armv8.1-m.main+mve+pacbti -mfloat-abi=hard -mfpu=auto  
-mbranch-protection=pac-ret

"-mbranch-protection=standard" will be used in the multilib matching.

Regression tested on arm-none-eabi and bootstrapped on arm-none-linux-gnueabihf.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2023-01-11  Srinath Parvathaneni  

 * config.gcc ($tm_file): Update variable.
 * config/arm/arm-mlib.h: Create new header file.
 * config/arm/t-rmprofile (MULTI_ARCH_DIRS_RM): Rename 
mbranch-protection
 multilib arch directory.
 (MULTILIB_REUSE): Add multilib reuse rules.
 (MULTILIB_MATCHES): Add multilib match rules.

gcc/testsuite/ChangeLog:

2023-01-11  Srinath Parvathaneni  

 * gcc.target/arm/multilib.exp (multilib_config "rmprofile"): Update
 tests.
 * gcc.target/arm/pac-12.c: New test.
 * gcc.target/arm/pac-13.c: Likewise.
 * gcc.target/arm/pac-14.c: Likewise.


OK.

R.


Re: [committed] C-SKY: Define SYSROOT_SUFFIX_SPEC.

2023-01-20 Thread Joseph Myers
On Fri, 13 Jan 2023, Xianmiao Qu via Gcc-patches wrote:

> The earlier patch
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575418.html
> refine the way to generate sysroot suffix, but it can't find the
> right path for all CPUs. The SYSROOT_SUFFIX_SPEC should be defined
> to fix it.

I think this caused the build failures with build-many-glibcs.py shown by 
my bot.  SYSROOT_SUFFIX_SPEC should not be defined for a 
--disable-multilib build; in such a build you can expect a single sysroot 
without a suffix involved.  Thus, you should put the SYSROOT_SUFFIX_SPEC 
definition in a separate header only used if test "x${enable_multilib}" = 
xyes (as in the config.gcc code removed in the older patch you refer to 
above), or something similar.

https://sourceware.org/pipermail/libc-testresults/2023q1/010706.html

The error is:

/scratch/jmyers/glibc-bot/install/compilers/csky-linux-gnuabiv2/csky-glibc-linux-gnuabiv2/bin/ld:
 cannot find -lc: No such file or directory

(linking shared libgcc).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH v2][GCC] arm: Add support for new frame unwinding instruction "0xb5".

2023-01-20 Thread Srinath Parvathaneni via Gcc-patches
Hi,

This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.

[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2022-11-09  Srinath Parvathaneni  

* libgcc/config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode 
"0xb5".


### Attachment also inlined for ease of reply###


diff --git a/libgcc/config/arm/pr-support.c b/libgcc/config/arm/pr-support.c
index 
e48854587c667a959aa66ccc4982231f6ecc..1fbc41e17c227c21af1937344ded2a7fd80e61df
 100644
--- a/libgcc/config/arm/pr-support.c
+++ b/libgcc/config/arm/pr-support.c
@@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
   _uw op;
   int set_pc;
   int set_pac = 0;
+  int set_pac_sp = 0;
   _uw reg;
+  _uw sp;
 
   set_pc = 0;
   for (;;)
@@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
 #if defined(TARGET_HAVE_PACBTI)
  if (set_pac)
{
- _uw sp;
  _uw lr;
  _uw pac;
- _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ if (!set_pac_sp)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+&sp);
  _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32, &lr);
  _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
   _UVRSD_UINT32, &pac);
@@ -259,6 +262,14 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  continue;
}
 
+ /* Use current VSP as modifier in PAC validation.  */
+ if (op == 0xb5)
+   {
+ _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ set_pac_sp = 1;
+ continue;
+   }
+
  if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */
return _URC_FAILURE;
 



diff --git a/libgcc/config/arm/pr-support.c b/libgcc/config/arm/pr-support.c
index 
e48854587c667a959aa66ccc4982231f6ecc..1fbc41e17c227c21af1937344ded2a7fd80e61df
 100644
--- a/libgcc/config/arm/pr-support.c
+++ b/libgcc/config/arm/pr-support.c
@@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
   _uw op;
   int set_pc;
   int set_pac = 0;
+  int set_pac_sp = 0;
   _uw reg;
+  _uw sp;
 
   set_pc = 0;
   for (;;)
@@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
 #if defined(TARGET_HAVE_PACBTI)
  if (set_pac)
{
- _uw sp;
  _uw lr;
  _uw pac;
- _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ if (!set_pac_sp)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+&sp);
  _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32, &lr);
  _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
   _UVRSD_UINT32, &pac);
@@ -259,6 +262,14 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  continue;
}
 
+ /* Use current VSP as modifier in PAC validation.  */
+ if (op == 0xb5)
+   {
+ _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ set_pac_sp = 1;
+ continue;
+   }
+
  if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */
return _URC_FAILURE;
 





Re: git out-of-order commit (was Re: [PATCH] Fortran: Remove unused declaration)

2023-01-20 Thread Jason Merrill via Gcc-patches
On Thu, Jan 19, 2023 at 11:26 PM Bernhard Reutner-Fischer
 wrote:
>
> On 19 January 2023 20:39:08 CET, Jason Merrill  wrote:
> >On Sat, Nov 12, 2022 at 4:24 PM Harald Anlauf via Gcc-patches
> > wrote:
> >>
> >> Am 12.11.22 um 22:05 schrieb Bernhard Reutner-Fischer via Gcc-patches:
> >> > This function definition was removed years ago, remove it's prototype.
> >> >
> >> > gcc/fortran/ChangeLog:
> >> >
> >> >   * gfortran.h (gfc_check_include): Remove declaration.
> >> > ---
> >> >   gcc/fortran/gfortran.h | 1 -
> >> >   1 file changed, 1 deletion(-)
> >> > ---
> >> > Regtests cleanly, ok for trunk?
> >> >
> >> > diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> >> > index c4deec0d5b8..ce3ad61bb52 100644
> >> > --- a/gcc/fortran/gfortran.h
> >> > +++ b/gcc/fortran/gfortran.h
> >> > @@ -3208,7 +3208,6 @@ int gfc_at_eof (void);
> >> >   int gfc_at_bol (void);
> >> >   int gfc_at_eol (void);
> >> >   void gfc_advance_line (void);
> >> > -int gfc_check_include (void);
> >> >   int gfc_define_undef_line (void);
> >> >
> >> >   int gfc_wide_is_printable (gfc_char_t);
> >>
> >> OK, thanks.
> >
> >Somehow this was applied with a CommitDate in 2021, breaking scripts
> >that assume monotonically increasing CommitDate.  Anyone know how that
> >could have happened?
>
> Sorry for that.
> I think i cherry-picked this commit to master before pushing it, not 100% 
> sure though.

You would have also needed to override the commit date with
GIT_COMMITTER_DATE.  Do you remember using that environment variable
at all?

> What shall we do now?

I don't think there's anything we can do about this commit at this
point; rewriting the git history would be a bigger disruption than
leaving it alone.

Martin, I wonder about having the hooks reject out-of-order CommitDate
in future?

Jason



Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

2023-01-20 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 19, 2023 at 03:40:19PM +0100, Tobias Burnus wrote:
> +  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
> +
> +  gfc_se se;
> +  tree tree_var, a1, a2;
> +  a1 = integer_one_node;
> +  a2 = integer_zero_node;
> +
> +  gfc_init_se (&se, NULL);
> +  gfc_conv_expr_lhs (&se, code->ext.iterator->var);
> +  gfc_add_block_to_block (pblock, &se.pre);
> +  tree_var = se.expr;
> +
> +  {
> +/* FIXME: Handle non-unity iterations, cf. PR fortran/107424.

I think instead of non-unity etc. it is better to talk about
constant step 1 or -1.

> +   The issue is that for those a 'count' variable is used.  */
> +dovar_init *di;
> +unsigned ix;
> +tree t = tree_var;
> +while (TREE_CODE (t) == INDIRECT_REF)
> +  t = TREE_OPERAND (t, 0);
> +FOR_EACH_VEC_ELT (*inits, ix, di)
> +  {
> + tree t2 = di->var;
> + while (TREE_CODE (t2) == INDIRECT_REF)
> +   t2 = TREE_OPERAND (t2, 0);

The actual problem with non-simple loops for non-rectangular loops is
both in case it is an inner loop which uses some outer loop's iterator,
or if it is outer loop whose iterator is used, both of those cases
will not be handled properly.  The former case because instead of
having lb and ub expressions in canonicalized form var-outer * m + a
lb will be 0 (that is fine) and ub will be
(var-outer * m2 + a2 + step - var-outer * m1 - a1) / step
or so (sure, we can simplify that to
(var-outer * (m1 - m2) + (a2 + step - a1)) / step
but the division remains.  And the latter case is bad because we
need var-outer but we actually compute some artificial count iterator
and var-outer is only initialized in the body of the loop.
These sorry_at seems to handle just one of those, when the outer
loop whose var-outer is referenced is not simple, no?

I wonder if it wouldn't be cleaner and easier to simply remember for
each loop in XALLOCAVEC array whether it was simple or not and why
(from the:
  if (VAR_P (dovar))
{
  if (integer_onep (step))
simple = 1;
  else if (tree_int_cst_equal (step, integer_minus_one_node))
simple = -1;
}
  else
dovar_decl
  = gfc_trans_omp_variable (code->ext.iterator->var->symtree->n.sym,
false);
remember if it was simple (1/-1) or VAR_P !simple (then we would
if needed for non-rect sorry_at about step not being constant 1 or -1)
or if it is the !VAR_P case.
And then the non-rect sorry can be emitted for both the cases easily
(especially if you precompute the:
  if (VAR_P (dovar))
{
  if (integer_onep (step))
simple_loop[i] = 1;
  else if (tree_int_cst_equal (step, integer_minus_one_node))
simple_loop[i] = -1;
  else
simple_loop[i] = 0;
}
  else
simple_loop[i] = 2;
early) and in this function check it for both loop_n and i.

> + if (t == t2)
> +   {
> + HOST_WIDE_INT intval;
> + if (gfc_extract_hwi (code->ext.iterator->step, &intval, 0) == 0
> + && intval != 1 && intval != -1)
> +   sorry_at (gfc_get_location (&code->loc),
> + "non-rectangular loop nest with non-unit loop iteration"
> + " step for %qs", var->name);

I'd say step other than constant 1 or -1.

> +  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops

unit ?

> +  ! Then same, execpt use nonunit stride for 'k'

except, non-unit ?

> +  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
> +  ! Then same, execpt use nonunit stride for 'k'

2x again
(and some more later).

Jakub



RE: [PATCH 01/23] arm: improve tests and fix vclsq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches
Hi Andrea,

> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:39 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 01/23] arm: improve tests and fix vclsq*
> 
> gcc/ChangeLog:
> 
>   * config/arm/mve.md (mve_vclsq_s): Fix spacing.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vclsq_m_s16.c: Improve test.

I'd prefer something a bit more descriptive, like "Use check-function-bodies 
instead of scan-assembler checks.  Use extern "C" for C++ testing."
Ok with a fixed ChangeLog.
Thanks,
Kyrill

>   * gcc.target/arm/mve/intrinsics/vclsq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclsq_x_s8.c: Likewise.

> ---
>  gcc/config/arm/mve.md |  2 +-
>  .../arm/mve/intrinsics/vclsq_m_s16.c  | 33 +--
>  .../arm/mve/intrinsics/vclsq_m_s32.c  | 33 +--
>  .../arm/mve/intrinsics/vclsq_m_s8.c   | 33 +--
>  .../gcc.target/arm/mve/intrinsics/vclsq_s16.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vclsq_s32.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vclsq_s8.c  | 24 --
>  .../arm/mve/intrinsics/vclsq_x_s16.c  | 33 +--
>  .../arm/mve/intrinsics/vclsq_x_s32.c  | 33 +--
>  .../arm/mve/intrinsics/vclsq_x_s8.c   | 33 +--
>  10 files changed, 251 insertions(+), 29 deletions(-)
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index f123edc449b..e35ea5d9f9c 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -469,7 +469,7 @@ (define_insn "mve_vclsq_s"
>VCLSQ_S))
>]
>"TARGET_HAVE_MVE"
> -  "vcls.s%#  %q0, %q1"
> +  "vcls.s%#\t%q0, %q1"
>[(set_attr "type" "mve_move")
>  ])
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
> index d0eb7008537..1996ac8b03e 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vclst.s16   q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vclsq_m_s16 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vclst.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vclst.s16   q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vclsq_m (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
> index b6d7088a8e7..f51841d024e 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vclst.s32   q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
>  {
>return vclsq_m_s32 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vclst.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vclst.s32   q

RE: [PATCH 02/23] arm: improve tests and fix vclzq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:39 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 02/23] arm: improve tests and fix vclzq*
> 
> gcc/ChangeLog:
> 
>   * config/arm/mve.md (@mve_vclzq_s): Fix spacing.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vclzq_m_s16.c: Improve test.

As in patch 1/23, ok with a more descriptive entry.
Thanks,
Kyrill

>   * gcc.target/arm/mve/intrinsics/vclzq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vclzq_x_u8.c: Likewise.
>   * gcc.target/arm/simd/mve-vclz.c: Update test.
> ---
>  gcc/config/arm/mve.md |  2 +-
>  .../arm/mve/intrinsics/vclzq_m_s16.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_m_s32.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_m_s8.c   | 33 +--
>  .../arm/mve/intrinsics/vclzq_m_u16.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_m_u32.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_m_u8.c   | 33 +--
>  .../gcc.target/arm/mve/intrinsics/vclzq_s16.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vclzq_s32.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vclzq_s8.c  | 24 --
>  .../gcc.target/arm/mve/intrinsics/vclzq_u16.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vclzq_u32.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vclzq_u8.c  | 28 +---
>  .../arm/mve/intrinsics/vclzq_x_s16.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_x_s32.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_x_s8.c   | 33 +--
>  .../arm/mve/intrinsics/vclzq_x_u16.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_x_u32.c  | 33 +--
>  .../arm/mve/intrinsics/vclzq_x_u8.c   | 33 +--
>  gcc/testsuite/gcc.target/arm/simd/mve-vclz.c  |  6 ++--
>  20 files changed, 506 insertions(+), 62 deletions(-)
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index e35ea5d9f9c..854371f7e11 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -448,7 +448,7 @@ (define_insn "@mve_vclzq_s"
>   (clz:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
>]
>"TARGET_HAVE_MVE"
> -  "vclz.i%#  %q0, %q1"
> +  "vclz.i%#\t%q0, %q1"
>[(set_attr "type" "mve_move")
>  ])
>  (define_expand "mve_vclzq_u"
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
> index 9670f8f56f3..620314e4ff2 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vclzt.i16   q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vclzq_m_s16 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vclzt.i16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vclzt.i16   q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vclzq_m (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> +#ifdef __cplusplus
>

RE: [PATCH 03/23] arm: improve tests and fix vnegq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:39 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 03/23] arm: improve tests and fix vnegq*
> 
> gcc/ChangeLog:
> 
>   * config/arm/mve.md (mve_vnegq_f,
> mve_vnegq_s):
>   Fix spacing.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vnegq_f16.c: Improve test.

Ok as before.
Thanks,
Kyrill

>   * gcc.target/arm/mve/intrinsics/vnegq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_m_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vnegq_x_s8.c: Likewise.
>   * gcc.target/arm/simd/mve-vneg.c: Update test.
>   * gcc.target/arm/simd/mve-vshr.c: Likewise
> ---
>  gcc/config/arm/mve.md |  4 +--
>  .../gcc.target/arm/mve/intrinsics/vnegq_f16.c | 30 -
>  .../gcc.target/arm/mve/intrinsics/vnegq_f32.c | 30 -
>  .../arm/mve/intrinsics/vnegq_m_f16.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_m_f32.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_m_s16.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_m_s32.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_m_s8.c   | 33 +--
>  .../gcc.target/arm/mve/intrinsics/vnegq_s16.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vnegq_s32.c | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vnegq_s8.c  | 24 --
>  .../arm/mve/intrinsics/vnegq_x_f16.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_x_f32.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_x_s16.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_x_s32.c  | 33 +--
>  .../arm/mve/intrinsics/vnegq_x_s8.c   | 33 +--
>  gcc/testsuite/gcc.target/arm/simd/mve-vneg.c  |  4 +--
>  gcc/testsuite/gcc.target/arm/simd/mve-vshr.c  |  2 +-
>  18 files changed, 433 insertions(+), 47 deletions(-)
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index 854371f7e11..0a243486bdb 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -252,7 +252,7 @@ (define_insn "mve_vnegq_f"
>   (neg:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")))
>]
>"TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> -  "vneg.f%#  %q0, %q1"
> +  "vneg.f%#\t%q0, %q1"
>[(set_attr "type" "mve_move")
>  ])
> 
> @@ -401,7 +401,7 @@ (define_insn "mve_vnegq_s"
>   (neg:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
>]
>"TARGET_HAVE_MVE"
> -  "vneg.s%#  %q0, %q1"
> +  "vneg.s%#\t%q0, %q1"
>[(set_attr "type" "mve_move")
>  ])
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
> index 9572c140d7e..9853cf6e6dd 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
> @@ -1,13 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>  /* { dg-add-options arm_v8_1m_mve_fp } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vneg.f16q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  float16x8_t
>  foo (float16x8_t a)
>  {
>return vnegq_f16 (a);
>  }
> 
> -/* { dg-final { scan-assembler "vneg.f16"  }  } */
> +
> +/*
> +**foo1:
> +**   ...
> +**   vneg.f16q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
> +float16x8_t
> +foo1 (float16x8_t a)
> +{
> +  return vnegq (a);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
> index be73cc0c5f5..489cfc760ba 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
> @@ -1,13 +1,41 @@
>  /* { dg-require-effective-target arm_

RE: [PATCH 04/23] arm: improve tests for vmulhq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:39 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 04/23] arm: improve tests for vmulhq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c: Improve test.

Ok as before.
Thanks,
Kyrill

>   * gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulhq_x_u8.c: Likewise.
> ---
>  .../arm/mve/intrinsics/vmulhq_m_s16.c | 34 ---
>  .../arm/mve/intrinsics/vmulhq_m_s32.c | 34 ---
>  .../arm/mve/intrinsics/vmulhq_m_s8.c  | 34 ---
>  .../arm/mve/intrinsics/vmulhq_m_u16.c | 34 ---
>  .../arm/mve/intrinsics/vmulhq_m_u32.c | 34 ---
>  .../arm/mve/intrinsics/vmulhq_m_u8.c  | 34 ---
>  .../arm/mve/intrinsics/vmulhq_s16.c   | 24 +++--
>  .../arm/mve/intrinsics/vmulhq_s32.c   | 24 +++--
>  .../gcc.target/arm/mve/intrinsics/vmulhq_s8.c | 24 +++--
>  .../arm/mve/intrinsics/vmulhq_u16.c   | 24 +++--
>  .../arm/mve/intrinsics/vmulhq_u32.c   | 24 +++--
>  .../gcc.target/arm/mve/intrinsics/vmulhq_u8.c | 24 +++--
>  .../arm/mve/intrinsics/vmulhq_x_s16.c | 33 --
>  .../arm/mve/intrinsics/vmulhq_x_s32.c | 33 --
>  .../arm/mve/intrinsics/vmulhq_x_s8.c  | 33 --
>  .../arm/mve/intrinsics/vmulhq_x_u16.c | 33 --
>  .../arm/mve/intrinsics/vmulhq_x_u32.c | 33 --
>  .../arm/mve/intrinsics/vmulhq_x_u8.c  | 33 --
>  18 files changed, 492 insertions(+), 54 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
> index 4971869a27b..a7d8460c265 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vmulht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vmulhq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vmulht.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vmulht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vmulhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vmulht.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
> index 3006de7fd24..997fdbe8d23 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "

Re: [wwwdocs] gcc-13/changes.html + projects/gomp/: OpenMP update

2023-01-20 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 01:39:43PM +0100, Tobias Burnus wrote:
> --- a/htdocs/gcc-13/changes.html
> +++ b/htdocs/gcc-13/changes.html
> @@ -53,12 +53,19 @@ a work-in-progress.
>https://gcc.gnu.org/projects/gomp/";>OpenMP
>
>  
> -  Reverse offload is now supported and the all clauses to the
> -  requires directive are now accepted. However, the
> -  requires_offload, unified_address
> -  and unified_shared_memory clauses imply the initial
> -  device (= the host) as the only available device. Fortran now
> -  supports non-rectangular loop nests, which were added for C/C++ in GCC 
> 11.
> +  Reverse offload is now supported with nvptx devices. Additionally, the
> +  requires handling has been improved and all clauses are
> +  now accepted. If a requirement cannot be fulfilled for an accessible
> +  device, this device is excluded from the list of available devices. 
> This
> +  may imply that the only device left is the host (the initial device).
> +  In particular, requires_offload is currently unsupported 
> on
> +  AMD GCN devices while unified_address and
> +  unified_shared_memory are unsupported by all non-host
> +  devices.

The above looks good to me.

> +
> +
> +  OpenMP 5.0: Fortran now supports non-rectangular loop nests, which were
> +  added for C/C++ in GCC 11.

But because of the sorry_at I'd say "supports some" instead of "supports".
And similarly in the libgomp texi as well as gomp/index.html clarify it is
full C/C++ support (I'll look at PR108435 soon) and say the Fortran support
is still partial.

> --- a/htdocs/projects/gomp/index.html
> +++ b/htdocs/projects/gomp/index.html
> @@ -547,9 +547,14 @@ than listed, depending on resolved corner cases and 
> optimizations.
>  
>
>
> -align clause/modifier in allocate 
> directive/clause and allocator directive
> +align clause in allocate directive
> +No
> +
> +  
> +  
> +align modifier in allocate clause
>   href="../../gcc-12/changes.html#languages">GCC 12
> -C/C++ on clause only
> +
>
>
>  thread_limit clause to target 
> construct


Jakub



RE: [PATCH 05/23] arm: improve tests for vmullbq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 05/23] arm: improve tests for vmullbq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c: Improve test.

Ok as before. For further patches in the series that do a similar thing please 
assume that I'd like a more descriptive entry here 😊
Thanks,
Kyrill

>   * gcc.target/arm/mve/intrinsics/vmullbq_int_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_int_x_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_poly_m_p16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_poly_m_p8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_poly_p16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_poly_p8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_poly_x_p16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmullbq_poly_x_p8.c: Likewise.
> ---
>  .../arm/mve/intrinsics/vmullbq_int_m_s16.c| 34 ---
>  .../arm/mve/intrinsics/vmullbq_int_m_s32.c| 34 ---
>  .../arm/mve/intrinsics/vmullbq_int_m_s8.c | 34 ---
>  .../arm/mve/intrinsics/vmullbq_int_m_u16.c| 34 ---
>  .../arm/mve/intrinsics/vmullbq_int_m_u32.c| 34 ---
>  .../arm/mve/intrinsics/vmullbq_int_m_u8.c | 34 ---
>  .../arm/mve/intrinsics/vmullbq_int_s16.c  | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_int_s32.c  | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_int_s8.c   | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_int_u16.c  | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_int_u32.c  | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_int_u8.c   | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_int_x_s16.c| 33 --
>  .../arm/mve/intrinsics/vmullbq_int_x_s32.c| 33 --
>  .../arm/mve/intrinsics/vmullbq_int_x_s8.c | 33 --
>  .../arm/mve/intrinsics/vmullbq_int_x_u16.c| 33 --
>  .../arm/mve/intrinsics/vmullbq_int_x_u32.c| 33 --
>  .../arm/mve/intrinsics/vmullbq_int_x_u8.c | 33 --
>  .../arm/mve/intrinsics/vmullbq_poly_m_p16.c   | 34 ---
>  .../arm/mve/intrinsics/vmullbq_poly_m_p8.c| 34 ---
>  .../arm/mve/intrinsics/vmullbq_poly_p16.c | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_poly_p8.c  | 24 +++--
>  .../arm/mve/intrinsics/vmullbq_poly_x_p16.c   | 33 --
>  .../arm/mve/intrinsics/vmullbq_poly_x_p8.c| 33 --
>  24 files changed, 656 insertions(+), 72 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
> index be933274d77..a4cc5e52773 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vmullbt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vmullbq_int_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vmullbt.s16"  }  } */
> 
> +/*

RE: [PATCH 06/23] arm: improve tests for vmulltq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 06/23] arm: improve tests for vmulltq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_int_x_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_poly_m_p16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_poly_m_p8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_poly_p16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_poly_p8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_poly_x_p16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmulltq_poly_x_p8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vmulltq_int_m_s16.c| 34 ---
>  .../arm/mve/intrinsics/vmulltq_int_m_s32.c| 34 ---
>  .../arm/mve/intrinsics/vmulltq_int_m_s8.c | 34 ---
>  .../arm/mve/intrinsics/vmulltq_int_m_u16.c| 34 ---
>  .../arm/mve/intrinsics/vmulltq_int_m_u32.c| 34 ---
>  .../arm/mve/intrinsics/vmulltq_int_m_u8.c | 34 ---
>  .../arm/mve/intrinsics/vmulltq_int_s16.c  | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_int_s32.c  | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_int_s8.c   | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_int_u16.c  | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_int_u32.c  | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_int_u8.c   | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_int_x_s16.c| 33 --
>  .../arm/mve/intrinsics/vmulltq_int_x_s32.c| 33 --
>  .../arm/mve/intrinsics/vmulltq_int_x_s8.c | 33 --
>  .../arm/mve/intrinsics/vmulltq_int_x_u16.c| 33 --
>  .../arm/mve/intrinsics/vmulltq_int_x_u32.c| 33 --
>  .../arm/mve/intrinsics/vmulltq_int_x_u8.c | 33 --
>  .../arm/mve/intrinsics/vmulltq_poly_m_p16.c   | 34 ---
>  .../arm/mve/intrinsics/vmulltq_poly_m_p8.c| 34 ---
>  .../arm/mve/intrinsics/vmulltq_poly_p16.c | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_poly_p8.c  | 24 +++--
>  .../arm/mve/intrinsics/vmulltq_poly_x_p16.c   | 33 --
>  .../arm/mve/intrinsics/vmulltq_poly_x_p8.c| 33 --
>  24 files changed, 656 insertions(+), 72 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
> index 25ecf7a2c51..7f573e9109e 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vmulltt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vmulltq_int_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vmulltt.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**

Re: [PATCH v2][GCC] arm: Add support for new frame unwinding instruction "0xb5".

2023-01-20 Thread Richard Earnshaw via Gcc-patches




On 20/01/2023 17:27, Srinath Parvathaneni via Gcc-patches wrote:

Hi,

This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.

[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2022-11-09  Srinath Parvathaneni  

 * libgcc/config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode 
"0xb5".


### Attachment also inlined for ease of reply###


diff --git a/libgcc/config/arm/pr-support.c b/libgcc/config/arm/pr-support.c
index 
e48854587c667a959aa66ccc4982231f6ecc..1fbc41e17c227c21af1937344ded2a7fd80e61df
 100644
--- a/libgcc/config/arm/pr-support.c
+++ b/libgcc/config/arm/pr-support.c
@@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
_uw op;
int set_pc;
int set_pac = 0;
+  int set_pac_sp = 0;
_uw reg;
+  _uw sp;
  
set_pc = 0;

for (;;)
@@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  #if defined(TARGET_HAVE_PACBTI)
  if (set_pac)
{
- _uw sp;
  _uw lr;
  _uw pac;
- _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ if (!set_pac_sp)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+&sp);
  _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32, &lr);
  _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
   _UVRSD_UINT32, &pac);
@@ -259,6 +262,14 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  continue;
}
  
+	  /* Use current VSP as modifier in PAC validation.  */

+ if (op == 0xb5)
+   {
+ _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, &sp);
+ set_pac_sp = 1;
+ continue;
+   }
+
  if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */
return _URC_FAILURE;
  






OK.

R.


RE: [PATCH 07/23] arm: improve tests for vcaddq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 07/23] arm: improve tests for vcaddq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_f16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vcaddq_rot270_f16.c| 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_f32.c| 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_m_f16.c  | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_f32.c  | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_s16.c  | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_s32.c  | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_s8.c   | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_u16.c  | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_u32.c  | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_m_u8.c   | 34 ---
>  .../arm/mve/intrinsics/vcaddq_rot270_s16.c| 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_s32.c| 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_s8.c | 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_u16.c| 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_u32.c| 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_u8.c | 24 +++--
>  .../arm/mve/intrinsics/vcaddq_rot270_x_f16.c  | 33 --
>  .../arm/mve/intrinsics/vcaddq_rot270_x_f32.c

RE: [PATCH 08/23] arm: improve tests for vcmlaq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 08/23] arm: improve tests for vcmlaq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vcmlaq_f16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot180_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot180_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot180_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot180_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot270_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot270_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot270_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot270_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot90_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot90_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot90_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmlaq_rot90_m_f32.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vcmlaq_f16.c   | 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_f32.c   | 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_m_f16.c | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_m_f32.c | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_rot180_f16.c| 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_rot180_f32.c| 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_rot180_m_f16.c  | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_rot180_m_f32.c  | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_rot270_f16.c| 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_rot270_f32.c| 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_rot270_m_f16.c  | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_rot270_m_f32.c  | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_rot90_f16.c | 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_rot90_f32.c | 24 +++--
>  .../arm/mve/intrinsics/vcmlaq_rot90_m_f16.c   | 34 ---
>  .../arm/mve/intrinsics/vcmlaq_rot90_m_f32.c   | 34 ---
>  16 files changed, 416 insertions(+), 48 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
> index fa7d0c05e8c..bb8a99790a0 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
> @@ -1,21 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>  /* { dg-add-options arm_v8_1m_mve_fp } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vcmla.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
> +**   ...
> +*/
>  float16x8_t
>  foo (float16x8_t a, float16x8_t b, float16x8_t c)
>  {
>return vcmlaq_f16 (a, b, c);
>  }
> 
> -/* { dg-final { scan-assembler "vcmla.f16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vcmla.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
> +**   ...
> +*/
>  float16x8_t
>  foo1 (float16x8_t a, float16x8_t b, float16x8_t c)
>  {
>return vcmlaq (a, b, c);
>  }
> 
> -/* { dg-final { scan-assembler "vcmla.f16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
> index 166bf421f14..71ec4b8479c 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
> @@ -1,21 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>  /* { dg-add-options arm_v8_1m_mve_fp } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vcmla.f32   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
> +**   ...
> +*/
>  float32x4_t
>  foo (float32x4_t a, float32x4_t b, float32x4_t c)
>  {
>return vcmlaq_f32 (a, b, c);
>  }
> 
> -/* { dg-final { scan-assembler "vcmla.f32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vcmla.f32   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
> +**   ...
> +*/
>  float32x4_t
>  foo1 (float32x4_t a, float32x4_t b

RE: [PATCH 09/23] arm: improve tests for vcmulq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 09/23] arm: improve tests for vcmulq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vcmulq_f16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vcmulq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot180_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot180_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot180_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot180_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot180_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot180_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot270_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot270_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot270_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot270_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot270_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot270_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot90_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot90_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot90_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot90_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot90_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_rot90_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vcmulq_x_f32.c: Likewise.

Thanks for the patch and I'll review the rest of the series, but I think for 
the future such (almost mechanical) uniform updates to the testsuite would be 
better grouped into one big patch rather than split per instruction group.
As long as the tests pass normally after these changes I would expect to review 
the general approach for the changes in one or two testcases and trust that 
it's been applied correctly to the rest of the tests, rather than auditing 
every testcase.
Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vcmulq_f16.c   | 24 +++--
>  .../arm/mve/intrinsics/vcmulq_f32.c   | 24 +++--
>  .../arm/mve/intrinsics/vcmulq_m_f16.c | 34 ---
>  .../arm/mve/intrinsics/vcmulq_m_f32.c | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot180_f16.c| 24 +++--
>  .../arm/mve/intrinsics/vcmulq_rot180_f32.c| 24 +++--
>  .../arm/mve/intrinsics/vcmulq_rot180_m_f16.c  | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot180_m_f32.c  | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot180_x_f16.c  | 33 --
>  .../arm/mve/intrinsics/vcmulq_rot180_x_f32.c  | 33 --
>  .../arm/mve/intrinsics/vcmulq_rot270_f16.c| 24 +++--
>  .../arm/mve/intrinsics/vcmulq_rot270_f32.c| 24 +++--
>  .../arm/mve/intrinsics/vcmulq_rot270_m_f16.c  | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot270_m_f32.c  | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot270_x_f16.c  | 33 --
>  .../arm/mve/intrinsics/vcmulq_rot270_x_f32.c  | 33 --
>  .../arm/mve/intrinsics/vcmulq_rot90_f16.c | 24 +++--
>  .../arm/mve/intrinsics/vcmulq_rot90_f32.c | 24 +++--
>  .../arm/mve/intrinsics/vcmulq_rot90_m_f16.c   | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot90_m_f32.c   | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot90_x_f16.c   | 34 ---
>  .../arm/mve/intrinsics/vcmulq_rot90_x_f32.c   | 34 ---
>  .../arm/mve/intrinsics/vcmulq_x_f16.c | 33 --
>  .../arm/mve/intrinsics/vcmulq_x_f32.c | 33 --
>  24 files changed, 656 insertions(+), 74 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
> index 142c315ecf5..456370e1de1 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
> @@ -1,21 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>  /* { dg-add-options arm_v8_1m_mve_fp } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vcmul.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
> +**   ...
> +*/
>  float16x8_t
> 

Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

2023-01-20 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 20, 2023 at 06:39:04PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > +   The issue is that for those a 'count' variable is used.  */
> > +dovar_init *di;
> > +unsigned ix;
> > +tree t = tree_var;
> > +while (TREE_CODE (t) == INDIRECT_REF)
> > +  t = TREE_OPERAND (t, 0);
> > +FOR_EACH_VEC_ELT (*inits, ix, di)
> > +  {
> > +   tree t2 = di->var;
> > +   while (TREE_CODE (t2) == INDIRECT_REF)
> > + t2 = TREE_OPERAND (t2, 0);
> 
> The actual problem with non-simple loops for non-rectangular loops is
> both in case it is an inner loop which uses some outer loop's iterator,
> or if it is outer loop whose iterator is used, both of those cases
> will not be handled properly.  The former case because instead of
> having lb and ub expressions in canonicalized form var-outer * m + a
> lb will be 0 (that is fine) and ub will be
> (var-outer * m2 + a2 + step - var-outer * m1 - a1) / step
> or so (sure, we can simplify that to
> (var-outer * (m1 - m2) + (a2 + step - a1)) / step
> but the division remains.  And the latter case is bad because we
> need var-outer but we actually compute some artificial count iterator
> and var-outer is only initialized in the body of the loop.
> These sorry_at seems to handle just one of those, when the outer
> loop whose var-outer is referenced is not simple, no?

Though, I wonder if we shouldn't for GCC 13 just sorry_at about
steps other than constant 1/-1 (in both outer loop with var-outer referenced
in inner loop and on inner loop that references it) and for the !VAR_P case
actually handle it if step 1/-1 by using simple like translation just with
an artificial iterator.
Say for:
subroutine foo (x, y, z)
  integer :: x, y, z
  !$omp do private (x)
  do x = y, z
  end do
end subroutine foo
we right now in *.original dump have:
D.4265 = *y;
D.4266 = *z;
D.4267 = (1 - D.4265) + D.4266;
#pragma omp for private(count.0) private(x)
for (count.0 = 0; count.0 < D.4267; count.0 = count.0 + 1)
  {
*x = D.4265 + NON_LVALUE_EXPR ;
L.1:;
  }
What I'd suggest is:
D.4265 = *y;
D.4266 = *z;
#pragma omp for private(x)
for (x.0 = D.4265; x.0 <= D.4266; x.0 = x.0 + 1)
  {
*x = x.0;
L.1:;
  }
or so.  This could be done independently from the non-rect stuff,
as a first change.

Jakub



RE: [PATCH 10/23] arm: improve tests and fix vqabsq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 10/23] arm: improve tests and fix vqabsq*
> 
> gcc/ChangeLog:
> 
>   * config/arm/mve.md (mve_vqabsq_s): Fix spacing.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqabsq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqabsq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqabsq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  gcc/config/arm/mve.md |  2 +-
>  .../arm/mve/intrinsics/vqabsq_m_s16.c | 33 +--
>  .../arm/mve/intrinsics/vqabsq_m_s32.c | 33 +--
>  .../arm/mve/intrinsics/vqabsq_m_s8.c  | 33 +--
>  .../arm/mve/intrinsics/vqabsq_s16.c   | 28 +---
>  .../arm/mve/intrinsics/vqabsq_s32.c   | 28 +---
>  .../gcc.target/arm/mve/intrinsics/vqabsq_s8.c | 24 --
>  7 files changed, 161 insertions(+), 20 deletions(-)
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index 0a243486bdb..600adf7d69b 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -388,7 +388,7 @@ (define_insn "mve_vqabsq_s"
>VQABSQ_S))
>]
>"TARGET_HAVE_MVE"
> -  "vqabs.s%# %q0, %q1"
> +  "vqabs.s%#\t%q0, %q1"
>[(set_attr "type" "mve_move")
>  ])
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
> index e74e04ac92f..7172ac5cddd 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqabst.s16  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vqabsq_m_s16 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqabst.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqabst.s16  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vqabsq_m (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
> index f6ca8a6c3d6..297cb196f1a 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqabst.s32  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
>  {
>return vqabsq_m_s32 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqabst.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqabst.s32  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
>  {
>return vqabsq_m (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
> index d89a5aa3fa5..83c69931239 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsi

RE: [PATCH 11/23] arm: improve tests for vqdmladhq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 11/23] arm: improve tests for vqdmladhq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmladhq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmladhq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmladhq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqdmladhq_m_s16.c  | 34 ---
>  .../arm/mve/intrinsics/vqdmladhq_m_s32.c  | 34 ---
>  .../arm/mve/intrinsics/vqdmladhq_m_s8.c   | 34 ---
>  .../arm/mve/intrinsics/vqdmladhq_s16.c| 24 +++--
>  .../arm/mve/intrinsics/vqdmladhq_s32.c| 24 +++--
>  .../arm/mve/intrinsics/vqdmladhq_s8.c | 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
> index 51cdadc9ece..aa9c78c883b 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmladhq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladht.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmladhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladht.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
> index 7e43fed1503..4694a6f9ec5 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmladhq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladht.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmladhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladht.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
> index adf591041e3..c8dc67fdd12 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-add

RE: [PATCH 12/23] arm: improve tests for vqdmladhxq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 12/23] arm: improve tests for vqdmladhxq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmladhxq_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmladhxq_s32.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmladhxq_s8.c: Improve test.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqdmladhxq_m_s16.c | 34 ---
>  .../arm/mve/intrinsics/vqdmladhxq_m_s32.c | 34 ---
>  .../arm/mve/intrinsics/vqdmladhxq_m_s8.c  | 34 ---
>  .../arm/mve/intrinsics/vqdmladhxq_s16.c   | 24 +++--
>  .../arm/mve/intrinsics/vqdmladhxq_s32.c   | 24 +++--
>  .../arm/mve/intrinsics/vqdmladhxq_s8.c| 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
> index c2446e69181..19c5ce5a64f 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmladhxq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladhxt.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmladhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladhxt.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
> index 12b45517535..e00162addae 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmladhxq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladhxt.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmladhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmladhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmladhxt.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
> index 146aa51306b..19767d2cd41 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { d

RE: [PATCH 13/23] arm: improve tests for vqrdmladhq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 13/23] arm: improve tests for vqrdmladhq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqrdmladhq_m_s16.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmladhq_m_s32.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmladhq_m_s8.c  | 34 ---
>  .../arm/mve/intrinsics/vqrdmladhq_s16.c   | 24 +++--
>  .../arm/mve/intrinsics/vqrdmladhq_s32.c   | 24 +++--
>  .../arm/mve/intrinsics/vqrdmladhq_s8.c| 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
> index fce4f5a35ef..5b0e134a0ff 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmladhq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladht.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmladhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladht.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
> index e550b6a7995..6fdf3879cc2 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmladhq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladht.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmladhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladht.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
> index b07b28e5bcd..ef75f737161 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8

RE: [PATCH 14/23] arm: improve tests for vqrdmladhxq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 14/23] arm: improve tests for vqrdmladhxq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c: Improve
> test.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhxq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhxq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmladhxq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqrdmladhxq_m_s16.c| 34 ---
>  .../arm/mve/intrinsics/vqrdmladhxq_m_s32.c| 34 ---
>  .../arm/mve/intrinsics/vqrdmladhxq_m_s8.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmladhxq_s16.c  | 24 +++--
>  .../arm/mve/intrinsics/vqrdmladhxq_s32.c  | 24 +++--
>  .../arm/mve/intrinsics/vqrdmladhxq_s8.c   | 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
> index 677efdcd1e4..1f68671b3f9 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmladhxq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladhxt.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmladhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladhxt.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
> index 8ee8bbb420b..eaea6e1f482 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmladhxq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladhxt.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmladhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmladhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmladhxt.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
> index 7cfa88fee28..0f582a91f3a 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok }

RE: [PATCH 15/23] arm: improve tests for vqrdmlashq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 15/23] arm: improve tests for vqrdmlashq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqrdmlashq_n_s16.c | 32 +++
>  .../arm/mve/intrinsics/vqrdmlashq_n_s32.c | 32 +++
>  .../arm/mve/intrinsics/vqrdmlashq_n_s8.c  | 32 +++
>  3 files changed, 78 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
> index 8ff8c34d529..2710f2f0442 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
> @@ -1,21 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vqrdmlash.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int16x8_t
> -foo (int16x8_t a, int16x8_t b, int16_t c)
> +foo (int16x8_t m1, int16x8_t m2, int16_t add)
>  {
> -  return vqrdmlashq_n_s16 (a, b, c);
> +  return vqrdmlashq_n_s16 (m1, m2, add);
>  }
> 
> -/* { dg-final { scan-assembler "vqrdmlash.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vqrdmlash.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int16x8_t
> -foo1 (int16x8_t a, int16x8_t b, int16_t c)
> +foo1 (int16x8_t m1, int16x8_t m2, int16_t add)
>  {
> -  return vqrdmlashq (a, b, c);
> +  return vqrdmlashq (m1, m2, add);
> +}
> +
> +#ifdef __cplusplus
>  }
> +#endif
> 
> -/* { dg-final { scan-assembler "vqrdmlash.s16"  }  } */
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
> index 02583f0627b..5fefc3938c5 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
> @@ -1,21 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vqrdmlash.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int32x4_t
> -foo (int32x4_t a, int32x4_t b, int32_t c)
> +foo (int32x4_t m1, int32x4_t m2, int32_t add)
>  {
> -  return vqrdmlashq_n_s32 (a, b, c);
> +  return vqrdmlashq_n_s32 (m1, m2, add);
>  }
> 
> -/* { dg-final { scan-assembler "vqrdmlash.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vqrdmlash.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int32x4_t
> -foo1 (int32x4_t a, int32x4_t b, int32_t c)
> +foo1 (int32x4_t m1, int32x4_t m2, int32_t add)
>  {
> -  return vqrdmlashq (a, b, c);
> +  return vqrdmlashq (m1, m2, add);
> +}
> +
> +#ifdef __cplusplus
>  }
> +#endif
> 
> -/* { dg-final { scan-assembler "vqrdmlash.s32"  }  } */
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
> index 0bd5bcac71f..df96fe85213 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
> @@ -1,21 +1,41 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vqrdmlash.s8q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int8x16_t
> -foo (int8x16_t a, int8x16_t b, int8_t c)
> +foo (int8x16_t m1, int8x16_t m2, int8_t add)
>  {
> -  return vqrdmlashq_n_s8 (a, b, c);
> +  return vqrdmlashq_n_s8 (m1, m2, add);
>  }
> 
> -/* { dg-final { scan-assembler "vqrdmlash.s8"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vqrdmlash.s8q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int8x16_t
> -foo1 (int8x16_t a, int8x16_t b, int8_t c)
> +foo1 (i

RE: [PATCH 16/23] arm: improve tests for vqdmlsdhq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 16/23] arm: improve tests for vqdmlsdhq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqdmlsdhq_m_s16.c  | 34 ---
>  .../arm/mve/intrinsics/vqdmlsdhq_m_s32.c  | 34 ---
>  .../arm/mve/intrinsics/vqdmlsdhq_m_s8.c   | 34 ---
>  .../arm/mve/intrinsics/vqdmlsdhq_s16.c| 24 +++--
>  .../arm/mve/intrinsics/vqdmlsdhq_s32.c| 24 +++--
>  .../arm/mve/intrinsics/vqdmlsdhq_s8.c | 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
> index d1e66864d10..f87287ab8cd 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmlsdhq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdht.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmlsdhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdht.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
> index cc80f211ec8..8155aaf843c 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmlsdhq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdht.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmlsdhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdht.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
> index 5c9d81a6526..d39badc7707 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-add

RE: [PATCH 17/23] arm: improve tests for vqdmlsdhxq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 17/23] arm: improve tests for vqdmlsdhxq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqdmlsdhxq_m_s16.c | 34 ---
>  .../arm/mve/intrinsics/vqdmlsdhxq_m_s32.c | 34 ---
>  .../arm/mve/intrinsics/vqdmlsdhxq_m_s8.c  | 34 ---
>  .../arm/mve/intrinsics/vqdmlsdhxq_s16.c   | 24 +++--
>  .../arm/mve/intrinsics/vqdmlsdhxq_s32.c   | 24 +++--
>  .../arm/mve/intrinsics/vqdmlsdhxq_s8.c| 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
> index 6ab9743054c..1742d47291c 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmlsdhxq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdhxt.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqdmlsdhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdhxt.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
> index a34618e97fd..1c1b73a2251 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmlsdhxq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdhxt.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqdmlsdhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqdmlsdhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqdmlsdhxt.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
> index fdbe89ab6b8..0a980a081a1 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8

RE: [PATCH 18/23] arm: improve tests for vqrdmlsdhq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 18/23] arm: improve tests for vqrdmlsdhq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqrdmlsdhq_m_s16.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmlsdhq_m_s32.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmlsdhq_m_s8.c  | 34 ---
>  .../arm/mve/intrinsics/vqrdmlsdhq_s16.c   | 24 +++--
>  .../arm/mve/intrinsics/vqrdmlsdhq_s32.c   | 24 +++--
>  .../arm/mve/intrinsics/vqrdmlsdhq_s8.c| 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
> index d0054b8ea97..6a5776215ca 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdht.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdht.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
> index 7d3fe45eb4d..9539e249d6a 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdht.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdht.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
> index c33f8ea903b..69e54f53a76 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8

RE: [PATCH 19/23] arm: improve tests for vqrdmlsdhxq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 19/23] arm: improve tests for vqrdmlsdhxq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c: Improve
> test.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c| 34 ---
>  .../arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c| 34 ---
>  .../arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmlsdhxq_s16.c  | 24 +++--
>  .../arm/mve/intrinsics/vqrdmlsdhxq_s32.c  | 24 +++--
>  .../arm/mve/intrinsics/vqrdmlsdhxq_s8.c   | 24 +++--
>  6 files changed, 156 insertions(+), 18 deletions(-)
> 
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
> index 2fbd351f3b4..3598f50ccba 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhxq_m_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdhxt.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdhxt.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
> index 324a6e63398..1ab22edf9ca 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhxq_m_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdhxt.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmlsdhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
>  {
>return vqrdmlsdhxq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmlsdhxt.s32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
> index 287868b1190..01103e99b61 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok }

RE: [PATCH 20/23] arm: improve tests for vqrdmulhq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 20/23] arm: improve tests for vqrdmulhq*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c: Improve
> test.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqrdmulhq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../arm/mve/intrinsics/vqrdmulhq_m_n_s16.c| 34 ---
>  .../arm/mve/intrinsics/vqrdmulhq_m_n_s32.c| 34 ---
>  .../arm/mve/intrinsics/vqrdmulhq_m_n_s8.c | 34 ---
>  .../arm/mve/intrinsics/vqrdmulhq_m_s16.c  | 34 ---
>  .../arm/mve/intrinsics/vqrdmulhq_m_s32.c  | 34 ---
>  .../arm/mve/intrinsics/vqrdmulhq_m_s8.c   | 34 ---
>  .../arm/mve/intrinsics/vqrdmulhq_n_s16.c  | 24 +++--
>  .../arm/mve/intrinsics/vqrdmulhq_n_s32.c  | 24 +++--
>  .../arm/mve/intrinsics/vqrdmulhq_n_s8.c   | 24 +++--
>  .../arm/mve/intrinsics/vqrdmulhq_s16.c| 24 +++--
>  .../arm/mve/intrinsics/vqrdmulhq_s32.c| 24 +++--
>  .../arm/mve/intrinsics/vqrdmulhq_s8.c | 24 +++--
>  12 files changed, 312 insertions(+), 36 deletions(-)
> 
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
> index c4b6b7e22f8..fc3a33073aa 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmulht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
>  {
>return vqrdmulhq_m_n_s16 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmulht.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmulht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
>  {
>return vqrdmulhq_m (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmulht.s16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
> index 6de3eb1cb9a..897ad5bd28c 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
> @@ -1,23 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmulht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, int32_t b, mve_pred16_t p)
>  {
>return vqrdmulhq_m_n_s32 (inactive, a, b, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqrdmulht.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqrdmulht.s32  

RE: [PATCH 21/23] arm: improve tests and fix vqnegq*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 21/23] arm: improve tests and fix vqnegq*
> 
> gcc/ChangeLog:
> 
>   * config/arm/mve.md (mve_vqnegq_s): Fix spacing.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqnegq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqnegq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqnegq_s8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  gcc/config/arm/mve.md |  2 +-
>  .../arm/mve/intrinsics/vqnegq_m_s16.c | 33 +--
>  .../arm/mve/intrinsics/vqnegq_m_s32.c | 33 +--
>  .../arm/mve/intrinsics/vqnegq_m_s8.c  | 33 +--
>  .../arm/mve/intrinsics/vqnegq_s16.c   | 28 +---
>  .../arm/mve/intrinsics/vqnegq_s32.c   | 24 --
>  .../gcc.target/arm/mve/intrinsics/vqnegq_s8.c | 24 --
>  7 files changed, 159 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index 600adf7d69b..4f94cf14a0b 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -374,7 +374,7 @@ (define_insn "mve_vqnegq_s"
>VQNEGQ_S))
>]
>"TARGET_HAVE_MVE"
> -  "vqneg.s%# %q0, %q1"
> +  "vqneg.s%#\t%q0, %q1"
>[(set_attr "type" "mve_move")
>  ])
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
> index 4f0145d2ebd..f3799a35b12 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqnegt.s16  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vqnegq_m_s16 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqnegt.s16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqnegt.s16  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int16x8_t
>  foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
>  {
>return vqnegq_m (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
> index da4f90bad53..bbe64ff4d52 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
> @@ -1,22 +1,49 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqnegt.s32  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
>  {
>return vqnegq_m_s32 (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> -/* { dg-final { scan-assembler "vqnegt.s32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
> +**   ...
> +**   vpst(?: @.*|)
> +**   ...
> +**   vqnegt.s32  q[0-9]+, q[0-9]+(?: @.*|)
> +**   ...
> +*/
>  int32x4_t
>  foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
>  {
>return vqnegq_m (inactive, a, p);
>  }
> 
> -/* { dg-final { scan-assembler "vpst" } } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
> index ac1250b2fac..71fcdd7cba7 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics

RE: [PATCH 22/23] arm: improve tests for vld2q*

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 22/23] arm: improve tests for vld2q*
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vld2q_f16.c: Improve test.
>   * gcc.target/arm/mve/intrinsics/vld2q_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vld2q_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vld2q_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vld2q_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vld2q_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vld2q_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vld2q_u8.c: Likewise.

Ok.
Thanks,
Kyrill

> ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_f16.c | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_f32.c | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_s16.c | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_s32.c | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_s8.c  | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_u16.c | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_u32.c | 33 ---
>  .../gcc.target/arm/mve/intrinsics/vld2q_u8.c  | 33 ---
>  8 files changed, 224 insertions(+), 40 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
> index 24e7a2ea4d0..81690b1022e 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
> @@ -1,22 +1,45 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>  /* { dg-add-options arm_v8_1m_mve_fp } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +**   vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +*/
>  float16x8x2_t
> -foo (float16_t const * addr)
> +foo (float16_t const *addr)
>  {
>return vld2q_f16 (addr);
>  }
> 
> -/* { dg-final { scan-assembler "vld20.16"  }  } */
> -/* { dg-final { scan-assembler "vld21.16"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +**   vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +*/
>  float16x8x2_t
> -foo1 (float16_t const * addr)
> +foo1 (float16_t const *addr)
>  {
>return vld2q (addr);
>  }
> 
> -/* { dg-final { scan-assembler "vld20.16"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
> index 727484caaf6..d2ae31fa9e5 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
> @@ -1,22 +1,45 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>  /* { dg-add-options arm_v8_1m_mve_fp } */
>  /* { dg-additional-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> 
>  #include "arm_mve.h"
> 
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/*
> +**foo:
> +**   ...
> +**   vld20.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +**   vld21.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +*/
>  float32x4x2_t
> -foo (float32_t const * addr)
> +foo (float32_t const *addr)
>  {
>return vld2q_f32 (addr);
>  }
> 
> -/* { dg-final { scan-assembler "vld20.32"  }  } */
> -/* { dg-final { scan-assembler "vld21.32"  }  } */
> 
> +/*
> +**foo1:
> +**   ...
> +**   vld20.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +**   vld21.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
> +**   ...
> +*/
>  float32x4x2_t
> -foo1 (float32_t const * addr)
> +foo1 (float32_t const *addr)
>  {
>return vld2q (addr);
>  }
> 
> -/* { dg-final { scan-assembler "vld20.32"  }  } */
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
> index f2864a00478..fb4dc1b4fcf 100644
> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
> @@ -1,22 +1,45 @@
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-optio

RE: [PATCH 23/23] arm: fix missing extern "C" in MVE tests

2023-01-20 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Friday, January 20, 2023 4:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH 23/23] arm: fix missing extern "C" in MVE tests
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vhaddq_n_s16.c: Add missing extern
>   "C".
>   * gcc.target/arm/mve/intrinsics/vhaddq_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_n_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_n_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_n_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_n_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_n_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_n_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhaddq_x_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_n_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_n_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_n_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_n_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_n_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_n_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vhsubq_x_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmladavaxq_p_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmladavaxq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmladavaxq_p_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmladavaxq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmladavaxq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmladavaxq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_n_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_n_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_n_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqaddq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlahq_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlahq_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlahq_n_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s32.c: Likewise.
> 

Re: [PATCH] modula2/108144 - Fix multilib install of libgm2

2023-01-20 Thread Gaius Mulley via Gcc-patches
Richard Biener  writes:

> The following adjusts libgm2 to properly use the multilib build
> infrastructure, thereby fixing the install with
> --enable-version-specific-runtime-libs
>
> In particular config-ml.pl needs to be applied to generated Makefiles
> as documented in the manual and we have to avoid clobbering the
> variables via make arguments.  The explicit install rules used different
> ways to construct the multilib dir which isn't necessary and breaks
> when MUTLIDIR is now finally set correctly.  Instead use
> $(toolexeclibdir).
>
> This results in some dead variables in the Makefile.am (and there were
> some before), I refrained from doing even more changes here.
>
> Verified with an install with and without 
> --enable-version-specific-runtime-libs
> and checking the result.
>
> OK?
>
> Thanks,
> Richard.

Many thanks for this fix - and the deep magic AC_FOREACH config-ml.in
recursion rhunes.  LGTM

regards,
Gaius


Re: [PATCH] modula2/108144 - Fix multilib install of libgm2

2023-01-20 Thread NightStrike via Gcc-patches
On Fri, Jan 20, 2023 at 1:40 PM Gaius Mulley via Gcc-patches
 wrote:
>
> Richard Biener  writes:
>
> > The following adjusts libgm2 to properly use the multilib build
> > infrastructure, thereby fixing the install with
> > --enable-version-specific-runtime-libs
> >
> > In particular config-ml.pl needs to be applied to generated Makefiles
> > as documented in the manual and we have to avoid clobbering the
> > variables via make arguments.  The explicit install rules used different
> > ways to construct the multilib dir which isn't necessary and breaks
> > when MUTLIDIR is now finally set correctly.  Instead use
> > $(toolexeclibdir).
> >
> > This results in some dead variables in the Makefile.am (and there were
> > some before), I refrained from doing even more changes here.
> >
> > Verified with an install with and without 
> > --enable-version-specific-runtime-libs
> > and checking the result.
> >
> > OK?
> >
> > Thanks,
> > Richard.
>
> Many thanks for this fix - and the deep magic AC_FOREACH config-ml.in
> recursion rhunes.  LGTM
>
> regards,
> Gaius

AC_FOREACH is obsolete and shouldn't be used in new code.  It's been
replaced with m4_foreach_w:
https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Obsolete-Macros.html


Re: [PATCH 3/4] libbacktrace: work with aslr on windows

2023-01-20 Thread Eli Zaretskii via Gcc-patches
> Date: Fri, 20 Jan 2023 17:46:59 +0100
> Cc: gcc-patches@gcc.gnu.org, g...@gcc.gnu.org
> From: Gabriel Ravier 
> 
> On 1/20/23 14:39, Eli Zaretskii via Gcc wrote:
> >> From: Björn Schäpers 
> >> Date: Fri, 20 Jan 2023 11:54:08 +0100
> >>
> >> @@ -856,7 +870,12 @@ coff_add (struct backtrace_state *state, int 
> >> descriptor,
> >>  + (sections[i].offset - min_offset));
> >>   }
> >>   
> >> -  if (!backtrace_dwarf_add (state, /* base_address */ 0, &dwarf_sections,
> >> +#ifdef HAVE_WINDOWS_H
> >> +module_handle = (uintptr_t) GetModuleHandleW (NULL);
> >> +base_address = module_handle - image_base;
> >> +#endif
> >> +
> >> +  if (!backtrace_dwarf_add (state, base_address, &dwarf_sections,
> >>0, /* FIXME: is_bigendian */
> >>NULL, /* altlink */
> >>error_callback, data, fileline_fn,
> > Why do you force using the "wide" APIs here?  Won't GetModuleHandle do
> > the job, whether it resolves to GetModuleHandleA or GetModuleHandleW?
> 
> I would expect the reason to be either that:
> 
> - using wide APIs with Windows is generally considered to be a best 
> practice, even when not strictly needed (and in this case I can't see 
> any problem with doing so, unless maybe we want to code to work with 
> Windows 95 or something like that...)

There's no reason to forcibly break GDB on platforms where wide APIs
are not available.

> - using the narrow API somehow has an actual drawback, for example maybe 
> it might not work if the name of the exe file the NULL will tell it to 
> get a handle to contains wide characters

Native Windows port of GDB doesn't support Unicode file names anyway,
which is why you used the *A APIs elsewhere in the patch, and
rightfully so.  So there's no reason to use "wide" APIs in this one
place, and every reason not to.


Re: [PATCH] c++: Quash bogus -Wunused-value with new [PR107797]

2023-01-20 Thread Jason Merrill via Gcc-patches

On 1/19/23 21:03, Marek Polacek wrote:

We shouldn't emit "right operand of comma operator has no effect"
when that comma operator was created by the compiler for "new int{}".
convert_to_void/COMPOUND_EXPR already checks warning_suppressed_p so
we can just suppress -Wunused-value.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/107797

gcc/cp/ChangeLog:

* cvt.cc (ocp_convert): copy_warning when creating a new
COMPOUND_EXPR.
* init.cc (build_new_1): Suppress -Wunused-value on
compiler-generated COMPOUND_EXPRs.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wunused-value-1.C: New test.
---
  gcc/cp/cvt.cc   |  6 --
  gcc/cp/init.cc  |  2 ++
  gcc/testsuite/g++.dg/warn/Wunused-value-1.C | 12 
  3 files changed, 18 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wunused-value-1.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index f816c474cef..52e96fbe590 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -3800,6 +3800,8 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
if (cookie_expr)
  rval = build2 (COMPOUND_EXPR, TREE_TYPE (rval), cookie_expr, rval);
  
+  suppress_warning (rval, OPT_Wunused_value);


This makes sense, but IIUC since rval is built with no location, this 
just sets nowarning_flag?



if (rval == data_addr && TREE_CODE (alloc_expr) == TARGET_EXPR)
  /* If we don't have an initializer or a cookie, strip the TARGET_EXPR
 and return the call (which doesn't need to be adjusted).  */
diff --git a/gcc/cp/cvt.cc b/gcc/cp/cvt.cc
index 0cbfd8060cb..17827d06a4a 100644
--- a/gcc/cp/cvt.cc
+++ b/gcc/cp/cvt.cc
@@ -711,8 +711,10 @@ ocp_convert (tree type, tree expr, int convtype, int flags,
return error_mark_node;
   if (e == TREE_OPERAND (expr, 1))
return expr;
-  return build2_loc (EXPR_LOCATION (expr), COMPOUND_EXPR, TREE_TYPE (e),
-TREE_OPERAND (expr, 0), e);
+  e = build2_loc (EXPR_LOCATION (expr), COMPOUND_EXPR, TREE_TYPE (e),
+ TREE_OPERAND (expr, 0), e);
+  copy_warning (e, expr);


And so I don't know what effect this would have; copy_warning doesn't 
seem to propagate nowarning_flag, which seems like a bug.



+  return e;
 }
 
   complete_type (type);



diff --git a/gcc/testsuite/g++.dg/warn/Wunused-value-1.C 
b/gcc/testsuite/g++.dg/warn/Wunused-value-1.C
new file mode 100644
index 000..2ba5587fce0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wunused-value-1.C
@@ -0,0 +1,12 @@
+// PR c++/107797
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wunused" }
+
+void
+g ()
+{
+  (long) new int{};
+  long(new int{});
+  (long) new int();
+  long(new int());
+}

base-commit: 86caab6c5d1e26e1c54c3dceacc873d6e27bfc09




[ping3][PATCH 0/2] __bos and flex arrays

2023-01-20 Thread Siddhesh Poyarekar

ping!

On 2022-12-21 17:25, Siddhesh Poyarekar wrote:

Hi,

The first patch in the series is just a minor test cleanup that I did to
make sure all tests in a test case run (instead of aborting at first
failure) and print the ones that failed.  The second patch is the actual
fix.

The patch intends to make __bos/__bdos do the right thing with structs
containing flex arrays, either directly or within nested structs and
unions.  This should improve minimum object size estimation in some
cases and also bail out more consistently so that flex arrays don't
cause false positives in fortification.

I've tested this with a bootstrap on x86_64 and also with
--with-build-config=bootstrap-ubsan to make sure that there are no new
failures due to this change.

Siddhesh Poyarekar (2):
   testsuite: Run __bos tests to completion
   tree-object-size: More consistent behaviour with flex arrays

  .../g++.dg/ext/builtin-object-size1.C | 267 
  .../g++.dg/ext/builtin-object-size2.C | 267 
  .../gcc.dg/builtin-dynamic-object-size-0.c|  14 +-
  gcc/testsuite/gcc.dg/builtin-object-size-1.c  | 263 
  gcc/testsuite/gcc.dg/builtin-object-size-12.c |  12 +-
  gcc/testsuite/gcc.dg/builtin-object-size-13.c |  17 +-
  gcc/testsuite/gcc.dg/builtin-object-size-15.c |  11 +-
  gcc/testsuite/gcc.dg/builtin-object-size-2.c  | 287 +-
  gcc/testsuite/gcc.dg/builtin-object-size-3.c  | 263 
  gcc/testsuite/gcc.dg/builtin-object-size-4.c  | 267 
  gcc/testsuite/gcc.dg/builtin-object-size-6.c  | 267 
  gcc/testsuite/gcc.dg/builtin-object-size-7.c  |  52 ++--
  gcc/testsuite/gcc.dg/builtin-object-size-8.c  |  17 +-
  .../gcc.dg/builtin-object-size-common.h   |  12 +
  .../gcc.dg/builtin-object-size-flex-common.h  |  90 ++
  ...n-object-size-flex-nested-struct-nonzero.c |   6 +
  ...ltin-object-size-flex-nested-struct-zero.c |   6 +
  .../builtin-object-size-flex-nested-struct.c  |  22 ++
  ...in-object-size-flex-nested-union-nonzero.c |   6 +
  ...iltin-object-size-flex-nested-union-zero.c |   6 +
  .../builtin-object-size-flex-nested-union.c   |  28 ++
  .../gcc.dg/builtin-object-size-flex-nonzero.c |   6 +
  .../gcc.dg/builtin-object-size-flex-zero.c|   6 +
  .../gcc.dg/builtin-object-size-flex.c |  18 ++
  gcc/testsuite/gcc.dg/pr101836.c   |  11 +-
  gcc/testsuite/gcc.dg/strict-flex-array-3.c|  11 +-
  gcc/tree-object-size.cc   | 150 -
  27 files changed, 1275 insertions(+), 1107 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-common.h
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-common.h
  create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct-nonzero.c
  create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct-zero.c
  create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-struct.c
  create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union-nonzero.c
  create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union-zero.c
  create mode 100644 
gcc/testsuite/gcc.dg/builtin-object-size-flex-nested-union.c
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-nonzero.c
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex-zero.c
  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-flex.c



Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

2023-01-20 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 20, 2023 at 07:00:18PM +0100, Jakub Jelinek via Gcc-patches wrote:
> Though, I wonder if we shouldn't for GCC 13 just sorry_at about
> steps other than constant 1/-1 (in both outer loop with var-outer referenced
> in inner loop and on inner loop that references it) and for the !VAR_P case
> actually handle it if step 1/-1 by using simple like translation just with
> an artificial iterator.

As for the steps other than constant 1/-1, we have 5 cases:
  do i = x, y, 25
or
  do i = 12, 72, z
or
  do i = x, y, -42
or
  do i = 42, -10, z
or
  do i = x, y, z
The 1st and 3rd are with constant step, 2nd and 4th with constant lower and
upper bounds and the last one has step and at least one of the bounds
non-constant.

I wonder if in the light of e.g. PR108431 which says that
do i = -huge(i), huge(i) is invalid (well, that one would be very wrong
even from OpenMP POV because computing number of iterations definitely
overflows) and the fact that we handle step 1 and -1 the simple way
do do i = huge(i) - 10, huge(i) will not work either, I wonder if even
do i = huge(i) - 5, huge(i) - 1, 2 is undefined (similar reasoning, if
i after loop needs to be set to the huge(i) + 1 it is signed integer
overflow).  If yes, then perhaps at least the first 4 cases could be easily
handled (perhaps for GCC 13 just if clauses->non_rectangular only) as
for (i = x; i <= y; i += 25)
or
for (i = 12; i <= 72; i += z)
or
for (i = x; i >= y; i -= 42)
or
for (i = 42; i >= -10; i += z)

If those give equivalent behavior, then that would mean a sorry
only for the last case - the problem is that we then don't know at compile
time the direction.
Though perhaps even for that case we could play tricks, handle
  do i = x, y, z
as
if (z > 0)
  a = x, b = y, c = z;
else
  a = INT_MIN, b = too_lazy_to_compute_that_now, c = -z;
for (counter = a; counter <= b; counter += c)
{
  if (z > 0)
i = counter;
  else
i = counter - (unsigned) INT_MAX;
}
If that works, we'd need to figure also out how to handle that
in the non-rect cases.  But the m1 * var-outer + a1 and m2 * var-outer + a2
factors can be non-constant invariants, so again we could compute something
for them depending on if the outer or inner step was positive or negative.

Jakub



Clean up after newlib "nvptx: In offloading execution, map '_exit' to 'abort' [GCC PR85463]"

2023-01-20 Thread Thomas Schwinge
Hi!

Re the newlib commit 05a2d7a8b3277b469e7cb121115bba398adc8559
"nvptx: In offloading execution, map '_exit' to 'abort' [GCC PR85463]"
that I've just pushes to newlib main branch:

On 2023-01-19T23:00:05+0100, I wrote:
> This is still not properly resolving 
> '[nvptx] "exit" in offloaded region doesn't terminate process', but is
> one step into that direction, and allows for simplifying some GCC code.

> --- a/newlib/libc/machine/nvptx/_exit.c
> +++ b/newlib/libc/machine/nvptx/_exit.c

> @@ -26,7 +27,15 @@ void __attribute__((noreturn))
>  _exit (int status)
>  {
>if (__exitval_ptr)
> -*__exitval_ptr = status;
> -  for (;;)
> -asm ("exit;" ::: "memory");
> +{
> +  *__exitval_ptr = status;
> +  for (;;)
> +   asm ("exit;" ::: "memory");
> +}
> +  else /* offloading */
> +{
> +  /* Map to 'abort'; see 
> +'[nvptx] "exit" in offloaded region doesn't terminate process'.  */
> +  abort ();
> +}
>  }

That has put "the PR85463 stuff" into the one central place, and allows
for simplifying GCC as per the attached
'Clean up after newlib "nvptx: In offloading execution, map '_exit' to 'abort' 
[GCC PR85463]"',
which I've just pushed to GCC devel/omp/gcc-12 branch in
commit 094b379f461bb4b635327cde26eabc0966159fec, and intend to push to
GCC master branch once the latter depends on updated newlib for other
(functional) reasons.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 094b379f461bb4b635327cde26eabc0966159fec Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 19 Jan 2023 20:25:45 +0100
Subject: [PATCH] Clean up after newlib "nvptx: In offloading execution, map
 '_exit' to 'abort' [GCC PR85463]"

	PR target/85463
	libgfortran/
	* runtime/minimal.c [__nvptx__] (exit): Don't override.
	libgomp/
	* config/nvptx/error.c (exit): Don't override.
	* testsuite/libgomp.oacc-fortran/error_stop-1.f: Update.
	* testsuite/libgomp.oacc-fortran/error_stop-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/error_stop-3.f: Likewise.
	* testsuite/libgomp.oacc-fortran/stop-1.f: Likewise.
	* testsuite/libgomp.oacc-fortran/stop-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/stop-3.f: Likewise.
---
 libgfortran/ChangeLog.omp   |  4 
 libgfortran/runtime/minimal.c   |  8 
 libgomp/ChangeLog.omp   |  9 +
 libgomp/config/nvptx/error.c|  7 ---
 .../testsuite/libgomp.oacc-fortran/error_stop-1.f   |  8 +---
 .../testsuite/libgomp.oacc-fortran/error_stop-2.f   |  8 +---
 .../testsuite/libgomp.oacc-fortran/error_stop-3.f   |  8 +---
 libgomp/testsuite/libgomp.oacc-fortran/stop-1.f | 13 +
 libgomp/testsuite/libgomp.oacc-fortran/stop-2.f |  6 +-
 libgomp/testsuite/libgomp.oacc-fortran/stop-3.f | 12 
 10 files changed, 50 insertions(+), 33 deletions(-)
 create mode 100644 libgfortran/ChangeLog.omp

diff --git a/libgfortran/ChangeLog.omp b/libgfortran/ChangeLog.omp
new file mode 100644
index 000..b08c264daf9
--- /dev/null
+++ b/libgfortran/ChangeLog.omp
@@ -0,0 +1,4 @@
+2023-01-20  Thomas Schwinge  
+
+	PR target/85463
+	* runtime/minimal.c [__nvptx__] (exit): Don't override.
diff --git a/libgfortran/runtime/minimal.c b/libgfortran/runtime/minimal.c
index 326ff822ca7..5af2bada2f6 100644
--- a/libgfortran/runtime/minimal.c
+++ b/libgfortran/runtime/minimal.c
@@ -31,14 +31,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #endif
 
 
-#if __nvptx__
-/* Map "exit" to "abort"; see PR85463 '[nvptx] "exit" in offloaded region
-   doesn't terminate process'.  */
-# undef exit
-# define exit(status) do { (void) (status); abort (); } while (0)
-#endif
-
-
 #if __nvptx__
 /* 'printf' is all we have.  */
 # undef estr_vprintf
diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 134d450f44a..33aa4b01350 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,5 +1,14 @@
 2023-01-20  Thomas Schwinge  
 
+	PR target/85463
+	* config/nvptx/error.c (exit): Don't override.
+	* testsuite/libgomp.oacc-fortran/error_stop-1.f: Update.
+	* testsuite/libgomp.oacc-fortran/error_stop-2.f: Likewise.
+	* testsuite/libgomp.oacc-fortran/error_stop-3.f: Likewise.
+	* testsuite/libgomp.oacc-fortran/stop-1.f: Likewise.
+	* testsuite/libgomp.oacc-fortran/stop-2.f: Likewise.
+	* testsuite/libgomp.oacc-fortran/stop-3.f: Likewise.
+
 	* testsuite/libgomp.c/simd-math-1.c: Fix configuration, again.
 
 	* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Force
diff --git a/libgomp/config/nvptx/error.c b/libgomp/config/nvptx/error.c
index ab99130ed4a..1756eaeee53 100644
--- a/libgomp/config/nvptx/error.c
+

Re: [PATCH v3] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-01-20 Thread Jason Merrill via Gcc-patches

On 1/19/23 21:03, Marek Polacek wrote:

On Thu, Jan 19, 2023 at 01:02:02PM -0500, Jason Merrill wrote:

On 1/18/23 20:13, Marek Polacek wrote:

On Wed, Jan 18, 2023 at 04:07:59PM -0500, Jason Merrill wrote:

On 1/18/23 12:52, Marek Polacek wrote:

Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

 const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

So I figured that perhaps we want to look at the object we're invoking
the member function(s) on and see if that is a temporary, as in, don't
warn about

 const Plane & meta = fm.planes().inner();

but do warn about

 const Plane & meta = FrameMetadata().planes().inner();

It's ugly, but better than asking users to add #pragmas into their code.


Hmm, that doesn't seem right; the former is only OK because Ref is in fact a
reference-like type.  If planes() returned a class that held data, we would
want to warn.


Sure, it's always some kind of tradeoff with warnings :/.

In this case, we might recognize the reference-like class because it has a
reference member and a constructor taking the same reference type.


That occurred to me too, but then I found out that std::reference_wrapper
actually uses T*, not T&, as you say.  But here's a patch to do that
(I hope).

That wouldn't help with std::reference_wrapper or std::ref_view because they
have pointer members instead of references, but perhaps loosening the check
to include that case would make sense?


Sorry, I don't understand what you mean by loosening the check.  I could
hardcode std::reference_wrapper and std::ref_view but I don't think that's
what you meant.


Indeed that's not what I meant, but as I was saying in our meeting I think
it's worth doing; the compiler has various tweaks to handle specific
standard-library classes better.
  
Okay, done in the patch below.  Except that I'm not including a test for

std::ranges::ref_view because I don't really know how that works.


Surely I cannot _not_ warn for any class that contains a T*.


I was thinking if a constructor takes a T& and the class has a T* that would
be close enough, though this also wouldn't handle the standard library
classes so the benefit is questionable.


Here's the patch so that we have some actual code to discuss...  Thanks.

-- >8 --
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

Perhaps we want to look at the member function's enclosing class
to see if it's a reference wrapper class (meaning, has a reference
member and a constructor taking the same reference type) and don't
warn if so, supposing that the member function returns a reference
to a non-temporary object.

It's ugly, but better than asking users to add #pragmas into their code.

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn when the
member function comes from a reference wrapper class.


Let's factor the new code out into e.g. reference_like_class_p


Done.  Thanks,

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

   const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

Perhaps we want to look at the member function's enclosing class
to see if it's a reference wrapper class (meaning, has a reference
member and a constructor taking the same reference type, or is
std::reference_wrapper or std::ranges::ref_view) and don't warn if so,
supposing that the member function returns a reference to a non-temporary
object.

It's ugly, but better than asking users to add #pragmas into their code.

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): New.
(do_warn_dangling_reference): Don't warn when the member function comes
 

[og12] nvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp execution

2023-01-20 Thread Thomas Schwinge
Hi!

On 2022-12-15T19:27:08+0100, I wrote:
> [...] I'd like to make 'nvptx_uniform_warp_check'
> fit for non-full-warp execution.  For example, to be able to execute such
> code in single-threaded 'cuLaunchKernel' for execution of global
> constructors/destructors, where those may, for example, call into nvptx
> target libraries compiled with '-mgomp' (thus, '-muniform-simt').
>
> OK to push (after proper testing, and with TODO markers adjusted/removed)
> the attached
> "nvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp execution"?

For now pushed, still with TODO markers, to devel/omp/gcc-12 branch in
commit d26a2a299392af330b3576b62d4eb6c81820be29
"nvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp execution",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From d26a2a299392af330b3576b62d4eb6c81820be29 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 12 Dec 2022 22:05:37 +0100
Subject: [PATCH] nvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp
 execution

For example, this allows for '-muniform-simt' code to be executed
single-threaded, which currently fails (device-side 'trap'), as the 0x
mask isn't correct if not all 32 threads of a warp are active.  The same
issue/fix, I suppose but have not verified, would apply if we were to allow for
OpenACC 'vector_length' smaller than 32, for example for OpenACC 'serial'.

We use 'nvptx_uniform_warp_check' only for PTX ISA version less than 6.0.
Otherwise we're using 'nvptx_warpsync', which emits 'bar.warp.sync 0x',
which evidently appears to do the right thing.  (I've tested '-muniform-simt'
code executing single-threaded.)

	gcc/
	* config/nvptx/nvptx.md (nvptx_uniform_warp_check): Make fit for
	non-full-warp execution.
	gcc/testsuite/
	* gcc.target/nvptx/nvptx.exp
	(check_effective_target_default_ptx_isa_version_at_least_6_0):
	New.
	* gcc.target/nvptx/uniform-simt-5.c: New.
	libgomp/
	* plugin/plugin-nvptx.c (nvptx_exec): Assert what we know about
	'blockDimX'.
---
 gcc/ChangeLog.omp |  5 
 gcc/config/nvptx/nvptx.md | 16 ++-
 gcc/testsuite/ChangeLog.omp   |  7 +
 gcc/testsuite/gcc.target/nvptx/nvptx.exp  |  5 
 .../gcc.target/nvptx/uniform-simt-5.c | 28 +++
 libgomp/ChangeLog.omp |  3 ++
 libgomp/plugin/plugin-nvptx.c |  3 ++
 7 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/nvptx/uniform-simt-5.c

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index 2d4b7513413..382cd5c80c2 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,3 +1,8 @@
+2023-01-20  Thomas Schwinge  
+
+	* config/nvptx/nvptx.md (nvptx_uniform_warp_check): Make fit for
+	non-full-warp execution.
+
 2023-01-19  Tobias Burnus  
 
 	Backported from master:
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 04c150b8982..d27126556ce 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -2321,10 +2321,24 @@
   "{",
   "\\t"		  ".reg.b32"	"\\t" "%%r_act;",
   "%.\\t"		  "vote.ballot.b32" "\\t" "%%r_act,1;",
+  /* For '%r_exp', we essentially need 'activemask.b32', but that is "Introduced in PTX ISA version 6.2", and this code here is used only 'if (!TARGET_PTX_6_0)'.  Thus, emulate it.
+ TODO Is that actually correct?  Wouldn't 'activemask.b32' rather replace our 'vote.ballot.b32' given that it registers the *currently active threads*?  */
+  /* Compute the "membermask" of all threads of the warp that are expected to be converged here.
+  	 For OpenACC, '%ntid.x' is 'vector_length', which per 'nvptx_goacc_validate_dims' always is a multiple of 32.
+	 For OpenMP, '%ntid.x' always is 32.
+  	 Thus, this is typically 0x, but additionally always for the case that not all 32 threads of the warp have been launched.
+	 This assume that lane IDs are assigned in ascending order.  */
+  //TODO Can we rely on '1 << 32 == 0', and '0 - 1 = 0x'?
+  //TODO https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/
+  //TODO https://stackoverflow.com/questions/54055195/activemask-vs-ballot-sync
+  "\\t"		  ".reg.b32"	"\\t" "%%r_exp;",
+  "%.\\t"		  "mov.b32"	"\\t" "%%r_exp, %%ntid.x;",
+  "%.\\t"		  "shl.b32"	"\\t" "%%r_exp, 1, %%r_exp;",
+  "%.\\t"		  "sub.u32"	"\\t" "%%r_exp, %%r_exp, 1;",
   "\\t"		  ".reg.pred"	"\\t" "%%r_do_abort;",
   "\\t"		  "mov.pred"	"\\t" "%%r_do_abort,0;",
   "%.\\t"		  "setp.ne.b32"	"\\t" "%%r_do_abort,%%r_act,"
-		  "0x;",
+		  "%%r_exp;",
   "@ %%r_do_abort\\t" "trap;",
   "@ %%r_do_ab

[og12] Add 'gcc.target/nvptx/softstack-decl-1.c', 'gcc.target/nvptx/uniform-simt-decl-1.c'

2023-01-20 Thread Thomas Schwinge
Hi!

On 2022-12-19T21:40:06+0100, Thomas Schwinge  wrote:
> ... to document the status quo re implicit (via 'need_softstack_decl',
> 'need_unisimt_decl') and explicit declarations of '__nvptx_stacks',
> '__nvptx_uni'.

For now pushed to devel/omp/gcc-12 branch in
commit 703ebfdb483fdade316ceb003a0d57ca132a090b
"Add 'gcc.target/nvptx/softstack-decl-1.c', 
'gcc.target/nvptx/uniform-simt-decl-1.c'",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 703ebfdb483fdade316ceb003a0d57ca132a090b Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 19 Dec 2022 17:10:52 +0100
Subject: [PATCH] Add 'gcc.target/nvptx/softstack-decl-1.c',
 'gcc.target/nvptx/uniform-simt-decl-1.c'

... to document the status quo re implicit (via 'need_softstack_decl',
'need_unisimt_decl') and explicit declarations of '__nvptx_stacks',
'__nvptx_uni'.

	gcc/testsuite/
	* gcc.target/nvptx/softstack-decl-1.c: New.
	* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
---
 gcc/testsuite/ChangeLog.omp   |  3 ++
 .../gcc.target/nvptx/softstack-decl-1.c   | 20 +
 .../gcc.target/nvptx/uniform-simt-decl-1.c| 29 +++
 3 files changed, 52 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
 create mode 100644 gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c

diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp
index 7339bf41482..5b3d9fe416b 100644
--- a/gcc/testsuite/ChangeLog.omp
+++ b/gcc/testsuite/ChangeLog.omp
@@ -1,5 +1,8 @@
 2023-01-20  Thomas Schwinge  
 
+	* gcc.target/nvptx/softstack-decl-1.c: New.
+	* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
+
 	* gcc.target/nvptx/nvptx.exp
 	(check_effective_target_default_ptx_isa_version_at_least_6_0):
 	New.
diff --git a/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
new file mode 100644
index 000..c502eacc1b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options {-save-temps -O0 -msoft-stack} } */
+
+extern void *__nvptx_stacks[32] __attribute__((shared,nocommon));
+
+void *f()
+{
+  /* Implicit '__nvptx_stacks' usage for frame; per 'init_softstack_frame':
+ { dg-final { scan-assembler-times {mov\.u64 %fstmp2, __nvptx_stacks;} 1 } }
+  */
+  void *stack_array[123];
+  /* Explicit '__nvptx_stacks' usage.  */ 
+  stack_array[5] = __nvptx_stacks[0];
+  return stack_array[5];
+}
+
+/* The implicit (via 'need_softstack_decl') and explicit declarations of
+   '__nvptx_stacks' are both emitted:
+   { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_stacks\[32\];} 2 } }
+*/
diff --git a/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
new file mode 100644
index 000..486456ab243
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options {-save-temps -O0 -muniform-simt} } */
+
+extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
+
+enum memmodel
+{
+  MEMMODEL_RELAXED = 0,
+};
+
+int a = 0;
+
+int f (void)
+{
+  /* Explicit '__nvptx_uni' usage.  */
+  __builtin_printf("%u\n", __nvptx_uni[0]);
+
+  /* Implicit '__nvptx_uni' usage; per 'nvptx_init_unisimt_predicate':
+ { dg-final { scan-assembler-times {mov\.u64 %r[0-9]+, __nvptx_uni;} 1 } }
+  */
+  int expected = 1;
+  return __atomic_compare_exchange_n (&a, &expected, 0, 0, MEMMODEL_RELAXED,
+  MEMMODEL_RELAXED);
+}
+
+/* The implicit (via 'need_unisimt_decl') and explicit declarations of
+   '__nvptx_uni' are both emitted:
+   { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_uni\[32\];} 2 } }
+*/
-- 
2.25.1



[og12] nvptx: Prevent emitting duplicate declarations for '__nvptx_stacks', '__nvptx_uni'

2023-01-20 Thread Thomas Schwinge
Hi!

On 2022-12-19T21:40:07+0100, Thomas Schwinge  wrote:
> As I have reported to Nvidia in 2022-12-01 'NVIDIA Incident Report (3891704):
> ptxas: Duplicate declaration error: "cannot be resolved by a '.static'"',
> 'ptxas' has an inscrutable error mode for duplicate declarations:
>
> ptxas softstack-decl-1.o, line 11; error   : '.extern' variable 
> '__nvptx_stacks' cannot be resolved by a '.static'
> ptxas fatal   : Ptx assembly aborted due to errors
> nvptx-as: ptxas returned 255 exit status
>
> ptxas uniform-simt-decl-1.o, line 12; error   : '.extern' variable 
> '__nvptx_uni' cannot be resolved by a '.static'
> ptxas fatal   : Ptx assembly aborted due to errors
> nvptx-as: ptxas returned 255 exit status
>
> This is inscrutable, because (a) what is "cannot be resolved by a '.static'"
> supposed to tell me (there is no '.static' in PTX?), and (b) why arent't
> repeated declaration just verified to match the first, but otherwise a no-op
> (like in other programming languages)?

For now pushed to devel/omp/gcc-12 branch in
commit ea52f1ca16870e4228f8044588b1bf958d4723b0
"nvptx: Prevent emitting duplicate declarations for '__nvptx_stacks', 
'__nvptx_uni'",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From ea52f1ca16870e4228f8044588b1bf958d4723b0 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 19 Dec 2022 17:19:19 +0100
Subject: [PATCH] nvptx: Prevent emitting duplicate declarations for
 '__nvptx_stacks', '__nvptx_uni'

As I have reported to Nvidia in 2022-12-01 'NVIDIA Incident Report (3891704):
ptxas: Duplicate declaration error: "cannot be resolved by a '.static'"',
'ptxas' has an inscrutable error mode for duplicate declarations:

ptxas softstack-decl-1.o, line 11; error   : '.extern' variable '__nvptx_stacks' cannot be resolved by a '.static'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status

ptxas uniform-simt-decl-1.o, line 12; error   : '.extern' variable '__nvptx_uni' cannot be resolved by a '.static'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status

This is inscrutable, because (a) what is "cannot be resolved by a '.static'"
supposed to tell me (there is no '.static' in PTX?), and (b) why arent't
repeated declaration just verified to match the first, but otherwise a no-op
(like in other programming languages)?

	gcc/
	* config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice
	'__nvptx_stacks', '__nvptx_uni' declarations.
	(nvptx_file_end): Don't emit duplicate declarations for those.
	gcc/testsuite/
	* gcc.target/nvptx/softstack-decl-1.c: Make 'dg-do assemble',
	adjust.
	* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
---
 gcc/ChangeLog.omp  |  4 
 gcc/config/nvptx/nvptx.cc  | 14 --
 gcc/testsuite/ChangeLog.omp|  4 
 gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c  |  8 
 .../gcc.target/nvptx/uniform-simt-decl-1.c |  8 
 5 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index 382cd5c80c2..127b450644b 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,5 +1,9 @@
 2023-01-20  Thomas Schwinge  
 
+	* config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice
+	'__nvptx_stacks', '__nvptx_uni' declarations.
+	(nvptx_file_end): Don't emit duplicate declarations for those.
+
 	* config/nvptx/nvptx.md (nvptx_uniform_warp_check): Make fit for
 	non-full-warp execution.
 
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index da735cf82ff..9c284ed5b01 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -181,9 +181,11 @@ static GTY(()) tree global_lock_var;
 
 /* True if any function references __nvptx_stacks.  */
 static bool need_softstack_decl;
+static bool have_softstack_decl;
 
 /* True if any function references __nvptx_uni.  */
 static bool need_unisimt_decl;
+static bool have_unisimt_decl;
 
 static int nvptx_mach_max_workers ();
 
@@ -2572,6 +2574,13 @@ nvptx_assemble_undefined_decl (FILE *file, const char *name, const_tree decl)
 			 TREE_TYPE (decl), size ? tree_to_shwi (size) : 0,
 			 DECL_ALIGN (decl), true);
   nvptx_assemble_decl_end ();
+
+  static tree softstack_id = get_identifier ("__nvptx_stacks");
+  static tree unisimt_id = get_identifier ("__nvptx_uni");
+  if (DECL_NAME (decl) == softstack_id)
+have_softstack_decl = true;
+  else if (DECL_NAME (decl) == unisimt_id)
+have_unisimt_decl = true;
 }
 
 /* Output a pattern for a move instruction.  */
@@ -6052,7 +6061,7 @@ nvptx_file_end (void)
 write_shared_buffer (asm_out_file, gang_private_share

Re: [PATCH 3/4] libbacktrace: work with aslr on windows

2023-01-20 Thread Gabriel Ravier via Gcc-patches

On 1/20/23 20:19, Eli Zaretskii wrote:

Date: Fri, 20 Jan 2023 17:46:59 +0100
Cc: gcc-patches@gcc.gnu.org, g...@gcc.gnu.org
From: Gabriel Ravier 

On 1/20/23 14:39, Eli Zaretskii via Gcc wrote:

From: Björn Schäpers 
Date: Fri, 20 Jan 2023 11:54:08 +0100

@@ -856,7 +870,12 @@ coff_add (struct backtrace_state *state, int descriptor,
  + (sections[i].offset - min_offset));
   }
   
-  if (!backtrace_dwarf_add (state, /* base_address */ 0, &dwarf_sections,

+#ifdef HAVE_WINDOWS_H
+module_handle = (uintptr_t) GetModuleHandleW (NULL);
+base_address = module_handle - image_base;
+#endif
+
+  if (!backtrace_dwarf_add (state, base_address, &dwarf_sections,
0, /* FIXME: is_bigendian */
NULL, /* altlink */
error_callback, data, fileline_fn,

Why do you force using the "wide" APIs here?  Won't GetModuleHandle do
the job, whether it resolves to GetModuleHandleA or GetModuleHandleW?

I would expect the reason to be either that:

- using wide APIs with Windows is generally considered to be a best
practice, even when not strictly needed (and in this case I can't see
any problem with doing so, unless maybe we want to code to work with
Windows 95 or something like that...)

There's no reason to forcibly break GDB on platforms where wide APIs
are not available.
Are there even any platforms that have GetModuleHandleA but not 
GetModuleHandleW ? MSDN states that Windows XP and Windows Server 2003 
are the first versions to support both of the APIs, so if this is 
supposed to work on Windows 98, for instance, whether we're using 
GetModuleHandleA or GetModuleHandleW won't matter.



- using the narrow API somehow has an actual drawback, for example maybe
it might not work if the name of the exe file the NULL will tell it to
get a handle to contains wide characters

Native Windows port of GDB doesn't support Unicode file names anyway,
which is why you used the *A APIs elsewhere in the patch, and
rightfully so.  So there's no reason to use "wide" APIs in this one
place, and every reason not to.


(just as a clarification, I did not write this patch)



[og12] nvptx: Support global constructors/destructors via 'collect2'

2023-01-20 Thread Thomas Schwinge
Hi!

On 2022-12-02T14:35:35+0100, I wrote:
> On 2022-12-01T22:13:38+0100, I wrote:
>> I'm working on support for global constructors/destructors with
>> GCC/nvptx
>
> See "nvptx: Support global constructors/destructors via 'collect2'"
> attached; OK to push?  (... with 'gcc/doc/install.texi' accordingly
> updated once 
> "'nm'" and newlib
> 
> "nvptx: Implement '_exit' instead of 'exit'" have been merged; any
> comments to those?)

For now pushed to devel/omp/gcc-12 branch in
commit fe07b0003bb2092bc34d4bed504be1868b88782d
"nvptx: Support global constructors/destructors via 'collect2'",
see attached.

> Per my quick scanning of 'gcc/config.gcc' history, for more than two
> decades, there was a clear trend to remove 'use_collect2=yes'
> configurations; now finally a new one is being added -- making sure we're
> not slowly dispensing with the need for the early 1990s piece of work
> that 'gcc/collect2*' is...  ;'-P

(I still find that "notable" and "funny" in a certain way.)  ;-*


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [og12] nvptx: Support global constructors/destructors via 'collect2'

2023-01-20 Thread Thomas Schwinge
Hi!

On 2023-01-20T21:41:26+0100, I wrote:
> On 2022-12-02T14:35:35+0100, I wrote:
>> On 2022-12-01T22:13:38+0100, I wrote:
>>> I'm working on support for global constructors/destructors with
>>> GCC/nvptx
>>
>> See "nvptx: Support global constructors/destructors via 'collect2'"
>> attached; OK to push?  (... with 'gcc/doc/install.texi' accordingly
>> updated once 
>> "'nm'" and newlib
>> 
>> "nvptx: Implement '_exit' instead of 'exit'" have been merged; any
>> comments to those?)
>
> For now pushed to devel/omp/gcc-12 branch in
> commit fe07b0003bb2092bc34d4bed504be1868b88782d
> "nvptx: Support global constructors/destructors via 'collect2'",
> see attached.

Now really attached.

>> Per my quick scanning of 'gcc/config.gcc' history, for more than two
>> decades, there was a clear trend to remove 'use_collect2=yes'
>> configurations; now finally a new one is being added -- making sure we're
>> not slowly dispensing with the need for the early 1990s piece of work
>> that 'gcc/collect2*' is...  ;'-P
>
> (I still find that "notable" and "funny" in a certain way.)  ;-*


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From fe07b0003bb2092bc34d4bed504be1868b88782d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sun, 13 Nov 2022 14:19:30 +0100
Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2'

The function attributes 'constructor', 'destructor', and 'init_priority' now
work, as do the C++ features making use of this.  Test cases with effective
target 'global_constructor' and 'init_priority' now generally work, and
'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented:
global constructors not supported on this target".

This depends on  "'nm'"
generally, and for global destructors support: newlib

"nvptx: Implement '_exit' instead of 'exit'".

	gcc/
	* collect2.cc (write_c_file_glob): Allow for
	'COLLECT2_MAIN_REFERENCE' override.
	* config.gcc : Set 'use_collect2=yes'.
	* config/nvptx/nvptx.h: Adjust.
	gcc/testsuite/
	* gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is
	'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper
	in identifiers.
	* lib/target-supports.exp
	(check_effective_target_global_constructor): Enable for nvptx.
	libgcc/
	* config.host : Add 'crtbegin.o',
	'crtend.o' to 'extra_parts'.
	* config/nvptx/crt0.c: Invoke '__do_global_ctors',
	'__do_global_dtors'.
	* config/nvptx/crtstuff.c: New.
	* config/nvptx/t-nvptx: Adjust.
---
 gcc/ChangeLog.omp |  5 ++
 gcc/collect2.cc   |  4 ++
 gcc/config.gcc|  1 +
 gcc/config/nvptx/nvptx.h  | 35 ++-
 gcc/testsuite/ChangeLog.omp   |  6 ++
 .../no_profile_instrument_function-attr-1.c   |  2 +-
 gcc/testsuite/lib/target-supports.exp |  3 +-
 libgcc/ChangeLog.omp  |  9 +++
 libgcc/config.host|  2 +-
 libgcc/config/nvptx/crt0.c|  6 ++
 libgcc/config/nvptx/crtstuff.c| 58 +++
 libgcc/config/nvptx/t-nvptx   | 15 -
 12 files changed, 139 insertions(+), 7 deletions(-)
 create mode 100644 libgcc/config/nvptx/crtstuff.c

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index 127b450644b..ca00bfb48f9 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,5 +1,10 @@
 2023-01-20  Thomas Schwinge  
 
+	* collect2.cc (write_c_file_glob): Allow for
+	'COLLECT2_MAIN_REFERENCE' override.
+	* config.gcc : Set 'use_collect2=yes'.
+	* config/nvptx/nvptx.h: Adjust.
+
 	* config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice
 	'__nvptx_stacks', '__nvptx_uni' declarations.
 	(nvptx_file_end): Don't emit duplicate declarations for those.
diff --git a/gcc/collect2.cc b/gcc/collect2.cc
index d81c7f28f16..945a9ff86dd 100644
--- a/gcc/collect2.cc
+++ b/gcc/collect2.cc
@@ -2238,8 +2238,12 @@ write_c_file_glob (FILE *stream, const char *name ATTRIBUTE_UNUSED)
 fprintf (stream, "\tdereg_frame,\n");
   fprintf (stream, "\t0\n};\n\n");
 
+# ifdef COLLECT2_MAIN_REFERENCE
+  fprintf (stream, "%s\n\n", COLLECT2_MAIN_REFERENCE);
+# else
   fprintf (stream, "extern entry_pt %s;\n", NAME__MAIN);
   fprintf (stream, "entry_pt *__main_reference = %s;\n\n", NAME__MAIN);
+# endif
 }
 #endif /* ! LD_INIT_SWITCH */
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index e6b9c864b0d..9c9365886cf 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2835,

[og12] nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2')

2023-01-20 Thread Thomas Schwinge
Hi!

On 2022-12-23T14:35:16+0100, I wrote:
> On 2022-12-02T14:35:35+0100, I wrote:
>> On 2022-12-01T22:13:38+0100, I wrote:
>>> I'm working on support for global constructors/destructors with
>>> GCC/nvptx
>>
>> See "nvptx: Support global constructors/destructors via 'collect2'"
>> [posted before]
>
> Building on that, attached is now the additional "for offloading" piece:
> "nvptx: Support global constructors/destructors via 'collect2' for 
> offloading".
> OK to push?

For now pushed to devel/omp/gcc-12 branch in
commit 689a5340c7e4286b451f1bc600342550c7c94da2
"nvptx: Support global constructors/destructors via 'collect2' for offloading",
see attached.

> I did manually test this (by putting a few constructors/destructors into
> 'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed),
> and also in my WIP development tree with standard libgfortran
> constructors (with 'LIBGFOR_MINIMAL' disabled).


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 689a5340c7e4286b451f1bc600342550c7c94da2 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 30 Nov 2022 22:09:35 +0100
Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2'
 for offloading

This extends "nvptx: Support global constructors/destructors via 'collect2'"
for offloading.

	libgcc/
	* config/nvptx/crtstuff.c ["mgomp"]
	(__do_global_ctors__entry__mgomp)
	(__do_global_dtors__entry__mgomp): New.
	[!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry):
	New.
	libgomp/
	* plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New.
	(nvptx_close_device, GOMP_OFFLOAD_load_image)
	(GOMP_OFFLOAD_unload_image): Call it.
---
 libgcc/ChangeLog.omp   |   6 ++
 libgcc/config/nvptx/crtstuff.c |  64 ++-
 libgomp/ChangeLog.omp  |   4 ++
 libgomp/plugin/plugin-nvptx.c  | 113 -
 4 files changed, 185 insertions(+), 2 deletions(-)

diff --git a/libgcc/ChangeLog.omp b/libgcc/ChangeLog.omp
index 68a99cbe427..2e7bf5cc029 100644
--- a/libgcc/ChangeLog.omp
+++ b/libgcc/ChangeLog.omp
@@ -1,5 +1,11 @@
 2023-01-20  Thomas Schwinge  
 
+	* config/nvptx/crtstuff.c ["mgomp"]
+	(__do_global_ctors__entry__mgomp)
+	(__do_global_dtors__entry__mgomp): New.
+	[!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry):
+	New.
+
 	* config.host : Add 'crtbegin.o',
 	'crtend.o' to 'extra_parts'.
 	* config/nvptx/crt0.c: Invoke '__do_global_ctors',
diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c
index 0823fc49901..8dc80687e0a 100644
--- a/libgcc/config/nvptx/crtstuff.c
+++ b/libgcc/config/nvptx/crtstuff.c
@@ -29,6 +29,14 @@
files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we
do so anyway, for symmetry with other configurations.  */
 
+
+/* See 'crt0.c', 'mgomp.c'.  */
+#if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__)
+extern void *__nvptx_stacks[32] __attribute__((shared,nocommon));
+extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
+#endif
+
+
 #ifdef CRT_BEGIN
 
 void
@@ -37,6 +45,33 @@ __do_global_ctors (void)
   DO_GLOBAL_CTORS_BODY;
 }
 
+/* Need '.entry' wrapper for offloading.  */
+
+# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__)
+
+__attribute__((kernel)) void __do_global_ctors__entry__mgomp (void *);
+
+void
+__do_global_ctors__entry__mgomp (void *nvptx_stacks_0)
+{
+  __nvptx_stacks[0] = nvptx_stacks_0;
+  __nvptx_uni[0] = 0;
+
+  __do_global_ctors ();
+}
+
+# else
+
+__attribute__((kernel)) void __do_global_ctors__entry (void);
+
+void
+__do_global_ctors__entry (void)
+{
+  __do_global_ctors ();
+}
+
+# endif
+
 #elif defined(CRT_END) /* ! CRT_BEGIN */
 
 void
@@ -45,7 +80,7 @@ __do_global_dtors (void)
   /* In this configuration here, there's no way that "this routine is run more
  than once [...] when exit is called recursively": for nvptx target, the
  call to '__do_global_dtors' is registered via 'atexit', which doesn't
- re-enter a function already run.
+ re-enter a function already run, and neither does nvptx offload target.
  Therefore, we do *not* "arrange to remember where in the list we left off
  processing".  */
   func_ptr *p;
@@ -53,6 +88,33 @@ __do_global_dtors (void)
 (*p++) ();
 }
 
+/* Need '.entry' wrapper for offloading.  */
+
+# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__)
+
+__attribute__((kernel)) void __do_global_dtors__entry__mgomp (void *);
+
+void
+__do_global_dtors__entry__mgomp (void *nvptx_stacks_0)
+{
+  __nvptx_stacks[0] = nvptx_stacks_0;
+  __nvptx_uni[0] = 0;
+
+  __do_global_dtors ();
+}
+
+# else
+
+__attribute__((kernel)) void __do_global_dtors__entry (void);
+
+void
+__do_global_dtors__entry (void)
+{
+  __do_global_dtors ();
+}
+
+# endif
+
 #els

nvptx, libgcc: Stub unwinding implementation

2023-01-20 Thread Thomas Schwinge
Hi!

We've been (t)asked to enable (portions of) GCC/Fortran I/O for nvptx
offloading, which means building a normal (non-'LIBGFOR_MINIMAL')
configuration of libgfortran.  One prerequisite patch, based on WIP work
by Andrew Stubbs, is: "nvptx, libgcc: Stub unwinding implementation", see
attached.  This I've just pushed to devel/omp/gcc-12 branch in
commit 26d3146736218ccfdaba4da1edf969bc190d, and would like to push
to master branch once other pending GCC patches have been accepted.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 26d3146736218ccfdaba4da1edf969bc190d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 21 Sep 2022 18:58:34 +0200
Subject: [PATCH] nvptx, libgcc: Stub unwinding implementation

Adding stub '_Unwind_Backtrace', '_Unwind_GetIPInfo' functions is necessary
for linking libbacktrace, as a normal (non-'LIBGFOR_MINIMAL') configuration
of libgfortran wants to do, for example.

The file 'libgcc/config/nvptx/unwind-nvptx.c' is copied from
'libgcc/config/gcn/unwind-gcn.c'.

libgcc/ChangeLog:

	* config/nvptx/t-nvptx: Add unwind-nvptx.c.
	* config/nvptx/unwind-nvptx.c: New file.

Co-authored-by: Andrew Stubbs 
---
 libgcc/ChangeLog.omp   |  6 +
 libgcc/config/nvptx/t-nvptx|  3 ++-
 libgcc/config/nvptx/unwind-nvptx.c | 36 ++
 3 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/nvptx/unwind-nvptx.c

diff --git a/libgcc/ChangeLog.omp b/libgcc/ChangeLog.omp
index 2e7bf5cc029..c46f49bf5b7 100644
--- a/libgcc/ChangeLog.omp
+++ b/libgcc/ChangeLog.omp
@@ -1,3 +1,9 @@
+2023-01-20  Thomas Schwinge  
+	Andrew Stubbs  
+
+	* config/nvptx/t-nvptx: Add unwind-nvptx.c.
+	* config/nvptx/unwind-nvptx.c: New file.
+
 2023-01-20  Thomas Schwinge  
 
 	* config/nvptx/crtstuff.c ["mgomp"]
diff --git a/libgcc/config/nvptx/t-nvptx b/libgcc/config/nvptx/t-nvptx
index 9a0454c3a4d..1845a38a35e 100644
--- a/libgcc/config/nvptx/t-nvptx
+++ b/libgcc/config/nvptx/t-nvptx
@@ -1,6 +1,7 @@
 LIB2ADD=$(srcdir)/config/nvptx/reduction.c \
 	$(srcdir)/config/nvptx/mgomp.c \
-	$(srcdir)/config/nvptx/atomic.c
+	$(srcdir)/config/nvptx/atomic.c \
+	$(srcdir)/config/nvptx/unwind-nvptx.c
 
 LIB2ADDEH=
 LIB2FUNCS_EXCLUDE=
diff --git a/libgcc/config/nvptx/unwind-nvptx.c b/libgcc/config/nvptx/unwind-nvptx.c
new file mode 100644
index 000..c657b2af6f3
--- /dev/null
+++ b/libgcc/config/nvptx/unwind-nvptx.c
@@ -0,0 +1,36 @@
+/* Stub unwinding implementation.
+
+   Copyright (C) 2019-2023 Free Software Foundation, Inc.
+
+   This file is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 3, or (at your option) any
+   later version.
+
+   This file is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include "unwind.h"
+
+_Unwind_Reason_Code
+_Unwind_Backtrace(_Unwind_Trace_Fn trace, void * trace_argument)
+{
+  return 0;
+}
+
+_Unwind_Ptr
+_Unwind_GetIPInfo (struct _Unwind_Context *c, int *ip_before_insn)
+{
+  return 0;
+}
-- 
2.25.1



Ping: [PATCH 1/6] PowerPC: Add -mcpu=future

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like the patches to enable the possible future
MMA+ instructions into GCC 13.

| Date: Wed, 9 Nov 2022 21:44:39 -0500
| Subject: [PATCH 1/6] PowerPC: Add -mcpu=future
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 2/6] PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair.

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like to get these possible future PowerPC patches
into GCC 13.

| Date: Wed, 9 Nov 2022 21:45:39 -0500
| Subject: [PATCH 2/6] PowerPC: Make -mcpu=future enable 
-mblock-ops-vector-pair.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like to get these possibly future PowerPC patches
into GCC 13.

| Date: Wed, 9 Nov 2022 21:46:36 -0500
| Subject: [PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 4/6] PowerPC: Make MMA insns support DMR registers

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like to get these possibly future PowerPC insns
into GCC 13.

| Date: Wed, 9 Nov 2022 21:50:24 -0500
| Subject: [PATCH 4/6] PowerPC: Make MMA insns support DMR registers
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations.

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like to get these possibly future PowerPC insns
into GCC 13.

| Date: Wed, 9 Nov 2022 21:51:48 -0500
| Subject: [PATCH 5/6] PowerPC: Switch to dense math names for all MMA 
operations.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like to get these possibly future PowerPC insns
into GCC 13.

| Date: Wed, 9 Nov 2022 21:52:49 -0500
| Subject: [PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Ping: [PATCH 7] PowerPC: Add -mcpu=future saturating subtract built-ins.

2023-01-20 Thread Michael Meissner via Gcc-patches
Ping patch.  We really would like to get these possibly future PowerPC insns
into GCC 13.

| Date: Sat, 12 Nov 2022 00:07:55 -0500
| Subject: [PATCH 7] PowerPC: Add -mcpu=future saturating subtract built-ins.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


  1   2   >