[gcc r11-11548] Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds.

2024-06-29 Thread Iain D Sandoe via Gcc-cvs
https://gcc.gnu.org/g:d08739dc3e99eaee3fea6375f31b14249265f227

commit r11-11548-gd08739dc3e99eaee3fea6375f31b14249265f227
Author: Iain Sandoe 
Date:   Fri Nov 12 16:36:25 2021 +

Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds.

Most of the time we get away with using the dsymutil that is
installed with the latest Xcode, however for some cross-compilation
cases that does not work.

We now have the ability to specify the correct dsymutil to use for
the toolchain (--with-dsymutil=) and we should use that specified
tool for debug link.  Fixes cross-compilers from x86-64 to powerpc.

Signed-off-by: Iain Sandoe 

gcc/ada/ChangeLog:

* gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in
libgnat/libgnarl recipies.

Diff:
---
 gcc/ada/gcc-interface/Makefile.in | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index 6cc95038c40..1f15652d2c8 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -805,8 +805,8 @@ gnatlib-shared-darwin:
libgnat$(soext)
cd $(RTSDIR); $(LN_S) libgnarl$(hyphen)$(LIBRARY_VERSION)$(soext) \
libgnarl$(soext)
-   cd $(RTSDIR); dsymutil libgnat$(hyphen)$(LIBRARY_VERSION)$(soext)
-   cd $(RTSDIR); dsymutil libgnarl$(hyphen)$(LIBRARY_VERSION)$(soext)
+   cd $(RTSDIR); $(DSYMUTIL_FOR_TARGET) 
libgnat$(hyphen)$(LIBRARY_VERSION)$(soext)
+   cd $(RTSDIR); $(DSYMUTIL_FOR_TARGET) 
libgnarl$(hyphen)$(LIBRARY_VERSION)$(soext)
 
 gnatlib-shared:
$(MAKE) $(FLAGS_TO_PASS) \


[gcc r11-11549] Include safe-ctype.h after C++ standard headers, to avoid over-poisoning

2024-06-29 Thread Iain D Sandoe via Gcc-cvs
https://gcc.gnu.org/g:5a419c22e67b30bfa10a59351c64663396a4c8f2

commit r11-11549-g5a419c22e67b30bfa10a59351c64663396a4c8f2
Author: Francois-Xavier Coudert 
Date:   Thu Mar 7 14:36:03 2024 +0100

Include safe-ctype.h after C++ standard headers, to avoid over-poisoning

When building gcc's C++ sources against recent libc++, the poisoning of
the ctype macros due to including safe-ctype.h before including C++
standard headers such as , , etc, causes many compilation
errors, similar to:

  In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
  In file included from /home/dim/src/gcc/master/gcc/system.h:233:
  In file included from /usr/include/c++/v1/vector:321:
  In file included from
  /usr/include/c++/v1/__format/formatter_bool.h:20:
  In file included from
  /usr/include/c++/v1/__format/formatter_integral.h:32:
  In file included from /usr/include/c++/v1/locale:202:
  /usr/include/c++/v1/__locale:546:5: error: '__abi_tag__' attribute
  only applies to structs, variables, functions, and namespaces
546 | _LIBCPP_INLINE_VISIBILITY
| ^
  /usr/include/c++/v1/__config:813:37: note: expanded from macro
  '_LIBCPP_INLINE_VISIBILITY'
813 | #  define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI
| ^
  /usr/include/c++/v1/__config:792:26: note: expanded from macro
  '_LIBCPP_HIDE_FROM_ABI'
792 |
__attribute__((__abi_tag__(_LIBCPP_TOSTRING(
  _LIBCPP_VERSIONED_IDENTIFIER
|  ^
  In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
  In file included from /home/dim/src/gcc/master/gcc/system.h:233:
  In file included from /usr/include/c++/v1/vector:321:
  In file included from
  /usr/include/c++/v1/__format/formatter_bool.h:20:
  In file included from
  /usr/include/c++/v1/__format/formatter_integral.h:32:
  In file included from /usr/include/c++/v1/locale:202:
  /usr/include/c++/v1/__locale:547:37: error: expected ';' at end of
  declaration list
547 | char_type toupper(char_type __c) const
| ^
  /usr/include/c++/v1/__locale:553:48: error: too many arguments
  provided to function-like macro invocation
553 | const char_type* toupper(char_type* __low, const
char_type* __high) const
|^
  /home/dim/src/gcc/master/gcc/../include/safe-ctype.h:146:9: note:
  macro 'toupper' defined here
146 | #define toupper(c) do_not_use_toupper_with_safe_ctype
| ^

This is because libc++ uses different transitive includes than
libstdc++, and some of those transitive includes pull in various ctype
declarations (typically via ).

There was already a special case for including  before
safe-ctype.h, so move the rest of the C++ standard header includes to
the same location, to fix the problem.

gcc/ChangeLog:

* system.h: Include safe-ctype.h after C++ standard headers.

Signed-off-by: Dimitry Andric 
(cherry picked from commit 9970b576b7e4ae337af1268395ff221348c4b34a)

Diff:
---
 gcc/system.h | 39 ++-
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/gcc/system.h b/gcc/system.h
index b13e9429577..28c721565a4 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -194,27 +194,8 @@ extern int fprintf_unlocked (FILE *, const char *, ...);
 #undef fread_unlocked
 #undef fwrite_unlocked
 
-/* Include  before "safe-ctype.h" to avoid GCC poisoning
-   the ctype macros through safe-ctype.h */
-
-#ifdef __cplusplus
-#ifdef INCLUDE_STRING
-# include 
-#endif
-#endif
-
-/* There are an extraordinary number of issues with .
-   The last straw is that it varies with the locale.  Use libiberty's
-   replacement instead.  */
-#include "safe-ctype.h"
-
-#include 
-
-#include 
-
-#if !defined (errno) && defined (HAVE_DECL_ERRNO) && !HAVE_DECL_ERRNO
-extern int errno;
-#endif
+/* Include C++ standard headers before "safe-ctype.h" to avoid GCC
+   poisoning the ctype macros through safe-ctype.h */
 
 #ifdef __cplusplus
 #if defined (INCLUDE_ALGORITHM) || !defined (HAVE_SWAP_IN_UTILITY)
@@ -229,6 +210,9 @@ extern int errno;
 #ifdef INCLUDE_SET
 # include 
 #endif
+#ifdef INCLUDE_STRING
+# include 
+#endif
 #ifdef INCLUDE_VECTOR
 # include 
 #endif
@@ -244,6 +228,19 @@ extern int errno;
 # include 
 #endif
 
+/* There are an extraordinary number of issues with .
+   The last straw is that it varies with the locale.  Use libiberty's
+   replacement instead.  */
+#include "safe-ctype.h"
+
+#include 
+
+#include 
+
+#if !defined (errno) && defined (HAVE_DECL_ERRNO) && !HAVE_DECL_ERRNO
+extern int errno;
+#endif
+
 /* Some of glibc's string inlines cause warnings.  Plus we'd rather
 

[gcc r11-11550] libcc1: fix include

2024-06-29 Thread Iain D Sandoe via Gcc-cvs
https://gcc.gnu.org/g:378f50f4c32af5111893989bfc5a191d3aa27bb7

commit r11-11550-g378f50f4c32af5111893989bfc5a191d3aa27bb7
Author: Francois-Xavier Coudert 
Date:   Sat Mar 16 09:50:00 2024 +0100

libcc1: fix  include

Use INCLUDE_VECTOR before including system.h, instead of directly
including , to avoid running into poisoned identifiers.

Signed-off-by: Dimitry Andric 

libcc1/ChangeLog:

PR middle-end/111632
* libcc1plugin.cc: Fix include.
* libcp1plugin.cc: Fix include.

(cherry picked from commit 5213047b1d50af63dfabb5e5649821a6cb157e33)

Diff:
---
 libcc1/libcc1plugin.cc | 2 ++
 libcc1/libcp1plugin.cc | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/libcc1/libcc1plugin.cc b/libcc1/libcc1plugin.cc
index e80ecd8f4b3..e58fa55c0f5 100644
--- a/libcc1/libcc1plugin.cc
+++ b/libcc1/libcc1plugin.cc
@@ -31,6 +31,8 @@
 #undef PACKAGE_TARNAME
 #undef PACKAGE_VERSION
 
+#define INCLUDE_MEMORY
+#define INCLUDE_VECTOR
 #include "gcc-plugin.h"
 #include "system.h"
 #include "coretypes.h"
diff --git a/libcc1/libcp1plugin.cc b/libcc1/libcp1plugin.cc
index 27a6175e34e..378c65d43c2 100644
--- a/libcc1/libcp1plugin.cc
+++ b/libcc1/libcp1plugin.cc
@@ -32,6 +32,8 @@
 #undef PACKAGE_TARNAME
 #undef PACKAGE_VERSION
 
+#define INCLUDE_MEMORY
+#define INCLUDE_VECTOR
 #include "gcc-plugin.h"
 #include "system.h"
 #include "coretypes.h"


[gcc r15-1721] Match: Support imm form for unsigned scalar .SAT_ADD

2024-06-29 Thread Pan Li via Gcc-cvs
https://gcc.gnu.org/g:21e3565927eda5ce9907d91100623052fa8182cd

commit r15-1721-g21e3565927eda5ce9907d91100623052fa8182cd
Author: Pan Li 
Date:   Fri Jun 28 11:33:41 2024 +0800

Match: Support imm form for unsigned scalar .SAT_ADD

This patch would like to support the form of unsigned scalar .SAT_ADD
when one of the op is IMM.  For example as below:

Form IMM:
  #define DEF_SAT_U_ADD_IMM_FMT_1(T)   \
  T __attribute__((noinline))  \
  sat_u_add_imm_##T##_fmt_1 (T x)  \
  {\
return (T)(x + 9) >= x ? (x + 9) : -1; \
  }

DEF_SAT_U_ADD_IMM_FMT_1(uint64_t)

Before this patch:
__attribute__((noinline))
uint64_t sat_u_add_imm_uint64_t_fmt_1 (uint64_t x)
{
  long unsigned int _1;
  uint64_t _3;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _1 = MIN_EXPR ;
  _3 = _1 + 9;
  return _3;
;;succ:   EXIT

}

After this patch:
__attribute__((noinline))
uint64_t sat_u_add_imm_uint64_t_fmt_1 (uint64_t x)
{
  uint64_t _3;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _3 = .SAT_ADD (x_2(D), 9); [tail call]
  return _3;
;;succ:   EXIT

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression test with newlib.
2. The x86 bootstrap test.
3. The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add imm form for .SAT_ADD matching.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add .SAT_ADD matching under PLUS_EXPR.

Signed-off-by: Pan Li 

Diff:
---
 gcc/match.pd  | 24 
 gcc/tree-ssa-math-opts.cc |  2 ++
 2 files changed, 26 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 3fa3f2e8296..7fff7b5f9fe 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3154,6 +3154,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (match (unsigned_integer_sat_add @0 @1)
  (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
 
+/* Unsigned saturation add, case 9 (one op is imm):
+   SAT_U_ADD = (X + 3) >= x ? (X + 3) : -1.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (plus (min @0 INTEGER_CST@2) INTEGER_CST@1)
+ (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
+  && types_match (type, @0, @1))
+  (with
+   {
+unsigned precision = TYPE_PRECISION (type);
+wide_int cst_1 = wi::to_wide (@1);
+wide_int cst_2 = wi::to_wide (@2);
+wide_int max = wi::mask (precision, false, precision);
+wide_int sum = wi::add (cst_1, cst_2);
+   }
+   (if (wi::eq_p (max, sum))
+
+/* Unsigned saturation add, case 10 (one op is imm):
+   SAT_U_ADD = __builtin_add_overflow (X, 3, &ret) == 0 ? ret : -1.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (ne (imagpart (IFN_ADD_OVERFLOW@2 @0 INTEGER_CST@1)) integer_zerop)
+  integer_minus_onep (realpart @2))
+  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
+  && types_match (type, @0
+
 /* Unsigned saturation sub, case 1 (branch with gt):
SAT_U_SUB = X > Y ? X - Y : 0  */
 (match (unsigned_integer_sat_sub @0 @1)
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index 3783a874699..3b5433ec000 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -6195,6 +6195,8 @@ math_opts_dom_walker::after_dom_children (basic_block bb)
  break;
 
case PLUS_EXPR:
+ match_unsigned_saturation_add (&gsi, as_a (stmt));
+ /* fall-through  */
case MINUS_EXPR:
  if (!convert_plusminus_to_widen (&gsi, stmt, code))
{


[gcc r15-1722] Fortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]

2024-06-29 Thread Harald Anlauf via Gcc-cvs
https://gcc.gnu.org/g:7682d115402743090f20aca63a3b5e6c205dedff

commit r15-1722-g7682d115402743090f20aca63a3b5e6c205dedff
Author: Harald Anlauf 
Date:   Fri Jun 28 21:44:06 2024 +0200

Fortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]

gcc/fortran/ChangeLog:

PR fortran/114019
* trans-stmt.cc (gfc_trans_allocate): Fix handling of case of
scalar character expression being used for SOURCE.

gcc/testsuite/ChangeLog:

PR fortran/114019
* gfortran.dg/allocate_with_source_33.f90: New test.

Diff:
---
 gcc/fortran/trans-stmt.cc  |  5 +-
 .../gfortran.dg/allocate_with_source_33.f90| 69 ++
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc
index 93b633e212e..60275e18867 100644
--- a/gcc/fortran/trans-stmt.cc
+++ b/gcc/fortran/trans-stmt.cc
@@ -6464,7 +6464,10 @@ gfc_trans_allocate (gfc_code * code, gfc_omp_namelist 
*omp_allocate)
   else if (se.expr != NULL_TREE && temp_var_needed)
{
  tree var, desc;
- tmp = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr)) || is_coarray ?
+ tmp = (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr))
+|| is_coarray
+|| (code->expr3->ts.type == BT_CHARACTER
+&& code->expr3->rank == 0)) ?
se.expr
  : build_fold_indirect_ref_loc (input_location, se.expr);
 
diff --git a/gcc/testsuite/gfortran.dg/allocate_with_source_33.f90 
b/gcc/testsuite/gfortran.dg/allocate_with_source_33.f90
new file mode 100644
index 000..43a03625950
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/allocate_with_source_33.f90
@@ -0,0 +1,69 @@
+! { dg-do compile }
+! { dg-options "-O0" }
+!
+! PR fortran/114019 - allocation with source of deferred character length
+
+subroutine s
+  implicit none
+  character(1)  :: w   = "4"
+  character(*), parameter   :: str = "123"
+  character(5), pointer :: chr_pointer1
+  character(:), pointer :: chr_pointer2
+  character(:), pointer :: chr_ptr_arr(:)
+  character(5), allocatable :: chr_alloc1
+  character(:), allocatable :: chr_alloc2
+  character(:), allocatable :: chr_all_arr(:)
+  allocate (chr_pointer1, source=w// str//w)
+  allocate (chr_pointer2, source=w// str//w)
+  allocate (chr_ptr_arr,  source=w//[str//w])
+  allocate (chr_alloc1,   source=w// str//w)
+  allocate (chr_alloc2,   source=w// str//w)
+  allocate (chr_all_arr,  source=w//[str//w])
+  allocate (chr_pointer2, source=str)
+  allocate (chr_pointer2, source=w)
+  allocate (chr_alloc2,   source=str)
+  allocate (chr_alloc2,   source=w)
+  allocate (chr_pointer1, mold  =w// str//w)
+  allocate (chr_pointer2, mold  =w// str//w)
+  allocate (chr_ptr_arr,  mold  =w//[str//w])
+  allocate (chr_alloc1,   mold  =w// str//w)
+  allocate (chr_alloc2,   mold  =w// str//w)
+  allocate (chr_all_arr,  mold  =w//[str//w])
+  allocate (chr_pointer2, mold  =str)
+  allocate (chr_pointer2, mold  =w)
+  allocate (chr_alloc2,   mold  =str)
+  allocate (chr_alloc2,   mold  =w)
+end
+
+subroutine s2
+  implicit none
+  integer, parameter :: ck=4
+  character(kind=ck,len=1)  :: w   = ck_"4"
+  character(kind=ck,len=*), parameter   :: str = ck_"123"
+  character(kind=ck,len=5), pointer :: chr_pointer1
+  character(kind=ck,len=:), pointer :: chr_pointer2
+  character(kind=ck,len=:), pointer :: chr_ptr_arr(:)
+  character(kind=ck,len=5), allocatable :: chr_alloc1
+  character(kind=ck,len=:), allocatable :: chr_alloc2
+  character(kind=ck,len=:), allocatable :: chr_all_arr(:)
+  allocate (chr_pointer1, source=w// str//w)
+  allocate (chr_pointer2, source=w// str//w)
+  allocate (chr_ptr_arr,  source=w//[str//w])
+  allocate (chr_alloc1,   source=w// str//w)
+  allocate (chr_alloc2,   source=w// str//w)
+  allocate (chr_all_arr,  source=w//[str//w])
+  allocate (chr_pointer2, source=str)
+  allocate (chr_pointer2, source=w)
+  allocate (chr_alloc2,   source=str)
+  allocate (chr_alloc2,   source=w)
+  allocate (chr_pointer1, mold  =w// str//w)
+  allocate (chr_pointer2, mold  =w// str//w)
+  allocate (chr_ptr_arr,  mold  =w//[str//w])
+  allocate (chr_alloc1,   mold  =w// str//w)
+  allocate (chr_alloc2,   mold  =w// str//w)
+  allocate (chr_all_arr,  mold  =w//[str//w])
+  allocate (chr_pointer2, mold  =str)
+  allocate (chr_pointer2, mold  =w)
+  allocate (chr_alloc2,   mold  =str)
+  allocate (chr_alloc2,   mold  =w)
+end


[gcc r15-1723] [to-be-committed, RISC-V, V4] movmem for RISCV with V extension

2024-06-29 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:42946aa9b3228262e413481a3193bda85c20ef4b

commit r15-1723-g42946aa9b3228262e413481a3193bda85c20ef4b
Author: Sergei Lewis 
Date:   Sat Jun 29 14:34:31 2024 -0600

[to-be-committed,RISC-V,V4] movmem for RISCV with V extension

I hadn't updated my repo on the host where I handle email, so it picked
up the older version of this patch without the testsuite fix.  So, V4
with the testsuite option for lmul fixed.

--

And Sergei's movmem patch.  Just trivial testsuite adjustment for an
option name change and a whitespace fix from me.

I've spun this in my tester for rv32 and rv64.  I'll wait for pre-commit
CI before taking further action.

Just a reminder, this patch is designed to handle the case where we can
issue a single vector load/store which avoids all the complexities of
determining which direction to copy.

--

gcc/ChangeLog

* config/riscv/riscv.md (movmem): New expander.

gcc/testsuite/ChangeLog

PR target/112109
* gcc.target/riscv/rvv/base/movmem-1.c: New test

Diff:
---
 gcc/config/riscv/riscv.md  | 22 
 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c | 60 ++
 2 files changed, 82 insertions(+)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ff37125e3f2..c0c960353eb 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2723,6 +2723,28 @@
 FAIL;
 })
 
+;; Inlining general memmove is a pessimisation: we can't avoid having to decide
+;; which direction to go at runtime, which is costly in instruction count
+;; however for situations where the entire move fits in one vector operation
+;; we can do all reads before doing any writes so we don't have to worry
+;; so generate the inline vector code in such situations
+;; nb. prefer scalar path for tiny memmoves.
+(define_expand "movmem"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+   (match_operand:BLK 1 "general_operand"))
+(use (match_operand:P 2 "const_int_operand"))
+(use (match_operand:SI 3 "const_int_operand"))])]
+  "TARGET_VECTOR"
+{
+  if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN / 8)
+   && (INTVAL (operands[2]) <= TARGET_MIN_VLEN)
+   && riscv_vector::expand_block_move (operands[0], operands[1],
+operands[2]))
+DONE;
+  else
+FAIL;
+})
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
new file mode 100644
index 000..d9d4a70a392
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-O3 -mrvv-max-lmul=dynamic" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define MIN_VECTOR_BYTES (__riscv_v_min_vlen / 8)
+
+/* Tiny memmoves should not be vectorised.
+** f1:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char *
+f1 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES - 1);
+}
+
+/* Vectorise+inline minimum vector register width with LMUL=1
+** f2:
+**  (
+**  vsetivli\s+zero,16,e8,m1,ta,ma
+**  |
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m1,ta,ma
+**  )
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char *
+f2 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES);
+}
+
+/* Vectorise+inline up to LMUL=8
+** f3:
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m8,ta,ma
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char *
+f3 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8);
+}
+
+/* Don't vectorise if the move is too large for one operation
+** f4:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char *
+f4 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8 + 1);
+}


[gcc r15-1724] [PR115565] cse: Don't use a valid regno for non-register in comparison_qty

2024-06-29 Thread Maciej W. Rozycki via Gcc-cvs
https://gcc.gnu.org/g:69bc5fb97dc3fada81869e00fa65d39f7def6acf

commit r15-1724-g69bc5fb97dc3fada81869e00fa65d39f7def6acf
Author: Maciej W. Rozycki 
Date:   Sat Jun 29 23:26:55 2024 +0100

[PR115565] cse: Don't use a valid regno for non-register in comparison_qty

Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.

Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.

This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.

The issue has been observed as a regression from commit 08a692679fb8
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
, and up to
commit 932ad4d9b550 ("Make CSE path following use the CFG"),
, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore.  However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.

Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679fb8.

gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.

Diff:
---
 gcc/cse.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cse.cc b/gcc/cse.cc
index c53deecbe54..65794ac5f2c 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -239,7 +239,7 @@ static int next_qty;
the constant being compared against, or zero if the comparison
is not against a constant.  `comparison_qty' holds the quantity
being compared against when the result is known.  If the comparison
-   is not with a register, `comparison_qty' is -1.  */
+   is not with a register, `comparison_qty' is INT_MIN.  */
 
 struct qty_table_elem
 {
@@ -4058,7 +4058,7 @@ record_jump_cond (enum rtx_code code, machine_mode mode, 
rtx op0, rtx op1)
   else
{
  ent->comparison_const = op1;
- ent->comparison_qty = -1;
+ ent->comparison_qty = INT_MIN;
}
 
   return;