date:20240922

[gcc r15-3777] libgcc, Darwin: From macOS 11, make that the earliest supported.

2024-09-22 Thread Iain D Sandoe via Gcc-cvs

https://gcc.gnu.org/g:43eab54939d37d4e634a692910d31adafc053e38

commit r15-3777-g43eab54939d37d4e634a692910d31adafc053e38
Author: Iain Sandoe 
Date:   Sun Sep 22 14:30:30 2024 +0100

libgcc, Darwin: From macOS 11, make that the earliest supported.

For libgcc, we have (so far) supported building a DSO that supports
earlier versions of the OS than the target.  From macOS 11, there are
APIs that do not exist on earlier OS versions, so limit the libgcc
range to macOS11..current.

libgcc/ChangeLog:

* config.host: From macOS 11, limit earliest macOS support
to macOS 11.
* config/t-darwin-min-11: New file.

Signed-off-by: Iain Sandoe 

Diff:
---
 libgcc/config.host| 5 -
 libgcc/config/t-darwin-min-11 | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/libgcc/config.host b/libgcc/config.host
index 9fae51d4ce7d..4fb4205478a8 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -236,7 +236,10 @@ case ${host} in
   esac
   tmake_file="$tmake_file t-slibgcc-darwin"
   case ${host} in
-*-*-darwin1[89]* | *-*-darwin2* )
+*-*-darwin2*)
+  tmake_file="t-darwin-min-11 $tmake_file"
+  ;;
+*-*-darwin1[89]*)
   tmake_file="t-darwin-min-8 $tmake_file"
   ;;
 *-*-darwin9* | *-*-darwin1[0-7]*)
diff --git a/libgcc/config/t-darwin-min-11 b/libgcc/config/t-darwin-min-11
new file mode 100644
index ..4009d41addb5
--- /dev/null
+++ b/libgcc/config/t-darwin-min-11
@@ -0,0 +1,3 @@
+# Support building with -mmacosx-version-min back to macOS 11.
+DARWIN_MIN_LIB_VERSION = -mmacosx-version-min=11
+DARWIN_MIN_CRT_VERSION = -mmacosx-version-min=11

[gcc r15-3778] testsuite, coroutines: Add tests for non-supension ramp returns.

2024-09-22 Thread Iain D Sandoe via Gcc-cvs

https://gcc.gnu.org/g:0312b66677590471b8b783b81f62b2e36b1b7ac1

commit r15-3778-g0312b66677590471b8b783b81f62b2e36b1b7ac1
Author: Iain Sandoe 
Date:   Sun Sep 22 14:59:13 2024 +0100

testsuite, coroutines: Add tests for non-supension ramp returns.

Although it is most common for the ramp function to see a return when a 
coroutine
first suspends, there are other possibilities.  For example all the awaits 
could
be ready - effectively the coroutine will then run to completion and 
deallocation.
Another case is where the first active suspension point causes the current 
routine
to be cancelled and thence destroyed.

These cases are tested here.

gcc/testsuite/ChangeLog:

* 
g++.dg/coroutines/torture/special-termination-00-sync-completion.C: New test.
* g++.dg/coroutines/torture/special-termination-01-self-destruct.C: 
New test.

Signed-off-by: Iain Sandoe 

Diff:
---
 .../special-termination-00-sync-completion.C   | 127 
 .../torture/special-termination-01-self-destruct.C | 129 +
 2 files changed, 256 insertions(+)

diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/special-termination-00-sync-completion.C
 
b/gcc/testsuite/g++.dg/coroutines/torture/special-termination-00-sync-completion.C
new file mode 100644
index ..21ce721d0ace
--- /dev/null
+++ 
b/gcc/testsuite/g++.dg/coroutines/torture/special-termination-00-sync-completion.C
@@ -0,0 +1,127 @@
+// { dg-do run }
+#include 
+
+#ifndef OUTPUT
+#  define PRINT(X)
+#  define PRINTF(...)
+#else
+#include 
+#  define PRINT(X) puts(X)
+#  define PRINTF(...) printf(__VA_ARGS__)
+#endif
+
+// coroutine management object with simple interlocks for the underlying
+// coroutine.
+
+struct coro1 {
+  struct promise_type;
+  using handle_type = std::coroutine_handle;
+
+  handle_type handle = nullptr;
+
+  coro1 () : handle(0) {}
+  coro1 (handle_type _handle)
+: handle(_handle)
+  {
+PRINT("Created coro1 object from handle");
+// We're alive - let the promise object know so that it can notify us
+// if the coroutine terminates before this object is destroyed.
+handle.promise().set_owner (this);
+  }
+
+  coro1 (const coro1 &) = delete; // no copying
+
+  coro1 (coro1 &&s) : handle (s.handle) {
+s.handle = nullptr;
+PRINT("coro1 mv ctor ");
+handle.promise().set_owner (this);
+  }
+
+  coro1 &operator = (coro1 &&s) {
+handle = s.handle;
+s.handle = nullptr;
+PRINT("coro1 op=  ");
+handle.promise().set_owner (this);
+return *this;
+  }
+
+  ~coro1() {
+PRINT("Destroyed coro1");
+// We might come here before the coroutine has finished so...
+if ( handle )
+  {
+handle.promise().set_owner (nullptr);
+handle.destroy();
+  }
+  }
+
+  // Special awaiters.
+  struct suspend_always_prt {
+  bool await_ready() const noexcept { PRINT ("susp-always-is-not-ready") ; 
return false; }
+  void await_suspend(handle_type) const noexcept { PRINT ("susp-always-susp");}
+  void await_resume() const noexcept { PRINT ("susp-always-resume");}
+  ~suspend_always_prt() { PRINT ("susp-always-dtor"); }
+  };
+
+  struct suspend_never_prt {
+  bool await_ready () const noexcept { PRINT ("susp-never-is-ready") ; return 
true; }
+  void await_suspend (handle_type) const noexcept { PRINT ("susp-never-susp");}
+  void await_resume () const noexcept { PRINT ("susp-never-resume");}
+  ~suspend_never_prt () { PRINT ("susp-never-dtor"); }
+  };
+
+  struct promise_type {
+
+  promise_type () : vv(-1) {  PRINT ("Created Promise"); }
+
+  promise_type (int __x) : vv(__x) {  PRINTF ("Created Promise with 
%d\n",__x); }
+
+  ~promise_type () {
+  PRINT ("Destroyed Promise");
+  // The coroutine is about to become invalid - so remove the handle from the
+  // owner.
+  if (owner)
+owner->handle = nullptr;
+  }
+
+  // The coro1 ramp return object will be constructed from this.
+  auto get_return_object () {
+PRINT ("get_return_object: handle from promise");
+return handle_type::from_promise (*this);
+  }
+
+  auto initial_suspend () { return suspend_never_prt{}; }
+  auto final_suspend () noexcept { return suspend_never_prt{}; }
+
+  void return_value (int v) {
+PRINTF ("return_value (%d)\n", v);
+vv = v;
+  }
+
+  void unhandled_exception() { PRINT ("** unhandled exception"); }
+
+  int get_value () { return vv; }
+  void set_owner (coro1 *new_owner) { owner = new_owner; }
+
+  private:
+coro1 *owner = nullptr;
+int vv;
+  };
+};
+
+struct coro1
+finishes_synchronously (const int v)
+{
+  co_return v;
+}
+
+int main ()
+{
+  struct coro1 x = finishes_synchronously (42);
+  // the underlying coroutine is done and destroyed... and we should have
+  // signalled that by clearing coro1's handle.
+  if (x.handle)
+__builtin_abort ();
+  // ... so we just clean up the return object here.
+  return 0;
+}
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/sp

[gcc r15-3780] RISC-V: Add testcases for form 4 of signed scalar SAT_ADD

2024-09-22 Thread Pan Li via Gcc-cvs

https://gcc.gnu.org/g:50c9c3cbdf26dfd2499b12eb6b9bc2218e598546

commit r15-3780-g50c9c3cbdf26dfd2499b12eb6b9bc2218e598546
Author: Pan Li 
Date:   Fri Sep 20 10:15:37 2024 +0800

RISC-V: Add testcases for form 4 of signed scalar SAT_ADD

Form 4:
  #define DEF_SAT_S_ADD_FMT_4(T, UT, MIN, MAX)   \
  T __attribute__((noinline))\
  sat_s_add_##T##_fmt_4 (T x, T y)   \
  {  \
T sum;   \
bool overflow = __builtin_add_overflow (x, y, &sum); \
return !overflow ? sum : x < 0 ? MIN : MAX;  \
  }

DEF_SAT_S_ADD_FMT_4 (int64_t, uint64_t, INT64_MIN, INT64_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_add-13.c: New test.
* gcc.target/riscv/sat_s_add-14.c: New test.
* gcc.target/riscv/sat_s_add-15.c: New test.
* gcc.target/riscv/sat_s_add-16.c: New test.
* gcc.target/riscv/sat_s_add-run-13.c: New test.
* gcc.target/riscv/sat_s_add-run-14.c: New test.
* gcc.target/riscv/sat_s_add-run-15.c: New test.
* gcc.target/riscv/sat_s_add-run-16.c: New test.

Signed-off-by: Pan Li 

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 14 ++
 gcc/testsuite/gcc.target/riscv/sat_s_add-13.c | 30 +
 gcc/testsuite/gcc.target/riscv/sat_s_add-14.c | 32 +++
 gcc/testsuite/gcc.target/riscv/sat_s_add-15.c | 31 ++
 gcc/testsuite/gcc.target/riscv/sat_s_add-16.c | 29 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-13.c | 16 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-14.c | 16 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-15.c | 16 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-16.c | 16 
 9 files changed, 200 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index ab141bb17791..a2617b6db708 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -153,6 +153,17 @@ sat_s_add_##T##_fmt_3 (T x, T y)   \
 #define DEF_SAT_S_ADD_FMT_3_WRAP(T, UT, MIN, MAX) \
   DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX)
 
+#define DEF_SAT_S_ADD_FMT_4(T, UT, MIN, MAX)   \
+T __attribute__((noinline))\
+sat_s_add_##T##_fmt_4 (T x, T y)   \
+{  \
+  T sum;   \
+  bool overflow = __builtin_add_overflow (x, y, &sum); \
+  return !overflow ? sum : x < 0 ? MIN : MAX;  \
+}
+#define DEF_SAT_S_ADD_FMT_4_WRAP(T, UT, MIN, MAX) \
+  DEF_SAT_S_ADD_FMT_4(T, UT, MIN, MAX)
+
 #define RUN_SAT_S_ADD_FMT_1(T, x, y) sat_s_add_##T##_fmt_1(x, y)
 #define RUN_SAT_S_ADD_FMT_1_WRAP(T, x, y) RUN_SAT_S_ADD_FMT_1(T, x, y)
 
@@ -162,6 +173,9 @@ sat_s_add_##T##_fmt_3 (T x, T y)   \
 #define RUN_SAT_S_ADD_FMT_3(T, x, y) sat_s_add_##T##_fmt_3(x, y)
 #define RUN_SAT_S_ADD_FMT_3_WRAP(T, x, y) RUN_SAT_S_ADD_FMT_3(T, x, y)
 
+#define RUN_SAT_S_ADD_FMT_4(T, x, y) sat_s_add_##T##_fmt_4(x, y)
+#define RUN_SAT_S_ADD_FMT_4_WRAP(T, x, y) RUN_SAT_S_ADD_FMT_4(T, x, y)
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git a/gcc/testsuite/gcc.target/riscv/sat_s_add-13.c 
b/gcc/testsuite/gcc.target/riscv/sat_s_add-13.c
new file mode 100644
index ..0923764cde44
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_s_add-13.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_s_add_int8_t_fmt_4:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** xor\s+[atx][0-9]+,\s*a0,\s*a1
+** xor\s+[atx][0-9]+,\s*a0,\s*[atx][0-9]+
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*7
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*7
+** xori\s+[atx][0-9]+,\s*[atx][0-9]+,\s*1
+** and\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*1
+** srai\s+[atx][0-9]+,\s*[atx][0-9]+,\s*63
+** xori\s+[atx][0-9]+,\s*[atx][0-9]+,\s*127
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** and\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** addi\s+[atx][0

[gcc r15-3779] RISC-V: Add testcases for form 3 of signed scalar SAT_ADD

2024-09-22 Thread Pan Li via Gcc-cvs

https://gcc.gnu.org/g:20ec2c5dd4f21ee4d5c8246fbaddc507ee7fed3d

commit r15-3779-g20ec2c5dd4f21ee4d5c8246fbaddc507ee7fed3d
Author: Pan Li 
Date:   Fri Sep 20 10:01:40 2024 +0800

RISC-V: Add testcases for form 3 of signed scalar SAT_ADD

This patch would like to add testcases of the signed scalar SAT_ADD
for form 3.  Aka:

Form 3:
  #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX)   \
  T __attribute__((noinline))\
  sat_s_add_##T##_fmt_3 (T x, T y)   \
  {  \
T sum;   \
bool overflow = __builtin_add_overflow (x, y, &sum); \
return overflow ? x < 0 ? MIN : MAX : sum;   \
  }

DEF_SAT_S_ADD_FMT_3 (int64_t, uint64_t, INT64_MIN, INT64_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_add-10.c: New test.
* gcc.target/riscv/sat_s_add-11.c: New test.
* gcc.target/riscv/sat_s_add-12.c: New test.
* gcc.target/riscv/sat_s_add-9.c: New test.
* gcc.target/riscv/sat_s_add-run-10.c: New test.
* gcc.target/riscv/sat_s_add-run-11.c: New test.
* gcc.target/riscv/sat_s_add-run-12.c: New test.
* gcc.target/riscv/sat_s_add-run-9.c: New test.

Signed-off-by: Pan Li 

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 14 ++
 gcc/testsuite/gcc.target/riscv/sat_s_add-10.c | 32 +++
 gcc/testsuite/gcc.target/riscv/sat_s_add-11.c | 31 ++
 gcc/testsuite/gcc.target/riscv/sat_s_add-12.c | 29 
 gcc/testsuite/gcc.target/riscv/sat_s_add-9.c  | 30 +
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-10.c | 16 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-11.c | 16 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-12.c | 16 
 gcc/testsuite/gcc.target/riscv/sat_s_add-run-9.c  | 16 
 9 files changed, 200 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index b4fbf5dc6623..ab141bb17791 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -142,12 +142,26 @@ sat_s_add_##T##_fmt_2 (T x, T y) \
   return x < 0 ? MIN : MAX;  \
 }
 
+#define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX)   \
+T __attribute__((noinline))\
+sat_s_add_##T##_fmt_3 (T x, T y)   \
+{  \
+  T sum;   \
+  bool overflow = __builtin_add_overflow (x, y, &sum); \
+  return overflow ? x < 0 ? MIN : MAX : sum;   \
+}
+#define DEF_SAT_S_ADD_FMT_3_WRAP(T, UT, MIN, MAX) \
+  DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX)
+
 #define RUN_SAT_S_ADD_FMT_1(T, x, y) sat_s_add_##T##_fmt_1(x, y)
 #define RUN_SAT_S_ADD_FMT_1_WRAP(T, x, y) RUN_SAT_S_ADD_FMT_1(T, x, y)
 
 #define RUN_SAT_S_ADD_FMT_2(T, x, y) sat_s_add_##T##_fmt_2(x, y)
 #define RUN_SAT_S_ADD_FMT_2_WRAP(T, x, y) RUN_SAT_S_ADD_FMT_2(T, x, y)
 
+#define RUN_SAT_S_ADD_FMT_3(T, x, y) sat_s_add_##T##_fmt_3(x, y)
+#define RUN_SAT_S_ADD_FMT_3_WRAP(T, x, y) RUN_SAT_S_ADD_FMT_3(T, x, y)
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git a/gcc/testsuite/gcc.target/riscv/sat_s_add-10.c 
b/gcc/testsuite/gcc.target/riscv/sat_s_add-10.c
new file mode 100644
index ..45329619f9d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_s_add-10.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_s_add_int16_t_fmt_3:
+** add\s+[atx][0-9]+,\s*a0,\s*a1
+** xor\s+[atx][0-9]+,\s*a0,\s*a1
+** xor\s+[atx][0-9]+,\s*a0,\s*[atx][0-9]+
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*15
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*15
+** xori\s+[atx][0-9]+,\s*[atx][0-9]+,\s*1
+** and\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*1
+** srai\s+[atx][0-9]+,\s*[atx][0-9]+,\s*63
+** li\s+[atx][0-9]+,\s*32768
+** addi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*-1
+** xor\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** and\s+[atx][0

[gcc r15-3783] RISC-V: Add testcases for form 2 of signed vector SAT_ADD

2024-09-22 Thread Pan Li via Gcc-cvs

https://gcc.gnu.org/g:a1e6bb6fb128a00a8355cb49afb9c1d290c1389b

commit r15-3783-ga1e6bb6fb128a00a8355cb49afb9c1d290c1389b
Author: Pan Li 
Date:   Fri Sep 20 16:09:56 2024 +0800

RISC-V: Add testcases for form 2 of signed vector SAT_ADD

Form 2:
  #define DEF_VEC_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \
  void __attribute__((noinline))   \
  vec_sat_s_add_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \
  {\
unsigned i;\
for (i = 0; i < limit; i++)\
  {\
T x = op_1[i]; \
T y = op_2[i]; \
T sum = (UT)x + (UT)y; \
if ((x ^ y) < 0 || (sum ^ x) >= 0) \
  out[i] = sum;\
else   \
  out[i] = x < 0 ? MIN : MAX;  \
  }\
  }

DEF_VEC_SAT_S_ADD_FMT_2 (int8_t, uint8_t, INT8_MIN, INT8_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper 
macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-5.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-6.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-7.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-5.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-6.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-7.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-run-8.c: New 
test.

Signed-off-by: Pan Li 

Diff:
---
 .../riscv/rvv/autovec/binop/vec_sat_s_add-5.c  |  9 
 .../riscv/rvv/autovec/binop/vec_sat_s_add-6.c  |  9 
 .../riscv/rvv/autovec/binop/vec_sat_s_add-7.c  |  9 
 .../riscv/rvv/autovec/binop/vec_sat_s_add-8.c  |  9 
 .../riscv/rvv/autovec/binop/vec_sat_s_add-run-5.c  | 17 +++
 .../riscv/rvv/autovec/binop/vec_sat_s_add-run-6.c  | 17 +++
 .../riscv/rvv/autovec/binop/vec_sat_s_add-run-7.c  | 17 +++
 .../riscv/rvv/autovec/binop/vec_sat_s_add-run-8.c  | 17 +++
 .../gcc.target/riscv/rvv/autovec/vec_sat_arith.h   | 24 ++
 9 files changed, 128 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-5.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-5.c
new file mode 100644
index ..8cf0d06efdb2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-5.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fdump-rtl-expand-details" } */
+
+#include "../vec_sat_arith.h"
+
+DEF_VEC_SAT_S_ADD_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
+/* { dg-final { scan-assembler-times {vsadd\.vv} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-6.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-6.c
new file mode 100644
index ..a26d3943e27b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-6.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fdump-rtl-expand-details" } */
+
+#include "../vec_sat_arith.h"
+
+DEF_VEC_SAT_S_ADD_FMT_2(int16_t, uint16_t, INT16_MIN, INT16_MAX)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
+/* { dg-final { scan-assembler-times {vsadd\.vv} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-7.c
new file mode 100644
index ..4ef1351dd299
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_s_add-7.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fdump-rtl-expand-details" } */
+
+#include "../vec_sat_arith.h"
+
+DEF_VEC_SAT_S_ADD_FMT_2(int32_t, uin

[gcc r15-3784] Match: Support form 2 for vector signed integer .SAT_ADD

2024-09-22 Thread Pan Li via Gcc-cvs

https://gcc.gnu.org/g:4fc92480675bd071dd3edbaa78bb73525137c4a6

commit r15-3784-g4fc92480675bd071dd3edbaa78bb73525137c4a6
Author: Pan Li 
Date:   Fri Sep 20 16:01:01 2024 +0800

Match: Support form 2 for vector signed integer .SAT_ADD

This patch would like to support the form 2 of the vector signed
integer .SAT_ADD.  Aka below example:

Form 2:
  #define DEF_VEC_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \
  void __attribute__((noinline))   \
  vec_sat_s_add_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \
  {\
unsigned i;\
for (i = 0; i < limit; i++)\
  {\
T x = op_1[i]; \
T y = op_2[i]; \
T sum = (UT)x + (UT)y; \
if ((x ^ y) < 0 || (sum ^ x) >= 0) \
  out[i] = sum;\
else   \
  out[i] = x < 0 ? MIN : MAX;  \
  }\
  }

DEF_VEC_SAT_S_ADD_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX)

Before this patch:
 104   │   loop_len_79 = MIN_EXPR ;
 105   │   _50 = &MEM  [(int8_t 
*)vectp_op_1.9_77];
 106   │   vect_x_18.11_80 = .MASK_LEN_LOAD (_50, 8B, { -1, ... }, 
loop_len_79, 0);
 107   │   _70 = vect_x_18.11_80 >> 7;
 108   │   vect_x.12_81 = VIEW_CONVERT_EXPR(vect_x_18.11_80);
 109   │   _26 = (void *) ivtmp.47_20;
 110   │   _27 = &MEM  [(int8_t *)_26];
 111   │   vect_y_20.15_84 = .MASK_LEN_LOAD (_27, 8B, { -1, ... }, 
loop_len_79, 0);
 112   │   vect__7.21_90 = vect_x_18.11_80 ^ vect_y_20.15_84;
 113   │   mask__50.23_92 = vect__7.21_90 >= { 0, ... };
 114   │   vect_y.16_85 = VIEW_CONVERT_EXPR(vect_y_20.15_84);
 115   │   vect__6.17_86 = vect_x.12_81 + vect_y.16_85;
 116   │   vect_sum_21.18_87 = VIEW_CONVERT_EXPR(vect__6.17_86);
 117   │   vect__8.19_88 = vect_x_18.11_80 ^ vect_sum_21.18_87;
 118   │   mask__45.20_89 = vect__8.19_88 < { 0, ... };
 119   │   mask__44.24_93 = mask__45.20_89 & mask__50.23_92;
 120   │   _40 = .COND_XOR (mask__44.24_93, _70, { 127, ... }, 
vect_sum_21.18_87);
 121   │   _60 = (void *) ivtmp.49_6;
 122   │   _61 = &MEM  [(int8_t *)_60];
 123   │   .MASK_LEN_STORE (_61, 8B, { -1, ... }, loop_len_79, 0, _40);
 124   │   vectp_op_1.9_78 = vectp_op_1.9_77 + POLY_INT_CST [16, 16];
 125   │   ivtmp.47_4 = ivtmp.47_20 + POLY_INT_CST [16, 16];
 126   │   ivtmp.49_21 = ivtmp.49_6 + POLY_INT_CST [16, 16];
 127   │   ivtmp.51_98 = ivtmp.51_53;
 128   │   ivtmp.51_8 = ivtmp.51_53 + POLY_INT_CST [18446744073709551600, 
18446744073709551600];

After this patch:
  88   │   _103 = .SELECT_VL (ivtmp_101, POLY_INT_CST [16, 16]);
  89   │   vect_x_18.11_90 = .MASK_LEN_LOAD (vectp_op_1.9_88, 8B, { -1, ... 
}, _103, 0);
  90   │   vect_y_20.14_94 = .MASK_LEN_LOAD (vectp_op_2.12_92, 8B, { -1, 
... }, _103, 0);
  91   │   vect_patt_49.15_95 = .SAT_ADD (vect_x_18.11_90, vect_y_20.14_94);
  92   │   .MASK_LEN_STORE (vectp_out.16_97, 8B, { -1, ... }, _103, 0, 
vect_patt_49.15_95);
  93   │   vectp_op_1.9_89 = vectp_op_1.9_88 + _103;
  94   │   vectp_op_2.12_93 = vectp_op_2.12_92 + _103;
  95   │   vectp_out.16_98 = vectp_out.16_97 + _103;
  96   │   ivtmp_102 = ivtmp_101 - _103;

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add the case 3 for signed .SAT_ADD matching.

Signed-off-by: Pan Li 

Diff:
---
 gcc/match.pd | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index fdb59ff0d447..940292d0d497 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3251,6 +3251,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type)
   && types_match (type, @0, @1
 
+/* Signed saturation add, case 5:
+   T sum = (T)((UT)X + (UT)Y);
+   SAT_S_ADD = (X ^ sum) < 0 & ~((X ^ Y) < 0) ? (-(T)(X < 0) ^ MAX) : sum;
+
+   The T and UT are type pair like T=int8_t, UT=uint8_t.  */
+(match (signed_integer_sat_add @0 @1)
+ (cond^ (bit_and:c (lt (bit_xor @0 (nop_convert@2 (plus (nop_convert @0)
+(nop_convert @1
+  integer_zerop)
+

[gcc r15-3782] testsuite/gfortran.dg/unsigned_22.f90: Add missing close with delete, PR116701

2024-09-22 Thread Hans-Peter Nilsson via Gcc-cvs

https://gcc.gnu.org/g:3f37c6f47cd50c99350e93ef0dab31f7dc6d213a

commit r15-3782-g3f37c6f47cd50c99350e93ef0dab31f7dc6d213a
Author: Hans-Peter Nilsson 
Date:   Mon Sep 23 03:29:02 2024 +0200

testsuite/gfortran.dg/unsigned_22.f90: Add missing close with delete, 
PR116701

Without this patch, gfortran.dg/unsigned_22.f90 fails for
non-effective-target fd_truncate targets, i.e. targets that
don't support chsize or ftruncate.  See also
libgfortran/io/unix.c:raw_truncate.  It passes on the first
run, but leaves behind a file "fort.10" which is then picked
up by subsequent runs, but since that file is to be
rewritten, the libgfortran machinery tries to truncate it,
which fails.  The file always being left behind, is
primarily because the test-case lacks a deleting
close-statement, apparently accidentally.

Incidentally, this "fort.10" artefact is also picked up by
gfortran.dg/write_check3.f90 causing that test to fail too,
observable as a regression for non-fd_truncate targets since
the unsigned_22.f90 introduction.  Also, when running
e.g. the whole of gfortran.dg/dg.exp, the "fort.10" is later
deleted by gfortran.dg/write_direct_eor.f90 (which
regardlessly passes), erasing the clue of the cause of the
write_check3 failure.  Also, running just
dg.exp=write_check3.f90 or manually repeating the commands
in gfortran.log showed no error.

N.B.: this close-statement will not help if unsigned_22 for
some reason fails, executing one of the "stop" statements,
but that's also the case for many other tests.

PR testsuite/116701
* gfortran.dg/unsigned_22.f90: Add missing close with delete.

Diff:
---
 gcc/testsuite/gfortran.dg/unsigned_22.f90 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gfortran.dg/unsigned_22.f90 
b/gcc/testsuite/gfortran.dg/unsigned_22.f90
index bc2f810238de..2a8434ccb6ec 100644
--- a/gcc/testsuite/gfortran.dg/unsigned_22.f90
+++ b/gcc/testsuite/gfortran.dg/unsigned_22.f90
@@ -22,4 +22,5 @@ program memain
   read (10,*,iostat=iostat,iomsg=iomsg) u
   if (iostat == 0) error stop 7
   if (iomsg /= "Unsigned integer overflow while reading item 1 of list input") 
error stop 8
+  close(unit=10, status='delete')
  end program memain

[gcc r14-10703] Darwin: Allow for as versions that need '-' for std in.

2024-09-22 Thread Iain D Sandoe via Gcc-cvs

https://gcc.gnu.org/g:59fa909de87d3658462e6f8220b545c285581e78

commit r14-10703-g59fa909de87d3658462e6f8220b545c285581e78
Author: Iain Sandoe 
Date:   Wed Sep 18 17:46:32 2024 +0100

Darwin: Allow for as versions that need '-' for std in.

Recent versions of Xcode as require a dash to read from standard
input.  We can use this on all supported OS versions so make it
unconditional.  Patch from Mark Mentovai.

gcc/ChangeLog:

* config/darwin.h (AS_NEEDS_DASH_FOR_PIPED_INPUT): New.

Signed-off-by: Iain Sandoe 
(cherry picked from commit 33ccc1314dcdb0b988a9276ca6b6ce9b07bea21e)

Diff:
---
 gcc/config/darwin.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 377599074a75..0d8886c026c6 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -648,6 +648,8 @@ extern GTY(()) int darwin_ms_struct;
 #define ASM_OPTIONS "%{v} %{w:-W} %{I*}"
 #endif
 
+#define AS_NEEDS_DASH_FOR_PIPED_INPUT
+
 /* Default Darwin ASM_SPEC, very simple. */
 #define ASM_SPEC \
 "%{static} -arch %(darwin_arch) " \

[gcc r14-10702] Darwin: Recognise -weak_framework in the driver [PR116237].

2024-09-22 Thread Iain D Sandoe via Gcc-cvs

https://gcc.gnu.org/g:b292b6b9e104c2418b4b19c8495fef9effe9369f

commit r14-10702-gb292b6b9e104c2418b4b19c8495fef9effe9369f
Author: Iain Sandoe 
Date:   Mon Aug 5 13:19:28 2024 +0100

Darwin: Recognise -weak_framework in the driver [PR116237].

XCode compilers recognise the weak_framework linker option in the driver
and forward it.  This patch makes GCC adopt the same behaviour.

PR target/116237

gcc/ChangeLog:

* config/darwin.h (SUBTARGET_DRIVER_SELF_SPECS): Add a spec for
weak_framework.
* config/darwin.opt: Handle weak_framework driver option.

Signed-off-by: Iain Sandoe 
(cherry picked from commit 4cec7bc79db52bae159c3c60a415e2aea48051d8)

Diff:
---
 gcc/config/darwin.h   | 2 ++
 gcc/config/darwin.opt | 4 
 2 files changed, 6 insertions(+)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index c09b9e9dc94d..377599074a75 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -264,6 +264,8 @@ extern GTY(()) int darwin_ms_struct;
   "%{weak_reference_mismatches*:\
 -Xlinker -weak_reference_mismatches -Xlinker %*} \
 %Do not export the global symbols listed 
in .
 
+weak_framework
+Driver RejectNegative Separate
+-weak_framework Make a weak link to the specified framework.
+
 weak_reference_mismatches
 Driver RejectNegative Separate
 -weak_reference_mismatches  Specifies what to do if a symbol import 
conflicts between file (weak in one and not in another) the default is to treat 
the symbol as non-weak.

[gcc r14-10704] libgcc, Darwin: From macOS 11, make that the earliest supported.

2024-09-22 Thread Iain D Sandoe via Gcc-cvs

https://gcc.gnu.org/g:7eba5b286e991d3e16321791805704815a02ee92

commit r14-10704-g7eba5b286e991d3e16321791805704815a02ee92
Author: Iain Sandoe 
Date:   Sun Sep 22 14:30:30 2024 +0100

libgcc, Darwin: From macOS 11, make that the earliest supported.

For libgcc, we have (so far) supported building a DSO that supports
earlier versions of the OS than the target.  From macOS 11, there are
APIs that do not exist on earlier OS versions, so limit the libgcc
range to macOS11..current.

libgcc/ChangeLog:

* config.host: From macOS 11, limit earliest macOS support
to macOS 11.
* config/t-darwin-min-11: New file.

Signed-off-by: Iain Sandoe 
(cherry picked from commit 43eab54939d37d4e634a692910d31adafc053e38)

Diff:
---
 libgcc/config.host| 5 -
 libgcc/config/t-darwin-min-11 | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/libgcc/config.host b/libgcc/config.host
index e75a7af647f6..733290370442 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -236,7 +236,10 @@ case ${host} in
   esac
   tmake_file="$tmake_file t-slibgcc-darwin"
   case ${host} in
-*-*-darwin1[89]* | *-*-darwin2* )
+*-*-darwin2*)
+  tmake_file="t-darwin-min-11 $tmake_file"
+  ;;
+*-*-darwin1[89]*)
   tmake_file="t-darwin-min-8 $tmake_file"
   ;;
 *-*-darwin9* | *-*-darwin1[0-7]*)
diff --git a/libgcc/config/t-darwin-min-11 b/libgcc/config/t-darwin-min-11
new file mode 100644
index ..4009d41addb5
--- /dev/null
+++ b/libgcc/config/t-darwin-min-11
@@ -0,0 +1,3 @@
+# Support building with -mmacosx-version-min back to macOS 11.
+DARWIN_MIN_LIB_VERSION = -mmacosx-version-min=11
+DARWIN_MIN_CRT_VERSION = -mmacosx-version-min=11

[gcc r15-3767] aarch64: Take into account when VF is higher than known scalar iters

2024-09-22 Thread Tamar Christina via Gcc-cvs

https://gcc.gnu.org/g:e84e5d034124c6733d3b36d8623c56090d4d17f7

commit r15-3767-ge84e5d034124c6733d3b36d8623c56090d4d17f7
Author: Tamar Christina 
Date:   Sun Sep 22 13:34:10 2024 +0100

aarch64: Take into account when VF is higher than known scalar iters

Consider low overhead loops like:

void
foo (char *restrict a, int *restrict b, int *restrict c, int n)
{
  for (int i = 0; i < 9; i++)
{
  int res = c[i];
  int t = b[i];
  if (a[i] != 0)
res = t;
  c[i] = res;
}
}

For such loops we use latency only costing since the loop bounds is known 
and
small.

The current costing however does not consider the case where niters < VF.

So when comparing the scalar vs vector costs it doesn't keep in mind that 
the
scalar code can't perform VF iterations.  This makes it overestimate the 
cost
for the scalar loop and we incorrectly vectorize.

This patch takes the minimum of the VF and niters in such cases.
Before the patch we generate:

 note:  Original vector body cost = 46
 note:  Vector loop iterates at most 1 times
 note:  Scalar issue estimate:
 note:load operations = 2
 note:store operations = 1
 note:general operations = 1
 note:reduction latency = 0
 note:estimated min cycles per iteration = 1.00
 note:estimated cycles per vector iteration (for VF 32) = 32.00
 note:  SVE issue estimate:
 note:load operations = 5
 note:store operations = 4
 note:general operations = 11
 note:predicate operations = 12
 note:reduction latency = 0
 note:estimated min cycles per iteration without predication = 5.50
 note:estimated min cycles per iteration for predication = 12.00
 note:estimated min cycles per iteration = 12.00
 note:  Low iteration count, so using pure latency costs
 note:  Cost model analysis:

vs after:

 note:  Original vector body cost = 46
 note:  Known loop bounds, capping VF to 9 for analysis
 note:  Vector loop iterates at most 1 times
 note:  Scalar issue estimate:
 note:load operations = 2
 note:store operations = 1
 note:general operations = 1
 note:reduction latency = 0
 note:estimated min cycles per iteration = 1.00
 note:estimated cycles per vector iteration (for VF 9) = 9.00
 note:  SVE issue estimate:
 note:load operations = 5
 note:store operations = 4
 note:general operations = 11
 note:predicate operations = 12
 note:reduction latency = 0
 note:estimated min cycles per iteration without predication = 5.50
 note:estimated min cycles per iteration for predication = 12.00
 note:estimated min cycles per iteration = 12.00
 note:  Increasing body cost to 1472 because the scalar code could issue 
within the limit imposed by predicate operations
 note:  Low iteration count, so using pure latency costs
 note:  Cost model analysis:

gcc/ChangeLog:

* config/aarch64/aarch64.cc (adjust_body_cost):
Cap VF for low iteration loops.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/asrdiv_4.c: Update bounds.
* gcc.target/aarch64/sve/cond_asrd_2.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_6.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_7.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_8.c: Likewise.
* gcc.target/aarch64/sve/miniloop_1.c: Likewise.
* gcc.target/aarch64/sve/spill_6.c: Likewise.
* gcc.target/aarch64/sve/sve_iters_low_1.c: New test.
* gcc.target/aarch64/sve/sve_iters_low_2.c: New test.

Diff:
---
 gcc/config/aarch64/aarch64.cc| 13 +
 gcc/testsuite/gcc.target/aarch64/sve/asrdiv_4.c  | 12 ++--
 gcc/testsuite/gcc.target/aarch64/sve/cond_asrd_2.c   | 12 ++--
 gcc/testsuite/gcc.target/aarch64/sve/cond_uxt_6.c|  8 
 gcc/testsuite/gcc.target/aarch64/sve/cond_uxt_7.c|  8 
 gcc/testsuite/gcc.target/aarch64/sve/cond_uxt_8.c|  8 
 gcc/testsuite/gcc.target/aarch64/sve/miniloop_1.c|  2 +-
 gcc/testsuite/gcc.target/aarch64/sve/spill_6.c   |  8 
 .../gcc.target/aarch64/sve/sve_iters_low_1.c | 17 +
 .../gcc.target/aarch64/sve/sve_iters_low_2.c | 20 
 10 files changed, 79 insertions(+), 29 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 92763d403c75..68913beaee20 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -17565,6 +17565,19 @@ adjust_body_cost (loop_vec_info loop_vinfo,
 dump_printf_loc (MSG_NOTE, vect_location,
 "Origina

[gcc r15-3768] middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-22 Thread Tamar Christina via Gcc-cvs

https://gcc.gnu.org/g:4150bcd205ebb60b949224758c05012c0dfab7a7

commit r15-3768-g4150bcd205ebb60b949224758c05012c0dfab7a7
Author: Tamar Christina 
Date:   Sun Sep 22 13:38:49 2024 +0100

middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

Currently the vectorizer cheats when lowering COND_EXPR during bool recog.
In the cases where the conditonal is loop invariant or non-boolean it 
instead
converts the operation back into GENERIC and hides much of the operation 
from
the analysis part of the vectorizer.

i.e.

  a ? b : c

is transformed into:

  a != 0 ? b : c

however by doing so we can't perform any optimization on the mask as they 
aren't
explicit until quite late during codegen.

To fix this this patch lowers booleans earlier and so ensures that we are 
always
in GIMPLE.

For when the value is a loop invariant boolean we have to generate an 
additional
conversion from bool to the integer mask form.

This is done by creating a loop invariant a ? -1 : 0 with the target mask
precision and then doing a normal != 0 comparison on that.

To support this the patch also adds the ability to during pattern matching
create a loop invariant pattern that won't be seen by the vectorizer and 
will
instead me materialized inside the loop preheader in the case of loops, or 
in
the case of BB vectorization it materializes it in the first BB in the 
region.

gcc/ChangeLog:

* tree-vect-patterns.cc (append_inv_pattern_def_seq): New.
(vect_recog_bool_pattern): Lower COND_EXPRs.
* tree-vect-slp.cc (vect_slp_region): Materialize loop invariant
statements.
* tree-vect-loop.cc (vect_transform_loop): Likewise.
* tree-vect-stmts.cc (vectorizable_comparison_1): Remove
VECT_SCALAR_BOOLEAN_TYPE_P handling for vectype.
* tree-vectorizer.cc (vec_info::vec_info): Initialize
inv_pattern_def_seq.
* tree-vectorizer.h (LOOP_VINFO_INV_PATTERN_DEF_SEQ): New.
(class vec_info): Add inv_pattern_def_seq.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-conditional_store_1.c: New test.
* gcc.dg/vect/vect-conditional_store_5.c: New test.
* gcc.dg/vect/vect-conditional_store_6.c: New test.

Diff:
---
 .../gcc.dg/vect/bb-slp-conditional_store_1.c   | 15 +
 .../gcc.dg/vect/vect-conditional_store_5.c | 28 
 .../gcc.dg/vect/vect-conditional_store_6.c | 24 +
 gcc/tree-vect-loop.cc  | 12 +++
 gcc/tree-vect-patterns.cc  | 39 --
 gcc/tree-vect-slp.cc   | 14 
 gcc/tree-vect-stmts.cc |  6 +---
 gcc/tree-vectorizer.cc |  3 +-
 gcc/tree-vectorizer.h  |  7 
 9 files changed, 139 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-conditional_store_1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-conditional_store_1.c
new file mode 100644
index ..650a3bfbfb1d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-conditional_store_1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_float } */
+
+/* { dg-additional-options "-mavx2" { target avx2 } } */
+/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
+
+void foo3 (float *restrict a, int *restrict c)
+{
+#pragma GCC unroll 8
+  for (int i = 0; i < 8; i++)
+c[i] = a[i] > 1.0;
+}
+
+/* { dg-final { scan-tree-dump "vectorized using SLP" "slp1" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-conditional_store_5.c 
b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_5.c
new file mode 100644
index ..37d60fa76351
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_masked_store } */
+
+/* { dg-additional-options "-mavx2" { target avx2 } } */
+/* { dg-additional-options "-march=armv9-a" { target aarch64-*-* } } */
+
+#include 
+
+void foo3 (float *restrict a, int *restrict b, int *restrict c, int n, int 
stride)
+{
+  if (stride <= 1)
+return;
+
+  bool ai = a[0];
+
+  for (int i = 0; i < n; i++)
+{
+  int res = c[i];
+  int t = b[i+stride];
+  if (ai)
+t = res;
+  c[i] = t;
+}
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "VEC_COND_EXPR " "vect" { target 
aarch64-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-conditional_store_6.c 
b/gcc/testsuite/gcc.dg/vect/vect-conditional_store_6.c
new file mode 100644
index ..5e1aedf3726b
--- /

[gcc r15-3776] libstdc++: Disable std::formatter specialization

2024-09-22 Thread Jonathan Wakely via Libstdc++-cvs

https://gcc.gnu.org/g:0f52a92ab249bde64b7570d4cf549437a3283520

commit r15-3776-g0f52a92ab249bde64b7570d4cf549437a3283520
Author: Jonathan Wakely 
Date:   Fri Sep 20 17:26:35 2024 +0100

libstdc++: Disable std::formatter specialization

I noticed that char8_t was missing from the list of types that were
prevented from using the std::formatter partial specialization for
integer types. That partial specialization was also matching
cv-qualified integer types, because std::integral is true.

This change simplifies the constraints by introducing a new variable
template which is only true for cv-unqualified integer types, with
explicit specializations to exclude the character types. This should be
slightly more efficient than the previous constraints that checked
std::integral and (!__is_one_of). It also
avoids the need for a separate std::formatter specialization for 128-bit
integers, as they can be handled by the new variable template too.

libstdc++-v3/ChangeLog:

* include/std/format (__format::__is_formattable_integer): New
variable template and specializations.
(template struct formatter): Replace
constraints on first arg with __is_formattable_integer.
* testsuite/std/format/formatter/requirements.cc: Check that
std::formatter specializations for char8_t and const int are
disabled.

Diff:
---
 libstdc++-v3/include/std/format| 53 --
 .../testsuite/std/format/formatter/requirements.cc | 17 +++
 2 files changed, 45 insertions(+), 25 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 4c5377aabec6..100a53dfd76f 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -1468,8 +1468,9 @@ namespace __format
 
   // We can format a floating-point type iff it is usable with to_chars.
   template
-concept __formattable_float = requires (_Tp __t, char* __p)
-{ __format::to_chars(__p, __p, __t, chars_format::scientific, 6); };
+concept __formattable_float
+  = is_same_v, _Tp> && requires (_Tp __t, char* __p)
+  { __format::to_chars(__p, __p, __t, chars_format::scientific, 6); };
 
   template<__char _CharT>
 struct __formatter_fp
@@ -2184,32 +2185,33 @@ namespace __format
 #endif // USE_WCHAR_T
   /// @}
 
-  /// Format an integer.
-  template
-requires (!__is_one_of<_Tp, char, wchar_t, char16_t, char32_t>::value)
-struct formatter<_Tp, _CharT>
-{
-  formatter() = default;
-
-  [[__gnu__::__always_inline__]]
-  constexpr typename basic_format_parse_context<_CharT>::iterator
-  parse(basic_format_parse_context<_CharT>& __pc)
-  {
-   return _M_f.template _M_parse<_Tp>(__pc);
-  }
+/// @cond undocumented
+namespace __format
+{
+  // each cv-unqualified arithmetic type ArithmeticT other than
+  // char, wchar_t, char8_t, char16_t, or char32_t
+  template
+constexpr bool __is_formattable_integer = __is_integer<_Tp>::__value;
 
-  template
-   typename basic_format_context<_Out, _CharT>::iterator
-   format(_Tp __u, basic_format_context<_Out, _CharT>& __fc) const
-   { return _M_f.format(__u, __fc); }
+#if defined __SIZEOF_INT128__
+  template<> inline constexpr bool __is_formattable_integer<__int128>  = true;
+  template<> inline constexpr bool __is_formattable_integer
+  = true;
+#endif
 
-private:
-  __format::__formatter_int<_CharT> _M_f;
-};
+  template<> inline constexpr bool __is_formattable_integer = false;
+  template<> inline constexpr bool __is_formattable_integer = false;
+#ifdef _GLIBCXX_USE_CHAR8_T
+  template<> inline constexpr bool __is_formattable_integer = false;
+#endif
+  template<> inline constexpr bool __is_formattable_integer = false;
+  template<> inline constexpr bool __is_formattable_integer = false;
+}
+/// ~endcond
 
-#if defined __SIZEOF_INT128__ && defined __STRICT_ANSI__
+  /// Format an integer.
   template
-requires (__is_one_of<_Tp, __int128, unsigned __int128>::value)
+requires __format::__is_formattable_integer<_Tp>
 struct formatter<_Tp, _CharT>
 {
   formatter() = default;
@@ -2229,7 +2231,6 @@ namespace __format
 private:
   __format::__formatter_int<_CharT> _M_f;
 };
-#endif
 
 #if defined __glibcxx_to_chars
   /// Format a floating-point value.
@@ -2614,6 +2615,8 @@ namespace __format
 } // namespace __format
 /// @endcond
 
+// Concept std::formattable was introduced by P2286R8 "Formatting Ranges",
+// but we can't guard it with __cpp_lib_format_ranges until we define that!
 #if __cplusplus > 202002L
   // [format.formattable], concept formattable
   template
diff --git a/libstdc++-v3/testsuite/std/format/formatter/requirements.cc 
b/libstdc++-v3/testsuite/std/format/formatter/requirements.cc
index bde67e586efc..416b9a8ede52 100644
--- a/libstdc++-v3/testsuite/std/format/formatter/requ

[gcc r15-3774] libstdc++: Fix formatting of most negative chrono::duration [PR116755]

2024-09-22 Thread Jonathan Wakely via Gcc-cvs

https://gcc.gnu.org/g:482e651f5750e4648ade90e32ed45b094538e7f8

commit r15-3774-g482e651f5750e4648ade90e32ed45b094538e7f8
Author: Jonathan Wakely 
Date:   Wed Sep 18 17:20:29 2024 +0100

libstdc++: Fix formatting of most negative chrono::duration [PR116755]

When formatting chrono::duration::min() we were
causing undefined behaviour by trying to form the negative of the most
negative value. If we convert negative durations with integer rep to the
corresponding unsigned integer rep then we can safely represent all
values.

libstdc++-v3/ChangeLog:

PR libstdc++/116755
* include/bits/chrono_io.h (formatter>::format):
Cast negative integral durations to unsigned rep.
* testsuite/20_util/duration/io.cc: Test the most negative
integer durations.

Diff:
---
 libstdc++-v3/include/bits/chrono_io.h | 16 ++--
 libstdc++-v3/testsuite/20_util/duration/io.cc |  8 
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/chrono_io.h 
b/libstdc++-v3/include/bits/chrono_io.h
index 0e4d23c9bb77..c7d2c9862fcf 100644
--- a/libstdc++-v3/include/bits/chrono_io.h
+++ b/libstdc++-v3/include/bits/chrono_io.h
@@ -1720,8 +1720,20 @@ namespace __format
   basic_format_context<_Out, _CharT>& __fc) const
{
  if constexpr (numeric_limits<_Rep>::is_signed)
-   if (__d < __d.zero())
- return _M_f._M_format(-__d, __fc, true);
+   if (__d < __d.zero()) [[unlikely]]
+ {
+   if constexpr (is_integral_v<_Rep>)
+ {
+   // -d is undefined for the most negative integer.
+   // Convert duration to corresponding unsigned rep.
+   using _URep = make_unsigned_t<_Rep>;
+   auto __ucnt = -static_cast<_URep>(__d.count());
+   auto __ud = chrono::duration<_URep, _Period>(__ucnt);
+   return _M_f._M_format(__ud, __fc, true);
+ }
+   else
+ return _M_f._M_format(-__d, __fc, true);
+ }
  return _M_f._M_format(__d, __fc, false);
}
 
diff --git a/libstdc++-v3/testsuite/20_util/duration/io.cc 
b/libstdc++-v3/testsuite/20_util/duration/io.cc
index 6b00689672c8..57020f4f9537 100644
--- a/libstdc++-v3/testsuite/20_util/duration/io.cc
+++ b/libstdc++-v3/testsuite/20_util/duration/io.cc
@@ -106,6 +106,14 @@ test_format()
   VERIFY( s == "500ms" );
   s = std::format("{:%Q %q}", u);
   VERIFY( s == "500 ms" );
+
+  // PR libstdc++/116755 extra minus sign for most negative value
+  auto minsec = std::chrono::seconds::min();
+  s = std::format("{}", minsec);
+  auto expected = std::format("{}s", minsec.count());
+  VERIFY( s == expected );
+  s = std::format("{:%Q%q}", minsec);
+  VERIFY( s == expected );
 }
 
 void

[gcc r15-3775] libstdc++: Fix condition for ranges::copy to use memmove [PR116754]

2024-09-22 Thread Jonathan Wakely via Gcc-cvs

https://gcc.gnu.org/g:83c6fe130a00c6c28cfffcc787a0a719966adfaf

commit r15-3775-g83c6fe130a00c6c28cfffcc787a0a719966adfaf
Author: Jonathan Wakely 
Date:   Wed Sep 18 17:47:49 2024 +0100

libstdc++: Fix condition for ranges::copy to use memmove [PR116754]

libstdc++-v3/ChangeLog:

PR libstdc++/116754
* include/bits/ranges_algobase.h (__copy_or_move): Fix order of
arguments to __memcpyable.

Diff:
---
 libstdc++-v3/include/bits/ranges_algobase.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/ranges_algobase.h 
b/libstdc++-v3/include/bits/ranges_algobase.h
index 2a36ba69775a..40c628b38182 100644
--- a/libstdc++-v3/include/bits/ranges_algobase.h
+++ b/libstdc++-v3/include/bits/ranges_algobase.h
@@ -286,7 +286,7 @@ namespace ranges
{
  if (!std::__is_constant_evaluated())
{
- if constexpr (__memcpyable<_Iter, _Out>::__value)
+ if constexpr (__memcpyable<_Out, _Iter>::__value)
{
  using _ValueTypeI = iter_value_t<_Iter>;
  auto __num = __last - __first;

[gcc r15-3773] libstdc++: Use constexpr instead of _GLIBCXX20_CONSTEXPR in

2024-09-22 Thread Jonathan Wakely via Gcc-cvs

https://gcc.gnu.org/g:b6463161c3cd0b1f764697290d9569c7153b8a5b

commit r15-3773-gb6463161c3cd0b1f764697290d9569c7153b8a5b
Author: Jonathan Wakely 
Date:   Wed Sep 18 16:17:28 2024 +0100

libstdc++: Use constexpr instead of _GLIBCXX20_CONSTEXPR in 

For the operator<=> overload we can use the 'constexpr' keyword
directly, because we know the language dialect is at least C++20.

libstdc++-v3/ChangeLog:

* include/bits/stl_vector.h (operator<=>): Use constexpr
instead of _GLIBCXX20_CONSTEXPR macro.

Diff:
---
 libstdc++-v3/include/bits/stl_vector.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 182ad41ed946..e284536ad31e 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -2078,7 +2078,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 { return (__x.size() == __y.size()
  && std::equal(__x.begin(), __x.end(), __y.begin())); }
 
-#if __cpp_lib_three_way_comparison
+#if __cpp_lib_three_way_comparison // >= C++20
   /**
*  @brief  Vector ordering relation.
*  @param  __x  A `vector`.
@@ -2091,8 +2091,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*  `<` and `>=` etc.
   */
   template
-[[nodiscard]] _GLIBCXX20_CONSTEXPR
-inline __detail::__synth3way_t<_Tp>
+[[nodiscard]]
+constexpr __detail::__synth3way_t<_Tp>
 operator<=>(const vector<_Tp, _Alloc>& __x, const vector<_Tp, _Alloc>& __y)
 {
   return std::lexicographical_compare_three_way(__x.begin(), __x.end(),

[gcc r15-3772] libstdc++: Silence -Wattributes warning in exception_ptr

2024-09-22 Thread Jonathan Wakely via Libstdc++-cvs

https://gcc.gnu.org/g:164c1b1f812da5d1e00fc10a415e80f7c508efcb

commit r15-3772-g164c1b1f812da5d1e00fc10a415e80f7c508efcb
Author: Jonathan Wakely 
Date:   Wed Sep 18 15:41:05 2024 +0100

libstdc++: Silence -Wattributes warning in exception_ptr

libstdc++-v3/ChangeLog:

* libsupc++/exception_ptr.h (__exception_ptr::_M_safe_bool_dummy):
Remove __attribute__((const)) from function returning void.

Diff:
---
 libstdc++-v3/libsupc++/exception_ptr.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libstdc++-v3/libsupc++/exception_ptr.h 
b/libstdc++-v3/libsupc++/exception_ptr.h
index 7c234ce0bf20..ee977a8a6eac 100644
--- a/libstdc++-v3/libsupc++/exception_ptr.h
+++ b/libstdc++-v3/libsupc++/exception_ptr.h
@@ -151,8 +151,7 @@ namespace std _GLIBCXX_VISIBILITY(default)
 
 #ifdef _GLIBCXX_EH_PTR_COMPAT
   // Retained for compatibility with CXXABI_1.3.
-  void _M_safe_bool_dummy() _GLIBCXX_USE_NOEXCEPT
-   __attribute__ ((__const__));
+  void _M_safe_bool_dummy() _GLIBCXX_USE_NOEXCEPT;
   bool operator!() const _GLIBCXX_USE_NOEXCEPT
__attribute__ ((__pure__));
   operator __safe_bool() const _GLIBCXX_USE_NOEXCEPT;

[gcc r15-3771] libstdc++: Silence -Woverloaded-virtual warning in cxx11-ios_failure.cc

2024-09-22 Thread Jonathan Wakely via Libstdc++-cvs

https://gcc.gnu.org/g:d842eb5ee6cb4d8a2795730ac88c4c2960f94bf4

commit r15-3771-gd842eb5ee6cb4d8a2795730ac88c4c2960f94bf4
Author: Jonathan Wakely 
Date:   Wed Sep 18 15:38:02 2024 +0100

libstdc++: Silence -Woverloaded-virtual warning in cxx11-ios_failure.cc

libstdc++-v3/ChangeLog:

* src/c++11/cxx11-ios_failure.cc (__iosfail_type_info): Unhide
the three-arg overload of __do_upcast.

Diff:
---
 libstdc++-v3/src/c++11/cxx11-ios_failure.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/src/c++11/cxx11-ios_failure.cc 
b/libstdc++-v3/src/c++11/cxx11-ios_failure.cc
index bd3fd556bd43..70dddc82389e 100644
--- a/libstdc++-v3/src/c++11/cxx11-ios_failure.cc
+++ b/libstdc++-v3/src/c++11/cxx11-ios_failure.cc
@@ -94,6 +94,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 ~__iosfail_type_info();
 
+using __si_class_type_info::__do_upcast;
+
 bool
 __do_upcast (const __class_type_info *dst_type,
 void **obj_ptr) const override;

[gcc r15-3769] libstdc++: add default template parameters to algorithms

2024-09-22 Thread Jonathan Wakely via Libstdc++-cvs

https://gcc.gnu.org/g:dc47add79261679747302293e1a5e49ba96276b1

commit r15-3769-gdc47add79261679747302293e1a5e49ba96276b1
Author: Jonathan Wakely 
Date:   Fri Sep 13 10:18:46 2024 +0100

libstdc++: add default template parameters to algorithms

This implements P2248R8 + P3217R0, both approved for C++26.
The changes are mostly mechanical; the struggle is to keep readability
with the pre-P2248 signatures.

* For containers, "classic STL" algorithms and their parallel versions,
  introduce a macro and amend their declarations/definitions with it.
  The macro either expands to the defaulted parameter or to nothing
  in pre-C++26 modes.

* For range algorithms, we need to reorder their template parameters.
  I've done so unconditionally, because users cannot rely on template
  parameters of algorithms (this is explicitly authorized by
  [algorithms.requirements]/15). The defaults are then hidden behind
  another macro.

libstdc++-v3/ChangeLog:

* include/bits/iterator_concepts.h: Add projected_value_t.
* include/bits/algorithmfwd.h: Add the default template
parameter to the relevant forward declarations.
* include/pstl/glue_algorithm_defs.h: Likewise.
* include/bits/ranges_algo.h: Add the default template
parameter to range-based algorithms.
* include/bits/ranges_algobase.h: Likewise.
* include/bits/ranges_util.h: Likewise.
* include/bits/ranges_base.h: Add helper macros.
* include/bits/stl_iterator_base_types.h: Add helper macro.
* include/bits/version.def: Add the new feature-testing macro.
* include/bits/version.h: Regenerate.
* include/std/algorithm: Pull the feature-testing macro.
* include/std/ranges: Likewise.
* include/std/deque: Pull the feature-testing macro, add
the default for std::erase.
* include/std/forward_list: Likewise.
* include/std/list: Likewise.
* include/std/string: Likewise.
* include/std/vector: Likewise.
* testsuite/23_containers/default_template_value.cc: New test.
* testsuite/25_algorithms/default_template_value.cc: New test.

Signed-off-by: Giuseppe D'Angelo 
Co-authored-by: Jonathan Wakely 

Diff:
---
 libstdc++-v3/include/bits/algorithmfwd.h   |  48 ---
 libstdc++-v3/include/bits/iterator_concepts.h  |   7 +
 libstdc++-v3/include/bits/ranges_algo.h| 135 ++--
 libstdc++-v3/include/bits/ranges_algobase.h|  14 +-
 libstdc++-v3/include/bits/ranges_base.h|   6 +
 libstdc++-v3/include/bits/ranges_util.h|   9 +-
 .../include/bits/stl_iterator_base_types.h |   9 ++
 libstdc++-v3/include/bits/version.def  |   8 ++
 libstdc++-v3/include/bits/version.h|  10 ++
 libstdc++-v3/include/pstl/glue_algorithm_defs.h|  23 ++--
 libstdc++-v3/include/std/algorithm |   1 +
 libstdc++-v3/include/std/deque |   4 +-
 libstdc++-v3/include/std/forward_list  |   4 +-
 libstdc++-v3/include/std/list  |   4 +-
 libstdc++-v3/include/std/ranges|   1 +
 libstdc++-v3/include/std/string|   4 +-
 libstdc++-v3/include/std/vector|   4 +-
 .../23_containers/default_template_value.cc|  40 ++
 .../25_algorithms/default_template_value.cc| 142 +
 19 files changed, 392 insertions(+), 81 deletions(-)

diff --git a/libstdc++-v3/include/bits/algorithmfwd.h 
b/libstdc++-v3/include/bits/algorithmfwd.h
index 7f1f15970abe..df14864d210e 100644
--- a/libstdc++-v3/include/bits/algorithmfwd.h
+++ b/libstdc++-v3/include/bits/algorithmfwd.h
@@ -206,12 +206,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 any_of(_IIter, _IIter, _Predicate);
 #endif
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 bool
 binary_search(_FIter, _FIter, const _Tp&);
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 bool
 binary_search(_FIter, _FIter, const _Tp&, _Compare);
@@ -253,22 +254,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // count
   // count_if
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 pair<_FIter, _FIter>
 equal_range(_FIter, _FIter, const _Tp&);
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 pair<_FIter, _FIter>
 equal_range(_FIter, _FIter, const _Tp&, _Compare);
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 void
 fill(_FIter, _FIter, const _Tp&);
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 _OIter
 fill_n(_OIter, _Size, const _Tp&);
@@ -380,12 +383,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 void
 iter_swap(_FIter1, _FIter2);
 
-  template
+  template
 _GLIBCXX20_CONSTEXPR
 _FIter
 lower_bound(_FIter, _FIter, const _Tp&);
 
-  te

[gcc r15-3770] libstdc++: Reorder C++26 entries in version.def

2024-09-22 Thread Jonathan Wakely via Libstdc++-cvs

https://gcc.gnu.org/g:d024be89712c9e1c175793717dfc23e635b66254

commit r15-3770-gd024be89712c9e1c175793717dfc23e635b66254
Author: Jonathan Wakely 
Date:   Fri Sep 13 10:20:01 2024 +0100

libstdc++: Reorder C++26 entries in version.def

This puts the C++26 ftms definitions in alphabetical order.

libstdc++-v3/ChangeLog:

* include/bits/version.def: Sort C++26 entries alphabetically.
* include/bits/version.h: Regenerate.

Diff:
---
 libstdc++-v3/include/bits/version.def | 34 ++---
 libstdc++-v3/include/bits/version.h   | 40 +--
 2 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index c12b0de61598..f2e28175b087 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -1789,6 +1789,15 @@ ftms = {
   };
 };
 
+ftms = {
+  name = constexpr_new;
+  values = {
+v = 202406;
+cxxmin = 26;
+extra_cond = "__cpp_constexpr >= 202406L";
+  };
+};
+
 ftms = {
   name = fstream_native_handle;
   values = {
@@ -1798,6 +1807,14 @@ ftms = {
   };
 };
 
+ftms = {
+  name = ranges_concat;
+  values = {
+v = 202403;
+cxxmin = 26;
+  };
+};
+
 ftms = {
   name = ratio;
   values = {
@@ -1842,23 +1859,6 @@ ftms = {
   };
 };
 
-ftms = {
-  name = ranges_concat;
-  values = {
-v = 202403;
-cxxmin = 26;
-  };
-};
-
-ftms = {
-  name = constexpr_new;
-  values = {
-v = 202406;
-cxxmin = 26;
-extra_cond = "__cpp_constexpr >= 202406L";
-  };
-};
-
 // Standard test specifications.
 stds[97] = ">= 199711L";
 stds[03] = ">= 199711L";
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index 4738def977fe..22526e851457 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -1978,6 +1978,16 @@
 #endif /* !defined(__cpp_lib_algorithm_default_value_type) && 
defined(__glibcxx_want_algorithm_default_value_type) */
 #undef __glibcxx_want_algorithm_default_value_type
 
+#if !defined(__cpp_lib_constexpr_new)
+# if (__cplusplus >  202302L) && (__cpp_constexpr >= 202406L)
+#  define __glibcxx_constexpr_new 202406L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_constexpr_new)
+#   define __cpp_lib_constexpr_new 202406L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_constexpr_new) && 
defined(__glibcxx_want_constexpr_new) */
+#undef __glibcxx_want_constexpr_new
+
 #if !defined(__cpp_lib_fstream_native_handle)
 # if (__cplusplus >  202302L) && _GLIBCXX_HOSTED
 #  define __glibcxx_fstream_native_handle 202306L
@@ -1988,6 +1998,16 @@
 #endif /* !defined(__cpp_lib_fstream_native_handle) && 
defined(__glibcxx_want_fstream_native_handle) */
 #undef __glibcxx_want_fstream_native_handle
 
+#if !defined(__cpp_lib_ranges_concat)
+# if (__cplusplus >  202302L)
+#  define __glibcxx_ranges_concat 202403L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_ranges_concat)
+#   define __cpp_lib_ranges_concat 202403L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_ranges_concat) && 
defined(__glibcxx_want_ranges_concat) */
+#undef __glibcxx_want_ranges_concat
+
 #if !defined(__cpp_lib_ratio)
 # if (__cplusplus >  202302L)
 #  define __glibcxx_ratio 202306L
@@ -2038,24 +2058,4 @@
 #endif /* !defined(__cpp_lib_to_string) && defined(__glibcxx_want_to_string) */
 #undef __glibcxx_want_to_string
 
-#if !defined(__cpp_lib_ranges_concat)
-# if (__cplusplus >  202302L)
-#  define __glibcxx_ranges_concat 202403L
-#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_ranges_concat)
-#   define __cpp_lib_ranges_concat 202403L
-#  endif
-# endif
-#endif /* !defined(__cpp_lib_ranges_concat) && 
defined(__glibcxx_want_ranges_concat) */
-#undef __glibcxx_want_ranges_concat
-
-#if !defined(__cpp_lib_constexpr_new)
-# if (__cplusplus >  202302L) && (__cpp_constexpr >= 202406L)
-#  define __glibcxx_constexpr_new 202406L
-#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_constexpr_new)
-#   define __cpp_lib_constexpr_new 202406L
-#  endif
-# endif
-#endif /* !defined(__cpp_lib_constexpr_new) && 
defined(__glibcxx_want_constexpr_new) */
-#undef __glibcxx_want_constexpr_new
-
 #undef __glibcxx_want_all

[gcc r13-9051] hppa: Add peephole2 optimizations for REG+D loads and stores

2024-09-22 Thread John David Anglin via Gcc-cvs

https://gcc.gnu.org/g:f4f2d6dd3d73321f177b2f9926a41bcd2e3a3300

commit r13-9051-gf4f2d6dd3d73321f177b2f9926a41bcd2e3a3300
Author: John David Anglin 
Date:   Wed Sep 18 11:02:32 2024 -0400

hppa: Add peephole2 optimizations for REG+D loads and stores

The PA 1.x architecture only supports long displacements in
integer loads and stores.  Floating-point loads and stores
only support short displacements.  As a result, we have to
wait until reload is complete before generating insns with
long displacements.

The PA 2.0 architecture supports long displacements in both
integer and floating-point loads and stores.

The peephole2 optimizations added in this change are only
enabled when 14-bit long displacements aren't supported for
floating-point loads and stores.

2024-09-18  John David Anglin  

gcc/ChangeLog:

* config/pa/pa.h (GENERAL_REGNO_P): Define.
* config/pa/pa.md: Add SImode and SFmode peephole2
patterns to generate loads and stores with long
displacements.

Diff:
---
 gcc/config/pa/pa.h  |   3 ++
 gcc/config/pa/pa.md | 100 
 2 files changed, 103 insertions(+)

diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index 6e29be282ad9..2e8b248c2fcc 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -475,6 +475,9 @@ extern rtx hppa_pic_save_rtx (void);
 #define INDEX_REG_CLASS GENERAL_REGS
 #define BASE_REG_CLASS GENERAL_REGS
 
+/* True if register is a general register.  */
+#define GENERAL_REGNO_P(N) ((N) >= 1 && (N) <= 31)
+
 #define FP_REG_CLASS_P(CLASS) \
   ((CLASS) == FP_REGS || (CLASS) == FPUPPER_REGS)
 
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index d832a29683c2..c01719467856 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -2270,6 +2270,58 @@
(set_attr "pa_combine_type" "addmove")
(set_attr "length" "4")])
 
+; Rewrite RTL using a REG+D store.  This will allow the insn that
+; computes the address to be deleted if the register it sets is dead.
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (mem:SI (match_dup 0))
+   (match_operand:SI 3 "register_operand" ""))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (mem:SI (plus:SI (match_dup 1) (match_dup 2))) (match_dup 3))
+   (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]
+  "")
+
+; Rewrite RTL using a REG+D load.  This will allow the insn that
+; computes the address to be deleted if the register it sets is dead.
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (match_operand:SI 3 "register_operand" "")
+   (mem:SI (match_dup 0)))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && REGNO (operands[1]) != REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (match_dup 3) (mem:SI (plus:SI (match_dup 1) (match_dup 2
+   (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]
+  "")
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (match_operand:SI 3 "register_operand" "")
+   (mem:SI (match_dup 0)))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (match_dup 3) (mem:SI (plus:SI (match_dup 1) (match_dup 2]
+  "")
+
 ; Rewrite RTL using an indexed store.  This will allow the insn that
 ; computes the address to be deleted if the register it sets is dead.
 (define_peephole2
@@ -4497,6 +4549,54 @@
(set_attr "pa_combine_type" "addmove")
(set_attr "length" "4")])
 
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (mem:SF (match_dup 0))
+   (match_operand:SF 3 "register_operand" ""))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (mem:SF (plus:SI (match_dup 1) (match_dup

[gcc r14-10700] hppa: Add peephole2 optimizations for REG+D loads and stores

2024-09-22 Thread John David Anglin via Gcc-cvs

https://gcc.gnu.org/g:cf4086628dd5546cc0efdd5afa102fee6b16385d

commit r14-10700-gcf4086628dd5546cc0efdd5afa102fee6b16385d
Author: John David Anglin 
Date:   Wed Sep 18 11:02:32 2024 -0400

hppa: Add peephole2 optimizations for REG+D loads and stores

The PA 1.x architecture only supports long displacements in
integer loads and stores.  Floating-point loads and stores
only support short displacements.  As a result, we have to
wait until reload is complete before generating insns with
long displacements.

The PA 2.0 architecture supports long displacements in both
integer and floating-point loads and stores.

The peephole2 optimizations added in this change are only
enabled when 14-bit long displacements aren't supported for
floating-point loads and stores.

2024-09-18  John David Anglin  

gcc/ChangeLog:

* config/pa/pa.h (GENERAL_REGNO_P): Define.
* config/pa/pa.md: Add SImode and SFmode peephole2
patterns to generate loads and stores with long
displacements.

Diff:
---
 gcc/config/pa/pa.h  |   3 ++
 gcc/config/pa/pa.md | 100 
 2 files changed, 103 insertions(+)

diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index 127a0d1966d2..49c798e49338 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -480,6 +480,9 @@ extern rtx hppa_pic_save_rtx (void);
 #define INDEX_REG_CLASS GENERAL_REGS
 #define BASE_REG_CLASS GENERAL_REGS
 
+/* True if register is a general register.  */
+#define GENERAL_REGNO_P(N) ((N) >= 1 && (N) <= 31)
+
 #define FP_REG_CLASS_P(CLASS) \
   ((CLASS) == FP_REGS || (CLASS) == FPUPPER_REGS)
 
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 9e410f43052d..c03332761442 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -2280,6 +2280,58 @@
(set_attr "pa_combine_type" "addmove")
(set_attr "length" "4")])
 
+; Rewrite RTL using a REG+D store.  This will allow the insn that
+; computes the address to be deleted if the register it sets is dead.
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (mem:SI (match_dup 0))
+   (match_operand:SI 3 "register_operand" ""))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (mem:SI (plus:SI (match_dup 1) (match_dup 2))) (match_dup 3))
+   (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]
+  "")
+
+; Rewrite RTL using a REG+D load.  This will allow the insn that
+; computes the address to be deleted if the register it sets is dead.
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (match_operand:SI 3 "register_operand" "")
+   (mem:SI (match_dup 0)))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && REGNO (operands[1]) != REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (match_dup 3) (mem:SI (plus:SI (match_dup 1) (match_dup 2
+   (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]
+  "")
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (match_operand:SI 3 "register_operand" "")
+   (mem:SI (match_dup 0)))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (match_dup 3) (mem:SI (plus:SI (match_dup 1) (match_dup 2]
+  "")
+
 ; Rewrite RTL using an indexed store.  This will allow the insn that
 ; computes the address to be deleted if the register it sets is dead.
 (define_peephole2
@@ -4507,6 +4559,54 @@
(set_attr "pa_combine_type" "addmove")
(set_attr "length" "4")])
 
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand" "")
+   (plus:SI (match_operand:SI 1 "register_operand" "")
+(match_operand:SI 2 "const_int_operand" "")))
+   (set (mem:SF (match_dup 0))
+   (match_operand:SF 3 "register_operand" ""))]
+  "!TARGET_64BIT
+   && !INT14_OK_STRICT
+   && GENERAL_REGNO_P (REGNO (operands[0]))
+   && GENERAL_REGNO_P (REGNO (operands[3]))
+   && REGNO (operands[0]) != REGNO (operands[3])
+   && base14_operand (operands[2], E_SImode)"
+  [(set (mem:SF (plus:SI (match_dup 1) (match_dup

[gcc r15-3777] libgcc, Darwin: From macOS 11, make that the earliest supported.

[gcc r15-3778] testsuite, coroutines: Add tests for non-supension ramp returns.

[gcc r15-3780] RISC-V: Add testcases for form 4 of signed scalar SAT_ADD

[gcc r15-3779] RISC-V: Add testcases for form 3 of signed scalar SAT_ADD

[gcc r15-3783] RISC-V: Add testcases for form 2 of signed vector SAT_ADD

[gcc r15-3784] Match: Support form 2 for vector signed integer .SAT_ADD

[gcc r15-3782] testsuite/gfortran.dg/unsigned_22.f90: Add missing close with delete, PR116701

[gcc r14-10703] Darwin: Allow for as versions that need '-' for std in.

[gcc r14-10702] Darwin: Recognise -weak_framework in the driver [PR116237].

[gcc r14-10704] libgcc, Darwin: From macOS 11, make that the earliest supported.

[gcc r15-3767] aarch64: Take into account when VF is higher than known scalar iters

[gcc r15-3768] middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

[gcc r15-3776] libstdc++: Disable std::formatter specialization

[gcc r15-3774] libstdc++: Fix formatting of most negative chrono::duration [PR116755]

[gcc r15-3775] libstdc++: Fix condition for ranges::copy to use memmove [PR116754]

[gcc r15-3773] libstdc++: Use constexpr instead of _GLIBCXX20_CONSTEXPR in

[gcc r15-3772] libstdc++: Silence -Wattributes warning in exception_ptr

[gcc r15-3771] libstdc++: Silence -Woverloaded-virtual warning in cxx11-ios_failure.cc

[gcc r15-3769] libstdc++: add default template parameters to algorithms

[gcc r15-3770] libstdc++: Reorder C++26 entries in version.def

[gcc r13-9051] hppa: Add peephole2 optimizations for REG+D loads and stores

[gcc r14-10700] hppa: Add peephole2 optimizations for REG+D loads and stores

22 matches

Site Navigation

Mail list logo

Footer information