[gcc r14-9827] Darwin: Sync coverage specs with gcc/gcc.cc.

2024-04-08 Thread Iain D Sandoe via Gcc-cvs
https://gcc.gnu.org/g:39cb6b880f723780faeef06383e67cfed2e3458d

commit r14-9827-g39cb6b880f723780faeef06383e67cfed2e3458d
Author: Iain Sandoe 
Date:   Sun Apr 7 19:25:33 2024 +0100

Darwin: Sync coverage specs with gcc/gcc.cc.

The specs for coverage ere out of date leading to test fails for
fcondition-coverage cases. Fixed by updating to match the specs
in gcc/gcc.cc.

gcc/ChangeLog:

* config/darwin.h (LINK_COMMAND_SPEC_A): Update coverage
specs.

Signed-off-by: Iain Sandoe 

Diff:
---
 gcc/config/darwin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 31019a0c49d..c09b9e9dc94 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -406,7 +406,7 @@ extern GTY(()) int darwin_ms_struct;
 %{!r:%{!nostdlib:%{!nodefaultlibs: " DARWIN_WEAK_CRTS "}}} \
 %o \
 %{!r:%{!nostdlib:%{!nodefaultlibs:\
-  %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
+  %{fprofile-arcs|fcondition-coverage|fprofile-generate*|coverage:-lgcov} \
   %{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1): \
%{static|static-libgcc|static-libstdc++|static-libgfortran: \
  libgomp.a%s; : -lgomp }} \


[gcc r14-9828] RISC-V: Refine the error msg for RVV intrinisc required ext

2024-04-08 Thread Pan Li via Gcc-cvs
https://gcc.gnu.org/g:7d051f7d45789e1442d26c07bfc5e7fb77433b87

commit r14-9828-g7d051f7d45789e1442d26c07bfc5e7fb77433b87
Author: Pan Li 
Date:   Mon Apr 8 12:33:05 2024 +0800

RISC-V: Refine the error msg for RVV intrinisc required ext

The RVV intrinisc API has sorts of required extension from both
the march or target attribute.  It will have error message similar
to below:

built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension

However, it is not accurate as we have many additional sub extenstion
besides v extension.  For example, zvbb, zvbk, zvbc ... etc.  This patch
would like to refine the error message with a friendly hint for the
required extension.  For example as below:

vuint64m1_t
__attribute__((target("arch=+v")))
test_1 (vuint64m1_t op_1, vuint64m1_t op_2, size_t vl)
{
  return __riscv_vclmul_vv_u64m1 (op_1, op_2, vl);
}

When compile with march=rv64gc and target arch=+v, we will have error
message as below:

error: built-in function '__riscv_vclmul_vv_u64m1(op_1,  op_2,  vl)'
  requires the 'zvbc' ISA extension

Then the end-user will get the point that the *zvbc* extension is missing
for the intrinisc API easily.

The below tests are passed for this patch.
* The riscv fully regression tests.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Pass
required_ext arg when invoke add function.
(build_th_loadstore): Ditto.
(struct vcreate_def): Ditto.
(struct read_vl_def): Ditto.
(struct vlenb_def): Ditto.
* config/riscv/riscv-vector-builtins.cc 
(function_builder::add_function):
Introduce new arg required_ext to fill in the register func.
(function_builder::add_unique_function): Ditto.
(function_builder::add_overloaded_function): Ditto.
(expand_builtin): Leverage required_extensions_specified to
check if the required extension is provided.
* config/riscv/riscv-vector-builtins.h (reqired_ext_to_isa_name): 
New
func impl to convert the required_ext enum to the extension name.
(required_extensions_specified): New func impl to predicate if
the required extension is well feeded.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: 
Adjust
the error message for v extension.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: 
Ditto.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-1.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-10.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-2.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-3.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-4.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-5.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-6.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-7.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-8.c: New test.
* gcc.target/riscv/rvv/base/intrinsic_required_ext-9.c: New test.

Signed-off-by: Pan Li 

Diff:
---
 gcc/config/riscv/riscv-vector-builtins-shapes.cc   | 18 --
 gcc/config/riscv/riscv-vector-builtins.cc  | 23 +--
 gcc/config/riscv/riscv-vector-builtins.h   | 75 +-
 .../riscv/rvv/base/intrinsic_required_ext-1.c  | 10 +++
 .../riscv/rvv/base/intrinsic_required_ext-10.c | 11 
 .../riscv/rvv/base/intrinsic_required_ext-2.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-3.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-4.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-5.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-6.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-7.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-8.c  | 11 
 .../riscv/rvv/base/intrinsic_required_ext-9.c  | 11 
 .../rvv/base/target_attribute_v_with_intrinsic-7.c |  2 +-
 .../rvv/base/target_attribute_v_with_intrinsic-8.c |  2 +-
 15 files changed, 210 insertions(+), 19 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index c5ffcc1f2c4..7f983e82370 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -72,9 +72,10 @@ build_one (function_builder &b, const function_group_info 
&group,
   if (TARGET_XTHEADVECTOR && !check_type (return_type, argument_types))
 return;
 
-  b.add_overloaded_function (function_instance, *group.shape);
+  b.add_overload

[gcc r14-9829] tree-optimization/114624 - fix use-after-free in SCCP

2024-04-08 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:97d5cd8740384dbce5a83080916388f80d8976dd

commit r14-9829-g97d5cd8740384dbce5a83080916388f80d8976dd
Author: Richard Biener 
Date:   Mon Apr 8 10:38:49 2024 +0200

tree-optimization/114624 - fix use-after-free in SCCP

We're inspecting the replaced PHI node after releasing it.

PR tree-optimization/114624
* tree-scalar-evolution.cc (final_value_replacement_loop):
Get at the PHI arg location before releasing the PHI node.

* gcc.dg/torture/pr114624.c: New testcase.

Diff:
---
 gcc/testsuite/gcc.dg/torture/pr114624.c | 20 
 gcc/tree-scalar-evolution.cc|  4 ++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr114624.c 
b/gcc/testsuite/gcc.dg/torture/pr114624.c
new file mode 100644
index 000..ae031356982
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114624.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+int a, b;
+int main() {
+  int c, d = 1;
+  while (a) {
+while (b)
+  if (d)
+while (a)
+  ;
+for (; b < 2; b++)
+  if (b)
+for (c = 0; c < 8; c++)
+  d = 0;
+  else
+for (a = 0; a < 2; a++)
+  ;
+  }
+  return 0;
+}
diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index 25e3130e2f1..b0a5e09a77c 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -3877,6 +3877,7 @@ final_value_replacement_loop (class loop *loop)
 to a GIMPLE sequence or to a statement list (keeping this a
 GENERIC interface).  */
   def = unshare_expr (def);
+  auto loc = gimple_phi_arg_location (phi, exit->dest_idx);
   remove_phi_node (&psi, false);
 
   /* Propagate constants immediately, but leave an unused initialization
@@ -3888,8 +3889,7 @@ final_value_replacement_loop (class loop *loop)
   gimple_seq stmts;
   def = force_gimple_operand (def, &stmts, false, NULL_TREE);
   gassign *ass = gimple_build_assign (rslt, def);
-  gimple_set_location (ass,
-  gimple_phi_arg_location (phi, exit->dest_idx));
+  gimple_set_location (ass, loc);
   gimple_seq_add_stmt (&stmts, ass);
 
   /* If def's type has undefined overflow and there were folded


[gcc r14-9830] contrib: Add 8057f9aa1f7e70490064de796d7a8d42d446caf8 to ignored commits.

2024-04-08 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:b93836d5ca3959f456df8bc47284741780475e03

commit r14-9830-gb93836d5ca3959f456df8bc47284741780475e03
Author: Jakub Jelinek 
Date:   Mon Apr 8 14:12:00 2024 +0200

contrib: Add 8057f9aa1f7e70490064de796d7a8d42d446caf8 to ignored commits.

This commit unfortunately added explanation to the git revert generated
message, breaking ChangeLog generation.

2024-04-08  Jakub Jelinek  

* gcc-changelog/git_update_version.py: Add
8057f9aa1f7e70490064de796d7a8d42d446caf8 to IGNORED_COMMITS.

Diff:
---
 contrib/gcc-changelog/git_update_version.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_update_version.py 
b/contrib/gcc-changelog/git_update_version.py
index 92639ba626f..3e79dab0096 100755
--- a/contrib/gcc-changelog/git_update_version.py
+++ b/contrib/gcc-changelog/git_update_version.py
@@ -38,7 +38,8 @@ IGNORED_COMMITS = (
 '86d8e0c0652ef5236a460b75c25e4f7093cc0651',
 'e4cba49413ca429dc82f6aa2e88129ecb3fdd943',
 '1957bedf29a1b2cc231972aba680fe80199d5498',
-'040e5b0edbca861196d9e2ea2af5e805769c8d5d')
+'040e5b0edbca861196d9e2ea2af5e805769c8d5d',
+'8057f9aa1f7e70490064de796d7a8d42d446caf8')
 
 FORMAT = '%(asctime)s:%(levelname)s:%(name)s:%(message)s'
 logging.basicConfig(level=logging.INFO, format=FORMAT,


[gcc r14-9832] ChangeLog: Add by hand ChangeLog entry for PR114361 revert.

2024-04-08 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:080cac15ce0c3e6b396b9161055cb76974882c07

commit r14-9832-g080cac15ce0c3e6b396b9161055cb76974882c07
Author: Jakub Jelinek 
Date:   Mon Apr 8 14:46:30 2024 +0200

ChangeLog: Add by hand ChangeLog entry for PR114361 revert.

This commit has been added to IGNORED_COMMITS, because it contained
bogus explanation of the standardized git revert message.

Diff:
---
 gcc/c/ChangeLog |  9 +
 gcc/testsuite/ChangeLog | 10 ++
 2 files changed, 19 insertions(+)

diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog
index 3d731f148b2..7d3f2362562 100644
--- a/gcc/c/ChangeLog
+++ b/gcc/c/ChangeLog
@@ -1,3 +1,12 @@
+2024-04-05  Martin Uecker  
+
+   Revert:
+   2024-04-02  Martin Uecker  
+
+   PR c/114361
+   * c-decl.cc (finish_struct): Set TYPE_CANONICAL when completing
+   strucute types.
+
 2024-04-02  Martin Uecker  
 
PR c/114361
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9ae2f94d2e4..07d9248c842 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -138,6 +138,16 @@
PR tree-optimization/114566
* gcc.target/i386/avx512f-pr114566.c: New test.
 
+2024-04-05  Martin Uecker  
+
+   Revert:
+   2024-04-02  Martin Uecker  
+
+   PR c/114361
+   * gcc.dg/pr114361.c: New test.
+   * gcc.dg/c23-tag-incomplete-1.c: New test.
+   * gcc.dg/c23-tag-incomplete-2.c: New test.
+
 2024-04-05  Jakub Jelinek  
 
* gdc.dg/dg.exp: Prune gcov*.d from the list of tests to run.


[gcc r14-9833] aarch64: Fix vld1/st1_x4 intrinsic test

2024-04-08 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:278cad85077509b73b1faf32d36f3889c2a5524b

commit r14-9833-g278cad85077509b73b1faf32d36f3889c2a5524b
Author: Swinney, Jonathan 
Date:   Mon Apr 8 14:02:33 2024 +0100

aarch64: Fix vld1/st1_x4 intrinsic test

The test for this intrinsic was failing silently and so it failed to
report the bug reported in 114521. This patch modifes the test to
report the result.

Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521

Signed-off-by: Jonathan Swinney 

gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Exit with a 
nonzero
code if the test fails.

Diff:
---
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
index 89b289bb21d..17db262a31a 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
@@ -3,6 +3,7 @@
 /* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
+#include 
 #include 
 #include "arm-neon-ref.h"
 
@@ -71,13 +72,16 @@ VARIANT (float64, 2, q_f64)
 VARIANTS (TESTMETH)
 
 #define CHECKS(BASE, ELTS, SUFFIX) \
-  if (test_vld1##SUFFIX##_x4 () != 0)  \
-fprintf (stderr, "test_vld1##SUFFIX##_x4");
+  if (test_vld1##SUFFIX##_x4 () != 0) {\
+fprintf (stderr, "test_vld1" #SUFFIX "_x4 failed\n"); \
+failed = true; \
+  }
 
 int
 main (int argc, char **argv)
 {
+  bool failed = false;
   VARIANTS (CHECKS)
 
-  return 0;
+  return (failed) ? 1 : 0;
 }


[gcc r14-9834] s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605]

2024-04-08 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:d5d84487dec06186fd9246b505f44ef68a66d6a2

commit r14-9834-gd5d84487dec06186fd9246b505f44ef68a66d6a2
Author: Jakub Jelinek 
Date:   Mon Apr 8 16:22:13 2024 +0200

s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605]

The following testcase is miscompiled, because we have initially
a movti which loads the 0x3f803f80ULL TImode constant
from constant pool.  Later on we split it into a pair of DImode
loads.  Now, for the first load (why just that?, though not stage4
material) we trigger the peephole2 which uses s390_const_int_pool_entry_p.
That function doesn't check at all the constant pool mode though, sees
the constant pool at that address has a CONST_INT value and just assumes
that is the value to return, which is especially wrong for big-endian,
if it is a DImode load from offset 0, it should be loading 0 rather than
0x3f803f80ULL.
The following patch adds checks if we are extracing a MODE_INT mode,
if the constant pool has MODE_INT mode as well, punts if constant pool
has smaller mode size than the extraction one (then it would be UB),
if it has the same mode as before keeps using what it did before,
if constant pool has a larger mode than the one being extracted, uses
simplify_subreg.  I'd have used avoid_constant_pool_reference
instead which can handle also offsets into the constant pool constants,
but it can't handle UNSPEC_LTREF.

Another thing is that once that is fixed, we ICE when we extract constant
like 0, ior insn predicate require non-0 constant.  So, the patch also
fixes the peephole2 so that if either 32-bit half is zero, it uses a mere
load of the constant into register rather than a pair of such load and ior.

2024-04-08  Jakub Jelinek  

PR target/114605
* config/s390/s390.cc (s390_const_int_pool_entry_p): Punt
if mem doesn't have MODE_INT mode, or pool constant doesn't
have MODE_INT mode, or if pool constant mode is smaller than
mem mode.  If mem mode is different from pool constant mode,
try to simplify subreg.  If that doesn't work, punt, if it
does, use the simplified constant instead of the constant pool
constant.
* config/s390/s390.md (movdi from const pool peephole): If
either low or high 32-bit part is zero, just emit move insn
instead of move + ior.

* gcc.dg/pr114605.c: New test.

Diff:
---
 gcc/config/s390/s390.cc | 14 --
 gcc/config/s390/s390.md | 10 ++
 gcc/testsuite/gcc.dg/pr114605.c | 37 +
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 372a2324403..3ee7ae7b8d4 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -9984,7 +9984,7 @@ s390_const_int_pool_entry_p (rtx mem, HOST_WIDE_INT *val)
  - (mem (unspec [(symbol_ref) (reg)] UNSPEC_LTREF)).
  - (mem (symbol_ref)).  */
 
-  if (!MEM_P (mem))
+  if (!MEM_P (mem) || GET_MODE_CLASS (GET_MODE (mem)) != MODE_INT)
 return false;
 
   rtx addr = XEXP (mem, 0);
@@ -9998,9 +9998,19 @@ s390_const_int_pool_entry_p (rtx mem, HOST_WIDE_INT *val)
 return false;
 
   rtx val_rtx = get_pool_constant (sym);
-  if (!CONST_INT_P (val_rtx))
+  machine_mode mode = get_pool_mode (sym);
+  if (!CONST_INT_P (val_rtx)
+  || GET_MODE_CLASS (mode) != MODE_INT
+  || GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (mem)))
 return false;
 
+  if (mode != GET_MODE (mem))
+{
+  val_rtx = simplify_subreg (GET_MODE (mem), val_rtx, mode, 0);
+  if (val_rtx == NULL_RTX || !CONST_INT_P (val_rtx))
+   return false;
+}
+
   if (val != nullptr)
 *val = INTVAL (val_rtx);
   return true;
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 8aa40ba5b7f..c607dce3cf0 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2152,6 +2152,16 @@
   gcc_assert (ok);
   operands[2] = GEN_INT (val & 0xULL);
   operands[3] = GEN_INT (val & 0xULL);
+  if (operands[2] == const0_rtx)
+{
+  emit_move_insn (operands[0], operands[3]);
+  DONE;
+}
+  else if (operands[3] == const0_rtx)
+{
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
 })
 
 ;
diff --git a/gcc/testsuite/gcc.dg/pr114605.c b/gcc/testsuite/gcc.dg/pr114605.c
new file mode 100644
index 000..4b5c9c1c0a0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114605.c
@@ -0,0 +1,37 @@
+/* PR target/114605 */
+/* { dg-do run } */
+/* { dg-options "-O0" } */
+
+typedef struct { const float *a; int b, c; float *d; } S;
+
+__attribute__((noipa)) void
+bar (void)
+{
+}
+
+__attribute__((noinline, optimize (2))) static void
+foo (S *e)
+{
+  const float *f;
+  float *g;
+  float h[4] = { 0.0, 0.0, 1.0, 1.0 };
+  if

[gcc r14-9835] RISC-V: Implement TLS Descriptors.

2024-04-08 Thread Kito Cheng via Gcc-cvs
https://gcc.gnu.org/g:97069657c4e40b209c7b774e12faaca13812a86c

commit r14-9835-g97069657c4e40b209c7b774e12faaca13812a86c
Author: Tatsuyuki Ishi 
Date:   Fri Mar 29 14:52:39 2024 +0900

RISC-V: Implement TLS Descriptors.

This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with-tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373.

gcc/ChangeLog:

* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
* config.gcc: Add --with-tls configuration option to change the
default TLS flavor.
* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
-mtls-dialect and with_tls defaults.
* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
two TLS flavors.
* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
sequence length data for TLSDESC.
(riscv_legitimize_tls_address): Add lowering of TLSDESC.
* doc/install.texi: Document --with-tls for RISC-V.
* doc/invoke.texi: Document -mtls-dialect for RISC-V.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/tls_1.x: Add TLSDESC GD test case.
* gcc.target/riscv/tlsdesc.c: Same as above.

Diff:
---
 gcc/config.gcc   | 15 ++-
 gcc/config/riscv/riscv-opts.h|  6 ++
 gcc/config/riscv/riscv-protos.h  |  5 +++--
 gcc/config/riscv/riscv.cc| 24 
 gcc/config/riscv/riscv.h |  9 +++--
 gcc/config/riscv/riscv.md| 20 +++-
 gcc/config/riscv/riscv.opt   | 14 ++
 gcc/doc/install.texi |  3 +++
 gcc/doc/invoke.texi  | 13 -
 gcc/testsuite/gcc.target/riscv/tls_1.x   |  5 +
 gcc/testsuite/gcc.target/riscv/tlsdesc.c | 12 
 11 files changed, 115 insertions(+), 11 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index e62f98e93f4..2e320dd26c5 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2492,6 +2492,7 @@ riscv*-*-linux*)
# Force .init_array support.  The configure script cannot always
# automatically detect that GAS supports it, yet we require it.
gcc_cv_initfini_array=yes
+   with_tls=${with_tls:-trad}
;;
 riscv*-*-elf* | riscv*-*-rtems*)
tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
@@ -2534,6 +2535,7 @@ riscv*-*-freebsd*)
# Force .init_array support.  The configure script cannot always
# automatically detect that GAS supports it, yet we require it.
gcc_cv_initfini_array=yes
+   with_tls=${with_tls:-trad}
;;
 
 loongarch*-*-linux*)
@@ -4671,7 +4673,7 @@ case "${target}" in
;;
 
riscv*-*-*)
-   supported_defaults="abi arch tune riscv_attribute isa_spec"
+   supported_defaults="abi arch tune riscv_attribute isa_spec tls"
 
case "${target}" in
riscv-* | riscv32*) xlen=32 ;;
@@ -4801,6 +4803,17 @@ case "${target}" in
;;
esac
fi
+   # Handle --with-tls.
+   case "$with_tls" in
+   "" \
+   | trad | desc)
+   # OK
+   ;;
+   *)
+   echo "Unknown TLS method used in --with-tls=$with_tls" 
1>&2
+   exit 1
+   ;;
+   esac
 
# Handle --with-multilib-list.
if test "x${with_multilib_list}" != xdefault; then
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 392b9169240..1b2dd5757a8 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -154,4 +154,10 @@ enum rvv_vector_bits_enum {
 #define TARGET_MAX_LMUL
\
   (int) (rvv_max_lmul == RVV_DYNAMIC ? RVV_M8 : rvv_max_lmul)
 
+/* TLS types.  */
+enum riscv_tls_type {
+  TLS_TRADITIONAL,
+  TLS_DESCRIPTORS
+};
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 4677d9c46cd..5d46a29d8b7 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -34,9 +34,10 @@ enum riscv_symbol_type {
   SYMB

[gcc r13-8594] ipa: Self-DCE of uses of removed call LHSs (PR 108007)

2024-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:40ddc0b05a47f999b24f20c1becb79004995731b

commit r13-8594-g40ddc0b05a47f999b24f20c1becb79004995731b
Author: Martin Jambor 
Date:   Mon Apr 8 17:34:33 2024 +0200

ipa: Self-DCE of uses of removed call LHSs (PR 108007)

PR 108007 is another manifestation where we rely on DCE to clean-up
after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA
can leave behind statements which are fed uninitialized values and
trap, even though their results are themselves never used.

I have already fixed this for unused parameters in callees, this bug
shows that almost the same thing can happen for removed returns, on
the side of callers.  This means that the issue has to be fixed
elsewhere, in call redirection.  This patch adds a function which
looks for (and through, using a work-list) uses of operations fed
specific SSA names and removes them all.

That would have been easy if it wasn't for debug statements during
tree-inline (from which call redirection is also invoked).  Debug
statements are decoupled from the rest at this point and iterating
over uses of SSAs does not bring them up.  During tree-inline they are
handled especially at the end, I assume in order to make sure that
relative ordering of UIDs are the same with and without debug info.

This means that during tree-inline we need to make a hash of killed
SSAs, that we already have in copy_body_data, available to the
function making the purging.  So the patch duly does also that, making
the interface slightly ugly.  Moreover, all newly unused SSA names
need to be freed and as PR 112616 showed, it must be done in a defined
order, which is what newly added ipa_release_ssas_in_hash does.

This backport to gcc-13 also contains
54e505d0446f86b7ad383acbb8e5501f20872b64 in order not to reintroduce
PR 113757.

gcc/ChangeLog:

2024-04-05  Martin Jambor  

PR ipa/108007
PR ipa/112616
* cgraph.h (cgraph_edge): Add a parameter to
redirect_call_stmt_to_callee.
* ipa-param-manipulation.h (ipa_param_adjustments): Add a
parameter to modify_call.
(ipa_release_ssas_in_hash): Declare.
* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New
parameter killed_ssas, pass it to padjs->modify_call.
* ipa-param-manipulation.cc (purge_all_uses): New function.
(ipa_param_adjustments::modify_call): New parameter killed_ssas.
Instead of substituting uses, invoke purge_all_uses.  If
hash of killed SSAs has not been provided, create a temporary one
and release SSAs that have been added to it.
(compare_ssa_versions): New function.
(ipa_release_ssas_in_hash): Likewise.
* tree-inline.cc (redirect_all_calls): Create
id->killed_new_ssa_names earlier, pass it to edge redirection,
adjust a comment.
(copy_body): Release SSAs in id->killed_new_ssa_names.

gcc/testsuite/ChangeLog:

2024-01-15  Martin Jambor  

PR ipa/108007
PR ipa/112616
* gcc.dg/ipa/pr108007.c: New test.
* gcc.dg/ipa/pr112616.c: Likewise.

(cherry picked from commit a9a8426e534760b8d3a250e9bd3cff4db131a2be)

Diff:
---
 gcc/cgraph.cc   |  10 +++-
 gcc/cgraph.h|   9 ++-
 gcc/ipa-param-manipulation.cc   | 112 +---
 gcc/ipa-param-manipulation.h|   5 +-
 gcc/testsuite/g++.dg/ipa/pr113757.C |  14 +
 gcc/testsuite/gcc.dg/ipa/pr108007.c |  32 +++
 gcc/testsuite/gcc.dg/ipa/pr112616.c |  28 +
 gcc/tree-inline.cc  |  27 -
 8 files changed, 193 insertions(+), 44 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index ec663d23385..7a14c00b60a 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1403,11 +1403,17 @@ cgraph_edge::redirect_callee (cgraph_node *n)
speculative indirect call, remove "speculative" of the indirect call and
also redirect stmt to it's final direct target.
 
+   When called from within tree-inline, KILLED_SSAs has to contain the pointer
+   to killed_new_ssa_names within the copy_body_data structure and SSAs
+   discovered to be useless (if LHS is removed) will be added to it, otherwise
+   it needs to be NULL.
+
It is up to caller to iteratively transform each "speculative"
direct call as appropriate.  */
 
 gimple *
-cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
+cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e,
+  hash_set  *killed_ssas)
 {
   tree decl = gimple_call_fndecl (e->call_stmt);
   gcall *new_stmt;
@@ -1527,7 +1533,7 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
remove_stmt_from_eh_lp (e->ca

[gcc r14-9836] aarch64: Fix expansion of svsudot [PR114607]

2024-04-08 Thread Richard Sandiford via Gcc-cvs
https://gcc.gnu.org/g:2c1c2485a4b1aca746ac693041e51ea6da5c64ca

commit r14-9836-g2c1c2485a4b1aca746ac693041e51ea6da5c64ca
Author: Richard Sandiford 
Date:   Mon Apr 8 16:53:32 2024 +0100

aarch64: Fix expansion of svsudot [PR114607]

Not sure how this happend, but: svsudot is supposed to be expanded
as USDOT with the operands swapped.  However, a thinko in the
expansion of svsudot meant that the arguments weren't in fact
swapped; the attempted swap was just a no-op.  And the testcases
blithely accepted that.

gcc/
PR target/114607
* config/aarch64/aarch64-sve-builtins-base.cc
(svusdot_impl::expand): Fix botched attempt to swap the operands
for svsudot.

gcc/testsuite/
PR target/114607
* gcc.target/aarch64/sve/acle/asm/sudot_s32.c: New test.

Diff:
---
 gcc/config/aarch64/aarch64-sve-builtins-base.cc   | 2 +-
 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 5be2315a3c6..0d2edf3f19e 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -2809,7 +2809,7 @@ public:
version) is through the USDOT instruction but with the second and third
inputs swapped.  */
 if (m_su)
-  e.rotate_inputs_left (1, 2);
+  e.rotate_inputs_left (1, 3);
 /* The ACLE function has the same order requirements as for svdot.
While there's no requirement for the RTL pattern to have the same sort
of order as that for dot_prod, it's easier to read.
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c
index 4b452619eee..e06b69affab 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sudot_s32.c
@@ -6,7 +6,7 @@
 
 /*
 ** sudot_s32_tied1:
-** usdot   z0\.s, z2\.b, z4\.b
+** usdot   z0\.s, z4\.b, z2\.b
 ** ret
 */
 TEST_TRIPLE_Z (sudot_s32_tied1, svint32_t, svint8_t, svuint8_t,
@@ -17,7 +17,7 @@ TEST_TRIPLE_Z (sudot_s32_tied1, svint32_t, svint8_t, 
svuint8_t,
 ** sudot_s32_tied2:
 ** mov (z[0-9]+)\.d, z0\.d
 ** movprfx z0, z4
-** usdot   z0\.s, z2\.b, \1\.b
+** usdot   z0\.s, \1\.b, z2\.b
 ** ret
 */
 TEST_TRIPLE_Z_REV (sudot_s32_tied2, svint32_t, svint8_t, svuint8_t,
@@ -27,7 +27,7 @@ TEST_TRIPLE_Z_REV (sudot_s32_tied2, svint32_t, svint8_t, 
svuint8_t,
 /*
 ** sudot_w0_s32_tied:
 ** mov (z[0-9]+\.b), w0
-** usdot   z0\.s, z2\.b, \1
+** usdot   z0\.s, \1, z2\.b
 ** ret
 */
 TEST_TRIPLE_ZX (sudot_w0_s32_tied, svint32_t, svint8_t, uint8_t,
@@ -37,7 +37,7 @@ TEST_TRIPLE_ZX (sudot_w0_s32_tied, svint32_t, svint8_t, 
uint8_t,
 /*
 ** sudot_9_s32_tied:
 ** mov (z[0-9]+\.b), #9
-** usdot   z0\.s, z2\.b, \1
+** usdot   z0\.s, \1, z2\.b
 ** ret
 */
 TEST_TRIPLE_Z (sudot_9_s32_tied, svint32_t, svint8_t, uint8_t,


[gcc(refs/users/meissner/heads/work163-test)] Add power10 wide immediate fusion

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:079fec07cd162dcbabad089c5d5298143a3669f9

commit 079fec07cd162dcbabad089c5d5298143a3669f9
Author: Michael Meissner 
Date:   Mon Apr 8 12:35:00 2024 -0400

Add power10 wide immediate fusion

2024-04-08  Michael Meissner  

gcc/

PR target/108018
* config/rs6000/rs6000.md (tocref_p10_fusion): New insn.
(tocref): Don't allow tocrev if tocref_p10_fusion
would be used.

Diff:
---
 gcc/config/rs6000/rs6000.md | 43 ++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index abc809448ad..1acd34ca0ee 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11302,13 +11302,54 @@
"TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
"la %0,%2@l(%1)")
 
+;; For power10 fusion, don't split the two pieces of the TOCREF, but keep them
+;; together so the instructions will fuse in case the user is using power10
+;; fusion via -mtune=power10 but is not enabling the PC-relative support.  If
+;; we split the instructions up, the scheduler will likely split the ADDIS and
+;; ADDI instruction.
+
+(define_insn "*tocref_p10_fusion"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
+   (match_operand:P 1 "small_toc_ref" "R"))]
+   "TARGET_TOC && TARGET_P10_FUSION && TARGET_CMODEL != CMODEL_SMALL
+&& (TARGET_ELF || TARGET_XCOFF)
+&& legitimate_constant_pool_address_p (operands[1], QImode, false)"
+{
+  rtx op = operands[1];
+  machine_mode mode = mode;
+  rtx offset = const0_rtx;
+
+  if (GET_CODE (op) == PLUS && add_cint_operand (XEXP (op, 1), mode))
+{
+  offset = XEXP (op, 1);
+  op = XEXP (op, 0);
+}
+
+gcc_assert (GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL);
+operands[2] = XVECEXP (op, 0, 0);  /* symbol ref.  */
+operands[3] = XVECEXP (op, 0, 1);  /* register, usually r2.  */
+operands[4] = offset;
+
+if (INTVAL (offset) == 0)
+  return (TARGET_ELF
+ ? "addis %0,%3,%2@toc@ha\;addi %0,%0,%2@toc@l"
+ : "addis %0,%2@u(%3)\;la %0,%2@l(%3)");
+else
+  return (TARGET_ELF
+ ? "addis %0,%3,%2+%4@toc@ha\;addi %0,%0,%2+%4@toc@l"
+ : "addis %0,%2+%4@u(%3)\;la %0,%2+%4@l(%3)");
+}
+  [(set_attr "type" "*")
+   (set_attr "length" "8")])
+
 (define_insn_and_split "*tocref"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b")
(match_operand:P 1 "small_toc_ref" "R"))]
"TARGET_TOC
+&& !(TARGET_P10_FUSION && (TARGET_ELF || TARGET_XCOFF))
 && legitimate_constant_pool_address_p (operands[1], QImode, false)"
"la %0,%a1"
-   "&& TARGET_CMODEL != CMODEL_SMALL && reload_completed"
+   "&& TARGET_CMODEL != CMODEL_SMALL && !TARGET_P10_FUSION && reload_completed"
   [(set (match_dup 0) (high:P (match_dup 1)))
(set (match_dup 0) (lo_sum:P (match_dup 0) (match_dup 1)))])


[gcc(refs/users/meissner/heads/work163-test)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:2123589bb1fed4fe083ccb58ff96d249a7ad603b

commit 2123589bb1fed4fe083ccb58ff96d249a7ad603b
Author: Michael Meissner 
Date:   Mon Apr 8 12:37:53 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.test | 173 +
 1 file changed, 173 insertions(+)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
index 07f6729db60..00d82eaf4cf 100644
--- a/gcc/ChangeLog.test
+++ b/gcc/ChangeLog.test
@@ -1,5 +1,178 @@
+ Branch work163-test, patch #200 
+
+Add power10 wide immediate fusion
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   PR target/108018
+   * config/rs6000/rs6000.md (tocref_p10_fusion): New insn.
+   (tocref): Don't allow tocrev if tocref_p10_fusion
+   would be used.
+
+ Branch work163, patch #12 
+
+Add -mcpu=future tuning support.
+
+This patch makes -mtune=future use the same tuning decision as -mtune=power11.
+
+2024-03-18  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add future as an
+   alterntive to power10 and power11.
+
+ Branch work163, patch #11 
+
+Add -mcpu=future support.
+
+This patch adds the future option to the -mcpu= and -mtune= switches.
+
+This patch treats the future like a power11 in terms of costs and reassociation
+width.
+
+This patch issues a ".machine future" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR_FUTURE if the user uses -mcpu=future.
+
+This patch allows GCC to be configured with the --with-cpu=future and
+--with-tune=future options.
+
+This patch passes -mfuture to the assembler if the user uses -mcpu=future.
+
+2024-03-18  Michael Meissner  
+
+gcc/
+
+   * config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=power11.
+   * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/driver-rs6000.cc (asm_names): Likewise.
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+   _ARCH_PWR_FUTURE if -mcpu=future.
+   * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): New define.
+   (POWERPC_MASKS): Add future isa bit.
+   (power11 cpu): Add future definition.
+   * config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Add future processor.
+   * config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
+   * config/rs6000/rs6000-tables.opt: Regenerate.
+   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Add future
+   support.
+   (rs6000_machine_from_flags): Likewise.
+   (rs6000_reassociation_width): Likewise.
+   (rs6000_adjust_cost): Likewise.
+   (rs6000_issue_rate): Likewise.
+   (rs6000_sched_reorder): Likewise.
+   (rs6000_sched_reorder2): Likewise.
+   (rs6000_register_move_cost): Likewise.
+   (rs6000_opt_masks): Likewise.
+   * config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/rs6000.md (cpu attribute): Add future.
+   * config/rs6000/rs6000.opt (-mpower11): Add internal future ISA flag.
+   * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcpu=future.
+
+ Branch work163, patch #3 
+
+Add -mcpu=power11 tests.
+
+This patch adds some simple tests for -mcpu=power11 support.  In order to run
+these tests, you need an assembler that supports the appropriate option for
+supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX).
+
+2024-03-18  Michael Meissner  
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/power11-1.c: New test.
+   * gcc.target/powerpc/power11-2.c: Likewise.
+   * gcc.target/powerpc/power11-3.c: Likewise.
+   * lib/target-supports.exp (check_effective_target_power11_ok): Add new
+   effective target.
+
+ Branch work163, patch #2 
+
+Add -mcpu=power11 tuning support.
+
+This patch makes -mtune=power11 use the same tuning decision as -mtune=power10.
+
+2024-03-18  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add power11 as an
+   alternative to power10.
+
+ Branch work163, patch #1 
+
+Add -mcpu=power11 support.
+
+This patch adds the power11 option to the -mcpu= and -mtune= switches.
+
+This patch treats the power11 like a power10 in terms of costs and 
reassociation
+width.
+
+This patch issues a ".machine power11" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR11 if the user uses -mcpu=power11.
+
+This patch allows GCC to be configured with the --with-cpu=power11 and
+--with-tune=power11 options.
+
+This patch passes -mpwr11 to the assembler if the user uses -mcpu=power11.
+
+This patch adds support for using "power11"

[gcc r14-9837] libstdc++: Combine two std::from_chars tests into one

2024-04-08 Thread Jonathan Wakely via Gcc-cvs
https://gcc.gnu.org/g:87bc20676ce606b0f75f12a35b24206df05a9f0a

commit r14-9837-g87bc20676ce606b0f75f12a35b24206df05a9f0a
Author: Jonathan Wakely 
Date:   Tue Apr 2 21:22:01 2024 +0100

libstdc++: Combine two std::from_chars tests into one

We don't need separate tests for the C++17 and C++20 cases, we can just
have one test that uses __cpp_char8_t to adjust whether it tests char8_t
or not. This means the C++20 one doesn't fail if -fno-char8_t is used.

libstdc++-v3/ChangeLog:

* testsuite/20_util/from_chars/1_neg.cc: Add char8_t cases,
using a struct of that name if -fno-char8_t is active.
* testsuite/20_util/from_chars/1_c++20_neg.cc: Removed.

Diff:
---
 .../testsuite/20_util/from_chars/1_c++20_neg.cc| 43 --
 libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc |  7 
 2 files changed, 7 insertions(+), 43 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
deleted file mode 100644
index d246eefb469..000
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
+++ /dev/null
@@ -1,43 +0,0 @@
-// Copyright (C) 2017-2024 Free Software Foundation, Inc.
-//
-// This file is part of the GNU ISO C++ Library.  This library is free
-// software; you can redistribute it and/or modify it under the
-// terms of the GNU General Public License as published by the
-// Free Software Foundation; either version 3, or (at your option)
-// any later version.
-
-// This library is distributed in the hope that it will be useful,
-// but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-// GNU General Public License for more details.
-
-// You should have received a copy of the GNU General Public License along
-// with this library; see the file COPYING3.  If not see
-// .
-
-// { dg-do compile { target c++20 } }
-
-#include 
-
-void
-test01(const char* first, const char* last)
-{
-  wchar_t wc;
-  std::from_chars(first, last, wc); // { dg-error "no matching" }
-  std::from_chars(first, last, wc, 10); // { dg-error "no matching" }
-  char8_t c8;
-  std::from_chars(first, last, c8); // { dg-error "no matching" }
-  std::from_chars(first, last, c8, 10); // { dg-error "no matching" }
-  char16_t c16;
-  std::from_chars(first, last, c16); // { dg-error "no matching" }
-  std::from_chars(first, last, c16, 10); // { dg-error "no matching" }
-  char32_t c32;
-  std::from_chars(first, last, c32); // { dg-error "no matching" }
-  std::from_chars(first, last, c32, 10); // { dg-error "no matching" }
-  enum E { } e;
-  std::from_chars(first, last, e); // { dg-error "no matching" }
-  std::from_chars(first, last, e, 10); // { dg-error "no matching" }
-}
-
-// { dg-prune-output "enable_if" }
-// { dg-prune-output "cannot bind non-const lvalue reference" }
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
index 7538d9448ac..3f4cd59f6fc 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
@@ -36,6 +36,13 @@ test01(const char* first, const char* last)
   enum E { } e;
   std::from_chars(first, last, e); // { dg-error "no matching" }
   std::from_chars(first, last, e, 10); // { dg-error "no matching" }
+
+#ifndef __cpp_char8_t
+  struct char8_t { };
+#endif
+  char8_t c8;
+  std::from_chars(first, last, c8); // { dg-error "no matching" }
+  std::from_chars(first, last, c8, 10); // { dg-error "no matching" }
 }
 
 // { dg-prune-output "enable_if" }


[gcc r14-9838] libstdc++: Fix tests that fail with -fno-char8_t

2024-04-08 Thread Jonathan Wakely via Gcc-cvs
https://gcc.gnu.org/g:cd77e152875d3bc9c8966fc20241d73aa47532b3

commit r14-9838-gcd77e152875d3bc9c8966fc20241d73aa47532b3
Author: Jonathan Wakely 
Date:   Tue Apr 2 20:53:11 2024 +0100

libstdc++: Fix tests that fail with -fno-char8_t

Adjust expected errors or skip tests as UNSUPPORTED if -fno-char8_t is
used in the test flags.

libstdc++-v3/ChangeLog:

* testsuite/20_util/integer_comparisons/equal_neg.cc: Use
no-opts selector for errors that depend on -fchar8_t.
* testsuite/20_util/integer_comparisons/greater_equal_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/greater_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/in_range_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/less_equal_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/less_neg.cc: Likewise.
* testsuite/20_util/integer_comparisons/not_equal_neg.cc:
Likewise.
* testsuite/21_strings/basic_string/hash/hash_char8_t.cc: Skip
if -fno-char8_t is used.
* testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc:
Likewise.
* 
testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc:
Likewise.
* testsuite/27_io/filesystem/path/factory/u8path-depr.cc: Use
char for u8 literal if char8_t is not available.
* testsuite/27_io/headers/iosfwd/synopsis.cc: Check
__cpp_char8_t.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/headers/atomic/types_std_c++20_neg.cc:
Remove check for _GLIBCXX_USE_CHAR8_T.

Diff:
---
 libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc | 4 ++--
 .../testsuite/20_util/integer_comparisons/greater_equal_neg.cc  | 4 ++--
 libstdc++-v3/testsuite/20_util/integer_comparisons/greater_neg.cc   | 4 ++--
 libstdc++-v3/testsuite/20_util/integer_comparisons/in_range_neg.cc  | 6 --
 .../testsuite/20_util/integer_comparisons/less_equal_neg.cc | 4 ++--
 libstdc++-v3/testsuite/20_util/integer_comparisons/less_neg.cc  | 4 ++--
 libstdc++-v3/testsuite/20_util/integer_comparisons/not_equal_neg.cc | 4 ++--
 libstdc++-v3/testsuite/21_strings/basic_string/hash/hash_char8_t.cc | 1 +
 .../testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc  | 1 +
 .../27_io/basic_ostream/inserters_character/char/deleted.cc | 1 +
 .../27_io/basic_ostream/inserters_character/wchar_t/deleted.cc  | 1 +
 libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-depr.cc | 4 +++-
 libstdc++-v3/testsuite/27_io/headers/iosfwd/synopsis.cc | 2 +-
 libstdc++-v3/testsuite/29_atomics/atomic_integral/wait_notify.cc| 2 ++
 .../testsuite/29_atomics/headers/atomic/types_std_c++20_neg.cc  | 2 --
 15 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc
index 06cfd6ee8f4..7290e9e2417 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/equal_neg.cc
@@ -25,8 +25,8 @@ bool c = std::cmp_equal(2, L'2'); // { dg-error "here" }
 bool d = std::cmp_equal(L'2', 2); // { dg-error "here" }
 bool e = std::cmp_equal(true, 1); // { dg-error "here" }
 bool f = std::cmp_equal(0, false); // { dg-error "here" }
-bool g = std::cmp_equal(97, u8'a'); // { dg-error "here" }
-bool h = std::cmp_equal(u8'a', 97); // { dg-error "here" }
+bool g = std::cmp_equal(97, u8'a'); // { dg-error "here" "" { target { no-opts 
"-fno-char8_t" } } }
+bool h = std::cmp_equal(u8'a', 97); // { dg-error "here" "" { target { no-opts 
"-fno-char8_t" } } }
 bool i = std::cmp_equal(97, u'a'); // { dg-error "here" }
 bool j = std::cmp_equal(u'a', 97); // { dg-error "here" }
 bool k = std::cmp_equal(97, U'a'); // { dg-error "here" }
diff --git 
a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
index b5f03b70733..cb60bfe9d07 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
@@ -25,8 +25,8 @@ bool c = std::cmp_greater_equal(2, L'2'); // { dg-error 
"constexpr" }
 bool d = std::cmp_greater_equal(L'2', 2); // { dg-error "constexpr" }
 bool e = std::cmp_greater_equal(true, 1); // { dg-error "constexpr" }
 bool f = std::cmp_greater_equal(0, false); // { dg-error "constexpr" }
-bool g = std::cmp_greater_equal(97, u8'a'); // { dg-error "constexpr" }
-bool h = std::cmp_greater_equal(u8'a', 97); // { dg-error "constexpr" }

[gcc r14-9839] libstdc++: Use char for _Utf8_view if char8_t isn't available [PR114519]

2024-04-08 Thread Jonathan Wakely via Libstdc++-cvs
https://gcc.gnu.org/g:feb6a2d3569095706170c9400e93add27a66e034

commit r14-9839-gfeb6a2d3569095706170c9400e93add27a66e034
Author: Jonathan Wakely 
Date:   Tue Apr 2 22:46:55 2024 +0100

libstdc++: Use char for _Utf8_view if char8_t isn't available [PR114519]

Instead of just omitting the definition of __unicode::_Utf8_view when
char8_t is disabled, we can make it use char instead.

libstdc++-v3/ChangeLog:

PR libstdc++/114519
* include/bits/unicode.h (_Utf8_view) [!__cpp_char8_t]: Define
using char instead of char8_t.
* testsuite/ext/unicode/view.cc: Use u8""sv literals to create
string views, instead of std::u8string_view.

Diff:
---
 libstdc++-v3/include/bits/unicode.h| 3 +++
 libstdc++-v3/testsuite/ext/unicode/view.cc | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/unicode.h 
b/libstdc++-v3/include/bits/unicode.h
index 0e95c86a0b0..29813b743dc 100644
--- a/libstdc++-v3/include/bits/unicode.h
+++ b/libstdc++-v3/include/bits/unicode.h
@@ -581,6 +581,9 @@ namespace __unicode
 #ifdef __cpp_char8_t
   template
 using _Utf8_view = _Utf_view;
+#else
+  template
+using _Utf8_view = _Utf_view;
 #endif
   template
 using _Utf16_view = _Utf_view;
diff --git a/libstdc++-v3/testsuite/ext/unicode/view.cc 
b/libstdc++-v3/testsuite/ext/unicode/view.cc
index 79ea2bbc6b7..ee23b0b1d8a 100644
--- a/libstdc++-v3/testsuite/ext/unicode/view.cc
+++ b/libstdc++-v3/testsuite/ext/unicode/view.cc
@@ -10,7 +10,7 @@ using namespace std::string_view_literals;
 constexpr void
 test_utf8_to_utf8()
 {
-  const std::u8string_view s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡";
+  const auto s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
   uc::_Utf8_view v(s8);
   VERIFY( std::ranges::distance(v) == s8.size() );
   VERIFY( std::ranges::equal(v,  s8) );
@@ -19,7 +19,7 @@ test_utf8_to_utf8()
 constexpr void
 test_utf8_to_utf16()
 {
-  const std::u8string_view s8  = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡";
+  const auto s8  = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
   const std::u16string_view s16 = u"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡";
   uc::_Utf16_view v(s8);
   VERIFY( std::ranges::distance(v) == s16.size() );


[gcc r14-9840] ipa: Compare jump functions in ICF (PR 113907)

2024-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1162861439fd3c4b30fc3ccd49462e47e876f04a

commit r14-9840-g1162861439fd3c4b30fc3ccd49462e47e876f04a
Author: Martin Jambor 
Date:   Mon Apr 8 18:53:23 2024 +0200

ipa: Compare jump functions in ICF (PR 113907)

In PR 113907 comment #58, Honza found a case where ICF thinks bodies
of functions are equivalent but becaise of difference in aliases in a
memory access, different aggregate jump functions are associated with
supposedly equivalent call statements.  This patch adds a way to
compare jump functions and plugs it into ICF to avoid the issue.

gcc/ChangeLog:

2024-03-20  Martin Jambor  

PR ipa/113907
* ipa-prop.h (class ipa_vr): Declare new overload of a member 
function
equal_p.
(ipa_jump_functions_equivalent_p): Declare.
* ipa-prop.cc (ipa_vr::equal_p): New function.
(ipa_agg_pass_through_jf_equivalent_p): Likewise.
(ipa_agg_jump_functions_equivalent_p): Likewise.
(ipa_jump_functions_equivalent_p): Likewise.
* ipa-cp.h (values_equal_for_ipcp_p): Declare.
* ipa-cp.cc (values_equal_for_ipcp_p): Make function public.
* ipa-icf-gimple.cc: Include alloc-pool.h, symbol-summary.h, 
sreal.h,
ipa-cp.h and ipa-prop.h.
(func_checker::compare_gimple_call): Comapre jump functions.

gcc/testsuite/ChangeLog:

2024-03-20  Martin Jambor  

PR ipa/113907
* gcc.dg/lto/pr113907_0.c: New.
* gcc.dg/lto/pr113907_1.c: Likewise.
* gcc.dg/lto/pr113907_2.c: Likewise.

Diff:
---
 gcc/ipa-cp.cc |   2 +-
 gcc/ipa-cp.h  |   2 +
 gcc/ipa-icf-gimple.cc |  30 ++
 gcc/ipa-prop.cc   | 167 ++
 gcc/ipa-prop.h|   3 +
 gcc/testsuite/gcc.dg/lto/pr113907_0.c |  18 
 gcc/testsuite/gcc.dg/lto/pr113907_1.c |  35 +++
 gcc/testsuite/gcc.dg/lto/pr113907_2.c |  11 +++
 8 files changed, 267 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 2a1da631e9c..b7add455bd5 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -201,7 +201,7 @@ ipcp_lattice::is_single_const ()
 
 /* Return true iff X and Y should be considered equal values by IPA-CP.  */
 
-static bool
+bool
 values_equal_for_ipcp_p (tree x, tree y)
 {
   gcc_checking_assert (x != NULL_TREE && y != NULL_TREE);
diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h
index 0b3cfe4b526..7ff74fb5c98 100644
--- a/gcc/ipa-cp.h
+++ b/gcc/ipa-cp.h
@@ -289,4 +289,6 @@ public:
   bool virt_call = false;
 };
 
+bool values_equal_for_ipcp_p (tree x, tree y);
+
 #endif /* IPA_CP_H */
diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index 8c2df7a354e..17f62bec068 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -41,7 +41,12 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-walk.h"
 
 #include "tree-ssa-alias-compare.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
 #include "ipa-icf-gimple.h"
+#include "sreal.h"
+#include "ipa-cp.h"
+#include "ipa-prop.h"
 
 namespace ipa_icf_gimple {
 
@@ -714,6 +719,31 @@ func_checker::compare_gimple_call (gcall *s1, gcall *s2)
   && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
 return return_false_with_msg ("GIMPLE internal call LHS type mismatch");
 
+  if (!gimple_call_internal_p (s1))
+{
+  cgraph_edge *e1 = cgraph_node::get (m_source_func_decl)->get_edge (s1);
+  cgraph_edge *e2 = cgraph_node::get (m_target_func_decl)->get_edge (s2);
+  class ipa_edge_args *args1 = ipa_edge_args_sum->get (e1);
+  class ipa_edge_args *args2 = ipa_edge_args_sum->get (e2);
+  if ((args1 != nullptr) != (args2 != nullptr))
+   return return_false_with_msg ("ipa_edge_args mismatch");
+  if (args1)
+   {
+ int n1 = ipa_get_cs_argument_count (args1);
+ int n2 = ipa_get_cs_argument_count (args2);
+ if (n1 != n2)
+   return return_false_with_msg ("ipa_edge_args nargs mismatch");
+ for (int i = 0; i < n1; i++)
+   {
+ struct ipa_jump_func *jf1 = ipa_get_ith_jump_func (args1, i);
+ struct ipa_jump_func *jf2 = ipa_get_ith_jump_func (args2, i);
+ if (((jf1 != nullptr) != (jf2 != nullptr))
+ || (jf1 && !ipa_jump_functions_equivalent_p (jf1, jf2)))
+   return return_false_with_msg ("jump function mismatch");
+   }
+   }
+}
+
   return compare_operand (t1, t2, get_operand_access_type (&map, t1));
 }
 
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index e8e4918d5a8..374e998aa64 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -156,6 +156,20 @@ ipa_vr::equal_p (const vrange &r) const
   return (types_compatible_p (m_type, r.type ()) && m_storage->equal_p (r));
 }
 
+bool
+ipa_vr::equal_p (const ipa_vr &o) const
+{
+  i

[gcc r14-9841] ICF&SRA: Make ICF and SRA agree on padding

2024-04-08 Thread Martin Jambor via Gcc-cvs
https://gcc.gnu.org/g:1e3312a25a7b34d6e3f549273e1674c7114e4408

commit r14-9841-g1e3312a25a7b34d6e3f549273e1674c7114e4408
Author: Martin Jambor 
Date:   Mon Apr 8 18:53:23 2024 +0200

ICF&SRA: Make ICF and SRA agree on padding

PR 113359 shows that (at least with -fno-strict-aliasing) ICF can
unify two functions which copy an aggregate type of the same size but
then SRA, through its total scalarization, can copy the aggregate by
pieces, skipping paddding, but the padding was not the same in the two
original functions that ICF unified.

This patch enhances SRA with the ability to collect padding
information which then can be compared from within ICF.  Unfortunately
SRA uses OPTION_SET_P when determining its limits, so ICF needs to
switch cfuns at least once to figure it out too.

gcc/ChangeLog:

2024-03-27  Martin Jambor  

PR ipa/113359
* ipa-icf-gimple.h (func_checker): New members
safe_for_total_scalarization_p, m_total_scalarization_limit_known_p
and m_total_scalarization_limit.
(func_checker::func_checker): Initialize new member variables.
* ipa-icf-gimple.cc: Include tree-sra.h.
(func_checker::func_checker): Initialize new member variables.
(func_checker::safe_for_total_scalarization_p): New function.
(func_checker::compare_operand): Use the new function.
* tree-sra.h (sra_get_max_scalarization_size): Declare.
(sra_total_scalarization_would_copy_same_data_p): Likewise.
* tree-sra.cc (prepare_iteration_over_array_elts): New function.
(class sra_padding_collecting): New.
(sra_padding_collecting::record_padding): Likewise.
(scalarizable_type_p): Rename to totally_scalarizable_type_p.  Add
ability to record padding when requested.
(totally_scalarize_subtree): Split out gathering information 
necessary
to iterate over array elements to prepare_iteration_over_array_elts.
Fix errornous early exit.
(analyze_all_variable_accesses): Adjust the call to
totally_scalarizable_type_p.  Move determining of total scalariation
size limit...
(sra_get_max_scalarization_size): ...here.
(check_ts_and_push_padding_to_vec): New function.
(sra_total_scalarization_would_copy_same_data_p): Likewise.

gcc/testsuite/ChangeLog:

2024-03-27  Martin Jambor  

PR ipa/113359
* gcc.dg/lto/pr113359-1_0.c: New.
* gcc.dg/lto/pr113359-1_1.c: Likewise.
* gcc.dg/lto/pr113359-2_0.c: Likewise.
* gcc.dg/lto/pr113359-2_1.c: Likewise.
* gcc.dg/lto/pr113359-3_0.c: Likewise.
* gcc.dg/lto/pr113359-3_1.c: Likewise.
* gcc.dg/lto/pr113359-4_0.c: Likewise.
* gcc.dg/lto/pr113359-4_1.c: Likewise.
* gcc.dg/lto/pr113359-5_0.c: Likewise.
* gcc.dg/lto/pr113359-5_1.c: Likewise.

Diff:
---
 gcc/ipa-icf-gimple.cc   |  41 +-
 gcc/ipa-icf-gimple.h|  15 +-
 gcc/testsuite/gcc.dg/lto/pr113359-1_0.c |  86 +++
 gcc/testsuite/gcc.dg/lto/pr113359-1_1.c |  38 +
 gcc/testsuite/gcc.dg/lto/pr113359-2_0.c |  87 +++
 gcc/testsuite/gcc.dg/lto/pr113359-2_1.c |  38 +
 gcc/testsuite/gcc.dg/lto/pr113359-3_0.c | 114 +++
 gcc/testsuite/gcc.dg/lto/pr113359-3_1.c |  49 +++
 gcc/testsuite/gcc.dg/lto/pr113359-4_0.c | 114 +++
 gcc/testsuite/gcc.dg/lto/pr113359-4_1.c |  49 +++
 gcc/testsuite/gcc.dg/lto/pr113359-5_0.c | 118 +++
 gcc/testsuite/gcc.dg/lto/pr113359-5_1.c |  50 +++
 gcc/tree-sra.cc | 252 +---
 gcc/tree-sra.h  |   3 +
 14 files changed, 999 insertions(+), 55 deletions(-)

diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
index 17f62bec068..c25eb24710f 100644
--- a/gcc/ipa-icf-gimple.cc
+++ b/gcc/ipa-icf-gimple.cc
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "attribs.h"
 #include "gimple-walk.h"
+#include "tree-sra.h"
 
 #include "tree-ssa-alias-compare.h"
 #include "alloc-pool.h"
@@ -64,7 +65,8 @@ func_checker::func_checker (tree source_func_decl, tree 
target_func_decl,
   : m_source_func_decl (source_func_decl), m_target_func_decl 
(target_func_decl),
 m_ignored_source_nodes (ignored_source_nodes),
 m_ignored_target_nodes (ignored_target_nodes),
-m_ignore_labels (ignore_labels), m_tbaa (tbaa)
+m_ignore_labels (ignore_labels), m_tbaa (tbaa),
+m_total_scalarization_limit_known_p (false)
 {
   function *source_func = DECL_STRUCT_FUNCTION (source_func_decl);
   function *target_func = DECL_STRUCT_FUNCTION (target_func_decl);
@@ -361,6 +363,36 @@ func_checker::operand_equal_p (const_tree t1, const_tree 

[gcc(refs/users/meissner/heads/work163-test)] Add power10 ori/oris and xori/xoris fusion support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e3a99edb9c370924e177b615cee8e91b8795548d

commit e3a99edb9c370924e177b615cee8e91b8795548d
Author: Michael Meissner 
Date:   Mon Apr 8 13:59:38 2024 -0400

Add power10 ori/oris and xori/xoris fusion support.

2024-04-08  Michael Meissner  

gcc/

* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.md (gen_wide_immediate): Add wide 
immediate
fusion patterns.
* config/rs6000/rs6000.md (IORXOR_OP): New code attribute.
(ior, xor splitter): Do not split ior/xor patterns that could be 
fused
on power10.

Diff:
---
 gcc/config/rs6000/fusion.md| 10 ++
 gcc/config/rs6000/genfusion.pl | 17 +
 gcc/config/rs6000/rs6000.md|  5 -
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 4ed9ae1d69f..45819627df6 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3055,3 +3055,13 @@
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
(set_attr "length" "8")])
+
+;; ori_oris and xori_xoris fusion patterns generated by gen_wide_immediate
+(define_insn "*fuse_i_is"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=&r")
+(iorxor:DI
+  (match_operand:DI 1 "gpc_reg_operand" "r")
+  (match_operand:DI 2 "non_logical_cint_operand" "n")))]
+  "(TARGET_P10_FUSION)"
+  "i %0,%1,%w2\;is %0,%0,%v2"
+  [(set_attr "length" "8")])
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 2271be14bfa..d059de8afc9 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -367,6 +367,23 @@ EOF
   }
 }
 
+sub gen_wide_immediate
+{
+print <<"EOF";
+
+;; ori_oris and xori_xoris fusion patterns generated by gen_wide_immediate
+(define_insn "*fuse_i_is"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=&r")
+(iorxor:DI
+  (match_operand:DI 1 "gpc_reg_operand" "r")
+  (match_operand:DI 2 "non_logical_cint_operand" "n")))]
+  "(TARGET_P10_FUSION)"
+  "i %0,%1,%w2\\;is %0,%0,%v2"
+  [(set_attr "length" "8")])
+EOF
+}
+
 gen_ld_cmpi_p10();
 gen_logical_addsubf();
 gen_addadd;
+gen_wide_immediate();
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 1acd34ca0ee..f626b68ebb2 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -822,6 +822,9 @@
 (SF "TARGET_P8_VECTOR")
 (DI "TARGET_POWERPC64")])
 
+;; Convert iorxor iterator into or/xor instruction
+(define_code_attr IORXOR_OP [(ior "or") (xor "xor")])
+
 (include "darwin.md")
 
 ;; Start with fixed-point load and store insns.  Here we put only the more
@@ -3933,7 +3936,7 @@
   [(set (match_operand:GPR 0 "gpc_reg_operand")
(iorxor:GPR (match_operand:GPR 1 "gpc_reg_operand")
(match_operand:GPR 2 "non_logical_cint_operand")))]
-  ""
+  "mode != DImode || !TARGET_P10_FUSION"
   [(set (match_dup 3)
(iorxor:GPR (match_dup 1)
(match_dup 4)))


[gcc(refs/users/meissner/heads/work163-test)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f7b4e5104da1e247943dca403f92678cb460da52

commit f7b4e5104da1e247943dca403f92678cb460da52
Author: Michael Meissner 
Date:   Mon Apr 8 14:01:05 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.test | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
index 00d82eaf4cf..bb0dce0ec35 100644
--- a/gcc/ChangeLog.test
+++ b/gcc/ChangeLog.test
@@ -1,3 +1,18 @@
+ Branch work163-test, patch #201 
+
+Add power10 ori/oris and xori/xoris fusion support.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/fusion.md: Regenerate.
+   * config/rs6000/genfusion.md (gen_wide_immediate): Add wide immediate
+   fusion patterns.
+   * config/rs6000/rs6000.md (IORXOR_OP): New code attribute.
+   (ior, xor splitter): Do not split ior/xor patterns that could be fused
+   on power10.
+
  Branch work163-test, patch #200 
 
 Add power10 wide immediate fusion


[gcc] Created branch 'meissner/heads/work164' in namespace 'refs/users'

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164' was created in namespace 'refs/users' 
pointing to:

 f4f7c52472f... Update gcc fr.po


[gcc(refs/users/meissner/heads/work164)] Add ChangeLog.meissner and REVISION.

2024-04-08 Thread Michael Meissner via Libstdc++-cvs
https://gcc.gnu.org/g:c74c4da314fe579ee63310dfda6ab67106c2615c

commit c74c4da314fe579ee63310dfda6ab67106c2615c
Author: Michael Meissner 
Date:   Mon Apr 8 15:37:58 2024 -0400

Add ChangeLog.meissner and REVISION.

2024-04-08  Michael Meissner  

gcc/

* REVISION: New file for branch.
* ChangeLog.meissner: New file.

gcc/c-family/

* ChangeLog.meissner: New file.

gcc/c/

* ChangeLog.meissner: New file.

gcc/cp/

* ChangeLog.meissner: New file.

gcc/fortran/

* ChangeLog.meissner: New file.

gcc/testsuite/

* ChangeLog.meissner: New file.

libgcc/

* ChangeLog.meissner: New file.

Diff:
---
 gcc/ChangeLog.meissner   | 6 ++
 gcc/REVISION | 1 +
 gcc/c-family/ChangeLog.meissner  | 6 ++
 gcc/c/ChangeLog.meissner | 6 ++
 gcc/cp/ChangeLog.meissner| 6 ++
 gcc/fortran/ChangeLog.meissner   | 6 ++
 gcc/testsuite/ChangeLog.meissner | 6 ++
 libgcc/ChangeLog.meissner| 6 ++
 libstdc++-v3/ChangeLog.meissner  | 6 ++
 9 files changed, 49 insertions(+)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/gcc/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index 000..28f9ab58eb7
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work164 branch
diff --git a/gcc/c-family/ChangeLog.meissner b/gcc/c-family/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/gcc/c-family/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/c/ChangeLog.meissner b/gcc/c/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/gcc/c/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/cp/ChangeLog.meissner b/gcc/cp/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/gcc/cp/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/fortran/ChangeLog.meissner b/gcc/fortran/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/gcc/fortran/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/testsuite/ChangeLog.meissner b/gcc/testsuite/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/gcc/testsuite/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/libgcc/ChangeLog.meissner b/libgcc/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/libgcc/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/libstdc++-v3/ChangeLog.meissner b/libstdc++-v3/ChangeLog.meissner
new file mode 100644
index 000..de8f5b59651
--- /dev/null
+++ b/libstdc++-v3/ChangeLog.meissner
@@ -0,0 +1,6 @@
+ Branch work164, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+


[gcc] Created branch 'meissner/heads/work164-dmf' in namespace 'refs/users'

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-dmf' was created in namespace 'refs/users' 
pointing to:

 c74c4da314f... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work164-dmf)] Add ChangeLog.dmf and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:49e37ddbfdfdca7c2a766b400e20c5c524a4538f

commit 49e37ddbfdfdca7c2a766b400e20c5c524a4538f
Author: Michael Meissner 
Date:   Mon Apr 8 15:38:53 2024 -0400

Add ChangeLog.dmf and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 6 ++
 gcc/REVISION  | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index 000..dfdbcd4a4b4
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,6 @@
+ Branch work164-dmf, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..be546585469 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-dmf branch


[gcc] Created branch 'meissner/heads/work164-vpair' in namespace 'refs/users'

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-vpair' was created in namespace 'refs/users' 
pointing to:

 c74c4da314f... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work164-vpair)] Add ChangeLog.vpair and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5653a12f14e859cf5a1acb08a4602a7c88718cf0

commit 5653a12f14e859cf5a1acb08a4602a7c88718cf0
Author: Michael Meissner 
Date:   Mon Apr 8 15:39:52 2024 -0400

Add ChangeLog.vpair and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 6 ++
 gcc/REVISION| 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index 000..c0c0f7095a6
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,6 @@
+ Branch work164-vpair, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..650632a5790 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-vpair branch


[gcc] Created branch 'meissner/heads/work164-bugs' in namespace 'refs/users'

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-bugs' was created in namespace 'refs/users' 
pointing to:

 c74c4da314f... Add ChangeLog.meissner and REVISION.


[gcc(refs/users/meissner/heads/work164-bugs)] Add ChangeLog.bugs and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:727700c5eaf43514edb61c54b56f840502ed8d2d

commit 727700c5eaf43514edb61c54b56f840502ed8d2d
Author: Michael Meissner 
Date:   Mon Apr 8 15:41:00 2024 -0400

Add ChangeLog.bugs and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 6 ++
 gcc/REVISION   | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index 000..4291f519935
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,6 @@
+ Branch work164-bugs, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..0ce156df690 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-bugs branch


[gcc] Created branch 'meissner/heads/work164-test' in namespace 'refs/users'

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-test' was created in namespace 'refs/users' 
pointing to:

 c74c4da314f... Add ChangeLog.meissner and REVISION.


[gcc] Created branch 'meissner/heads/work164-orig' in namespace 'refs/users'

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-orig' was created in namespace 'refs/users' 
pointing to:

 f4f7c52472f... Update gcc fr.po


[gcc(refs/users/meissner/heads/work164-test)] Add ChangeLog.test and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4194ef6e7ce01afdcef224d774abe0d0b1d2b45a

commit 4194ef6e7ce01afdcef224d774abe0d0b1d2b45a
Author: Michael Meissner 
Date:   Mon Apr 8 15:41:59 2024 -0400

Add ChangeLog.test and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 6 ++
 gcc/REVISION   | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index 000..8ee53e57958
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,6 @@
+ Branch work164-test, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..d7e751014df 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-test branch


[gcc(refs/users/meissner/heads/work164-orig)] Add REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:3e90ba04c4fcdddc598a5c27d90be52a19a5c8b7

commit 3e90ba04c4fcdddc598a5c27d90be52a19a5c8b7
Author: Michael Meissner 
Date:   Mon Apr 8 15:43:07 2024 -0400

Add REVISION.

2024-04-08  Michael Meissner  

gcc/

* REVISION: New file for branch.

Diff:
---
 gcc/REVISION | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index 000..f975b69cfac
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+work164-orig branch


[gcc r14-9843] Fortran: Accept again tab as alternative to space as separator [PR114304]

2024-04-08 Thread Tobias Burnus via Gcc-cvs
https://gcc.gnu.org/g:477c8a82f38e353a8c6313b38197c70b12deea80

commit r14-9843-g477c8a82f38e353a8c6313b38197c70b12deea80
Author: Tobias Burnus 
Date:   Mon Apr 8 21:47:51 2024 +0200

Fortran: Accept again tab as alternative to space as separator [PR114304]

This fixes a side-effect of/regression caused by r14-9822-g93adf88cc6744a,
which was for the same PR.

PR libfortran/114304

libgfortran/ChangeLog:

* io/list_read.c (eat_separator): Accept tab as alternative to 
space.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr114304-2.f90: New test.

Diff:
---
 gcc/testsuite/gfortran.dg/pr114304-2.f90 | 82 
 libgfortran/io/list_read.c   |  2 +-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gfortran.dg/pr114304-2.f90 
b/gcc/testsuite/gfortran.dg/pr114304-2.f90
new file mode 100644
index 000..5ef5874f528
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114304-2.f90
@@ -0,0 +1,82 @@
+! { dg-do run }
+!
+! PR fortran/114304
+!
+! Ensure that '\t' (tab) is supported as separator in list-directed input
+! While not really standard conform, this is widely used in user input and
+! widely supported.
+!
+
+use iso_c_binding
+implicit none
+character(len=*,kind=c_char), parameter :: tab = C_HORIZONTAL_TAB
+
+! Accept '' as variant to ' ' as separator
+! Check that  and  are handled
+
+character(len=*,kind=c_char), parameter :: nml_str &
+   = '&inparm'//C_CARRIAGE_RETURN // C_NEW_LINE // &
+ 'first'//tab//'='//tab//' .true.'// C_NEW_LINE // &
+ ' , other'//tab//' ='//tab//'3'//tab//', 2'//tab//'/'
+
+! Check that  is handled,
+
+! Note: For new line, Unix uses \n, Windows \r\n but old Apple systems used 
'\r'
+!
+! Gfortran does not seem to support all \r, but the following is supported
+! since ages, ! which seems to be a gfortran extension as ifort and flang 
don't like it.
+
+character(len=*,kind=c_char), parameter :: nml_str2 &
+   = '&inparm'//C_CARRIAGE_RETURN // C_NEW_LINE // &
+ 'first'//C_NEW_LINE//'='//tab//' .true.'// C_CARRIAGE_RETURN // &
+ ' , other'//tab//' ='//tab//'3'//tab//', 2'//tab//'/'
+
+character(len=*,kind=c_char), parameter :: str &
+   = tab//'1'//tab//'2,'//tab//'3'//tab//',4'//tab//','//tab//'5'//tab//'/'
+character(len=*,kind=c_char), parameter :: str2 &
+   = tab//'1'//tab//'2;'//tab//'3'//tab//';4'//tab//';'//tab//'5'//tab//'/'
+logical :: first
+integer :: other(4)
+integer :: ints(6)
+namelist /inparm/ first , other
+
+other = 1
+
+open(99, file="test.inp")
+write(99, '(a)') nml_str
+rewind(99)
+read(99,nml=inparm)
+close(99, status="delete")
+
+if (.not.first .or. any (other /= [3,2,1,1])) stop 1
+
+other = 9
+
+open(99, file="test.inp")
+write(99, '(a)') nml_str2
+rewind(99)
+read(99,nml=inparm)
+close(99, status="delete")
+
+if (.not.first .or. any (other /= [3,2,9,9])) stop 2
+
+ints = 66
+
+open(99, file="test.inp", decimal='point')
+write(99, '(a)') str
+rewind(99)
+read(99,*) ints
+close(99, status="delete")
+
+if (any (ints /= [1,2,3,4,5,66])) stop 3
+
+ints = 77 
+
+open(99, file="test.inp", decimal='comma')
+write(99, '(a)') str2
+rewind(99)
+read(99,*) ints
+close(99, status="delete")
+
+if (any (ints /= [1,2,3,4,5,77])) stop 4
+end
diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index b56f2a4e6d6..5bbbef26c26 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -463,7 +463,7 @@ eat_separator (st_parameter_dt *dtp)
 
   dtp->u.p.comma_flag = 0;
   c = next_char (dtp);
-  if (c == ' ')
+  if (c == ' ' || c == '\t')
 {
   eat_spaces (dtp);
   c = next_char (dtp);


[gcc r14-9845] New effective-target 'asm_goto_with_outputs'

2024-04-08 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:3fa8bff30ab58bd8b8018764d390ec2fcc8153bb

commit r14-9845-g3fa8bff30ab58bd8b8018764d390ec2fcc8153bb
Author: Thomas Schwinge 
Date:   Mon Mar 4 16:04:11 2024 +0100

New effective-target 'asm_goto_with_outputs'

After commit e16f90be2dc8af6c371fe79044c3e668fa3dda62
"testsuite: Fix up lra effective target", we get for nvptx target:

-PASS: gcc.c-torture/compile/asmgoto-2.c   -O0  (test for excess errors)
+ERROR: gcc.c-torture/compile/asmgoto-2.c   -O0 : no files matched glob 
pattern "lra1020113.c.[0-9][0-9][0-9]r.reload" for " dg-do 2 compile { target 
lra } "

Etc.

However, nvptx appears to support 'asm goto' with outputs, including the
new execution test case:

PASS: gcc.dg/pr107385.c execution test

Therefore, generally use new effective-target 'asm_goto_with_outputs' 
instead
of 'lra'.  One exceptions is 'gcc.dg/pr110079.c', which doesn't use 'asm 
goto'
with outputs, and continues using effective-target 'lra', with 
special-casing
nvptx target, to avoid ERROR for 'lra'.

gcc/
* doc/sourcebuild.texi (Effective-Target Keywords): Document
'asm_goto_with_outputs'.  Add comment to 'lra'.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_lra): Add
comment.
(check_effective_target_asm_goto_with_outputs): New.
* gcc.c-torture/compile/asmgoto-2.c: Use it.
* gcc.c-torture/compile/asmgoto-5.c: Likewise.
* gcc.c-torture/compile/asmgoto-6.c: Likewise.
* gcc.c-torture/compile/pr98096.c: Likewise.
* gcc.dg/pr100590.c: Likewise.
* gcc.dg/pr107385.c: Likewise.
* gcc.dg/pr108095.c: Likewise.
* gcc.dg/pr97954.c: Likewise.
* gcc.dg/torture/pr100329.c: Likewise.
* gcc.dg/torture/pr100398.c: Likewise.
* gcc.dg/torture/pr100519.c: Likewise.
* gcc.dg/torture/pr110422.c: Likewise.
* gcc.dg/pr110079.c: Special-case nvptx target.

Diff:
---
 gcc/doc/sourcebuild.texi|  6 ++
 gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c |  2 +-
 gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c |  2 +-
 gcc/testsuite/gcc.c-torture/compile/asmgoto-6.c |  3 +--
 gcc/testsuite/gcc.c-torture/compile/pr98096.c   |  2 +-
 gcc/testsuite/gcc.dg/pr100590.c |  2 +-
 gcc/testsuite/gcc.dg/pr107385.c |  2 +-
 gcc/testsuite/gcc.dg/pr108095.c |  2 +-
 gcc/testsuite/gcc.dg/pr110079.c |  2 +-
 gcc/testsuite/gcc.dg/pr97954.c  |  2 +-
 gcc/testsuite/gcc.dg/torture/pr100329.c |  2 +-
 gcc/testsuite/gcc.dg/torture/pr100398.c |  2 +-
 gcc/testsuite/gcc.dg/torture/pr100519.c |  2 +-
 gcc/testsuite/gcc.dg/torture/pr110422.c |  2 +-
 gcc/testsuite/lib/target-supports.exp   | 13 -
 15 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 7ef82fc9b00..7c0df90e822 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2871,6 +2871,9 @@ Target supports weak undefined symbols
 @item R_flag_in_section
 Target supports the 'R' flag in .section directive in assembly inputs.
 
+@item asm_goto_with_outputs
+Target supports 'asm goto' with outputs.
+
 @item automatic_stack_alignment
 Target supports automatic stack alignment.
 
@@ -2945,6 +2948,9 @@ Target is using an LLVM assembler and/or linker, instead 
of GNU Binutils.
 
 @item lra
 Target supports local register allocator (LRA).
+This must not be called (results in @code{ERROR}) for targets that
+don't do register allocation, and therefore neither use nor don't use
+LRA.
 
 @item lto
 Compiler has been configured to support link-time optimization (LTO).
diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c 
b/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
index 43e597bc59f..234c90e5295 100644
--- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
+++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-2.c
@@ -1,5 +1,5 @@
 /* This test should be switched off for a new target with less than 4 
allocatable registers */
-/* { dg-do compile { target lra } } */
+/* { dg-do compile { target asm_goto_with_outputs } } */
 int
 foo (void)
 {
diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c 
b/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
index e1574a2903a..af1ba5a7001 100644
--- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
+++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-5.c
@@ -1,5 +1,5 @@
 /* Test to generate output reload in asm goto on x86_64.  */
-/* { dg-do compile { target lra } } */
+/* { dg-do compile { target asm_goto_with_outputs } } */
 /* { dg-skip-if "no O0" { { i?86-*-* x86_64-*-* } && { ! ia32 } } { "-O0" } { 
"" } } */
 
 #if defined __x86_64__
diff --git a/gcc/testsuite/gcc.c-torture/compile/asmg

[gcc r14-9846] GCN: '--param=gcn-preferred-vectorization-factor=[default, 32, 64]'

2024-04-08 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:df7625c3af004a81c13d54bb8810e03932eeb59a

commit r14-9846-gdf7625c3af004a81c13d54bb8810e03932eeb59a
Author: Thomas Schwinge 
Date:   Sat Feb 24 00:29:14 2024 +0100

GCN: '--param=gcn-preferred-vectorization-factor=[default,32,64]'

..., and specify '--param=gcn-preferred-vectorization-factor=64' for
'gcc.target/gcn/[...]' test cases with 'scan-assembler' directives that
are specific to 64-lane vectors.  This resolves regressions introduced
in commit 6dedafe166cc02ae87b6a0699ad61ce3ffc46803
"amdgcn: Prefer V32 on RDNA devices".

gcc/
* config/gcn/gcn.opt (--param=gcn-preferred-vectorization-factor):
New.
* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode) Use it.
* doc/invoke.texi (Optimize Options): Document it.
gcc/testsuite/
* gcc.target/gcn/cond_fmaxnm_1.c: Specify
'--param=gcn-preferred-vectorization-factor=64'.
* gcc.target/gcn/cond_fmaxnm_2.c: Likewise.
* gcc.target/gcn/cond_fmaxnm_3.c: Likewise.
* gcc.target/gcn/cond_fmaxnm_4.c: Likewise.
* gcc.target/gcn/cond_fmaxnm_5.c: Likewise.
* gcc.target/gcn/cond_fmaxnm_6.c: Likewise.
* gcc.target/gcn/cond_fmaxnm_7.c: Likewise.
* gcc.target/gcn/cond_fmaxnm_8.c: Likewise.
* gcc.target/gcn/cond_fminnm_1.c: Likewise.
* gcc.target/gcn/cond_fminnm_2.c: Likewise.
* gcc.target/gcn/cond_fminnm_3.c: Likewise.
* gcc.target/gcn/cond_fminnm_4.c: Likewise.
* gcc.target/gcn/cond_fminnm_5.c: Likewise.
* gcc.target/gcn/cond_fminnm_6.c: Likewise.
* gcc.target/gcn/cond_fminnm_7.c: Likewise.
* gcc.target/gcn/cond_fminnm_8.c: Likewise.
* gcc.target/gcn/cond_shift_3.c: Likewise.
* gcc.target/gcn/cond_shift_4.c: Likewise.
* gcc.target/gcn/cond_shift_8.c: Likewise.
* gcc.target/gcn/cond_shift_9.c: Likewise.
* gcc.target/gcn/cond_smax_1.c: Likewise.
* gcc.target/gcn/cond_smin_1.c: Likewise.
* gcc.target/gcn/cond_umax_1.c: Likewise.
* gcc.target/gcn/cond_umin_1.c: Likewise.
* gcc.target/gcn/simd-math-1.c: Likewise.
* gcc.target/gcn/simd-math-5-char.c: Likewise.
* gcc.target/gcn/simd-math-5-long.c: Likewise.
* gcc.target/gcn/simd-math-5-short.c: Likewise.
* gcc.target/gcn/simd-math-5.c: Likewise.
* gcc.target/gcn/smax_1.c: Likewise.
* gcc.target/gcn/smin_1.c: Likewise.
* gcc.target/gcn/umax_1.c: Likewise.
* gcc.target/gcn/umin_1.c: Likewise.

Diff:
---
 gcc/config/gcn/gcn.cc| 14 +-
 gcc/config/gcn/gcn.opt   | 16 
 gcc/doc/invoke.texi  |  8 
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_1.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_2.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_3.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_4.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_5.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_6.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_7.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fmaxnm_8.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_1.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_2.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_3.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_4.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_5.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_6.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_7.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_fminnm_8.c |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_3.c  |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_4.c  |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_8.c  |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_shift_9.c  |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_smax_1.c   |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_smin_1.c   |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_umax_1.c   |  2 ++
 gcc/testsuite/gcc.target/gcn/cond_umin_1.c   |  2 ++
 gcc/testsuite/gcc.target/gcn/simd-math-1.c   |  3 ++-
 gcc/testsuite/gcc.target/gcn/simd-math-5-char.c  |  3 +++
 gcc/testsuite/gcc.target/gcn/simd-math-5-long.c  |  3 +++
 gcc/testsuite/gcc.target/gcn/simd-math-5-short.c |  3 +++
 gcc/testsuite/gcc.target/gcn/simd-math-5.c   |  3 +++
 gcc/testsuite/gcc.target/gcn/smax_1.c|  2 ++
 gcc/testsuite/gcc.target/gcn/smin_1.c|  2 ++
 gcc/testsuite/gcc.target/gcn/umax_1.c|  2 ++
 gcc/testsuite/gcc.target/gcn/umin_1.c|  2 ++
 36 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 700e

[gcc r14-9844] GCN, nvptx: Errors during device probing are fatal

2024-04-08 Thread Thomas Schwinge via Gcc-cvs
https://gcc.gnu.org/g:a02d7f0edc47495ffe456af7ab7718896e0a0c25

commit r14-9844-ga02d7f0edc47495ffe456af7ab7718896e0a0c25
Author: Thomas Schwinge 
Date:   Thu Mar 7 14:42:07 2024 +0100

GCN, nvptx: Errors during device probing are fatal

Currently, we silently disable libgomp GCN and nvptx plugins/devices in
presence of certain error conditions during device probing, thus typically
silently resorting to host-fallback execution.  Make such errors fatal, 
similar
as for any other device access later on, so that we early and reliably 
notice
when things go wrong.  (Keep just two cases non-fatal: (a) libgomp GCN or 
nvptx
plugins are available but 'libhsa-runtime64.so.1' or 'libcuda.so.1' are not,
and (b) those are available, but the corresponding devices are not.)

This resolves the issue that we've got execution test cases unexpectedly
PASSing, despite:

libgomp: GCN fatal error: Run-time could not be initialized
Runtime message: HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed 
to allocate the necessary resources. This error may also occur when the core 
runtime library needs to spawn threads or create internal OS-specific events.

..., and therefore they were not offloaded to the GCN device, but ran in
host-fallback execution mode.  What happend in that scenario is that in
'init_hsa_context' during the initial 'GOMP_OFFLOAD_get_num_devices' we ran
into 'HSA_STATUS_ERROR_OUT_OF_RESOURCES', but it wasn't fatal, but just
silently disabled the libgomp plugin/device.

Especially "entertaining" were cases where such unintended host-fallback
execution happened during effective-target checks like
'offload_device_available' (host-fallback execution there meaning: no 
offload
device available), but actual test cases then were running with an offload
device available, and therefore mis-configured.

include/
* cuda/cuda.h (CUresult): Add 'CUDA_ERROR_NO_DEVICE'.
libgomp/
* plugin/plugin-gcn.c (init_hsa_context): Add and handle
'bool probe' parameter.  Adjust all users; errors during device
probing are fatal.
* plugin/plugin-nvptx.c (nvptx_get_num_devices): Aside from
'CUDA_ERROR_NO_DEVICE', errors during device probing are fatal.

Diff:
---
 include/cuda/cuda.h   |  1 +
 libgomp/plugin/plugin-gcn.c   | 14 --
 libgomp/plugin/plugin-nvptx.c |  4 +++-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 114aba4e074..0dca4b3a5c0 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -57,6 +57,7 @@ typedef enum {
   CUDA_ERROR_OUT_OF_MEMORY = 2,
   CUDA_ERROR_NOT_INITIALIZED = 3,
   CUDA_ERROR_DEINITIALIZED = 4,
+  CUDA_ERROR_NO_DEVICE = 100,
   CUDA_ERROR_INVALID_CONTEXT = 201,
   CUDA_ERROR_INVALID_HANDLE = 400,
   CUDA_ERROR_NOT_FOUND = 500,
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 1d183b61ca4..27947801ccd 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -1513,10 +1513,12 @@ assign_agent_ids (hsa_agent_t agent, void *data)
 }
 
 /* Initialize hsa_context if it has not already been done.
-   Return TRUE on success.  */
+   If !PROBE: returns TRUE on success.
+   If PROBE: returns TRUE on success or if the plugin/device shall be silently
+   ignored, and otherwise emits an error and returns FALSE.  */
 
 static bool
-init_hsa_context (void)
+init_hsa_context (bool probe)
 {
   hsa_status_t status;
   int agent_index = 0;
@@ -1531,7 +1533,7 @@ init_hsa_context (void)
GOMP_PLUGIN_fatal ("%s\n", msg);
   else
GCN_WARNING ("%s\n", msg);
-  return false;
+  return probe ? true : false;
 }
   status = hsa_fns.hsa_init_fn ();
   if (status != HSA_STATUS_SUCCESS)
@@ -3337,8 +3339,8 @@ GOMP_OFFLOAD_version (void)
 int
 GOMP_OFFLOAD_get_num_devices (unsigned int omp_requires_mask)
 {
-  if (!init_hsa_context ())
-return 0;
+  if (!init_hsa_context (true))
+exit (EXIT_FAILURE);
   /* Return -1 if no omp_requires_mask cannot be fulfilled but
  devices were present.  */
   if (hsa_context.agent_count > 0
@@ -3355,7 +3357,7 @@ GOMP_OFFLOAD_get_num_devices (unsigned int 
omp_requires_mask)
 bool
 GOMP_OFFLOAD_init_device (int n)
 {
-  if (!init_hsa_context ())
+  if (!init_hsa_context (false))
 return false;
   if (n >= hsa_context.agent_count)
 {
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index ced6e014ece..5aad3448a8d 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -604,12 +604,14 @@ nvptx_get_num_devices (void)
   CUresult r = CUDA_CALL_NOCHECK (cuInit, 0);
   /* This is not an error: e.g. we may have CUDA libraries installed but
  no devices available.  */
-  if (r != CUDA_SUCCESS)
+  if (r == CUDA_ERROR_NO_DEVICE)
{
   

[gcc(refs/users/meissner/heads/work164)] Add -mcpu=power11 support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:029084c4dd049bf78695962f346b95f7ee6f9629

commit 029084c4dd049bf78695962f346b95f7ee6f9629
Author: Michael Meissner 
Date:   Mon Apr 8 16:11:17 2024 -0400

Add -mcpu=power11 support.

This patch adds the power11 option to the -mcpu= and -mtune= switches.

This patch treats the power11 like a power10 in terms of costs and 
reassociation
width.

This patch issues a ".machine power11" to the assembly file if you use
-mcpu=power11.

This patch defines _ARCH_PWR11 if the user uses -mcpu=power11.

This patch allows GCC to be configured with the --with-cpu=power11 and
--with-tune=power11 options.

This patch passes -mpwr11 to the assembler if the user uses -mcpu=power11.

This patch adds support for using "power11" in the __builtin_cpu_is built-in
function.

2024-04-08  Michael Meissner  

gcc/

* config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=power11.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/ppc-auxv.h (PPC_PLATFORM_POWER11): New define.
* config/rs6000/rs6000-builtin.cc (cpu_is_info): Add power11.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
_ARCH_PWR11 if -mcpu=power11.
* config/rs6000/rs6000-cpus.def (ISA_POWER11_MASKS_SERVER): New 
define.
(POWERPC_MASKS): Add power11 isa bit.
(power11 cpu): Add power11 definition.
* config/rs6000/rs6000-opts.h (PROCESSOR_POWER11): Add power11 
processor.
* config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Add 
power11
support.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Add power11.
* config/rs6000/rs6000.opt (-mpower11): Add internal power11 ISA 
flag.
* doc/invoke.texi (RS/6000 and PowerPC Options): Document 
-mcpu=power11.

Diff:
---
 gcc/config.gcc  |  6 --
 gcc/config/rs6000/aix71.h   |  1 +
 gcc/config/rs6000/aix72.h   |  1 +
 gcc/config/rs6000/aix73.h   |  1 +
 gcc/config/rs6000/driver-rs6000.cc  |  2 ++
 gcc/config/rs6000/ppc-auxv.h|  3 +--
 gcc/config/rs6000/rs6000-builtin.cc |  1 +
 gcc/config/rs6000/rs6000-c.cc   |  2 ++
 gcc/config/rs6000/rs6000-cpus.def   |  5 +
 gcc/config/rs6000/rs6000-opts.h |  3 ++-
 gcc/config/rs6000/rs6000-string.cc  |  1 +
 gcc/config/rs6000/rs6000-tables.opt |  3 +++
 gcc/config/rs6000/rs6000.cc | 32 
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md |  2 +-
 gcc/config/rs6000/rs6000.opt|  3 +++
 gcc/doc/invoke.texi |  5 +++--
 17 files changed, 56 insertions(+), 16 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2e320dd26c5..d9e4bef503a 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -531,7 +531,9 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower10|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+   xpowerpc64 | xdefault64 | x6[23]0 | x970 | xG5 | xpower[3456789] \
+   | xpower1[01] | xpower6x | xrs64a | xcell | xa2 | xe500mc64 \
+   | xe5500 | xe6500)
cpu_is_64bit=yes
;;
esac
@@ -5591,7 +5593,7 @@ case "${target}" in
eval "with_$which=405"
;;
"" | common | native \
-   | power[3456789] | power10 | power5+ | power6x \
+   | power[3456789] | power1[01] | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
| 401 | 403 | 405 | 405fp | 440 | 440fp | 464 | 464fp \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 24bc301e37d..41037b3852d 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  

[gcc(refs/users/meissner/heads/work164)] Add -mcpu=power11 tests.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e37b896535e92c2d19c38f2dabc129172998453c

commit e37b896535e92c2d19c38f2dabc129172998453c
Author: Michael Meissner 
Date:   Mon Apr 8 16:12:44 2024 -0400

Add -mcpu=power11 tests.

This patch adds some simple tests for -mcpu=power11 support.  In order to 
run
these tests, you need an assembler that supports the appropriate option for
supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under 
AIX).

2024-04-08  Michael Meissner  

gcc/testsuite/

* gcc.target/powerpc/power11-1.c: New test.
* gcc.target/powerpc/power11-2.c: Likewise.
* gcc.target/powerpc/power11-3.c: Likewise.
* lib/target-supports.exp (check_effective_target_power11_ok): Add 
new
effective target.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/power11-1.c | 13 +
 gcc/testsuite/gcc.target/powerpc/power11-2.c | 20 
 gcc/testsuite/gcc.target/powerpc/power11-3.c | 10 ++
 gcc/testsuite/lib/target-supports.exp| 17 +
 4 files changed, 60 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/power11-1.c 
b/gcc/testsuite/gcc.target/powerpc/power11-1.c
new file mode 100644
index 000..6a2e802eedf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/power11-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target power11_ok } */
+/* { dg-options "-mdejagnu-cpu=power11 -O2" } */
+
+/* Basic check to see if the compiler supports -mcpu=power11.  */
+
+#ifndef _ARCH_PWR11
+#error "-mcpu=power11 is not supported"
+#endif
+
+void foo (void)
+{
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/power11-2.c 
b/gcc/testsuite/gcc.target/powerpc/power11-2.c
new file mode 100644
index 000..7b9904c1d29
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/power11-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target power11_ok } */
+/* { dg-options "-O2" } */
+
+/* Check if we can set the power11 target via a target attribute.  */
+
+__attribute__((__target__("cpu=power9")))
+void foo_p9 (void)
+{
+}
+
+__attribute__((__target__("cpu=power10")))
+void foo_p10 (void)
+{
+}
+
+__attribute__((__target__("cpu=power11")))
+void foo_p11 (void)
+{
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/power11-3.c 
b/gcc/testsuite/gcc.target/powerpc/power11-3.c
new file mode 100644
index 000..9b2d643cc0f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/power11-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile { target powerpc*-*-* } }  */
+/* { dg-require-effective-target power11_ok } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2" }  */
+
+/* Check if we can set the power11 target via a target_clones attribute.  */
+
+__attribute__((__target_clones__("cpu=power11,cpu=power9,default")))
+void foo (void)
+{
+}
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 45435586de2..5784cf7b5dd 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7109,6 +7109,23 @@ proc check_effective_target_power10_ok { } {
 }
 }
 
+# Return 1 if this is a PowerPC target supporting -mcpu=power11.
+
+proc check_effective_target_power11_ok { } {
+if { ([istarget powerpc*-*-*]) } {
+   return [check_no_compiler_messages power11_ok object {
+   int main (void) {
+   #ifndef _ARCH_PWR11
+   #error "-mcpu=power11 is not supported"
+   #endif
+   return 0;
+   }
+   } "-mcpu=power11"]
+} else {
+   return 0
+}
+}
+
 # Return 1 if this is a PowerPC target supporting -mfloat128 via either
 # software emulation on power7/power8 systems or hardware support on power9.


[gcc(refs/users/meissner/heads/work164)] Add -mcpu=power11 tuning support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:57ae31dbc4f42831ac381573163a4039e02942c5

commit 57ae31dbc4f42831ac381573163a4039e02942c5
Author: Michael Meissner 
Date:   Mon Apr 8 16:12:11 2024 -0400

Add -mcpu=power11 tuning support.

This patch makes -mtune=power11 use the same tuning decisions as 
-mtune=power10.

2024-04-08  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add power11 as an
alternative to power10.

Diff:
---
 gcc/config/rs6000/power10.md | 144 +--
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index fcc2199ab29..90312643858 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,4 @@
-;; Scheduling description for the IBM POWER10 processor.
+;; Scheduling description for the IBM POWER10 and POWER11 processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +97,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +110,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +124,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +132,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +148,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +178,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +191,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_even_power10,STU_power10")
 
 ; Update forms have 2 cycle latency for updated addr reg
 (define_insn_reservation "power10-store-update" 2
   (and (eq_attr "type" "store,fpstore")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10"))
+   (eq_attr "cpu" "power10,power11"))
   "DU_any_power10,STU_powe

[gcc(refs/users/meissner/heads/work164)] Add -mcpu=future support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d480207ec78d59dbf6af1501cb8aaac605fe109c

commit d480207ec78d59dbf6af1501cb8aaac605fe109c
Author: Michael Meissner 
Date:   Mon Apr 8 16:15:07 2024 -0400

Add -mcpu=future support.

This patch adds the future option to the -mcpu= and -mtune= switches.

This patch treats the future like a power11 in terms of costs and 
reassociation
width.

This patch issues a ".machine future" to the assembly file if you use
-mcpu=power11.

This patch defines _ARCH_PWR_FUTURE if the user uses -mcpu=future.

This patch allows GCC to be configured with the --with-cpu=future and
--with-tune=future options.

This patch passes -mfuture to the assembler if the user uses -mcpu=future.

2024-04-08  Michael Meissner  

gcc/

* config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=power11.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
_ARCH_PWR_FUTURE if -mcpu=future.
* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): New 
define.
(POWERPC_MASKS): Add future isa bit.
(power11 cpu): Add future definition.
* config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Add future 
processor.
* config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Add 
future
support.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Add future.
* config/rs6000/rs6000.opt (-mpower11): Add internal future ISA 
flag.
* doc/invoke.texi (RS/6000 and PowerPC Options): Document 
-mcpu=future.

Diff:
---
 gcc/config.gcc  |  4 ++--
 gcc/config/rs6000/aix71.h   |  1 +
 gcc/config/rs6000/aix72.h   |  1 +
 gcc/config/rs6000/aix73.h   |  1 +
 gcc/config/rs6000/driver-rs6000.cc  |  2 ++
 gcc/config/rs6000/rs6000-c.cc   |  2 ++
 gcc/config/rs6000/rs6000-cpus.def   |  5 +
 gcc/config/rs6000/rs6000-opts.h |  3 ++-
 gcc/config/rs6000/rs6000-string.cc  |  1 +
 gcc/config/rs6000/rs6000-tables.opt |  3 +++
 gcc/config/rs6000/rs6000.cc | 30 ++
 gcc/config/rs6000/rs6000.h  |  1 +
 gcc/config/rs6000/rs6000.md |  2 +-
 gcc/config/rs6000/rs6000.opt|  3 +++
 gcc/doc/invoke.texi |  2 +-
 15 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d9e4bef503a..225d21c394d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -532,7 +532,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
xpowerpc64 | xdefault64 | x6[23]0 | x970 | xG5 | xpower[3456789] \
-   | xpower1[01] | xpower6x | xrs64a | xcell | xa2 | xe500mc64 \
+   | xpower1[01] | xfuture | xpower6x | xrs64a | xcell | xa2 | 
xe500mc64 \
| xe5500 | xe6500)
cpu_is_64bit=yes
;;
@@ -5593,7 +5593,7 @@ case "${target}" in
eval "with_$which=405"
;;
"" | common | native \
-   | power[3456789] | power1[01] | power5+ | power6x \
+   | power[3456789] | power1[01] | power5+ | power6x | 
future \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
| 401 | 403 | 405 | 405fp | 440 | 440fp | 464 | 464fp \
diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 41037b3852d..570ddcc451d 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
   mcpu=power9: -mpwr9; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index fe59f8319b4..242ca94bd06 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,6 +79,7 @@ do {   

[gcc(refs/users/meissner/heads/work164)] Add -mcpu=future tuning support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:db22ab06475b3c5ae8b4cb0c516df3b1321501a4

commit db22ab06475b3c5ae8b4cb0c516df3b1321501a4
Author: Michael Meissner 
Date:   Mon Apr 8 16:15:51 2024 -0400

Add -mcpu=future tuning support.

This patch makes -mtune=future use the same tuning decision as 
-mtune=power11.

2024-04-08  Michael Meissner  

gcc/

* config/rs6000/power10.md (all reservations): Add future as an
alterntive to power10 and power11.

Diff:
---
 gcc/config/rs6000/power10.md | 145 ++-
 1 file changed, 73 insertions(+), 72 deletions(-)

diff --git a/gcc/config/rs6000/power10.md b/gcc/config/rs6000/power10.md
index 90312643858..1ec1bef0726 100644
--- a/gcc/config/rs6000/power10.md
+++ b/gcc/config/rs6000/power10.md
@@ -1,4 +1,5 @@
-;; Scheduling description for the IBM POWER10 and POWER11 processors.
+;; Scheduling description for the IBM POWER10 and POWER11 processors as well as
+;; potential future processors.
 ;; Copyright (C) 2020-2024 Free Software Foundation, Inc.
 ;;
 ;; Contributed by Pat Haugen (pthau...@us.ibm.com).
@@ -97,12 +98,12 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-fused-load" 4
   (and (eq_attr "type" "fused_load_cmpi,fused_addis_load,fused_load_load")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-load" 4
@@ -110,13 +111,13 @@
(eq_attr "update" "no")
(eq_attr "size" "!128")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-load-update" 4
   (and (eq_attr "type" "load")
(eq_attr "update" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-fpload-double" 4
@@ -124,7 +125,7 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "no")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 (define_insn_reservation "power10-prefixed-fpload-double" 4
@@ -132,14 +133,14 @@
(eq_attr "update" "no")
(eq_attr "size" "64")
(eq_attr "prefixed" "yes")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-double" 4
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "64")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; SFmode loads are cracked and have additional 3 cycles over DFmode
@@ -148,27 +149,27 @@
   (and (eq_attr "type" "fpload")
(eq_attr "update" "no")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10")
 
 (define_insn_reservation "power10-fpload-update-single" 7
   (and (eq_attr "type" "fpload")
(eq_attr "update" "yes")
(eq_attr "size" "32")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 (define_insn_reservation "power10-vecload" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,LU_power10")
 
 ; lxvp
 (define_insn_reservation "power10-vecload-pair" 4
   (and (eq_attr "type" "vecload")
(eq_attr "size" "256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,LU_power10+SXU_power10")
 
 ; Store Unit
@@ -178,12 +179,12 @@
(eq_attr "prefixed" "no")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_any_power10,STU_power10")
 
 (define_insn_reservation "power10-fused-store" 0
   (and (eq_attr "type" "fused_store_store")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 (define_insn_reservation "power10-prefixed-store" 0
@@ -191,52 +192,52 @@
(eq_attr "prefixed" "yes")
(eq_attr "size" "!128")
(eq_attr "size" "!256")
-   (eq_attr "cpu" "power10,power11"))
+   (eq_attr "cpu" "power10,power11,future"))
   "DU_even_power10,STU_power10")
 
 ; Update forms

[gcc(refs/users/meissner/heads/work164)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:ed369045c44e2e212cfa09ad4efbd567007a5089

commit ed369045c44e2e212cfa09ad4efbd567007a5089
Author: Michael Meissner 
Date:   Mon Apr 8 16:18:42 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.meissner | 185 -
 1 file changed, 184 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index de8f5b59651..a2f14198bdb 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,6 +1,189 @@
+ Branch work164, patch #11 
+
+Add -mcpu=future tuning support.
+
+This patch makes -mtune=future use the same tuning decision as -mtune=power11.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add future as an
+   alterntive to power10 and power11.
+
+ Branch work164, patch #10 
+
+Add -mcpu=future support.
+
+This patch adds the future option to the -mcpu= and -mtune= switches.
+
+This patch treats the future like a power11 in terms of costs and reassociation
+width.
+
+This patch issues a ".machine future" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR_FUTURE if the user uses -mcpu=future.
+
+This patch allows GCC to be configured with the --with-cpu=future and
+--with-tune=future options.
+
+This patch passes -mfuture to the assembler if the user uses -mcpu=future.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=power11.
+   * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/driver-rs6000.cc (asm_names): Likewise.
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+   _ARCH_PWR_FUTURE if -mcpu=future.
+   * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): New define.
+   (POWERPC_MASKS): Add future isa bit.
+   (power11 cpu): Add future definition.
+   * config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Add future processor.
+   * config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
+   * config/rs6000/rs6000-tables.opt: Regenerate.
+   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Add future
+   support.
+   (rs6000_machine_from_flags): Likewise.
+   (rs6000_reassociation_width): Likewise.
+   (rs6000_adjust_cost): Likewise.
+   (rs6000_issue_rate): Likewise.
+   (rs6000_sched_reorder): Likewise.
+   (rs6000_sched_reorder2): Likewise.
+   (rs6000_register_move_cost): Likewise.
+   (rs6000_opt_masks): Likewise.
+   * config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/rs6000.md (cpu attribute): Add future.
+   * config/rs6000/rs6000.opt (-mpower11): Add internal future ISA flag.
+   * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcpu=future.
+
+ Branch work164, patch #3 
+
+Add -mcpu=power11 tests.
+
+This patch adds some simple tests for -mcpu=power11 support.  In order to run
+these tests, you need an assembler that supports the appropriate option for
+supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX).
+
+2024-04-08  Michael Meissner  
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/power11-1.c: New test.
+   * gcc.target/powerpc/power11-2.c: Likewise.
+   * gcc.target/powerpc/power11-3.c: Likewise.
+   * lib/target-supports.exp (check_effective_target_power11_ok): Add new
+   effective target.
+
+ Branch work164, patch #2 
+
+Add -mcpu=power11 tuning support.
+
+This patch makes -mtune=power11 use the same tuning decisions as 
-mtune=power10.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add power11 as an
+   alternative to power10.
+
+ Branch work164, patch #1 
+
+Add -mcpu=power11 support.
+
+This patch adds the power11 option to the -mcpu= and -mtune= switches.
+
+This patch treats the power11 like a power10 in terms of costs and 
reassociation
+width.
+
+This patch issues a ".machine power11" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR11 if the user uses -mcpu=power11.
+
+This patch allows GCC to be configured with the --with-cpu=power11 and
+--with-tune=power11 options.
+
+This patch passes -mpwr11 to the assembler if the user uses -mcpu=power11.
+
+This patch adds support for using "power11" in the __builtin_cpu_is built-in
+function.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=power11.
+   * config/rs6000/aix72.h (ASM_CPU_SPEC): Likew

[gcc/meissner/heads/work164-bugs] (8 commits) Merge commit 'refs/users/meissner/heads/work164-bugs' of gi

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-bugs' was updated to point to:

 f71af54db37... Merge commit 'refs/users/meissner/heads/work164-bugs' of gi

It previously pointed to:

 727700c5eaf... Add ChangeLog.bugs and update REVISION.

Diff:

Summary of changes (added commits):
---

  f71af54... Merge commit 'refs/users/meissner/heads/work164-bugs' of gi
  9aaed23... Add ChangeLog.bugs and update REVISION.
  ed36904... Update ChangeLog.* (*)
  db22ab0... Add -mcpu=future tuning support. (*)
  d480207... Add -mcpu=future support. (*)
  e37b896... Add -mcpu=power11 tests. (*)
  57ae31d... Add -mcpu=power11 tuning support. (*)
  029084c... Add -mcpu=power11 support. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work164-bugs' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work164-bugs)] Add ChangeLog.bugs and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:9aaed23449422c3fd26a6b62269db5ffdd77e697

commit 9aaed23449422c3fd26a6b62269db5ffdd77e697
Author: Michael Meissner 
Date:   Mon Apr 8 15:41:00 2024 -0400

Add ChangeLog.bugs and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.bugs: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.bugs | 6 ++
 gcc/REVISION   | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
new file mode 100644
index 000..4291f519935
--- /dev/null
+++ b/gcc/ChangeLog.bugs
@@ -0,0 +1,6 @@
+ Branch work164-bugs, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..0ce156df690 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-bugs branch


[gcc(refs/users/meissner/heads/work164-bugs)] Merge commit 'refs/users/meissner/heads/work164-bugs' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f71af54db37427b76a5e98645f60a66a3222fe91

commit f71af54db37427b76a5e98645f60a66a3222fe91
Merge: 9aaed234494 727700c5eaf
Author: Michael Meissner 
Date:   Mon Apr 8 16:19:53 2024 -0400

Merge commit 'refs/users/meissner/heads/work164-bugs' of 
git+ssh://gcc.gnu.org/git/gcc into me/work164-bugs

Diff:


[gcc r14-9847] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-04-08 Thread Uros Bizjak via Gcc-cvs
https://gcc.gnu.org/g:eaccdba315b86d374a4e72b9dd8fefb0fc3cc5ee

commit r14-9847-geaccdba315b86d374a4e72b9dd8fefb0fc3cc5ee
Author: Uros Bizjak 
Date:   Mon Apr 8 20:54:30 2024 +0200

combine: Fix ICE in try_combine on pr112494.c [PR112560]

The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:

internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 
'E' (rtx unspec) in try_combine, at combine.cc:3237

This is

3236  /* Just replace the CC reg with a new mode.  */
3237  SUBST (XEXP (*cc_use_loc, 0), newpat_dest);
3238  undobuf.other_insn = cc_use_insn;

in combine.cc, where *cc_use_loc is

(unspec:DI [
(reg:CC 17 flags)
] UNSPEC_PUSHFL)

combine assumes CC must be used inside of a comparison and uses XEXP (..., 
0)
without checking on the RTX type of the argument.

Replace cc_use_loc with the entire new RTX only in case cc_use_loc satisfies
COMPARISON_P predicate.  Otherwise scan the entire cc_use_loc RTX for CC reg
to be updated with a new mode.

PR rtl-optimization/112560

gcc/ChangeLog:

* combine.cc (try_combine): Replace cc_use_loc with the entire
new RTX only in case cc_use_loc satisfies COMPARISON_P predicate.
Otherwise scan the entire cc_use_loc RTX for CC reg to be updated
with a new mode.
* config/i386/i386.md (@pushf2): Allow all CC modes for
operand 1.

Diff:
---
 gcc/combine.cc  | 16 +---
 gcc/config/i386/i386.md |  4 ++--
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 745391016d0..71c9abc145c 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -3222,8 +3222,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
 #endif
  /* Cases for modifying the CC-using comparison.  */
  if (compare_code != orig_compare_code
- /* ??? Do we need to verify the zero rtx?  */
- && XEXP (*cc_use_loc, 1) == const0_rtx)
+ && COMPARISON_P (*cc_use_loc))
{
  /* Replace cc_use_loc with entire new RTX.  */
  SUBST (*cc_use_loc,
@@ -3233,8 +3232,19 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
}
  else if (compare_mode != orig_compare_mode)
{
+ subrtx_ptr_iterator::array_type array;
+
  /* Just replace the CC reg with a new mode.  */
- SUBST (XEXP (*cc_use_loc, 0), newpat_dest);
+ FOR_EACH_SUBRTX_PTR (iter, array, cc_use_loc, NONCONST)
+   {
+ rtx *loc = *iter;
+ if (REG_P (*loc)
+ && REGNO (*loc) == REGNO (newpat_dest))
+   {
+ SUBST (*loc, newpat_dest);
+ iter.skip_subrtxes ();
+   }
+   }
  undobuf.other_insn = cc_use_insn;
}
}
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index bb2c72f3473..10ae3113ae8 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2219,9 +2219,9 @@
 
 (define_insn "@pushfl2"
   [(set (match_operand:W 0 "push_operand" "=<")
-   (unspec:W [(match_operand:CC 1 "flags_reg_operand")]
+   (unspec:W [(match_operand 1 "flags_reg_operand")]
  UNSPEC_PUSHFL))]
-  ""
+  "GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_CC"
   "pushf{}"
   [(set_attr "type" "push")
(set_attr "mode" "")])


[gcc(refs/users/meissner/heads/work164-dmf)] Merge commit 'refs/users/meissner/heads/work164-dmf' of git+ssh://gcc.gnu.org/git/gcc into me/work16

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f930883e878cb195ec938987fef57421d704ee3b

commit f930883e878cb195ec938987fef57421d704ee3b
Merge: d4dad4336b6 49e37ddbfdf
Author: Michael Meissner 
Date:   Mon Apr 8 16:22:05 2024 -0400

Merge commit 'refs/users/meissner/heads/work164-dmf' of 
git+ssh://gcc.gnu.org/git/gcc into me/work164-dmf

Diff:


[gcc/meissner/heads/work164-dmf] (8 commits) Merge commit 'refs/users/meissner/heads/work164-dmf' of git

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-dmf' was updated to point to:

 f930883e878... Merge commit 'refs/users/meissner/heads/work164-dmf' of git

It previously pointed to:

 49e37ddbfdf... Add ChangeLog.dmf and update REVISION.

Diff:

Summary of changes (added commits):
---

  f930883... Merge commit 'refs/users/meissner/heads/work164-dmf' of git
  d4dad43... Add ChangeLog.dmf and update REVISION.
  ed36904... Update ChangeLog.* (*)
  db22ab0... Add -mcpu=future tuning support. (*)
  d480207... Add -mcpu=future support. (*)
  e37b896... Add -mcpu=power11 tests. (*)
  57ae31d... Add -mcpu=power11 tuning support. (*)
  029084c... Add -mcpu=power11 support. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work164-dmf' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work164-dmf)] Add ChangeLog.dmf and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d4dad4336b65a00d06880c52939c77aeb9f538af

commit d4dad4336b65a00d06880c52939c77aeb9f538af
Author: Michael Meissner 
Date:   Mon Apr 8 15:38:53 2024 -0400

Add ChangeLog.dmf and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.dmf: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.dmf | 6 ++
 gcc/REVISION  | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
new file mode 100644
index 000..dfdbcd4a4b4
--- /dev/null
+++ b/gcc/ChangeLog.dmf
@@ -0,0 +1,6 @@
+ Branch work164-dmf, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..be546585469 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-dmf branch


[gcc/meissner/heads/work164-test] (8 commits) Merge commit 'refs/users/meissner/heads/work164-test' of gi

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-test' was updated to point to:

 fb11e874c9f... Merge commit 'refs/users/meissner/heads/work164-test' of gi

It previously pointed to:

 4194ef6e7ce... Add ChangeLog.test and update REVISION.

Diff:

Summary of changes (added commits):
---

  fb11e87... Merge commit 'refs/users/meissner/heads/work164-test' of gi
  618d7ef... Add ChangeLog.test and update REVISION.
  ed36904... Update ChangeLog.* (*)
  db22ab0... Add -mcpu=future tuning support. (*)
  d480207... Add -mcpu=future support. (*)
  e37b896... Add -mcpu=power11 tests. (*)
  57ae31d... Add -mcpu=power11 tuning support. (*)
  029084c... Add -mcpu=power11 support. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work164-test' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work164-test)] Add ChangeLog.test and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:618d7ef74adc6a6888e6cfba8669653b66f8ac10

commit 618d7ef74adc6a6888e6cfba8669653b66f8ac10
Author: Michael Meissner 
Date:   Mon Apr 8 15:41:59 2024 -0400

Add ChangeLog.test and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.test: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.test | 6 ++
 gcc/REVISION   | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.test b/gcc/ChangeLog.test
new file mode 100644
index 000..8ee53e57958
--- /dev/null
+++ b/gcc/ChangeLog.test
@@ -0,0 +1,6 @@
+ Branch work164-test, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..d7e751014df 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-test branch


[gcc(refs/users/meissner/heads/work164-test)] Merge commit 'refs/users/meissner/heads/work164-test' of git+ssh://gcc.gnu.org/git/gcc into me/work1

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:fb11e874c9f7a5012b6a2d29edf76ddd4b213606

commit fb11e874c9f7a5012b6a2d29edf76ddd4b213606
Merge: 618d7ef74ad 4194ef6e7ce
Author: Michael Meissner 
Date:   Mon Apr 8 16:25:26 2024 -0400

Merge commit 'refs/users/meissner/heads/work164-test' of 
git+ssh://gcc.gnu.org/git/gcc into me/work164-test

Diff:


[gcc/meissner/heads/work164-vpair] (8 commits) Merge commit 'refs/users/meissner/heads/work164-vpair' of g

2024-04-08 Thread Michael Meissner via Gcc-cvs
The branch 'meissner/heads/work164-vpair' was updated to point to:

 8d380e9ea06... Merge commit 'refs/users/meissner/heads/work164-vpair' of g

It previously pointed to:

 5653a12f14e... Add ChangeLog.vpair and update REVISION.

Diff:

Summary of changes (added commits):
---

  8d380e9... Merge commit 'refs/users/meissner/heads/work164-vpair' of g
  6113768... Add ChangeLog.vpair and update REVISION.
  ed36904... Update ChangeLog.* (*)
  db22ab0... Add -mcpu=future tuning support. (*)
  d480207... Add -mcpu=future support. (*)
  e37b896... Add -mcpu=power11 tests. (*)
  57ae31d... Add -mcpu=power11 tuning support. (*)
  029084c... Add -mcpu=power11 support. (*)

(*) This commit already exists in another branch.
Because the reference `refs/users/meissner/heads/work164-vpair' matches
your hooks.email-new-commits-only configuration,
no separate email is sent for this commit.


[gcc(refs/users/meissner/heads/work164-vpair)] Add ChangeLog.vpair and update REVISION.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:61137681da67b3bfa9f7fe648ded1d13d2ea597a

commit 61137681da67b3bfa9f7fe648ded1d13d2ea597a
Author: Michael Meissner 
Date:   Mon Apr 8 15:39:52 2024 -0400

Add ChangeLog.vpair and update REVISION.

2024-04-08  Michael Meissner  

gcc/

* ChangeLog.vpair: New file for branch.
* REVISION: Update.

Diff:
---
 gcc/ChangeLog.vpair | 6 ++
 gcc/REVISION| 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
new file mode 100644
index 000..c0c0f7095a6
--- /dev/null
+++ b/gcc/ChangeLog.vpair
@@ -0,0 +1,6 @@
+ Branch work164-vpair, baseline 
+
+2024-04-08   Michael Meissner  
+
+   Clone branch
+
diff --git a/gcc/REVISION b/gcc/REVISION
index 28f9ab58eb7..650632a5790 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-work164 branch
+work164-vpair branch


[gcc(refs/users/meissner/heads/work164-vpair)] Merge commit 'refs/users/meissner/heads/work164-vpair' of git+ssh://gcc.gnu.org/git/gcc into me/work

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:8d380e9ea06a66fd55172897dde9b50b808e368e

commit 8d380e9ea06a66fd55172897dde9b50b808e368e
Merge: 61137681da6 5653a12f14e
Author: Michael Meissner 
Date:   Mon Apr 8 16:26:58 2024 -0400

Merge commit 'refs/users/meissner/heads/work164-vpair' of 
git+ssh://gcc.gnu.org/git/gcc into me/work164-vpair

Diff:


[gcc(refs/users/meissner/heads/work164-dmf)] Use vector pair load/store for memcpy with -mcpu=future

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:c575074589818be10e22af065970a7bd3fcbdc04

commit c575074589818be10e22af065970a7bd3fcbdc04
Author: Michael Meissner 
Date:   Mon Apr 8 18:29:13 2024 -0400

Use vector pair load/store for memcpy with -mcpu=future

In the development for the power10 processor, GCC did not enable using the 
load
vector pair and store vector pair instructions when optimizing things like
memory copy.  This patch enables using those instructions if -mcpu=future is
used.

2024-04-08  Michael Meissner  

gcc/

* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Enable 
using
load vector pair and store vector pair instructions for memory copy
operations.
(POWERPC_MASKS): Make the bit for enabling using load vector pair 
and
store vector pair operations set and reset when the PowerPC 
processor is
changed.

Diff:
---
 gcc/config/rs6000/rs6000-cpus.def | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 47365534af8..4ddba142e44 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -90,6 +90,7 @@
  | OPTION_MASK_POWER11)
 
 #define ISA_FUTURE_MASKS_SERVER(ISA_POWER11_MASKS_SERVER   
\
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_FUTURE)
 
 /* Flags that need to be turned off if -mno-vsx.  */
@@ -121,6 +122,7 @@
 
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
+| OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \


[gcc(refs/users/meissner/heads/work164-dmf)] Add wD constraint.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b00e0644aaeaaca5044af30724f85f46fe132956

commit b00e0644aaeaaca5044af30724f85f46fe132956
Author: Michael Meissner 
Date:   Mon Apr 8 19:05:25 2024 -0400

Add wD constraint.

This patch adds a new constraint ('wD') that matches the accumulator 
registers
that overlap with VSX registers 0..31 on power10.  Future patches will add 
the
support for a separate accumulator register class that will be used when the
support for dense math registes is added.

2024-04-08   Michael Meissner  

* config/rs6000/constraints.md (wD): New constraint.
* config/rs6000/mma.md (mma_): Prepare for alternate 
accumulator
registers.  Use wD constraint instead of 'd' constraint.  Use
accumulator_operand instead of fpr_reg_operand.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0")]
MMA_ACC))]
   "TARGET_MMA"
   " %A0"
@@ -523,7 +523,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_VV))]
@@ -532,8 +532,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_AVV))]
@@ -542,7 +542,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_PV))]
@@ -551,8 +551,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:OO 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_APV))]
@@ -561,7 +561,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -574,8 +574,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")
(match_operand:SI 4 "const_0_to_15_operand" "n,n")
@@ -588,7 +588,7 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -601,8 +601,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")

[gcc(refs/users/meissner/heads/work164-dmf)] Add support for dense math registers.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:aae104ff2f11641f29cb4e74bbfbed7967ae1761

commit aae104ff2f11641f29cb4e74bbfbed7967ae1761
Author: Michael Meissner 
Date:   Mon Apr 8 19:24:31 2024 -0400

Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
with
the VSX registers 0..31, but logically the accumulator registers were 
separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future 
systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the 
accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to 
enable
or disable the dense math support.

This patch adds a new constraint (wD).  If MMA is selected but dense math is
not selected (i.e. -mcpu=power10), the wD constraint will allow access to
accumulators that overlap with VSX registers 0..31.  If both MMA and dense 
math
are selected (i.e. -mcpu=future), the wD constraint will only allow dense 
math
registers.

This patch modifies the existing %A output modifier.  If MMA is selected but
dense math is not selected, then %A output modifier converts the VSX 
register
number to the accumulator number, by dividing it by 4.  If both MMA and 
dense
math are selected, then %A will map the separate DMR registers into 0..7.

The intention is that user code using extended asm can be modified to run on
both MMA without dense math and MMA with dense math:

1)  If possible, don't use extended asm, but instead use the MMA 
built-in
functions;

2)  If you do need to write extended asm, change the d constraints
targetting accumulators should now use wD;

3)  Only use the built-in zero, assemble and disassemble functions 
create
move data between vector quad types and dense math accumulators.
I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
extended asm code.  The reason is these instructions assume there 
is a
1-to-1 correspondence between 4 adjacent FPR registers and an
accumulator that overlaps with those instructions.  With 
accumulators
now being separate registers, there no longer is a 1-to-1
correspondence.

It is possible that the mangling for DMRs and the GDB register numbers may
produce other changes in the future.

2024-04-08   Michael Meissner  

* config/rs6000/mma.md (movxo): Add comments about dense math 
registers.
(movxo_nodm): Rename from movxo and restrict the usage to machines
without dense math registers.
(movxo_dm): New insn for movxo support for machines with dense math
registers.
(mma_): Restrict usage to machines without dense math 
registers.
(mma_xxsetaccz): Make a define_expand, and add support for dense 
math
registers.
(mma_xxsetaccz_nodm): Rename from mma_xxsetaccz, and restrict to
machines without dense math registers.
(mma_dmsetaccz): New insn.
* config/rs6000/predicates.md (dmr_operand): New predicate.
(accumulator_operand): Add support for dense math registers.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): 
Do
not de-prime accumulator when disassembling a vector quad.
* config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): 
Define
__DENSE_MATH__ if we have dense math registers.
* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
constraint.
(reload_reg_map): Likewise.
(rs6000_reg_names): Likewise.
(alt_reg_names): Likewise.
(rs6000_hard_regno_nregs_internal): Likewise.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_secondary_reload_memory): Add support for DMR registers.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(print_operand): Make %A handle both FPRs and D

[gcc(refs/users/meissner/heads/work164-dmf)] Revert all changes

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:b8c2eeecfc7fb41d8fdf00f5200181755038504f

commit b8c2eeecfc7fb41d8fdf00f5200181755038504f
Author: Michael Meissner 
Date:   Mon Apr 8 19:25:56 2024 -0400

Revert all changes

Diff:
---
 gcc/config/rs6000/constraints.md|   3 -
 gcc/config/rs6000/mma.md|  87 +-
 gcc/config/rs6000/predicates.md |  32 -
 gcc/config/rs6000/rs6000-builtin.cc |   5 +-
 gcc/config/rs6000/rs6000-c.cc   |   9 +-
 gcc/config/rs6000/rs6000.cc | 225 +++-
 gcc/config/rs6000/rs6000.h  |  44 +--
 gcc/config/rs6000/rs6000.md |   2 -
 gcc/doc/md.texi |   5 -
 9 files changed, 79 insertions(+), 333 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 277a30a8245..369a7b75042 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -107,9 +107,6 @@
(match_test "TARGET_P8_VECTOR")
(match_operand 0 "s5bit_cint_operand")))
 
-(define_register_constraint "wD" "rs6000_constraints[RS6000_CONSTRAINT_wD]"
-  "Accumulator register.")
-
 (define_constraint "wE"
   "@internal Vector constant that can be loaded with the XXSPLTIB instruction."
   (match_test "xxspltib_constant_nosplit (op, mode)"))
diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 29d2968fe2a..04e2d0066df 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -314,9 +314,7 @@
(set_attr "length" "*,*,8")])
 
 
-;; Vector quad support.  Under the original MMA, XOmode can only live in VSX
-;; registers 0..31.  With dense math, XOmode can live in either VSX registers
-;; (0..63) or DMR registers.
+;; Vector quad support.  XOmode can only live in FPRs.
 (define_expand "movxo"
   [(set (match_operand:XO 0 "nonimmediate_operand")
(match_operand:XO 1 "input_operand"))]
@@ -341,10 +339,10 @@
 gcc_assert (false);
 })
 
-(define_insn_and_split "*movxo_nodm"
+(define_insn_and_split "*movxo"
   [(set (match_operand:XO 0 "nonimmediate_operand" "=d,ZwO,d")
(match_operand:XO 1 "input_operand" "ZwO,d,d"))]
-  "TARGET_MMA_NO_DENSE_MATH
+  "TARGET_MMA
&& (gpc_reg_operand (operands[0], XOmode)
|| gpc_reg_operand (operands[1], XOmode))"
   "@
@@ -361,31 +359,6 @@
(set_attr "length" "*,*,16")
(set_attr "max_prefixed_insns" "2,2,*")])
 
-(define_insn_and_split "*movxo_dm"
-  [(set (match_operand:XO 0 "nonimmediate_operand" "=wa,ZwO,wa,wD,wD,wa")
-   (match_operand:XO 1 "input_operand""ZwO,wa, wa,wa,wD,wD"))]
-  "TARGET_MMA_DENSE_MATH
-   && (gpc_reg_operand (operands[0], XOmode)
-   || gpc_reg_operand (operands[1], XOmode))"
-  "@
-   #
-   #
-   #
-   dmxxinstdmr512 %0,%1,%Y1,0
-   dmmr %0,%1
-   dmxxextfdmr512 %0,%Y0,%1,0"
-  "&& reload_completed
-   && !dmr_operand (operands[0], XOmode)
-   && !dmr_operand (operands[1], XOmode)"
-  [(const_int 0)]
-{
-  rs6000_split_multireg_move (operands[0], operands[1]);
-  DONE;
-}
-  [(set_attr "type" "vecload,vecstore,veclogical,mma,mma,mma")
-   (set_attr "length" "*,*,16,*,*,*")
-   (set_attr "max_prefixed_insns" "2,2,*,*,*,*")])
-
 (define_expand "vsx_assemble_pair"
   [(match_operand:OO 0 "vsx_register_operand")
(match_operand:V16QI 1 "mma_assemble_input_operand")
@@ -526,17 +499,15 @@
   DONE;
 })
 
-;; MMA instructions that do not use their accumulators as an input, still must
-;; not allow their vector operands to overlap the registers used by the
-;; accumulator.  We enforce this by marking the output as early clobber.  The
-;; prime and de-prime instructions are not needed on systems with dense math
-;; registers.
+;; MMA instructions that do not use their accumulators as an input, still
+;; must not allow their vector operands to overlap the registers used by
+;; the accumulator.  We enforce this by marking the output as early clobber.
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "accumulator_operand" "=&wD")
+  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
(unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
MMA_ACC))]
-  "TARGET_MMA_NO_DENSE_MATH"
+  "TARGET_MMA"
   " %A0"
   [(set_attr "type" "mma")])
 
@@ -552,7 +523,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_VV))]
@@ -561,8 +532,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
-   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
+  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")

[gcc r14-9849] PR modula2/114648 cc1gm2 by default does not handle C pre-processor file and line directives

2024-04-08 Thread Gaius Mulley via Gcc-cvs
https://gcc.gnu.org/g:600bf396799a022e65938de572ad1a79a951b95a

commit r14-9849-g600bf396799a022e65938de572ad1a79a951b95a
Author: Gaius Mulley 
Date:   Tue Apr 9 02:35:11 2024 +0100

PR modula2/114648 cc1gm2 by default does not handle C pre-processor file 
and line directives

This patch fixes the default behavior of cc1gm2 to the description in
the documentation.  By default cc1gm2 will allow C preprocessor
directives (they can be turned off via -fno-cpp).

gcc/m2/ChangeLog:

PR modula2/114648
* gm2-compiler/M2Options.mod (LineDirectives): Initially
set to true.

gcc/testsuite/ChangeLog:

PR modula2/114648
* gm2/cpp/default/pass/AdvParse.def: New test.
* gm2/cpp/default/pass/AdvParse.mod: New test.
* gm2/cpp/default/pass/cpp-default-pass.exp: New test.

Signed-off-by: Gaius Mulley 

Diff:
---
 gcc/m2/gm2-compiler/M2Options.mod  |  2 +-
 gcc/testsuite/gm2/cpp/default/pass/AdvParse.def|  5 +++
 gcc/testsuite/gm2/cpp/default/pass/AdvParse.mod|  8 +
 .../gm2/cpp/default/pass/cpp-default-pass.exp  | 36 ++
 4 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/gcc/m2/gm2-compiler/M2Options.mod 
b/gcc/m2/gm2-compiler/M2Options.mod
index 09d62cbc732..d04cded17f0 100644
--- a/gcc/m2/gm2-compiler/M2Options.mod
+++ b/gcc/m2/gm2-compiler/M2Options.mod
@@ -1944,7 +1944,7 @@ BEGIN
ReturnChecking:= FALSE ;
CaseElseChecking  := FALSE ;
CPreProcessor := FALSE ;
-   LineDirectives:= FALSE ;
+   LineDirectives:= TRUE ;
ExtendedOpaque:= FALSE ;
UnboundedByReference  := FALSE ;
VerboseUnbounded  := FALSE ;
diff --git a/gcc/testsuite/gm2/cpp/default/pass/AdvParse.def 
b/gcc/testsuite/gm2/cpp/default/pass/AdvParse.def
new file mode 100644
index 000..391a803116f
--- /dev/null
+++ b/gcc/testsuite/gm2/cpp/default/pass/AdvParse.def
@@ -0,0 +1,5 @@
+DEFINITION MODULE AdvParse ;
+
+PROCEDURE foo ;
+
+END AdvParse.
diff --git a/gcc/testsuite/gm2/cpp/default/pass/AdvParse.mod 
b/gcc/testsuite/gm2/cpp/default/pass/AdvParse.mod
new file mode 100644
index 000..ed3727c63c6
--- /dev/null
+++ b/gcc/testsuite/gm2/cpp/default/pass/AdvParse.mod
@@ -0,0 +1,8 @@
+# 2 "AdvParse.bnf"
+IMPLEMENTATION MODULE AdvParse ;
+
+PROCEDURE foo ;
+BEGIN
+END foo ;
+
+END AdvParse.
\ No newline at end of file
diff --git a/gcc/testsuite/gm2/cpp/default/pass/cpp-default-pass.exp 
b/gcc/testsuite/gm2/cpp/default/pass/cpp-default-pass.exp
new file mode 100644
index 000..0b373cfd655
--- /dev/null
+++ b/gcc/testsuite/gm2/cpp/default/pass/cpp-default-pass.exp
@@ -0,0 +1,36 @@
+# Copyright (C) 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# This file was written by Gaius Mulley (gaiusm...@gmail.com)
+# for GNU Modula-2.
+
+if $tracelevel then {
+strace $tracelevel
+}
+
+# load support procs
+load_lib gm2-torture.exp
+
+gm2_init_pim "${srcdir}/gm2/cpp/default/pass/"
+
+foreach testcase [lsort [glob -nocomplain $srcdir/$subdir/*.mod]] {
+# If we're only testing specific files and this isn't one of them, skip it.
+if ![runtest_file_p $runtests $testcase] then {
+   continue
+}
+
+gm2-torture $testcase
+}


[gcc r14-9851] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:9c97de682303b81c8886ac131fcfb3b122f2f1a6

commit r14-9851-g9c97de682303b81c8886ac131fcfb3b122f2f1a6
Author: Kewen Lin 
Date:   Mon Apr 8 21:02:17 2024 -0500

testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

As PR114614 shows, the newly added test case gcov-20.c by
commit r14-9789-g08a52331803f66 failed on targets which do
not support atomic profile update, there would be a message
like:

  warning: target does not support atomic profile update,
   single mode is selected

Since the test case adopts -fprofile-update=atomic, it
requires effective target check profile_update_atomic, this
patch is to add the check accordingly.

PR testsuite/114614

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-20.c: Add effective target check
profile_update_atomic.

Diff:
---
 gcc/testsuite/gcc.misc-tests/gcov-20.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-20.c 
b/gcc/testsuite/gcc.misc-tests/gcov-20.c
index 215faffc980..ca8c12aad2b 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-20.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-20.c
@@ -1,5 +1,6 @@
 /* { dg-options "-fcondition-coverage -ftest-coverage -fprofile-update=atomic" 
} */
 /* { dg-do run { target native } } */
+/* { dg-require-effective-target profile_update_atomic } */
 
 /* Some side effect to stop branches from being pruned */
 int x = 0;


[gcc r14-9850] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:26eb5f8fd173e2425ae7505528fc426de4b7e34c

commit r14-9850-g26eb5f8fd173e2425ae7505528fc426de4b7e34c
Author: Kewen Lin 
Date:   Mon Apr 8 21:01:36 2024 -0500

rs6000: Fix wrong align passed to build_aligned_type [PR88309]

As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.

This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.

PR target/88309

Co-authored-by: Andrew Pinski 

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.cc (build_qualified_type): Update function comments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr88309.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 +++
 gcc/tree-ssa-loop-prefetch.cc  |  2 ++
 gcc/tree.cc|  3 ++-
 4 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 6698274031b..e7d6204074c 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1900,7 +1900,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree lhs_type = TREE_TYPE (lhs);
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
  required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_ltype = build_aligned_type (lhs_type, 4);
+   tree align_ltype = build_aligned_type (lhs_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg0.  The resulting type will match
   the type of arg1.  */
@@ -1944,7 +1944,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree arg2_type = ptr_type_node;
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
   required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_stype = build_aligned_type (arg0_type, 4);
+   tree align_stype = build_aligned_type (arg0_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg1.  */
gimple_seq stmts = NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
b/gcc/testsuite/gcc.target/powerpc/pr88309.c
new file mode 100644
index 000..c0078cf2b8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
+
+/* Verify there is no ICE or hanging.  */
+
+#include 
+
+void b(float *c, vector float a, vector float, vector float)
+{
+  vector float d;
+  vector char ahbc;
+  vec_xst(vec_perm(a, d, ahbc), 0, c);
+}
+
+vector float e(vector unsigned);
+
+void f() {
+  float *dst;
+  int g = 0;
+  for (;; g += 16) {
+vector unsigned m, i;
+vector unsigned n, j;
+vector unsigned k, l;
+b(dst + g * 3, e(m), e(n), e(k));
+b(dst + (g + 4) * 3, e(i), e(j), e(l));
+  }
+}
diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
index bbd98e03254..70073cc4fe4 100644
--- a/gcc/tree-ssa-loop-prefetch.cc
+++ b/gcc/tree-ssa-loop-prefetch.cc
@@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
cache_line_size,
   if (delta >= (HOST_WIDE_INT) cache_line_size)
 return false;
 
+  gcc_assert (align_unit > 0);
+
   miss_positions = 0;
   total_positions = (cache_line_size / align_unit) * distinct_iters;
   max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 1000;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index f801712c9dd..787168e9255 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -5689,7 +5689,8 @@ build_qualified_type (tree type, int type_quals 
MEM_STAT_DECL)
   return t;
 }
 
-/* Create a variant of type T with alignment ALIGN.  */
+/* Create a variant of type T with alignment ALIGN which
+   is measured in bits.  */
 
 tree
 build_aligned_type (tree type, unsigned int align)


[gcc r14-9852] x86: Define __APX_INLINE_ASM_USE_GPR32__

2024-04-08 Thread H.J. Lu via Gcc-cvs
https://gcc.gnu.org/g:18e94e04dae724c61cbc13ace85fa68f2deda900

commit r14-9852-g18e94e04dae724c61cbc13ace85fa68f2deda900
Author: H.J. Lu 
Date:   Mon Apr 8 18:57:49 2024 -0700

x86: Define __APX_INLINE_ASM_USE_GPR32__

Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.
When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements
should contain only instructions compatible with r16-r31.

gcc/

PR target/114587
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32.

gcc/testsuite/

PR target/114587
* gcc.target/i386/apx-3.c: Likewise.

Diff:
---
 gcc/config/i386/i386-c.cc | 2 ++
 gcc/testsuite/gcc.target/i386/apx-3.c | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
index 226d277676c..07f4936ba91 100644
--- a/gcc/config/i386/i386-c.cc
+++ b/gcc/config/i386/i386-c.cc
@@ -751,6 +751,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 def_or_undef (parse_in, "__AVX10_1_512__");
   if (isa_flag2 & OPTION_MASK_ISA2_APX_F)
 def_or_undef (parse_in, "__APX_F__");
+  if (ix86_apx_inline_asm_use_gpr32)
+def_or_undef (parse_in, "__APX_INLINE_ASM_USE_GPR32__");
   if (TARGET_IAMCU)
 {
   def_or_undef (parse_in, "__iamcu");
diff --git a/gcc/testsuite/gcc.target/i386/apx-3.c 
b/gcc/testsuite/gcc.target/i386/apx-3.c
new file mode 100644
index 000..1ba4ac036fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/apx-3.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mapx-inline-asm-use-gpr32" } */
+
+#ifndef __APX_INLINE_ASM_USE_GPR32__
+# error __APX_INLINE_ASM_USE_GPR32__ not defined
+#endif


[gcc(refs/users/meissner/heads/work164-dmf)] Add wD constraint.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:ddcfc9baba6d56ded3bd877409614d494803f26f

commit ddcfc9baba6d56ded3bd877409614d494803f26f
Author: Michael Meissner 
Date:   Mon Apr 8 21:34:32 2024 -0400

Add wD constraint.

This patch adds a new constraint ('wD') that matches the accumulator 
registers
that overlap with VSX registers 0..31 on power10.  Future patches will add 
the
support for a separate accumulator register class that will be used when the
support for dense math registes is added.

2024-04-08   Michael Meissner  

* config/rs6000/constraints.md (wD): New constraint.
* config/rs6000/mma.md (mma_): Prepare for alternate 
accumulator
registers.  Use wD constraint instead of 'd' constraint.  Use
accumulator_operand instead of fpr_reg_operand.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0")]
MMA_ACC))]
   "TARGET_MMA"
   " %A0"
@@ -523,7 +523,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_VV))]
@@ -532,8 +532,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_AVV))]
@@ -542,7 +542,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
MMA_PV))]
@@ -551,8 +551,8 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:OO 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")]
MMA_APV))]
@@ -561,7 +561,7 @@
   [(set_attr "type" "mma")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -574,8 +574,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 3 "vsx_register_operand" "v,?wa")
(match_operand:SI 4 "const_0_to_15_operand" "n,n")
@@ -588,7 +588,7 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
(unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
(match_operand:V16QI 2 "vsx_register_operand" "v,?wa")
(match_operand:SI 3 "const_0_to_15_operand" "n,n")
@@ -601,8 +601,8 @@
(set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
-  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
-   (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0")
+  [(set (match_operand:XO 0 "accumulator_operand" "=&wD,&wD")
+   (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0")

[gcc(refs/users/meissner/heads/work164-dmf)] Add support for dense math registers.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:236e981f23100b440dc60543935d43d310ddf324

commit 236e981f23100b440dc60543935d43d310ddf324
Author: Michael Meissner 
Date:   Mon Apr 8 22:10:53 2024 -0400

Add support for dense math registers.

The MMA subsystem added the notion of accumulator registers as an optional
feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
with
the VSX registers 0..31, but logically the accumulator registers were 
separate
from the FPR registers.  In ISA 3.1, it was anticipated that in future 
systems,
the accumulator registers may no overlap with the FPR registers.  This patch
adds the support for dense math registers as separate registers.

This particular patch does not change the MMA support to use the 
accumulators
within the dense math registers.  This patch just adds the basic support for
having separate DMRs.  The next patch will switch the MMA support to use the
accumulators if -mcpu=future is used.

For testing purposes, I added an undocumented option '-mdense-math' to 
enable
or disable the dense math support.

This patch adds a new constraint (wD).  If MMA is selected but dense math is
not selected (i.e. -mcpu=power10), the wD constraint will allow access to
accumulators that overlap with VSX registers 0..31.  If both MMA and dense 
math
are selected (i.e. -mcpu=future), the wD constraint will only allow dense 
math
registers.

This patch modifies the existing %A output modifier.  If MMA is selected but
dense math is not selected, then %A output modifier converts the VSX 
register
number to the accumulator number, by dividing it by 4.  If both MMA and 
dense
math are selected, then %A will map the separate DMR registers into 0..7.

The intention is that user code using extended asm can be modified to run on
both MMA without dense math and MMA with dense math:

1)  If possible, don't use extended asm, but instead use the MMA 
built-in
functions;

2)  If you do need to write extended asm, change the d constraints
targetting accumulators should now use wD;

3)  Only use the built-in zero, assemble and disassemble functions 
create
move data between vector quad types and dense math accumulators.
I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
extended asm code.  The reason is these instructions assume there 
is a
1-to-1 correspondence between 4 adjacent FPR registers and an
accumulator that overlaps with those instructions.  With 
accumulators
now being separate registers, there no longer is a 1-to-1
correspondence.

It is possible that the mangling for DMRs and the GDB register numbers may
produce other changes in the future.

2024-04-08   Michael Meissner  

* config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec.
(movxo): Add comments about dense math registers.
(movxo_nodm): Rename from movxo and restrict the usage to machines
without dense math registers.
(movxo_dm): New insn for movxo support for machines with dense math
registers.
(mma_): Restrict usage to machines without dense math 
registers.
(mma_xxsetaccz): Add a define_expand wrapper, and add support for 
dense
math registers.
(mma_dmsetaccz): New insn.
* config/rs6000/predicates.md (dmr_operand): New predicate.
(accumulator_operand): Add support for dense math registers.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): 
Do
not issue a de-prime instruction when disassembling a vector quad 
on a
system with dense math registers.
* config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): 
Define
__DENSE_MATH__ if we have dense math registers.
* config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMR_REG_TYPE.
(enum rs6000_reload_reg_type): Add RELOAD_REG_DMR.
(LAST_RELOAD_REG_CLASS): Add support for DMR registers and the wD
constraint.
(reload_reg_map): Likewise.
(rs6000_reg_names): Likewise.
(alt_reg_names): Likewise.
(rs6000_hard_regno_nregs_internal): Likewise.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Likewise.
(rs6000_secondary_reload_memory): Add support for DMR registers.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(print_operand): Make %A handle both FPRs and DMRs.
  

[gcc(refs/users/meissner/heads/work164-dmf)] PowerPC: Switch to dense math names for all MMA operations.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:5ebe63826f25ac7416e399f4216f86a41695da89

commit 5ebe63826f25ac7416e399f4216f86a41695da89
Author: Michael Meissner 
Date:   Mon Apr 8 22:15:44 2024 -0400

PowerPC: Switch to dense math names for all MMA operations.

This patch changes the assembler instruction names for MMA instructions from
the original name used in power10 to the new name when used with the dense 
math
system.  I.e. xvf64gerpp becomes dmxvf64gerpp.  The assembler will emit the
same bits for either spelling.

For the non-prefixed MMA instructions, we add a 'dm' prefix in front of the
instruction.  However, the prefixed instructions have a 'pm' prefix, and we 
add
the 'dm' prefix afterwards.  To prevent having two sets of parallel int
attributes, we remove the "pm" prefix from the instruction string in the
attributes, and add it later, both in the insn name and in the output 
template.

2024-04-08   Michael Meissner  

gcc/

* config/rs6000/mma.md (vvi4i4i8): Change the instruction to not 
have a
"pm" prefix.
(avvi4i4i8): Likewise.
(vvi4i4i2): Likewise.
(avvi4i4i2): Likewise.
(vvi4i4): Likewise.
(avvi4i4): Likewise.
(pvi4i2): Likewise.
(apvi4i2): Likewise.
(vvi4i4i4): Likewise.
(avvi4i4i4): Likewise.
(mma_): Add support for running on DMF systems, generating the 
dense
math instruction and using the dense math accumulators.
(mma_): Likewise.
(mma_): Likewise.
(mma_): Likewise.
(mma_pm): Add support for running on DMF systems, 
generating
the dense math instruction and using the dense math accumulators.
Rename the insn with a 'pm' prefix and add either 'pm' or 'pmdm'
prefixes based on whether we have the original MMA specification or 
if
we have dense math support.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.
(mma_pm): Likewise.

Diff:
---
 gcc/config/rs6000/mma.md | 157 +++
 1 file changed, 104 insertions(+), 53 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index ae6e7e9695b..2e04eb653fa 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -225,44 +225,47 @@
 (UNSPEC_MMA_XVF64GERNP "xvf64gernp")
 (UNSPEC_MMA_XVF64GERNN "xvf64gernn")])
 
-(define_int_attr vvi4i4i8  [(UNSPEC_MMA_PMXVI4GER8 "pmxvi4ger8")])
+;; The "pm" prefix is not in these expansions, so that we can generate
+;; pmdmxvi4ger8 on systems with dense math registers and xvi4ger8 on systems
+;; without dense math registers.
+(define_int_attr vvi4i4i8  [(UNSPEC_MMA_PMXVI4GER8 "xvi4ger8")])
 
-(define_int_attr avvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8PP   
"pmxvi4ger8pp")])
+(define_int_attr avvi4i4i8 [(UNSPEC_MMA_PMXVI4GER8PP   "xvi4ger8pp")])
 
-(define_int_attr vvi4i4i2  [(UNSPEC_MMA_PMXVI16GER2"pmxvi16ger2")
-(UNSPEC_MMA_PMXVI16GER2S   "pmxvi16ger2s")
-(UNSPEC_MMA_PMXVF16GER2"pmxvf16ger2")
-(UNSPEC_MMA_PMXVBF16GER2   
"pmxvbf16ger2")])
+(define_int_attr vvi4i4i2  [(UNSPEC_MMA_PMXVI16GER2"xvi16ger2")
+(UNSPEC_MMA_PMXVI16GER2S   "xvi16ger2s")
+(UNSPEC_MMA_PMXVF16GER2"xvf16ger2")
+(UNSPEC_MMA_PMXVBF16GER2   "xvbf16ger2")])
 
-(define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP  "pmxvi16ger2pp")
-(UNSPEC_MMA_PMXVI16GER2SPP 
"pmxvi16ger2spp")
-(UNSPEC_MMA_PMXVF16GER2PP  "pmxvf16ger2pp")
-(UNSPEC_MMA_PMXVF16GER2PN  "pmxvf16ger2pn")
-(UNSPEC_MMA_PMXVF16GER2NP  "pmxvf16ger2np")
-(UNSPEC_MMA_PMXVF16GER2NN  "pmxvf16ger2nn")
-(UNSPEC_MMA_PMXVBF16GER2PP 
"pmxvbf16ger2pp")
-(UNSPEC_MMA_PMXVBF16GER2PN 
"pmxvbf16ger2pn")
-(UNSPEC_MMA_PMXVBF16GER2NP 
"pmxvbf16ger2np")
-(UNSPEC_MMA_PMXVBF16GER2NN 
"pmxvbf16ger2nn")])
+(define_int_attr avvi4i4i2 [(UNSPEC_MMA_PMXVI16GER2PP  "xvi16ger2pp")
+(UNSPEC_MMA_PMXVI16GER2SPP "xvi16ger2spp")
+(UNSPEC_MMA_PMXVF16GER2PP  "xvf16ger2pp")
+(UNSPEC_MMA_PMXVF16GER2PN  "xvf16ger2pn")

[gcc(refs/users/meissner/heads/work164-dmf)] Add dense math test for new instruction names.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:97bf659e44a71e0f35a82dc7454b9d8eaf60cd81

commit 97bf659e44a71e0f35a82dc7454b9d8eaf60cd81
Author: Michael Meissner 
Date:   Mon Apr 8 22:17:17 2024 -0400

Add dense math test for new instruction names.

2024-04-08   Michael Meissner  

gcc/testsuite/

* gcc.target/powerpc/dm-double-test.c: New test.
* lib/target-supports.exp (check_effective_target_ppc_dmr_ok): New
target test.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/dm-double-test.c | 194 ++
 gcc/testsuite/lib/target-supports.exp |  23 +++
 2 files changed, 217 insertions(+)

diff --git a/gcc/testsuite/gcc.target/powerpc/dm-double-test.c 
b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c
new file mode 100644
index 000..66c19779585
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/dm-double-test.c
@@ -0,0 +1,194 @@
+/* Test derived from mma-double-1.c, modified for dense math.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_dense_math_ok } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+#include 
+#include 
+#include 
+
+typedef unsigned char vec_t __attribute__ ((vector_size (16)));
+typedef double v4sf_t __attribute__ ((vector_size (16)));
+#define SAVE_ACC(ACC, ldc, J)  \
+ __builtin_mma_disassemble_acc (result, ACC); \
+ rowC = (v4sf_t *) &CO[0*ldc+J]; \
+  rowC[0] += result[0]; \
+  rowC = (v4sf_t *) &CO[1*ldc+J]; \
+  rowC[0] += result[1]; \
+  rowC = (v4sf_t *) &CO[2*ldc+J]; \
+  rowC[0] += result[2]; \
+  rowC = (v4sf_t *) &CO[3*ldc+J]; \
+ rowC[0] += result[3];
+
+void
+DM (int m, int n, int k, double *A, double *B, double *C)
+{
+  __vector_quad acc0, acc1, acc2, acc3, acc4, acc5, acc6, acc7;
+  v4sf_t result[4];
+  v4sf_t *rowC;
+  for (int l = 0; l < n; l += 4)
+{
+  double *CO;
+  double *AO;
+  AO = A;
+  CO = C;
+  C += m * 4;
+  for (int j = 0; j < m; j += 16)
+   {
+ double *BO = B;
+ __builtin_mma_xxsetaccz (&acc0);
+ __builtin_mma_xxsetaccz (&acc1);
+ __builtin_mma_xxsetaccz (&acc2);
+ __builtin_mma_xxsetaccz (&acc3);
+ __builtin_mma_xxsetaccz (&acc4);
+ __builtin_mma_xxsetaccz (&acc5);
+ __builtin_mma_xxsetaccz (&acc6);
+ __builtin_mma_xxsetaccz (&acc7);
+ unsigned long i;
+
+ for (i = 0; i < k; i++)
+   {
+ vec_t *rowA = (vec_t *) & AO[i * 16];
+ __vector_pair rowB;
+ vec_t *rb = (vec_t *) & BO[i * 4];
+ __builtin_mma_assemble_pair (&rowB, rb[1], rb[0]);
+ __builtin_mma_xvf64gerpp (&acc0, rowB, rowA[0]);
+ __builtin_mma_xvf64gerpp (&acc1, rowB, rowA[1]);
+ __builtin_mma_xvf64gerpp (&acc2, rowB, rowA[2]);
+ __builtin_mma_xvf64gerpp (&acc3, rowB, rowA[3]);
+ __builtin_mma_xvf64gerpp (&acc4, rowB, rowA[4]);
+ __builtin_mma_xvf64gerpp (&acc5, rowB, rowA[5]);
+ __builtin_mma_xvf64gerpp (&acc6, rowB, rowA[6]);
+ __builtin_mma_xvf64gerpp (&acc7, rowB, rowA[7]);
+   }
+ SAVE_ACC (&acc0, m, 0);
+ SAVE_ACC (&acc2, m, 4);
+ SAVE_ACC (&acc1, m, 2);
+ SAVE_ACC (&acc3, m, 6);
+ SAVE_ACC (&acc4, m, 8);
+ SAVE_ACC (&acc6, m, 12);
+ SAVE_ACC (&acc5, m, 10);
+ SAVE_ACC (&acc7, m, 14);
+ AO += k * 16;
+ BO += k * 4;
+ CO += 16;
+   }
+  B += k * 4;
+}
+}
+
+void
+init (double *matrix, int row, int column)
+{
+  for (int j = 0; j < column; j++)
+{
+  for (int i = 0; i < row; i++)
+   {
+ matrix[j * row + i] = (i * 16 + 2 + j) / 0.123;
+   }
+}
+}
+
+void
+init0 (double *matrix, double *matrix1, int row, int column)
+{
+  for (int j = 0; j < column; j++)
+for (int i = 0; i < row; i++)
+  matrix[j * row + i] = matrix1[j * row + i] = 0;
+}
+
+
+void
+print (const char *name, const double *matrix, int row, int column)
+{
+  printf ("Matrix %s has %d rows and %d columns:\n", name, row, column);
+  for (int i = 0; i < row; i++)
+{
+  for (int j = 0; j < column; j++)
+   {
+ printf ("%f ", matrix[j * row + i]);
+   }
+  printf ("\n");
+}
+  printf ("\n");
+}
+
+int
+main (int argc, char *argv[])
+{
+  int rowsA, colsB, common;
+  int i, j, k;
+  int ret = 0;
+
+  for (int t = 16; t <= 128; t += 16)
+{
+  for (int t1 = 4; t1 <= 16; t1 += 4)
+   {
+ rowsA = t;
+ colsB = t1;
+ common = 1;
+ /* printf ("Running test for rows = %d,cols = %d\n", t, t1); */
+ double A[rowsA * common];
+ double B[common * colsB];
+ double C[rowsA * colsB];
+ double D[rowsA * colsB];
+
+
+ init (A, rowsA, common);
+ init (B, common, colsB);
+ init0 (C, D, rowsA, colsB);
+ DM (rowsA, colsB, common, A, B, C);
+
+  

[gcc(refs/users/meissner/heads/work164-dmf)] PowerPC: Add support for 1, 024 bit DMR registers.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:4174abc0dd30ebbb2feaf03f229b5c991c0c7b97

commit 4174abc0dd30ebbb2feaf03f229b5c991c0c7b97
Author: Michael Meissner 
Date:   Mon Apr 8 23:01:07 2024 -0400

PowerPC: Add support for 1,024 bit DMR registers.

This patch is a prelimianry patch to add the full 1,024 bit dense math 
register
(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of 
the
DMR register.

This patch only adds the new 1,024 bit register support.  It does not add
support for any instructions that need 1,024 bit registers instead of 512 
bit
registers.

I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
registers.  The 'wD' constraint added in previous patches is used for these
registers.  I added support to do load and store of DMRs via the VSX 
registers,
since there are no load/store dense math instructions.  I added the new 
keyword
'__dmr' to create 1,024 bit types that can be loaded into DMRs.  At 
present, I
don't have aliases for __dmr512 and __dmr1024 that we've discussed 
internally.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2024-04-08   Michael Meissner  

gcc/

* config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
(UNSPEC_DM_INSERT512_LOWER): Likewise.
(UNSPEC_DM_EXTRACT512): Likewise.
(UNSPEC_DMR_RELOAD_FROM_MEMORY): Likewise.
(UNSPEC_DMR_RELOAD_TO_MEMORY): Likewise.
(movtdo): New define_expand and define_insn_and_split to implement 
1,024
bit DMR registers.
(movtdo_insert512_upper): New insn.
(movtdo_insert512_lower): Likewise.
(movtdo_extract512): Likewise.
(reload_dmr_from_memory): Likewise.
(reload_dmr_to_memory): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMR
support.
(rs6000_init_builtins): Add support for __dmr keyword.
* config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add 
support
for TDOmode.
(rs6000_function_arg): Likewise.
* config/rs6000/rs6000-modes.def (TDOmode): New mode.
* config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
support for TDOmode.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
(rs6000_hard_regno_mode_ok): Likewise.
(rs6000_modes_tieable_p): Likewise.
(rs6000_debug_reg_global): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup 
reload
hooks for DMR mode.
(reg_offset_addressing_ok_p): Add support for TDOmode.
(rs6000_emit_move): Likewise.
(rs6000_secondary_reload_simple_move): Likewise.
(rs6000_preferred_reload_class): Likewise.
(rs6000_secondary_reload_class): Likewise.
(rs6000_mangle_type): Add mangling for __dmr type.
(rs6000_dmr_register_move_cost): Add support for TDOmode.
(rs6000_split_multireg_move): Likewise.
(rs6000_invalid_conversion): Likewise.
* config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
(enum rs6000_builtin_type_index): Add DMR type nodes.
(dmr_type_node): Likewise.
(ptr_dmr_type_node): Likewise.

gcc/testsuite/

* gcc.target/powerpc/dm-1024bit.c: New test.

Diff:
---
 gcc/config/rs6000/mma.md  | 154 ++
 gcc/config/rs6000/rs6000-builtin.cc   |  17 +++
 gcc/config/rs6000/rs6000-call.cc  |  10 +-
 gcc/config/rs6000/rs6000-modes.def|   4 +
 gcc/config/rs6000/rs6000.cc   | 101 -
 gcc/config/rs6000/rs6000.h|   6 +-
 gcc/testsuite/gcc.target/powerpc/dm-1024bit.c |  63 +++
 7 files changed, 321 insertions(+), 34 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 2e04eb653fa..8461499e1c3 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -92,6 +92,11 @@
UNSPEC_MMA_XXMFACC
UNSPEC_MMA_XXMTACC
UNSPEC_MMA_DMSETDMRZ
+   UNSPEC_DM_INSERT512_UPPER
+   UNSPEC_DM_INSERT512_LOWER
+   UNSPEC_DM_EXTRACT512
+   UNSPEC_DMR_RELOAD_FROM_MEMORY
+   UNSPEC_DMR_RELOAD_TO_MEMORY
   ])
 
 (define_c_enum "unspecv"
@@ -793,3 +798,152 @@
 }
   [(set_attr "type" "mma")
(set_attr "prefixed" "yes")])
+
+;; TDOmode (__dmr keyword for 1,024 bit registers).
+(define_expand "movtdo"
+  [(set (match_operand:TDO 0 "nonimmediate_operand")
+   (match_operand:TDO 1 "input_operand"))]
+  "TARGET_MMA_DENSE_MATH"
+{
+  rs6000_emit_move (operands[0], operands[1], TDOmode);
+  DONE;
+})
+
+(define_insn_and_split "*movtdo"
+  [(set (match_operand:TDO 0 "nonimme

[gcc(refs/users/meissner/heads/work164-dmf)] Support load/store vector with right length.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:bb0e760624b02e7f9fc22dca10925cd82efa

commit bb0e760624b02e7f9fc22dca10925cd82efa
Author: Michael Meissner 
Date:   Mon Apr 8 23:28:46 2024 -0400

Support load/store vector with right length.

This patch adds support for new instructions that may be added to the 
PowerPC
architecture in the future to enhance the load and store vector with length
instructions.

The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to 
use
since the count for the number of bytes must be in the top 8 bits of the GPR
register, instead of the bottom 8 bits.  This meant that code generating 
these
instructions typically had to do a shift left by 56 bits to get the count 
into
the right position.  In a future version of the PowerPC architecture, new
variants of these instructions might be added that expect the count to be in
the bottom 8 bits of the GPR register.  These patches add this support to 
GCC
if the user uses the -mcpu=future option.

I discovered that the code in rs6000-string.cc to generate ISA 3.1 
lxvl/stxvl
future lxvll/stxvll instructions would generate these instructions on 
32-bit.
However the patterns for these instructions is only done on 64-bit systems. 
 So
I added a check for 64-bit support before generating the instructions.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2024-04-08   Michael Meissner  

gcc/

* config/rs6000/rs6000-string.cc (expand_block_move): Do not 
generate
lxvl and stxvl on 32-bit.
* config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl 
with
the shift count automaticaly used in the insn.
(lxvrl): New insn for -mcpu=future.
(lxvrll): Likewise.
(stxvl): If -mcpu=future, generate the stxvl with the shift count
automaticaly used in the insn.
(stxvrl): New insn for -mcpu=future.
(stxvrll): Likewise.

gcc/testsuite/

* gcc.target/powerpc/lxvrl.c: New test.
* lib/target-supports.exp 
(check_effective_target_powerpc_future_ok):
New effective target.

Diff:
---
 gcc/config/rs6000/rs6000-string.cc   |   1 +
 gcc/config/rs6000/vsx.md | 122 +--
 gcc/testsuite/gcc.target/powerpc/lxvrl.c |  32 
 gcc/testsuite/lib/target-supports.exp|  12 +++
 4 files changed, 146 insertions(+), 21 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index e74ccf41937..c6737e66cbe 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -2787,6 +2787,7 @@ expand_block_move (rtx operands[], bool might_overlap)
 
   if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
  && TARGET_BLOCK_OPS_VECTOR_PAIR
+ && TARGET_POWERPC64
  && bytes >= 32
  && (align >= 256 || !STRICT_ALIGNMENT))
{
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f135fa079bd..9520191e613 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5629,20 +5629,32 @@
   DONE;
 })
 
-;; Load VSX Vector with Length
+;; Load VSX Vector with Length.  If we have lxvrl, we don't have to do an
+;; explicit shift left into a pseudo.
 (define_expand "lxvl"
-  [(set (match_dup 3)
-(ashift:DI (match_operand:DI 2 "register_operand")
-   (const_int 56)))
-   (set (match_operand:V16QI 0 "vsx_register_operand")
-   (unspec:V16QI
-[(match_operand:DI 1 "gpc_reg_operand")
-  (mem:V16QI (match_dup 1))
- (match_dup 3)]
-UNSPEC_LXVL))]
+  [(use (match_operand:V16QI 0 "vsx_register_operand"))
+   (use (match_operand:DI 1 "gpc_reg_operand"))
+   (use (match_operand:DI 2 "gpc_reg_operand"))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
 {
-  operands[3] = gen_reg_rtx (DImode);
+  rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56));
+  rtx len;
+
+  if (TARGET_FUTURE)
+len = shift_len;
+  else
+{
+  len = gen_reg_rtx (DImode);
+  emit_insn (gen_rtx_SET (len, shift_len));
+}
+
+  rtx dest = operands[0];
+  rtx addr = operands[1];
+  rtx mem = gen_rtx_MEM (V16QImode, addr);
+  rtvec rv = gen_rtvec (3, addr, mem, len);
+  rtx lxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_LXVL);
+  emit_insn (gen_rtx_SET (dest, lxvl));
+  DONE;
 })
 
 (define_insn "*lxvl"
@@ -5666,6 +5678,34 @@
   "lxvll %x0,%1,%2"
   [(set_attr "type" "vecload")])
 
+;; For lxvrl and lxvrll, use the combiner to eliminate the shift.  The
+;; define_expand for lxvl will already incorporate the shift in generating the
+;; insn.  The lxvll buitl-in function required the user to have already done
+;; the shift.  Defining lxvrll this way, will optimize cases where the user has
+;; done the shift immediately before the built-in

[gcc(refs/users/meissner/heads/work164-dmf)] Add saturating subtract built-ins.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:6ac16950e262822264bae44d33157e440af81bf4

commit 6ac16950e262822264bae44d33157e440af81bf4
Author: Michael Meissner 
Date:   Mon Apr 8 23:34:30 2024 -0400

Add saturating subtract built-ins.

This patch adds support for a saturating subtract built-in function that 
may be
added to a future PowerPC processor.  Note, if it is added, the name of the
built-in function may change before GCC 13 is released.  If the name 
changes,
we will submit a patch changing the name.

I also added support for providing dense math built-in functions, even 
though
at present, we have not added any new built-in functions for dense math.  
It is
likely we will want to add new dense math built-in functions as the dense 
math
support is fleshed out.

The patches have been tested on both little and big endian systems.  Can I 
check
it into the master branch?

2024-04-08   Michael Meissner  

gcc/

* config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add 
support
for flagging invalid use of future built-in functions.
(rs6000_builtin_is_supported): Add support for future built-in
functions.
* config/rs6000/rs6000-builtins.def 
(__builtin_saturate_subtract32): New
built-in function for -mcpu=future.
(__builtin_saturate_subtract64): Likewise.
* config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add 
stanzas
for -mcpu=future built-ins.
(stanza_map): Likewise.
(enable_string): Likewise.
(struct attrinfo): Likewise.
(parse_bif_attrs): Likewise.
(write_decls): Likewise.
* config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
built-in insn declarations.
(sat_sub3_dot): Likewise.
(sat_sub3_dot2): Likewise.
* doc/extend.texi (Future PowerPC built-ins): New section.

gcc/testsuite/

* gcc.target/powerpc/subfus-1.c: New test.
* gcc.target/powerpc/subfus-2.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc | 17 
 gcc/config/rs6000/rs6000-builtins.def   | 10 +
 gcc/config/rs6000/rs6000-gen-builtins.cc| 35 ++---
 gcc/config/rs6000/rs6000.md | 60 +
 gcc/doc/extend.texi | 24 
 gcc/testsuite/gcc.target/powerpc/subfus-1.c | 32 +++
 gcc/testsuite/gcc.target/powerpc/subfus-2.c | 32 +++
 7 files changed, 205 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 976a42a74cd..1af38698bf3 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -139,6 +139,17 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode)
 case ENB_MMA:
   error ("%qs requires the %qs option", name, "-mmma");
   break;
+case ENB_FUTURE:
+  error ("%qs requires the %qs option", name, "-mcpu=future");
+  break;
+case ENB_FUTURE_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=future", "-m64", "-mpowerpc64");
+  break;
+case ENB_DM:
+  error ("%qs requires the %qs or %qs options", name, "-mcpu=future",
+"-mdense-math");
+  break;
 default:
 case ENB_ALWAYS:
   gcc_unreachable ();
@@ -194,6 +205,12 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
   return TARGET_HTM;
 case ENB_MMA:
   return TARGET_MMA;
+case ENB_FUTURE:
+  return TARGET_FUTURE;
+case ENB_FUTURE_64:
+  return TARGET_FUTURE && TARGET_POWERPC64;
+case ENB_DM:
+  return TARGET_DENSE_MATH;
 default:
   gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 3bc7fed6956..437ab0e09e9 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -139,6 +139,8 @@
 ;   endian   Needs special handling for endianness
 ;   ibmldRestrict usage to the case when TFmode is IBM-128
 ;   ibm128   Restrict usage to the case where __ibm128 is supported or if ibmld
+;   future   Restrict usage to future instructions
+;   dm   Restrict usage to dense math
 ;
 ; Each attribute corresponds to extra processing required when
 ; the built-in is expanded.  All such special processing should
@@ -4131,3 +4133,11 @@
 
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
+
+[future]
+  const signed int __builtin_saturate_subtract32 (signed int, signed int);
+  SAT_SUBSI sat_subsi3 {}
+
+[future-64]
+  const signed long __builtin_saturate_subtract64 (signed long,  signed long);
+  SAT_SUBDI sat_subdi3 {}
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.cc 
b/gcc/config/rs6000/rs6

[gcc(refs/users/meissner/heads/work164-dmf)] Add -mcpu=future2

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:08dd6bc886f7d7a42332d19c66f47c735676dc11

commit 08dd6bc886f7d7a42332d19c66f47c735676dc11
Author: Michael Meissner 
Date:   Mon Apr 8 23:40:42 2024 -0400

Add -mcpu=future2

2024-04-08  Michael Meissner  

gcc/

* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for 
-mcpu=future2.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
_ARCH_PWR_FUTURE2 if -mcpu=future.
* config/rs6000/rs6000-cpus.def (ISA_FUTURE2_MASKS_SERVER): New 
macro.
(POWERPC_MASKS): Add support for -mcpu=future2.
(future2 processor): Likewise.
* config/rs6000/rs6000-tables.opt: Regenerate
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for 
-mcpu=future2.
* config/rs6000/rs6000.opt (-mfuture2): New internal option.

gcc/testsuite/

* lib/target-supports.exp 
(check_effective_target_powerpc_future2_ok):
New effective target test.

Diff:
---
 gcc/config/rs6000/aix71.h |  1 +
 gcc/config/rs6000/aix72.h |  1 +
 gcc/config/rs6000/aix73.h |  1 +
 gcc/config/rs6000/rs6000-c.cc |  2 ++
 gcc/config/rs6000/rs6000-cpus.def |  5 +
 gcc/config/rs6000/rs6000-tables.opt   |  3 +++
 gcc/config/rs6000/rs6000.cc   |  1 +
 gcc/config/rs6000/rs6000.h|  1 +
 gcc/config/rs6000/rs6000.opt  |  4 
 gcc/testsuite/lib/target-supports.exp | 13 +
 10 files changed, 32 insertions(+)

diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h
index 570ddcc451d..34bbad65dbe 100644
--- a/gcc/config/rs6000/aix71.h
+++ b/gcc/config/rs6000/aix71.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future2: -mfuture; \
   mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h
index 242ca94bd06..f5e66084553 100644
--- a/gcc/config/rs6000/aix72.h
+++ b/gcc/config/rs6000/aix72.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future2: -mfuture; \
   mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
diff --git a/gcc/config/rs6000/aix73.h b/gcc/config/rs6000/aix73.h
index 2bd6b4bb3c4..1140f0f6098 100644
--- a/gcc/config/rs6000/aix73.h
+++ b/gcc/config/rs6000/aix73.h
@@ -79,6 +79,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{mcpu=native: %(asm_cpu_native); \
+  mcpu=future2: -mfuture; \
   mcpu=future: -mfuture; \
   mcpu=power11: -mpwr11; \
   mcpu=power10: -mpwr10; \
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index acd44058876..f0461e5817a 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -451,6 +451,8 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR11");
   if ((flags & OPTION_MASK_FUTURE) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR_FUTURE");
+  if ((flags & OPTION_MASK_FUTURE2) != 0)
+rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR_FUTURE2");
   if ((flags & OPTION_MASK_SOFT_FLOAT) != 0)
 rs6000_define_or_undefine_macro (define_p, "_SOFT_FLOAT");
   if ((flags & OPTION_MASK_RECIP_PRECISION) != 0)
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4ddba142e44..fee97c96197 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -93,6 +93,9 @@
 | OPTION_MASK_BLOCK_OPS_VECTOR_PAIR\
 | OPTION_MASK_FUTURE)
 
+#define ISA_FUTURE2_MASKS_SERVER (ISA_FUTURE_MASKS_SERVER  \
+ | OPTION_MASK_FUTURE2)
+
 /* Flags that need to be turned off if -mno-vsx.  */
 #define OTHER_VSX_VECTOR_MASKS (OPTION_MASK_EFFICIENT_UNALIGNED_VSX\
 | OPTION_MASK_FLOAT128_KEYWORD \
@@ -133,6 +136,7 @@
 | OPTION_MASK_FLOAT128_KEYWORD \
 | OPTION_MASK_FPRND\
 | OPTION_MASK_FUTURE   \
+| OPTION_MASK_FUTURE2  \
 | OPTION_MASK_POWER10  \
 | OPTION_MASK_POWER11  \
 | OPTION_MASK_P10_FUSION   \
@@ -269,3 +

[gcc(refs/users/meissner/heads/work164-dmf)] Add xvrlw support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:6f6c28cb2beae8629e4177e160cd1b2b01e72fb1

commit 6f6c28cb2beae8629e4177e160cd1b2b01e72fb1
Author: Michael Meissner 
Date:   Mon Apr 8 23:52:39 2024 -0400

Add xvrlw support.

2024-04-08  Michael Meissner  

gcc/

* config/rs6000/altivec.md (xvrlw): New insn.
* config/rs6000/rs6000.h (TARGET_XVRLW): New macro.

gcc/testsuite/

* gcc.target/powerpc/xvrlw.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md | 14 +
 gcc/config/rs6000/rs6000.h   |  3 +++
 gcc/testsuite/gcc.target/powerpc/xvrlw.c | 34 
 3 files changed, 51 insertions(+)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 4d4c94ff0a0..fd3397b16f6 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1883,6 +1883,20 @@
 }
   [(set_attr "type" "vecperm")])
 
+;; -mcpu=future2 adds a vector rotate left word variant.  There is no vector
+;; byte/half-word/double-word/quad-word rotate left.  This insn occurs before
+;; altivec_vrl and will match for -mcpu=future, while other cpus will
+;; match the generic insn.
+(define_insn "*xvrlw"
+  [(set (match_operand:V4SI 0 "register_operand" "=v,wa")
+   (rotate:V4SI (match_operand:V4SI 1 "register_operand" "v,wa")
+(match_operand:V4SI 2 "register_operand" "v,wa")))]
+  "TARGET_XVRLW"
+  "@
+   vrlw %0,%1,%2
+   xvrlw %x0,%x1,%x2"
+  [(set_attr "type" "vecsimple")])
+
 (define_insn "altivec_vrl"
   [(set (match_operand:VI2 0 "register_operand" "=v")
 (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 37afa67f184..f52e0474e48 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -577,6 +577,9 @@ extern int rs6000_vector_align[];
 /* Whether we have PADDIS support.  */
 #define TARGET_PADDIS  TARGET_FUTURE2
 
+/* Whether we have XVRLW support.  */
+#define TARGET_XVRLW   TARGET_FUTURE2
+
 /* Whether the various reciprocal divide/square root estimate instructions
exist, and whether we should automatically generate code for the instruction
by default.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/xvrlw.c 
b/gcc/testsuite/gcc.target/powerpc/xvrlw.c
new file mode 100644
index 000..846f2e337c5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/xvrlw.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_future2_ok } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=future2 -O2" } */
+
+/* Test whether the xvrl (vector word rotate left using VSX registers insead of
+   Altivec registers is generated.  */
+
+#include 
+
+typedef vector unsigned int  v4si_t;
+
+v4si_t
+rotl_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x << n) | (x >> (32 - n));
+}
+
+v4si_t
+rotr_v4si_scalar (v4si_t x, unsigned long n)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return (x >> n) | (x << (32 - n));
+}
+
+v4si_t
+rotl_v4si_vector (v4si_t x, v4si_t y)
+{
+  __asm__ (" # %x0" : "+f" (x));
+  return vec_rl (x, y);
+}
+
+/* { dg-final { scan-assembler-times {\mxvrl\M} 3  } } */


[gcc(refs/users/meissner/heads/work164-dmf)] Add paddis support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a111516557ce455bbcade91f94299af03c9f3bb4

commit a111516557ce455bbcade91f94299af03c9f3bb4
Author: Michael Meissner 
Date:   Mon Apr 8 23:46:29 2024 -0400

Add paddis support.

2024-03-22  Michael Meissner  

gcc/

* config/rs6000/constraints.md (eU): New constraint.
(eV): Likewise.
* config/rs6000/predicates.md (paddis_operand): New predicate.
(paddis_paddi_operand): Likewise.
(add_operand): Add paddis support.
* config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis 
support.
(num_insns_constant_multi): Likewise.
(print_operand): Add %B for paddis support.
* config/rs6000/rs6000.h (TARGET_PADDIS): New macro.
(SIGNED_INTEGER_32BIT_P): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add paddis support.
(enabled attribute); Likewise.
(add3): Likewise.
(adddi3 splitter): New splitter for paddis.
(movdi_internal64): Add paddis support.
(movdi splitter): New splitter for paddis.

gcc/testsuite/

* gcc.target/powerpc/paddis.c: New test.

Diff:
---
 gcc/config/rs6000/constraints.md  | 10 
 gcc/config/rs6000/predicates.md   | 52 -
 gcc/config/rs6000/rs6000.cc   | 25 
 gcc/config/rs6000/rs6000.h|  4 ++
 gcc/config/rs6000/rs6000.md   | 96 ++-
 gcc/testsuite/gcc.target/powerpc/paddis.c | 24 
 6 files changed, 197 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 277a30a8245..4d8d21fd6bb 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -222,6 +222,16 @@
   "An IEEE 128-bit constant that can be loaded into VSX registers."
   (match_operand 0 "easy_vector_constant_ieee128"))
 
+(define_constraint "eU"
+  "@internal integer constant that can be loaded with paddis"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_operand")))
+
+(define_constraint "eV"
+  "@internal integer constant that can be loaded with paddis + paddi"
+  (and (match_code "const_int")
+   (match_operand 0 "paddis_paddi_operand")))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b325000690b..0b7c0bf4b0f 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -369,6 +369,53 @@
   return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
+;; Return 1 if op is a 64-bit constant that uses the paddis instruction
+(define_predicate "paddis_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are non-zero, paddis can't handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) != 0)
+return false;
+
+  return true;
+})
+
+;; Return 1 if op is a 64-bit constant that needs the paddis instruction and an
+;; addi/addis/paddi instruction combination.
+(define_predicate "paddis_paddi_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PADDIS && TARGET_POWERPC64)
+return 0;
+
+  /* If addi, addis, or paddi can handle the number, don't return true.  */
+  HOST_WIDE_INT value = INTVAL (op);
+  if (SIGNED_INTEGER_34BIT_P (value))
+return false;
+
+  /* If the number is too large for padds, return false.  */
+  if (!SIGNED_INTEGER_32BIT_P (value >> 32))
+return false;
+
+  /* If the bottom 32-bits are zero, we can use paddis alone to handle it.  */
+  if ((value & HOST_WIDE_INT_C(0x)) == 0)
+return false;
+
+  return true;
+})
+
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
@@ -1050,7 +1097,10 @@
   (if_then_else (match_code "const_int")
 (match_test "satisfies_constraint_I (op)
 || satisfies_constraint_L (op)
-|| satisfies_constraint_eI (op)")
+|| satisfies_constraint_eI (op)
+|| satisfies_constraint_eU (op)
+|| satisfies_constraint_eV (op)")
+
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if the operand is either a non-special register, or 0, or -1.
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 42e9fa1dc56..963c42df3d7 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.

[gcc(refs/users/meissner/heads/work164-dmf)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:49632a09712aad4bde818f869b02faadade8e1da

commit 49632a09712aad4bde818f869b02faadade8e1da
Author: Michael Meissner 
Date:   Mon Apr 8 23:57:17 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.dmf | 161 +-
 1 file changed, 160 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
index dfdbcd4a4b4..7182dca56a7 100644
--- a/gcc/ChangeLog.dmf
+++ b/gcc/ChangeLog.dmf
@@ -1,6 +1,165 @@
+ Branch work164-dmf, patch #11 from work164 

+
+Add -mcpu=future tuning support.
+
+This patch makes -mtune=future use the same tuning decision as -mtune=power11.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add future as an
+   alterntive to power10 and power11.
+
+ Branch work164-dmf, patch #10 from work164 

+
+Add -mcpu=future support.
+
+This patch adds the future option to the -mcpu= and -mtune= switches.
+
+This patch treats the future like a power11 in terms of costs and reassociation
+width.
+
+This patch issues a ".machine future" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR_FUTURE if the user uses -mcpu=future.
+
+This patch allows GCC to be configured with the --with-cpu=future and
+--with-tune=future options.
+
+This patch passes -mfuture to the assembler if the user uses -mcpu=future.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=power11.
+   * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/driver-rs6000.cc (asm_names): Likewise.
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+   _ARCH_PWR_FUTURE if -mcpu=future.
+   * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): New define.
+   (POWERPC_MASKS): Add future isa bit.
+   (power11 cpu): Add future definition.
+   * config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Add future processor.
+   * config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
+   * config/rs6000/rs6000-tables.opt: Regenerate.
+   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Add future
+   support.
+   (rs6000_machine_from_flags): Likewise.
+   (rs6000_reassociation_width): Likewise.
+   (rs6000_adjust_cost): Likewise.
+   (rs6000_issue_rate): Likewise.
+   (rs6000_sched_reorder): Likewise.
+   (rs6000_sched_reorder2): Likewise.
+   (rs6000_register_move_cost): Likewise.
+   (rs6000_opt_masks): Likewise.
+   * config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/rs6000.md (cpu attribute): Add future.
+   * config/rs6000/rs6000.opt (-mpower11): Add internal future ISA flag.
+   * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcpu=future.
+
+ Branch work164-dmf, patch #3 from work164 

+
+Add -mcpu=power11 tests.
+
+This patch adds some simple tests for -mcpu=power11 support.  In order to run
+these tests, you need an assembler that supports the appropriate option for
+supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX).
+
+2024-04-08  Michael Meissner  
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/power11-1.c: New test.
+   * gcc.target/powerpc/power11-2.c: Likewise.
+   * gcc.target/powerpc/power11-3.c: Likewise.
+   * lib/target-supports.exp (check_effective_target_power11_ok): Add new
+   effective target.
+
+ Branch work164-dmf, patch #2 from work164 

+
+Add -mcpu=power11 tuning support.
+
+This patch makes -mtune=power11 use the same tuning decisions as 
-mtune=power10.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add power11 as an
+   alternative to power10.
+
+ Branch work164-dmf, patch #1 from work164 

+
+Add -mcpu=power11 support.
+
+This patch adds the power11 option to the -mcpu= and -mtune= switches.
+
+This patch treats the power11 like a power10 in terms of costs and 
reassociation
+width.
+
+This patch issues a ".machine power11" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR11 if the user uses -mcpu=power11.
+
+This patch allows GCC to be configured with the --with-cpu=power11 and
+--with-tune=power11 options.
+
+This patch passes -mpwr11 to the assembler if the user uses -mcpu=power11.
+
+This patch adds support for using "power11" in the __builtin_cpu_is built-in
+function.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support fo

[gcc(refs/users/meissner/heads/work164-dmf)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:dd2213d6a6c95bd6ba7cdc2a5da4fd44ed57f1e4

commit dd2213d6a6c95bd6ba7cdc2a5da4fd44ed57f1e4
Author: Michael Meissner 
Date:   Tue Apr 9 00:00:15 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.dmf | 464 ++
 1 file changed, 464 insertions(+)

diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf
index 7182dca56a7..4017ffe6242 100644
--- a/gcc/ChangeLog.dmf
+++ b/gcc/ChangeLog.dmf
@@ -1,3 +1,467 @@
+ Branch work164-dmf, patch #122 
+
+Add xvrlw support.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/altivec.md (xvrlw): New insn.
+   * config/rs6000/rs6000.h (TARGET_XVRLW): New macro.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/xvrlw.c: New test.
+
+ Branch work164-dmf, patch #121 
+
+Add paddis support.
+
+2024-03-22  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/constraints.md (eU): New constraint.
+   (eV): Likewise.
+   * config/rs6000/predicates.md (paddis_operand): New predicate.
+   (paddis_paddi_operand): Likewise.
+   (add_operand): Add paddis support.
+   * config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis support.
+   (num_insns_constant_multi): Likewise.
+   (print_operand): Add %B for paddis support.
+   * config/rs6000/rs6000.h (TARGET_PADDIS): New macro.
+   (SIGNED_INTEGER_32BIT_P): Likewise.
+   * config/rs6000/rs6000.md (isa attribute): Add paddis support.
+   (enabled attribute); Likewise.
+   (add3): Likewise.
+   (adddi3 splitter): New splitter for paddis.
+   (movdi_internal64): Add paddis support.
+   (movdi splitter): New splitter for paddis.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/paddis.c: New test.
+
+ Branch work164-dmf, patch #120 
+
+Add -mcpu=future2
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future2.
+   * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+   _ARCH_PWR_FUTURE2 if -mcpu=future.
+   * config/rs6000/rs6000-cpus.def (ISA_FUTURE2_MASKS_SERVER): New macro.
+   (POWERPC_MASKS): Add support for -mcpu=future2.
+   (future2 processor): Likewise.
+   * config/rs6000/rs6000-tables.opt: Regenerate
+   * config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for -mcpu=future2.
+   * config/rs6000/rs6000.opt (-mfuture2): New internal option.
+
+gcc/testsuite/
+
+   * lib/target-supports.exp (check_effective_target_powerpc_future2_ok):
+   New effective target test.
+
+ Branch work164-dmf, patch #111 
+
+Add saturating subtract built-ins.
+
+This patch adds support for a saturating subtract built-in function that may be
+added to a future PowerPC processor.  Note, if it is added, the name of the
+built-in function may change before GCC 13 is released.  If the name changes,
+we will submit a patch changing the name.
+
+I also added support for providing dense math built-in functions, even though
+at present, we have not added any new built-in functions for dense math.  It is
+likely we will want to add new dense math built-in functions as the dense math
+support is fleshed out.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2024-04-08   Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtin.cc (rs6000_invalid_builtin): Add support
+   for flagging invalid use of future built-in functions.
+   (rs6000_builtin_is_supported): Add support for future built-in
+   functions.
+   * config/rs6000/rs6000-builtins.def (__builtin_saturate_subtract32): New
+   built-in function for -mcpu=future.
+   (__builtin_saturate_subtract64): Likewise.
+   * config/rs6000/rs6000-gen-builtins.cc (enum bif_stanza): Add stanzas
+   for -mcpu=future built-ins.
+   (stanza_map): Likewise.
+   (enable_string): Likewise.
+   (struct attrinfo): Likewise.
+   (parse_bif_attrs): Likewise.
+   (write_decls): Likewise.
+   * config/rs6000/rs6000.md (sat_sub3): Add saturating subtract
+   built-in insn declarations.
+   (sat_sub3_dot): Likewise.
+   (sat_sub3_dot2): Likewise.
+   * doc/extend.texi (Future PowerPC built-ins): New section.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/subfus-1.c: New test.
+   * gcc.target/powerpc/subfus-2.c: Likewise.
+
+ Branch work164-dmf, patch #110 
+
+Support load/store vector with right length.
+
+This patch adds support for new instructions that may be added to the PowerPC
+architecture in the future to enhance the load and store vector with length
+instructions.
+
+The current instructions (

[gcc(refs/users/meissner/heads/work164-bugs)] Add power10 wide immediate fusion

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:36b37616daeec9f84d6afa97a6e2d235bebe103d

commit 36b37616daeec9f84d6afa97a6e2d235bebe103d
Author: Michael Meissner 
Date:   Tue Apr 9 00:27:11 2024 -0400

Add power10 wide immediate fusion

2024-04-09  Michael Meissner  

gcc/

PR target/108018
* config/rs6000/rs6000.md (tocref_p10_fusion): New insn.
(tocref): Don't allow tocrev if tocref_p10_fusion
would be used.

Diff:
---
 gcc/config/rs6000/rs6000.md | 43 ++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index abc809448ad..1acd34ca0ee 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11302,13 +11302,54 @@
"TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
"la %0,%2@l(%1)")
 
+;; For power10 fusion, don't split the two pieces of the TOCREF, but keep them
+;; together so the instructions will fuse in case the user is using power10
+;; fusion via -mtune=power10 but is not enabling the PC-relative support.  If
+;; we split the instructions up, the scheduler will likely split the ADDIS and
+;; ADDI instruction.
+
+(define_insn "*tocref_p10_fusion"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
+   (match_operand:P 1 "small_toc_ref" "R"))]
+   "TARGET_TOC && TARGET_P10_FUSION && TARGET_CMODEL != CMODEL_SMALL
+&& (TARGET_ELF || TARGET_XCOFF)
+&& legitimate_constant_pool_address_p (operands[1], QImode, false)"
+{
+  rtx op = operands[1];
+  machine_mode mode = mode;
+  rtx offset = const0_rtx;
+
+  if (GET_CODE (op) == PLUS && add_cint_operand (XEXP (op, 1), mode))
+{
+  offset = XEXP (op, 1);
+  op = XEXP (op, 0);
+}
+
+gcc_assert (GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL);
+operands[2] = XVECEXP (op, 0, 0);  /* symbol ref.  */
+operands[3] = XVECEXP (op, 0, 1);  /* register, usually r2.  */
+operands[4] = offset;
+
+if (INTVAL (offset) == 0)
+  return (TARGET_ELF
+ ? "addis %0,%3,%2@toc@ha\;addi %0,%0,%2@toc@l"
+ : "addis %0,%2@u(%3)\;la %0,%2@l(%3)");
+else
+  return (TARGET_ELF
+ ? "addis %0,%3,%2+%4@toc@ha\;addi %0,%0,%2+%4@toc@l"
+ : "addis %0,%2+%4@u(%3)\;la %0,%2+%4@l(%3)");
+}
+  [(set_attr "type" "*")
+   (set_attr "length" "8")])
+
 (define_insn_and_split "*tocref"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b")
(match_operand:P 1 "small_toc_ref" "R"))]
"TARGET_TOC
+&& !(TARGET_P10_FUSION && (TARGET_ELF || TARGET_XCOFF))
 && legitimate_constant_pool_address_p (operands[1], QImode, false)"
"la %0,%a1"
-   "&& TARGET_CMODEL != CMODEL_SMALL && reload_completed"
+   "&& TARGET_CMODEL != CMODEL_SMALL && !TARGET_P10_FUSION && reload_completed"
   [(set (match_dup 0) (high:P (match_dup 1)))
(set (match_dup 0) (lo_sum:P (match_dup 0) (match_dup 1)))])


[gcc(refs/users/meissner/heads/work164-bugs)] Add power10 ori/oris and xori/xoris fusion support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:9b813c320ee59176e3279391ef7c47ef6ea8ba3a

commit 9b813c320ee59176e3279391ef7c47ef6ea8ba3a
Author: Michael Meissner 
Date:   Tue Apr 9 00:30:46 2024 -0400

Add power10 ori/oris and xori/xoris fusion support.

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.md (gen_wide_immediate): Add wide 
immediate
fusion patterns.
* config/rs6000/rs6000.md (IORXOR_OP): New code attribute.
(ior, xor splitter): Do not split ior/xor patterns that could be 
fused
on power10.

Diff:
---
 gcc/config/rs6000/fusion.md| 10 ++
 gcc/config/rs6000/genfusion.pl | 17 +
 gcc/config/rs6000/rs6000.md|  5 -
 3 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 4ed9ae1d69f..45819627df6 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -3055,3 +3055,13 @@
   [(set_attr "type" "fused_vector")
(set_attr "cost" "6")
(set_attr "length" "8")])
+
+;; ori_oris and xori_xoris fusion patterns generated by gen_wide_immediate
+(define_insn "*fuse_i_is"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=&r")
+(iorxor:DI
+  (match_operand:DI 1 "gpc_reg_operand" "r")
+  (match_operand:DI 2 "non_logical_cint_operand" "n")))]
+  "(TARGET_P10_FUSION)"
+  "i %0,%1,%w2\;is %0,%0,%v2"
+  [(set_attr "length" "8")])
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 2271be14bfa..d059de8afc9 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -367,6 +367,23 @@ EOF
   }
 }
 
+sub gen_wide_immediate
+{
+print <<"EOF";
+
+;; ori_oris and xori_xoris fusion patterns generated by gen_wide_immediate
+(define_insn "*fuse_i_is"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=&r")
+(iorxor:DI
+  (match_operand:DI 1 "gpc_reg_operand" "r")
+  (match_operand:DI 2 "non_logical_cint_operand" "n")))]
+  "(TARGET_P10_FUSION)"
+  "i %0,%1,%w2\\;is %0,%0,%v2"
+  [(set_attr "length" "8")])
+EOF
+}
+
 gen_ld_cmpi_p10();
 gen_logical_addsubf();
 gen_addadd;
+gen_wide_immediate();
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 1acd34ca0ee..f626b68ebb2 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -822,6 +822,9 @@
 (SF "TARGET_P8_VECTOR")
 (DI "TARGET_POWERPC64")])
 
+;; Convert iorxor iterator into or/xor instruction
+(define_code_attr IORXOR_OP [(ior "or") (xor "xor")])
+
 (include "darwin.md")
 
 ;; Start with fixed-point load and store insns.  Here we put only the more
@@ -3933,7 +3936,7 @@
   [(set (match_operand:GPR 0 "gpc_reg_operand")
(iorxor:GPR (match_operand:GPR 1 "gpc_reg_operand")
(match_operand:GPR 2 "non_logical_cint_operand")))]
-  ""
+  "mode != DImode || !TARGET_P10_FUSION"
   [(set (match_dup 3)
(iorxor:GPR (match_dup 1)
(match_dup 4)))


[gcc(refs/users/meissner/heads/work164-bugs)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:d332fba26fbf1202b38fe1d51002abbcbdd4bcbf

commit d332fba26fbf1202b38fe1d51002abbcbdd4bcbf
Author: Michael Meissner 
Date:   Tue Apr 9 00:34:39 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.bugs | 176 -
 1 file changed, 175 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs
index 4291f519935..9c793434348 100644
--- a/gcc/ChangeLog.bugs
+++ b/gcc/ChangeLog.bugs
@@ -1,6 +1,180 @@
+ Branch work164-bugs, patch #201 
+
+Add power10 ori/oris and xori/xoris fusion support.
+
+2024-04-09  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/fusion.md: Regenerate.
+   * config/rs6000/genfusion.md (gen_wide_immediate): Add wide immediate
+   fusion patterns.
+   * config/rs6000/rs6000.md (IORXOR_OP): New code attribute.
+   (ior, xor splitter): Do not split ior/xor patterns that could be fused
+   on power10.
+
+ Branch work164-bugs, patch #200 
+
+Add power10 wide immediate fusion
+
+2024-04-09  Michael Meissner  
+
+gcc/
+
+   PR target/108018
+   * config/rs6000/rs6000.md (tocref_p10_fusion): New insn.
+   (tocref): Don't allow tocrev if tocref_p10_fusion
+   would be used.
+
+ Branch work164-bugs, patch #10 from work164 

+
+Add -mcpu=future support.
+
+This patch adds the future option to the -mcpu= and -mtune= switches.
+
+This patch treats the future like a power11 in terms of costs and reassociation
+width.
+
+This patch issues a ".machine future" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR_FUTURE if the user uses -mcpu=future.
+
+This patch allows GCC to be configured with the --with-cpu=future and
+--with-tune=future options.
+
+This patch passes -mfuture to the assembler if the user uses -mcpu=future.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config.gcc (rs6000*-*-*, powerpc*-*-*): Add support for power11.
+   * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=power11.
+   * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/driver-rs6000.cc (asm_names): Likewise.
+   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+   _ARCH_PWR_FUTURE if -mcpu=future.
+   * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): New define.
+   (POWERPC_MASKS): Add future isa bit.
+   (power11 cpu): Add future definition.
+   * config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Add future processor.
+   * config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
+   * config/rs6000/rs6000-tables.opt: Regenerate.
+   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Add future
+   support.
+   (rs6000_machine_from_flags): Likewise.
+   (rs6000_reassociation_width): Likewise.
+   (rs6000_adjust_cost): Likewise.
+   (rs6000_issue_rate): Likewise.
+   (rs6000_sched_reorder): Likewise.
+   (rs6000_sched_reorder2): Likewise.
+   (rs6000_register_move_cost): Likewise.
+   (rs6000_opt_masks): Likewise.
+   * config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
+   * config/rs6000/rs6000.md (cpu attribute): Add future.
+   * config/rs6000/rs6000.opt (-mpower11): Add internal future ISA flag.
+   * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcpu=future.
+
+ Branch work164-bugs, patch #3 from work164 

+
+Add -mcpu=power11 tests.
+
+This patch adds some simple tests for -mcpu=power11 support.  In order to run
+these tests, you need an assembler that supports the appropriate option for
+supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX).
+
+2024-04-08  Michael Meissner  
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/power11-1.c: New test.
+   * gcc.target/powerpc/power11-2.c: Likewise.
+   * gcc.target/powerpc/power11-3.c: Likewise.
+   * lib/target-supports.exp (check_effective_target_power11_ok): Add new
+   effective target.
+
+ Branch work164-bugs, patch #2 from work164 

+
+Add -mcpu=power11 tuning support.
+
+This patch makes -mtune=power11 use the same tuning decisions as 
-mtune=power10.
+
+2024-04-08  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/power10.md (all reservations): Add power11 as an
+   alternative to power10.
+
+ Branch work164-bugs, patch #1 from work164 

+
+Add -mcpu=power11 support.
+
+This patch adds the power11 option to the -mcpu= and -mtune= switches.
+
+This patch treats the power11 like a power10 in terms of costs and 
reassociation
+width.
+
+This patch issues a ".machine power11" to the assembly file if you use
+-mcpu=power11.
+
+This patch defines _ARCH_PWR11 if the user uses -mcpu=

[gcc(refs/users/meissner/heads/work164-vpair)] Power10: Add options to disable load and store vector pair.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:76066d95bf3f71b749084b11fc6dab1ee7cbd813

commit 76066d95bf3f71b749084b11fc6dab1ee7cbd813
Author: Michael Meissner 
Date:   Tue Apr 9 00:42:14 2024 -0400

Power10: Add options to disable load and store vector pair.

In working on some future patches that involve utilizing vector pair
instructions, I wanted to be able to tune my program to enable or disable 
using
the vector pair load or store operations while still keeping the other
operations on the vector pair.

This patch adds two undocumented tuning options.  The -mno-load-vector-pair
option would tell GCC to generate two load vector instructions instead of a
single load vector pair.  The -mno-store-vector-pair option would tell GCC 
to
generate two store vector instructions instead of a single store vector 
pair.

If either -mno-load-vector-pair is used, GCC will not generate the indexed
stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will 
not
generate the indexed lxvpx instruction.  The reason for this is to enable
splitting the {,p}lxvp or {,p}stxvp instructions after reload without 
needing a
scratch GPR register.

The default for -mcpu=power10 is that both load vector pair and store vector
pair are enabled.

I added code so that the user code can modify these settings using either a
'#pragma GCC target' directive or used __attribute__((__target__(...))) in 
the
function declaration.

I added tests for the switches, #pragma, and attribute options.

I have built this on both little endian power10 systems and big endian 
power9
systems doing the normal bootstrap and test.  There were no regressions in 
any
of the tests, and the new tests passed.  Can I check this patch into the 
master
branch?

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/mma.md (movoo): Add support for 
-mno-load-vector-pair and
-mno-store-vector-pair.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support 
for
-mload-vector-pair and -mstore-vector-pair.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
indexed mode for OOmode if we are generating both load vector pair 
and
store vector pair instructions.
(rs6000_option_override_internal): Add support for 
-mno-load-vector-pair
and -mno-store-vector-pair.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
attributes.
(enabled attribute): Likewise.
* config/rs6000/rs6000.opt (-mload-vector-pair): New option.
(-mstore-vector-pair): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-attribute.c: New test.
* gcc.target/powerpc/vector-pair-pragma.c: New test.
* gcc.target/powerpc/vector-pair-switch1.c: New test.
* gcc.target/powerpc/vector-pair-switch2.c: New test.
* gcc.target/powerpc/vector-pair-switch3.c: New test.
* gcc.target/powerpc/vector-pair-switch4.c: New test.

Diff:
---
 gcc/config/rs6000/mma.md  | 19 +--
 gcc/config/rs6000/rs6000-cpus.def |  8 ++--
 gcc/config/rs6000/rs6000.cc   | 30 --
 gcc/config/rs6000/rs6000.md   | 10 +-
 gcc/config/rs6000/rs6000.opt  |  8 
 5 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 04e2d0066df..6a7d8a836db 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -292,27 +292,34 @@
 gcc_assert (false);
 })
 
+;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an
+;; alternative that does not allow indexed addresses so we can split the load
+;; or store.
 (define_insn_and_split "*movoo"
-  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
-   (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa")
+   (match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))]
   "TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
   "@
lxvp%X1 %x0,%1
+   #
stxvp%X0 %x1,%0
+   #
#"
   "&& reload_completed
-   && (!MEM_P (operands[0]) && !MEM_P (operands[1]))"
+   && ((MEM_P (operands[0]) && !TARGET_STORE_VECTOR_PAIR)
+   || (MEM_P (operands[1]) && !TARGET_LOAD_VECTOR_PAIR)
+   || (!MEM_P (operands[0]) && !MEM_P (operands[1])))"
   [(const_int 0)]
 {
   rs6000_split_multireg_move (operands[0], operands[1]);
   DONE;
 }
-  [(set_attr "type" "vecload,vecstore,veclogical")
+  [(set_attr "type" "vecload,vecload,vecstore,vecstore,veclogical")
(set_attr "size" "256")
-   (set_attr "length" "*

[gcc(refs/users/meissner/heads/work164-vpair)] Revert all changes

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:9a6e5ecfb5ac36496f700ee14b6aeed59de991a4

commit 9a6e5ecfb5ac36496f700ee14b6aeed59de991a4
Author: Michael Meissner 
Date:   Tue Apr 9 00:43:54 2024 -0400

Revert all changes

Diff:
---
 gcc/config/rs6000/mma.md  | 19 ++-
 gcc/config/rs6000/rs6000-cpus.def |  8 ++--
 gcc/config/rs6000/rs6000.cc   | 30 ++
 gcc/config/rs6000/rs6000.md   | 10 +-
 gcc/config/rs6000/rs6000.opt  |  8 
 5 files changed, 11 insertions(+), 64 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 6a7d8a836db..04e2d0066df 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -292,34 +292,27 @@
 gcc_assert (false);
 })
 
-;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an
-;; alternative that does not allow indexed addresses so we can split the load
-;; or store.
 (define_insn_and_split "*movoo"
-  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa")
-   (match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))]
+  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
+   (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
   "TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
   "@
lxvp%X1 %x0,%1
-   #
stxvp%X0 %x1,%0
-   #
#"
   "&& reload_completed
-   && ((MEM_P (operands[0]) && !TARGET_STORE_VECTOR_PAIR)
-   || (MEM_P (operands[1]) && !TARGET_LOAD_VECTOR_PAIR)
-   || (!MEM_P (operands[0]) && !MEM_P (operands[1])))"
+   && (!MEM_P (operands[0]) && !MEM_P (operands[1]))"
   [(const_int 0)]
 {
   rs6000_split_multireg_move (operands[0], operands[1]);
   DONE;
 }
-  [(set_attr "type" "vecload,vecload,vecstore,vecstore,veclogical")
+  [(set_attr "type" "vecload,vecstore,veclogical")
(set_attr "size" "256")
-   (set_attr "length" "*,8,*,8,8")
-   (set_attr "isa" "lxvp,*,stxvp,*,*")])
+   (set_attr "length" "*,*,8")])
+
 
 ;; Vector quad support.  XOmode can only live in FPRs.
 (define_expand "movxo"
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 6e0b2449b18..47365534af8 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -77,12 +77,10 @@
 /* Flags that need to be turned off if -mno-power10.  */
 /* We comment out PCREL_OPT here to disable it by default because SPEC2017
performance was degraded by it.  */
-#define OTHER_POWER10_MASKS(OPTION_MASK_LOAD_VECTOR_PAIR   \
-| OPTION_MASK_MMA  \
+#define OTHER_POWER10_MASKS(OPTION_MASK_MMA\
 | OPTION_MASK_PCREL\
 /* | OPTION_MASK_PCREL_OPT */  \
-| OPTION_MASK_PREFIXED \
-| OPTION_MASK_STORE_VECTOR_PAIR)
+| OPTION_MASK_PREFIXED)
 
 #define ISA_3_1_MASKS_SERVER   (ISA_3_0_MASKS_SERVER   \
 | OPTION_MASK_POWER10  \
@@ -133,7 +131,6 @@
 | OPTION_MASK_FLOAT128_KEYWORD \
 | OPTION_MASK_FPRND\
 | OPTION_MASK_FUTURE   \
-| OPTION_MASK_LOAD_VECTOR_PAIR \
 | OPTION_MASK_POWER10  \
 | OPTION_MASK_POWER11  \
 | OPTION_MASK_P10_FUSION   \
@@ -161,7 +158,6 @@
 | OPTION_MASK_QUAD_MEMORY_ATOMIC   \
 | OPTION_MASK_RECIP_PRECISION  \
 | OPTION_MASK_SOFT_FLOAT   \
-| OPTION_MASK_STORE_VECTOR_PAIR\
 | OPTION_MASK_STRICT_ALIGN_OPTIONAL\
 | OPTION_MASK_VSX)
 
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index da7cfa3c09e..2921e72aea8 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -2722,9 +2722,7 @@ rs6000_setup_reg_addr_masks (void)
  /* Vector pairs can do both indexed and offset loads if the
 instructions are enabled, otherwise they can only do offset loads
 since it will be broken into two vector moves.  Vector quads can
-only do offset loads.  If the user restricted generation of either
-of the LXVP or STXVP instructions, do not allow indexed mode so
-that we can split the load/store.  */
+only do offset loads.  */
  else if ((addr_mask != 0) && TARGET_MMA
   && (m2 == OOmode || m2 ==

[gcc(refs/users/meissner/heads/work164-vpair)] Power10: Add options to disable load and store vector pair.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:a6a927f86a58862aa30bda8f609fb5641fc76810

commit a6a927f86a58862aa30bda8f609fb5641fc76810
Author: Michael Meissner 
Date:   Tue Apr 9 00:44:44 2024 -0400

Power10: Add options to disable load and store vector pair.

In working on some future patches that involve utilizing vector pair
instructions, I wanted to be able to tune my program to enable or disable 
using
the vector pair load or store operations while still keeping the other
operations on the vector pair.

This patch adds two undocumented tuning options.  The -mno-load-vector-pair
option would tell GCC to generate two load vector instructions instead of a
single load vector pair.  The -mno-store-vector-pair option would tell GCC 
to
generate two store vector instructions instead of a single store vector 
pair.

If either -mno-load-vector-pair is used, GCC will not generate the indexed
stxvpx instruction.  Similarly if -mno-store-vector-pair is used, GCC will 
not
generate the indexed lxvpx instruction.  The reason for this is to enable
splitting the {,p}lxvp or {,p}stxvp instructions after reload without 
needing a
scratch GPR register.

The default for -mcpu=power10 is that both load vector pair and store vector
pair are enabled.

I added code so that the user code can modify these settings using either a
'#pragma GCC target' directive or used __attribute__((__target__(...))) in 
the
function declaration.

I added tests for the switches, #pragma, and attribute options.

I have built this on both little endian power10 systems and big endian 
power9
systems doing the normal bootstrap and test.  There were no regressions in 
any
of the tests, and the new tests passed.  Can I check this patch into the 
master
branch?

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/mma.md (movoo): Add support for 
-mno-load-vector-pair and
-mno-store-vector-pair.
* config/rs6000/rs6000-cpus.def (OTHER_POWER10_MASKS): Add support 
for
-mload-vector-pair and -mstore-vector-pair.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): Only allow
indexed mode for OOmode if we are generating both load vector pair 
and
store vector pair instructions.
(rs6000_option_override_internal): Add support for 
-mno-load-vector-pair
and -mno-store-vector-pair.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.md (isa attribute): Add lxvp and stxvp
attributes.
(enabled attribute): Likewise.
* config/rs6000/rs6000.opt (-mload-vector-pair): New option.
(-mstore-vector-pair): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-attribute.c: New test.
* gcc.target/powerpc/vector-pair-pragma.c: New test.
* gcc.target/powerpc/vector-pair-switch1.c: New test.
* gcc.target/powerpc/vector-pair-switch2.c: New test.
* gcc.target/powerpc/vector-pair-switch3.c: New test.
* gcc.target/powerpc/vector-pair-switch4.c: New test.

Diff:
---
 gcc/config/rs6000/mma.md   | 19 +---
 gcc/config/rs6000/rs6000-cpus.def  |  8 +++-
 gcc/config/rs6000/rs6000.cc| 30 +++-
 gcc/config/rs6000/rs6000.md| 10 +++-
 gcc/config/rs6000/rs6000.opt   |  8 
 .../gcc.target/powerpc/vector-pair-attribute.c | 39 +++
 .../gcc.target/powerpc/vector-pair-pragma.c| 55 ++
 .../gcc.target/powerpc/vector-pair-switch1.c   | 16 +++
 .../gcc.target/powerpc/vector-pair-switch2.c   | 17 +++
 .../gcc.target/powerpc/vector-pair-switch3.c   | 17 +++
 .../gcc.target/powerpc/vector-pair-switch4.c   | 17 +++
 11 files changed, 225 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 04e2d0066df..6a7d8a836db 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -292,27 +292,34 @@
 gcc_assert (false);
 })
 
+;; If the user used -mno-store-vector-pair or -mno-load-vector pair, use an
+;; alternative that does not allow indexed addresses so we can split the load
+;; or store.
 (define_insn_and_split "*movoo"
-  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,ZwO,wa")
-   (match_operand:OO 1 "input_operand" "ZwO,wa,wa"))]
+  [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,wa,ZwO,QwO,wa")
+   (match_operand:OO 1 "input_operand" "ZwO,QwO,wa,wa,wa"))]
   "TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
   "@
lxvp%X1 %x0,%1
+   #
stxvp%X0 %x1,%0
+   #
#"
   "&& reload_completed
-   && (!MEM_P (operands[0]) && !MEM_P (oper

[gcc(refs/users/meissner/heads/work164-vpair)] Peter's patches for subreg support.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f975eab7d1e0a30db8fb7ce8107a08be5e018d05

commit f975eab7d1e0a30db8fb7ce8107a08be5e018d05
Author: Michael Meissner 
Date:   Tue Apr 9 00:49:34 2024 -0400

Peter's patches for subreg support.

2024-03-12  Peter Bergner  

gcc/

PR target/109116
* gcc/config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Make OOmode
tieable with 128-bit vector modes.

2024-01-23  Peter Bergner  

gcc/

PR target/109116
* gcc/config/rs6000/mma.md (vsx_disassemble_pair): Use SUBREG's 
instead
of UNSPEC's.
(mma_disassemble_acc): Likewise.

Diff:
---
 gcc/config/rs6000/mma.md| 50 -
 gcc/config/rs6000/rs6000.cc |  9 +---
 2 files changed, 10 insertions(+), 49 deletions(-)

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 6a7d8a836db..831e646c473 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -405,29 +405,8 @@
(match_operand 2 "const_0_to_1_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*vsx_disassemble_pair"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa")
- (match_operand 2 "const_0_to_1_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && vsx_register_operand (operands[1], OOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], OOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
@@ -479,29 +458,8 @@
(match_operand 2 "const_0_to_3_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*mma_disassemble_acc"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:XO 1 "fpr_reg_operand" "d")
- (match_operand 2 "const_0_to_3_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && fpr_reg_operand (operands[1], XOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], XOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index da7cfa3c09e..f12eb15be1f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1975,9 +1975,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
 static bool
 rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
 {
-  if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
-  || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
-return mode1 == mode2;
+   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
+   || mode2 == PTImode || mode2 == XOmode)
+ return mode1 == mode2;
+ 
+  if (mode2 == OOmode)
+return ALTIVEC_OR_VSX_VECTOR_MODE (mode1);
 
   if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1))
 return ALTIVEC_OR_VSX_VECTOR_MODE (mode2);


[gcc(refs/users/meissner/heads/work164-vpair)] Add support for vector pair unary and binary operations.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:473ef47c46ad9efc28bb01e376783d331596fdd4

commit 473ef47c46ad9efc28bb01e376783d331596fdd4
Author: Michael Meissner 
Date:   Tue Apr 9 00:55:10 2024 -0400

Add support for vector pair unary and binary operations.

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.def (__builtin_vpair_*): Add new
built-in functions for vector pair support.
* config/rs6000/rs6000-protos.h (enum vpair_split_unary): New
enumeration.
(vpair_split_unary): New declaration.
(vpair_split_binary): Likewise.
* config/rs6000/rs6000.cc (vpair_split_unary): New function to split
vector pair operations.
(vpair_split_binary): Likewise.
* config/rs6000/rs6000.md (toplevel): Include vector-pair.md.
* config/rs6000/t-rs6000 (MD_INCLUDES): Add vector-pair.md.
* config/rs6000/vector-pair.md: New file.
* doc/extend.texi (PowerPC Vector Pair Built-in Functions): Add
documentation for the new vector pair built-in functions.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-1.c: New test.
* gcc.target/powerpc/vector-pair-2.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtins.def|  56 
 gcc/config/rs6000/rs6000-protos.h|  12 ++
 gcc/config/rs6000/rs6000.cc  |  67 ++
 gcc/config/rs6000/rs6000.md  |   1 +
 gcc/config/rs6000/t-rs6000   |   1 +
 gcc/config/rs6000/vector-pair.md | 160 +++
 gcc/doc/extend.texi  |  51 
 gcc/testsuite/gcc.target/powerpc/vector-pair-1.c |  87 
 gcc/testsuite/gcc.target/powerpc/vector-pair-2.c |  86 
 9 files changed, 521 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 3bc7fed6956..83e7206e989 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4131,3 +4131,59 @@
 
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
+
+;; Vector pair built-in functions with float elements
+  v256 __builtin_vpair_f32_abs (v256);
+VPAIR_F32_ABS vpair_abs_v8sf2 {mma}
+
+  v256 __builtin_vpair_f32_add (v256, v256);
+VPAIR_F32_ADD vpair_add_v8sf3 {mma}
+
+  v256 __builtin_vpair_f32_div (v256, v256);
+VPAIR_F32_DIV vpair_div_v8sf3 {mma}
+
+  v256 __builtin_vpair_f32_max (v256, v256);
+VPAIR_F32_MAX vpair_smax_v8sf3 {mma}
+
+  v256 __builtin_vpair_f32_min (v256, v256);
+VPAIR_F32_MIN vpair_smin_v8sf3 {mma}
+
+  v256 __builtin_vpair_f32_mul (v256, v256);
+VPAIR_F32_MUL vpair_mul_v8sf3 {mma}
+
+  v256 __builtin_vpair_f32_nabs (v256);
+VPAIR_F32_NABS vpair_nabs_v8sf2 {mma}
+
+  v256 __builtin_vpair_f32_neg (v256);
+VPAIR_F32_NEG vpair_neg_v8sf2 {mma}
+
+  v256 __builtin_vpair_f32_sub (v256, v256);
+VPAIR_F32_SUB vpair_sub_v8sf3 {mma}
+
+;; Vector pair built-in functions with double elements
+  v256 __builtin_vpair_f64_abs (v256);
+VPAIR_F64_ABS vpair_abs_v4df2 {mma}
+
+  v256 __builtin_vpair_f64_add (v256, v256);
+VPAIR_F64_ADD vpair_add_v4df3 {mma}
+
+  v256 __builtin_vpair_f64_div (v256, v256);
+VPAIR_F64_DIV vpair_div_v4df3 {mma}
+
+  v256 __builtin_vpair_f64_max (v256, v256);
+VPAIR_F64_MAX vpair_smax_v4df3 {mma}
+
+  v256 __builtin_vpair_f64_min (v256, v256);
+VPAIR_F64_MIN vpair_smin_v4df3 {mma}
+
+  v256 __builtin_vpair_f64_mul (v256, v256);
+VPAIR_F64_MUL vpair_mul_v4df3 {mma}
+
+  v256 __builtin_vpair_f64_nabs (v256);
+VPAIR_F64_NABS vpair_nabs_v4df2 {mma}
+
+  v256 __builtin_vpair_f64_neg (v256);
+VPAIR_F64_NEG vpair_neg_v4df2 {mma}
+
+  v256 __builtin_vpair_f64_sub (v256, v256);
+VPAIR_F64_SUB vpair_sub_v4df3 {mma}
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 09a57a806fa..4d6ecc83436 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -162,6 +162,18 @@ extern bool rs6000_pcrel_p (void);
 extern bool rs6000_fndecl_pcrel_p (const_tree);
 extern void rs6000_output_addr_vec_elt (FILE *, int);
 
+/* If we are splitting a vector pair unary operator into two separate vector
+   operations, we need to generate a NEG if this is NABS.  */
+
+enum vpair_split_unary {
+  VPAIR_SPLIT_NORMAL,  /* No extra processing is needed.  */
+  VPAIR_SPLIT_NEGATE   /* Wrap operation with a NEG.  */
+};
+
+extern void vpair_split_unary (rtx [], machine_mode, enum rtx_code,
+  enum vpair_split_unary);
+extern void vpair_split_binary (rtx [], machine_mode, enum rtx_code);
+
 /* Different PowerPC instruction formats that are used by GCC.  There are
various other instruction formats used by the PowerPC hardware, but these
formats are not currently u

[gcc(refs/users/meissner/heads/work164-vpair)] Add support for vector pair fma operations.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:dd2a7e91bc7f77c9bf37472daf343a6a22d977a0

commit dd2a7e91bc7f77c9bf37472daf343a6a22d977a0
Author: Michael Meissner 
Date:   Tue Apr 9 01:00:03 2024 -0400

Add support for vector pair fma operations.

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.def (__builtin_vpair_f32_fma): New
built-in.
(__builtin_vpair_f32_fms): Likewise.
(__builtin_vpair_f32_nfma): Likewise.
(__builtin_vpair_f32_nfms): Likewise.
(__builtin_vpair_f64_fma): Likewise.
(__builtin_vpair_f64_fms): Likewise.
(__builtin_vpair_f64_nfma): Likewise.
* config/rs6000/rs6000/rs6000-proto.h (enum vpair_split_fma): New
enumeration.
(vpair_split_fma): New declaration.
* config/rs6000/rs6000.cc (vpair_split_fma): New function to split
vector pair FMA operations.
* config/rs6000/vector-pair.md (UNSPEC_VPAIR_FMA): New unspec.
(vpair_stdname): Add UNSPEC_VPAIR_FMA.
(VPAIR_OP): Likewise.
(vpair_fma_4): New insns.
(vpair_fms_4): Likewise.
(vpair_nfma_4): Likewise.
(vpair_nfms_4): Likewise.
* doc/extend.texi (PowerPC Vector Pair Built-in Functions): 
Document new
vector pair fma built-in functions.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-3.c: New test.
* gcc.target/powerpc/vector-pair-4.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtins.def| 24 ++
 gcc/config/rs6000/rs6000-protos.h| 13 
 gcc/config/rs6000/rs6000.cc  | 71 ++
 gcc/config/rs6000/vector-pair.md | 96 
 gcc/doc/extend.texi  | 25 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-3.c | 57 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-4.c | 57 ++
 7 files changed, 343 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 83e7206e989..4362cbb8fc7 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4142,6 +4142,12 @@
   v256 __builtin_vpair_f32_div (v256, v256);
 VPAIR_F32_DIV vpair_div_v8sf3 {mma}
 
+  v256 __builtin_vpair_f32_fma (v256, v256, v256);
+VPAIR_F32_FMA vpair_fma_v8sf4 {mma}
+
+  v256 __builtin_vpair_f32_fms (v256, v256, v256);
+VPAIR_F32_FMS vpair_fms_v8sf4 {mma}
+
   v256 __builtin_vpair_f32_max (v256, v256);
 VPAIR_F32_MAX vpair_smax_v8sf3 {mma}
 
@@ -4157,6 +4163,12 @@
   v256 __builtin_vpair_f32_neg (v256);
 VPAIR_F32_NEG vpair_neg_v8sf2 {mma}
 
+  v256 __builtin_vpair_f32_nfma (v256, v256, v256);
+VPAIR_F32_NFMA vpair_nfma_v8sf4 {mma}
+
+  v256 __builtin_vpair_f32_nfms (v256, v256, v256);
+VPAIR_F32_NFMS vpair_nfms_v8sf4 {mma}
+
   v256 __builtin_vpair_f32_sub (v256, v256);
 VPAIR_F32_SUB vpair_sub_v8sf3 {mma}
 
@@ -4170,6 +4182,12 @@
   v256 __builtin_vpair_f64_div (v256, v256);
 VPAIR_F64_DIV vpair_div_v4df3 {mma}
 
+  v256 __builtin_vpair_f64_fma (v256, v256, v256);
+VPAIR_F64_FMA vpair_fma_v4df4 {mma}
+
+  v256 __builtin_vpair_f64_fms (v256, v256, v256);
+VPAIR_F64_FMS vpair_fms_v4df4 {mma}
+
   v256 __builtin_vpair_f64_max (v256, v256);
 VPAIR_F64_MAX vpair_smax_v4df3 {mma}
 
@@ -4185,5 +4203,11 @@
   v256 __builtin_vpair_f64_neg (v256);
 VPAIR_F64_NEG vpair_neg_v4df2 {mma}
 
+  v256 __builtin_vpair_f64_nfma (v256, v256, v256);
+VPAIR_F64_NFMA vpair_nfma_v4df4 {mma}
+
+  v256 __builtin_vpair_f64_nfms (v256, v256, v256);
+VPAIR_F64_NFMS vpair_nfms_v4df4 {mma}
+
   v256 __builtin_vpair_f64_sub (v256, v256);
 VPAIR_F64_SUB vpair_sub_v4df3 {mma}
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 4d6ecc83436..aed4081c87b 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -174,6 +174,19 @@ extern void vpair_split_unary (rtx [], machine_mode, enum 
rtx_code,
   enum vpair_split_unary);
 extern void vpair_split_binary (rtx [], machine_mode, enum rtx_code);
 
+/* When we are splitting a vector pair FMA operation into two vector 
operations, we
+   may need to modify the code generated.  This enumeration encodes the
+   different choices.  */
+
+enum vpair_split_fma {
+  VPAIR_SPLIT_FMA, /* Fused multiply-add.  */
+  VPAIR_SPLIT_FMS, /* Fused multiply-subtract.  */
+  VPAIR_SPLIT_NFMA,/* Fused negate multiply-add.  */
+  VPAIR_SPLIT_NFMS /* Fused negate multiply-subtract.  */
+};
+
+extern void vpair_split_fma (rtx [], machine_mode, enum vpair_split_fma);
+
 /* Different PowerPC instruction formats that are used by GCC.  There are
various other instruction formats used by the PowerPC hardware, but these
formats are 

[gcc(refs/users/meissner/heads/work164-vpair)] Add vector pair init and splat.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:aa0d0f245a6d4a763b754944463a9c1800394e35

commit aa0d0f245a6d4a763b754944463a9c1800394e35
Author: Michael Meissner 
Date:   Tue Apr 9 01:08:44 2024 -0400

Add vector pair init and splat.

2024-04-09  Michael Meissner  

gcc/

* config/rs6000/rs6000-builtins.def (__builtin_vpair_zero): New
built-in function.
(__builtin_vpair_f32_splat): Likewise.
(__builtin_vpair_f64_splat): Likewise.
* config/rs6000/vector-pair.md (UNSPEC_VPAIR_ZERO): New unspec.
(UNSPEC_VPAIR_SPLAT): Likewise.
(VPAIR_SPLAT_VMODE): New mode iterator.
(VPAIR_SPLAT_ELEMENT_TO_VMODE): New mode attribute.
(vpair_splat_name): Likewise.
(vpair_zero): New insn.
(vpair_splat_): New define_expand.
(vpair_splat__internal): New insns.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-5.c: New test.
* gcc.target/powerpc/vector-pair-6.c: Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-builtins.def|  10 +++
 gcc/config/rs6000/vector-pair.md | 102 ++-
 gcc/doc/extend.texi  |   9 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-5.c |  56 +
 gcc/testsuite/gcc.target/powerpc/vector-pair-6.c |  56 +
 5 files changed, 232 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 4362cbb8fc7..b757a8630ff 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -4132,6 +4132,10 @@
   void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
 STXVP nothing {mma,pair}
 
+;; Vector pair built-in functions.
+  v256 __builtin_vpair_zero ();
+VPAIR_ZERO vpair_zero {mma}
+
 ;; Vector pair built-in functions with float elements
   v256 __builtin_vpair_f32_abs (v256);
 VPAIR_F32_ABS vpair_abs_v8sf2 {mma}
@@ -4169,6 +4173,9 @@
   v256 __builtin_vpair_f32_nfms (v256, v256, v256);
 VPAIR_F32_NFMS vpair_nfms_v8sf4 {mma}
 
+  v256 __builtin_vpair_f32_splat (float);
+VPAIR_F32_SPLAT vpair_splat_v8sf {mma}
+
   v256 __builtin_vpair_f32_sub (v256, v256);
 VPAIR_F32_SUB vpair_sub_v8sf3 {mma}
 
@@ -4209,5 +4216,8 @@
   v256 __builtin_vpair_f64_nfms (v256, v256, v256);
 VPAIR_F64_NFMS vpair_nfms_v4df4 {mma}
 
+  v256 __builtin_vpair_f64_splat (double);
+VPAIR_F64_SPLAT vpair_splat_v4df {mma}
+
   v256 __builtin_vpair_f64_sub (v256, v256);
 VPAIR_F64_SUB vpair_sub_v4df3 {mma}
diff --git a/gcc/config/rs6000/vector-pair.md b/gcc/config/rs6000/vector-pair.md
index 73ae46e6d40..39b419c6814 100644
--- a/gcc/config/rs6000/vector-pair.md
+++ b/gcc/config/rs6000/vector-pair.md
@@ -38,7 +38,9 @@
UNSPEC_VPAIR_NEG
UNSPEC_VPAIR_PLUS
UNSPEC_VPAIR_SMAX
-   UNSPEC_VPAIR_SMIN])
+   UNSPEC_VPAIR_SMIN
+   UNSPEC_VPAIR_ZERO
+   UNSPEC_VPAIR_SPLAT])
 
 ;; Vector pair element ID that defines the scaler element within the vector 
pair.
 (define_c_enum "vpair_element"
@@ -98,6 +100,104 @@
 ;; Map the scalar element ID into the appropriate insn type for divide.
 (define_int_attr vpair_divtype [(VPAIR_ELEMENT_FLOAT  "vecfdiv")
(VPAIR_ELEMENT_DOUBLE "vecdiv")])
+
+;; Mode iterator for the vector modes that we provide splat operations for.
+(define_mode_iterator VPAIR_SPLAT_VMODE [V4SF V2DF])
+
+;; Map element mode to 128-bit vector mode for splat operations
+(define_mode_attr VPAIR_SPLAT_ELEMENT_TO_VMODE [(SF "V4SF")
+   (DF "V2DF")])
+
+;; Map either element mode or vector mode into the name for the splat insn.
+(define_mode_attr vpair_splat_name [(SF   "v8sf")
+   (DF   "v4df")
+   (V4SF "v8sf")
+   (V2DF "v4df")])
+
+;; Initialize a vector pair to 0
+(define_insn_and_split "vpair_zero"
+  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
+   (unspec:OO [(const_int 0)] UNSPEC_VPAIR_ZERO))]
+  "TARGET_MMA"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 1) (match_dup 3))
+   (set (match_dup 2) (match_dup 3))]
+{
+  rtx op0 = operands[0];
+
+  operands[1] = simplify_gen_subreg (V2DFmode, op0, OOmode, 0);
+  operands[2] = simplify_gen_subreg (V2DFmode, op0, OOmode, 16);
+  operands[3] = CONST0_RTX (V2DFmode);
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecperm")])
+
+;; Create a vector pair with a value splat'ed (duplicated) to all of the
+;; elements.
+(define_expand "vpair_splat_"
+  [(use (match_operand:OO 0 "vsx_register_operand"))
+   (use (match_operand:SFDF 1 "input_operand"))]
+  "TARGET_MMA"
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  machine_mode element_mode = mode;
+
+  if (op1 == CONST0_RTX (element_mode))
+{
+  emit_insn (gen_vpair_zero (op0));
+  DONE;
+}
+
+  machine_mode vector_mode = mode;

[gcc(refs/users/meissner/heads/work164-vpair)] Add vector pair optimizations.

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:f66efd9225badd375ff92f1e57397efc2877

commit f66efd9225badd375ff92f1e57397efc2877
Author: Michael Meissner 
Date:   Tue Apr 9 01:13:59 2024 -0400

Add vector pair optimizations.

2024-03-12  Michael Meissner  

gcc/

* config/rs6000/vector-pair.md (vpair_add_neg_3): 
New
combiner insn to convert vector plus/neg into a minus operation.
(vpair_fma__merge): Optimize multiply, 
add/subtract, and
negation into fma operations if the user specifies to create fmas.
(vpair_fma__merge): Likewise.
(vpair_fma__merge2): Likewise.
(vpair_nfma__merge): Likewise.
(vpair_nfms__merge): Likewise.
(vpair_nfms__merge2): Likewise.

gcc/testsuite/

* gcc.target/powerpc/vector-pair-7.c: New test.
* gcc.target/powerpc/vector-pair-8.c: Likewise.
* gcc.target/powerpc/vector-pair-9.c: Likewise.
* gcc.target/powerpc/vector-pair-10.c: Likewise.
* gcc.target/powerpc/vector-pair-11.c: Likewise.
* gcc.target/powerpc/vector-pair-12xs.c: Likewise.

Diff:
---
 gcc/config/rs6000/vector-pair.md  | 224 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-10.c |  61 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-11.c |  65 +++
 gcc/testsuite/gcc.target/powerpc/vector-pair-12.c |  65 +++
 gcc/testsuite/gcc.target/powerpc/vector-pair-7.c  |  18 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-8.c  |  18 ++
 gcc/testsuite/gcc.target/powerpc/vector-pair-9.c  |  61 ++
 7 files changed, 512 insertions(+)

diff --git a/gcc/config/rs6000/vector-pair.md b/gcc/config/rs6000/vector-pair.md
index 39b419c6814..7a81acbdc05 100644
--- a/gcc/config/rs6000/vector-pair.md
+++ b/gcc/config/rs6000/vector-pair.md
@@ -261,6 +261,31 @@
(set (attr "type") (if_then_else (match_test " == DIV")
(const_string "")
(const_string "")))])
+
+;; Optimize vector pair add of a negative value into a subtract.
+(define_insn_and_split "*vpair_add_neg_3"
+  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
+   (unspec:OO
+[(match_operand:OO 1 "vsx_register_operand" "wa")
+ (unspec:OO
+  [(match_operand:OO 2 "vsx_register_operand" "wa")
+   (const_int VPAIR_FP_ELEMENT)]
+  UNSPEC_VPAIR_NEG)
+ (const_int VPAIR_FP_ELEMENT)]
+VPAIR_FP_BINARY))]
+  "TARGET_MMA"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+   (unspec:OO
+[(match_dup 1)
+ (match_dup 2)
+ (const_int VPAIR_FP_ELEMENT)]
+UNSPEC_VPAIR_MINUS))]
+{
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "")])
 
 ;; Vector pair fused-multiply (FMA) operations.  The last argument in the
 ;; UNSPEC is a CONST_INT which identifies what the scalar element is.
@@ -354,3 +379,202 @@
 }
   [(set_attr "length" "8")
(set_attr "type" "")])
+
+;; Optimize vector pair multiply and vector pair add into vector pair fma,
+;; providing the compiler would do this optimization for scalar and vectors.
+;; Unlike most of the define_insn_and_splits, this can be done before register
+;; allocation.
+(define_insn_and_split "*vpair_fma__merge"
+  [(set (match_operand:OO 0 "vsx_register_operand" "=wa,wa")
+   (unspec:OO
+[(unspec:OO
+  [(match_operand:OO 1 "vsx_register_operand" "%wa,wa")
+   (match_operand:OO 2 "vsx_register_operand" "wa,0")
+   (const_int VPAIR_FP_ELEMENT)]
+  UNSPEC_VPAIR_MULT)
+ (match_operand:OO 3 "vsx_register_operand" "0,wa")
+ (const_int VPAIR_FP_ELEMENT)]
+UNSPEC_VPAIR_PLUS))]
+  "TARGET_MMA && flag_fp_contract_mode == FP_CONTRACT_FAST"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+   (unspec:OO
+[(match_dup 1)
+ (match_dup 2)
+ (match_dup 3)
+ (const_int VPAIR_FP_ELEMENT)]
+UNSPEC_VPAIR_FMA))]
+{
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "")])
+
+;; Merge multiply and subtract.
+(define_insn_and_split "*vpair_fma__merge"
+  [(set (match_operand:OO 0 "vsx_register_operand" "=wa,wa")
+   (unspec:OO
+[(unspec:OO
+  [(match_operand:OO 1 "vsx_register_operand" "%wa,wa")
+   (match_operand:OO 2 "vsx_register_operand" "wa,0")
+   (const_int VPAIR_FP_ELEMENT)]
+  UNSPEC_VPAIR_MULT)
+ (match_operand:OO 3 "vsx_register_operand" "0,wa")
+ (const_int VPAIR_FP_ELEMENT)]
+UNSPEC_VPAIR_MINUS))]
+  "TARGET_MMA && flag_fp_contract_mode == FP_CONTRACT_FAST"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+   (unspec:OO
+[(match_dup 1)
+ (match_dup 2)
+ (unspec:OO
+  [(match_dup 3)
+   (const_int VPAIR_FP_ELEMENT)]
+  UNSPEC_VPAIR_NEG)
+ (const_int VPAIR_FP_ELEMENT)]
+UNSPEC_VPAIR_FMA))]
+{
+}
+  [(set_attr "length" "8")
+   (set_attr "type" ""

[gcc(refs/users/meissner/heads/work164-vpair)] Update ChangeLog.*

2024-04-08 Thread Michael Meissner via Gcc-cvs
https://gcc.gnu.org/g:e994ae407e4dd762c46590ee49bbf691265eaa87

commit e994ae407e4dd762c46590ee49bbf691265eaa87
Author: Michael Meissner 
Date:   Tue Apr 9 01:15:05 2024 -0400

Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.vpair | 350 +++-
 1 file changed, 349 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.vpair b/gcc/ChangeLog.vpair
index c0c0f7095a6..3a6f5d80c0a 100644
--- a/gcc/ChangeLog.vpair
+++ b/gcc/ChangeLog.vpair
@@ -1,6 +1,354 @@
+ Branch work164-vpair, patch #305 
+
+Add vector pair optimizations.
+
+2024-03-12  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/vector-pair.md (vpair_add_neg_3): New
+   combiner insn to convert vector plus/neg into a minus operation.
+   (vpair_fma__merge): Optimize multiply, add/subtract, and
+   negation into fma operations if the user specifies to create fmas.
+   (vpair_fma__merge): Likewise.
+   (vpair_fma__merge2): Likewise.
+   (vpair_nfma__merge): Likewise.
+   (vpair_nfms__merge): Likewise.
+   (vpair_nfms__merge2): Likewise.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-pair-7.c: New test.
+   * gcc.target/powerpc/vector-pair-8.c: Likewise.
+   * gcc.target/powerpc/vector-pair-9.c: Likewise.
+   * gcc.target/powerpc/vector-pair-10.c: Likewise.
+   * gcc.target/powerpc/vector-pair-11.c: Likewise.
+   * gcc.target/powerpc/vector-pair-12xs.c: Likewise.
+
+ Branch work164-vpair, patch #304 
+
+Add vector pair init and splat.
+
+2024-04-09  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtins.def (__builtin_vpair_zero): New
+   built-in function.
+   (__builtin_vpair_f32_splat): Likewise.
+   (__builtin_vpair_f64_splat): Likewise.
+   * config/rs6000/vector-pair.md (UNSPEC_VPAIR_ZERO): New unspec.
+   (UNSPEC_VPAIR_SPLAT): Likewise.
+   (VPAIR_SPLAT_VMODE): New mode iterator.
+   (VPAIR_SPLAT_ELEMENT_TO_VMODE): New mode attribute.
+   (vpair_splat_name): Likewise.
+   (vpair_zero): New insn.
+   (vpair_splat_): New define_expand.
+   (vpair_splat__internal): New insns.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-pair-5.c: New test.
+   * gcc.target/powerpc/vector-pair-6.c: Likewise.
+
+ Branch work164-vpair, patch #303 
+
+Add support for vector pair fma operations.
+
+2024-04-09  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtins.def (__builtin_vpair_f32_fma): New
+   built-in.
+   (__builtin_vpair_f32_fms): Likewise.
+   (__builtin_vpair_f32_nfma): Likewise.
+   (__builtin_vpair_f32_nfms): Likewise.
+   (__builtin_vpair_f64_fma): Likewise.
+   (__builtin_vpair_f64_fms): Likewise.
+   (__builtin_vpair_f64_nfma): Likewise.
+   * config/rs6000/rs6000/rs6000-proto.h (enum vpair_split_fma): New
+   enumeration.
+   (vpair_split_fma): New declaration.
+   * config/rs6000/rs6000.cc (vpair_split_fma): New function to split
+   vector pair FMA operations.
+   * config/rs6000/vector-pair.md (UNSPEC_VPAIR_FMA): New unspec.
+   (vpair_stdname): Add UNSPEC_VPAIR_FMA.
+   (VPAIR_OP): Likewise.
+   (vpair_fma_4): New insns.
+   (vpair_fms_4): Likewise.
+   (vpair_nfma_4): Likewise.
+   (vpair_nfms_4): Likewise.
+   * doc/extend.texi (PowerPC Vector Pair Built-in Functions): Document new
+   vector pair fma built-in functions.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-pair-3.c: New test.
+   * gcc.target/powerpc/vector-pair-4.c: Likewise.
+
+ Branch work164-vpair, patch #302 
+
+Add support for vector pair unary and binary operations.
+
+2024-04-09  Michael Meissner  
+
+gcc/
+
+   * config/rs6000/rs6000-builtins.def (__builtin_vpair_*): Add new
+   built-in functions for vector pair support.
+   * config/rs6000/rs6000-protos.h (enum vpair_split_unary): New
+   enumeration.
+   (vpair_split_unary): New declaration.
+   (vpair_split_binary): Likewise.
+   * config/rs6000/rs6000.cc (vpair_split_unary): New function to split
+   vector pair operations.
+   (vpair_split_binary): Likewise.
+   * config/rs6000/rs6000.md (toplevel): Include vector-pair.md.
+   * config/rs6000/t-rs6000 (MD_INCLUDES): Add vector-pair.md.
+   * config/rs6000/vector-pair.md: New file.
+   * doc/extend.texi (PowerPC Vector Pair Built-in Functions): Add
+   documentation for the new vector pair built-in functions.
+
+gcc/testsuite/
+
+   * gcc.target/powerpc/vector-pair-1.c: New test.
+   * gcc.target/powerpc/vector-pair-2.c: Likewise.
+
+ Branch work164-vpair, patch #301 
+
+Peter's patches for subreg support.
+
+2024-03-12  Peter Bergner  
+
+gcc/
+
+   PR target/109116
+   * gcc/config/rs6000/rs6000.cc (rs6000_modes

[gcc r14-9853] libquadmath: Use soft-fp for sqrtq finite positive arguments [PR114623]

2024-04-08 Thread Jakub Jelinek via Gcc-cvs
https://gcc.gnu.org/g:481ba4fb5fce8257f5dbeb994dac2748c0237fa2

commit r14-9853-g481ba4fb5fce8257f5dbeb994dac2748c0237fa2
Author: Jakub Jelinek 
Date:   Tue Apr 9 08:17:25 2024 +0200

libquadmath: Use soft-fp for sqrtq finite positive arguments [PR114623]

sqrt should be 0.5ulp precise, but the current implementation is less
precise than that.
The following patch uses the soft-fp code (like e.g. glibc for x86) for it
if possible.  I didn't want to replicate the libgcc infrastructure for
choosing the right sfp-machine.h, so the patch just uses a single generic
implementation.  As the code is used solely for the finite positive 
arguments,
it shouldn't generate NaNs (so the exact form of canonical QNaN/SNaN is
irrelevant), and sqrt for these shouldn't produce underflows/overflows 
either,
for < 1.0 arguments it always returns larger values than the argument and 
for
> 1.0 smaller values than the argument.

2024-04-09  Jakub Jelinek  

PR libquadmath/114623
* sfp-machine.h: New file.
* math/sqrtq.c: Include from libgcc/soft-fp also soft-fp.h and 
quad.h
if possible.
(USE_SOFT_FP): Define in that case.
(sqrtq): Use soft-fp based implementation for the finite positive
arguments if possible.

Diff:
---
 libquadmath/math/sqrtq.c  | 25 +-
 libquadmath/sfp-machine.h | 54 +++
 2 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/libquadmath/math/sqrtq.c b/libquadmath/math/sqrtq.c
index 56ea5d3243c..8ca2828d42c 100644
--- a/libquadmath/math/sqrtq.c
+++ b/libquadmath/math/sqrtq.c
@@ -1,6 +1,17 @@
 #include "quadmath-imp.h"
 #include 
 #include 
+#if __has_include("../../libgcc/soft-fp/soft-fp.h") \
+&& __has_include("../../libgcc/soft-fp/quad.h") \
+&& defined(FE_TONEAREST) \
+&& defined(FE_UPWARD) \
+&& defined(FE_DOWNWARD) \
+&& defined(FE_TOWARDZERO) \
+&& defined(FE_INEXACT)
+#define USE_SOFT_FP 1
+#include "../../libgcc/soft-fp/soft-fp.h"
+#include "../../libgcc/soft-fp/quad.h"
+#endif
 
 __float128
 sqrtq (const __float128 x)
@@ -20,6 +31,18 @@ sqrtq (const __float128 x)
   return (x - x) / (x - x);
 }
 
+#if USE_SOFT_FP
+  FP_DECL_EX;
+  FP_DECL_Q (X);
+  FP_DECL_Q (Y);
+
+  FP_INIT_ROUNDMODE;
+  FP_UNPACK_Q (X, x);
+  FP_SQRT_Q (Y, X);
+  FP_PACK_Q (y, Y);
+  FP_HANDLE_EXCEPTIONS;
+  return y;
+#else
   if (x <= DBL_MAX && x >= DBL_MIN)
   {
 /* Use double result as starting point.  */
@@ -59,5 +82,5 @@ sqrtq (const __float128 x)
   y -= 0.5q * (y - x / y);
   y -= 0.5q * (y - x / y);
   return y;
+#endif
 }
-
diff --git a/libquadmath/sfp-machine.h b/libquadmath/sfp-machine.h
new file mode 100644
index 000..e37d5b3c656
--- /dev/null
+++ b/libquadmath/sfp-machine.h
@@ -0,0 +1,54 @@
+/* libquadmath uses soft-fp only for sqrtq and only for
+   the positive finite case, so it doesn't care about
+   NaN representation, nor tininess after rounding vs.
+   before rounding, all it cares about is current rounding
+   mode and raising inexact exceptions.  */
+#if __SIZEOF_LONG__ == 8
+#define _FP_W_TYPE_SIZE 64
+#define _FP_I_TYPE long long
+#define _FP_NANFRAC_Q _FP_QNANBIT_Q, 0
+#else
+#define _FP_W_TYPE_SIZE 32
+#define _FP_I_TYPE int
+#define _FP_NANFRAC_Q _FP_QNANBIT_Q, 0, 0, 0
+#endif
+#define _FP_W_TYPE unsigned _FP_I_TYPE
+#define _FP_WS_TYPE signed _FP_I_TYPE
+#define _FP_QNANNEGATEDP 0
+#define _FP_NANSIGN_Q 1
+#define _FP_KEEPNANFRACP 1
+#define _FP_TININESS_AFTER_ROUNDING 0
+#define _FP_DECL_EX \
+  unsigned int fp_roundmode __attribute__ ((unused)) = FP_RND_NEAREST;
+#define FP_ROUNDMODE fp_roundmode
+#define FP_INIT_ROUNDMODE \
+  do   \
+{  \
+  switch (fegetround ())   \
+{  \
+case FE_UPWARD:\
+  fp_roundmode = FP_RND_PINF;  \
+  break;   \
+case FE_DOWNWARD:  \
+  fp_roundmode = FP_RND_MINF;  \
+  break;   \
+case FE_TOWARDZERO:\
+  fp_roundmode = FP_RND_ZERO;  \
+  break;   \
+   default:\
+ break;\
+}  \
+}  \
+  while (0)
+#define FP_HANDLE_EXCEPTIONS \
+  do   \
+{  \
+  if (_fex & FP_EX_INEXACT)\
+   {   \
+ volatile double eight = 8.0;  \
+ volatile double eps   \
+   = DBL_EPSILON;  \
+ eight += eps; \
+   }   \
+}  \
+  while (0)


[gcc r14-9854] RTEMS: Add multilib configuration for aarch64

2024-04-08 Thread Sebastian Huber via Gcc-cvs
https://gcc.gnu.org/g:ddee4376d15ddde9280c9a6725ddd76bf33f2871

commit r14-9854-gddee4376d15ddde9280c9a6725ddd76bf33f2871
Author: Sebastian Huber 
Date:   Mon Mar 25 08:00:02 2024 +0100

RTEMS: Add multilib configuration for aarch64

Add a multilib with workarounds for Cortex-A53 errata.

gcc/ChangeLog:

* config.gcc (aarch64-*-rtems*): Add target makefile fragment
t-aarch64-rtems.
* config/aarch64/t-aarch64-rtems: New file.

Diff:
---
 gcc/config.gcc |  1 +
 gcc/config/aarch64/t-aarch64-rtems | 42 ++
 2 files changed, 43 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2e320dd26c5..63a88bce484 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1199,6 +1199,7 @@ aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 ;;
aarch64-*-rtems*)
tm_file="${tm_file} aarch64/rtems.h rtems.h"
+   tmake_file="${tmake_file} aarch64/t-aarch64-rtems"
;;
esac
case $target in
diff --git a/gcc/config/aarch64/t-aarch64-rtems 
b/gcc/config/aarch64/t-aarch64-rtems
new file mode 100644
index 000..7598d636165
--- /dev/null
+++ b/gcc/config/aarch64/t-aarch64-rtems
@@ -0,0 +1,42 @@
+# Multilibs for aarch64 RTEMS targets.
+#
+# Copyright (C) 2024 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+MULTILIB_OPTIONS  =
+MULTILIB_DIRNAMES =
+MULTILIB_REQUIRED =
+
+MULTILIB_OPTIONS  += mabi=ilp32
+MULTILIB_DIRNAMES += ilp32
+
+MULTILIB_OPTIONS  += mno-outline-atomics
+MULTILIB_DIRNAMES += nooa
+
+MULTILIB_OPTIONS  += mcpu=cortex-a53
+MULTILIB_DIRNAMES += a53
+
+MULTILIB_OPTIONS  += mfix-cortex-a53-835769
+MULTILIB_DIRNAMES += fix835769
+
+MULTILIB_OPTIONS  += mfix-cortex-a53-843419
+MULTILIB_DIRNAMES += fix843419
+
+MULTILIB_REQUIRED += mabi=ilp32
+MULTILIB_REQUIRED += 
mabi=ilp32/mno-outline-atomics/mcpu=cortex-a53/mfix-cortex-a53-835769/mfix-cortex-a53-843419
+MULTILIB_REQUIRED += 
mno-outline-atomics/mcpu=cortex-a53/mfix-cortex-a53-835769/mfix-cortex-a53-843419


[gcc r14-9855] middle-end/114604 - ranger allocates bitmap without initialized obstack

2024-04-08 Thread Richard Biener via Gcc-cvs
https://gcc.gnu.org/g:d76df699b8ff792575e9df4d214c21fed0ed3b6b

commit r14-9855-gd76df699b8ff792575e9df4d214c21fed0ed3b6b
Author: Richard Biener 
Date:   Mon Apr 8 10:50:18 2024 +0200

middle-end/114604 - ranger allocates bitmap without initialized obstack

The following fixes ranger bitmap allocation when invoked from IPA
context where the global bitmap obstack possibly isn't initialized.
Instead of trying to use one of the ranger obstacks the following
simply initializes the global bitmap obstack around an active ranger.

PR middle-end/114604
* gimple-range.cc (enable_ranger): Initialize the global
bitmap obstack.
(disable_ranger): Release it.

Diff:
---
 gcc/gimple-range.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index c16b776c1e3..4d3b1ce8588 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -689,6 +689,8 @@ enable_ranger (struct function *fun, bool use_imm_uses)
 {
   gimple_ranger *r;
 
+  bitmap_obstack_initialize (NULL);
+
   gcc_checking_assert (!fun->x_range_query);
   r = new gimple_ranger (use_imm_uses);
   fun->x_range_query = r;
@@ -705,6 +707,8 @@ disable_ranger (struct function *fun)
   gcc_checking_assert (fun->x_range_query);
   delete fun->x_range_query;
   fun->x_range_query = NULL;
+
+  bitmap_obstack_release (NULL);
 }
 
 //