[gcc r15-702] rs6000: Fix ICE on IEEE128 long double without vsx [PR114402]

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:c547e353597ac4e0af09c2faca8c5a16744dcea4

commit r15-702-gc547e353597ac4e0af09c2faca8c5a16744dcea4
Author: Kewen Lin 
Date:   Mon May 20 21:01:06 2024 -0500

rs6000: Fix ICE on IEEE128 long double without vsx [PR114402]

As PR114402 shows, we supports IEEE128 format long double
even if there is no vsx support, but there is an ICE about
cbranch as the test case shows.  For now, we only supports
compare:CCFP pattern for IEEE128 fp if TARGET_FLOAT128_HW,
so in function rs6000_generate_compare we have a check with
!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode) to make
!TARGET_FLOAT128_HW IEEE128 fp handling go with libcall.
But unfortunately the IEEE128 without vsx support doesn't
meet FLOAT128_VECTOR_P (mode) so it goes further with an
unmatched compare:CCFP pattern which triggers ICE.

So this patch is to make rs6000_generate_compare consider
IEEE128 without vsx as well then it can end up with libcall.

PR target/114402

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_generate_compare): Make IEEE128
handling without vsx go with libcall.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114402.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr114402.c | 16 
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index e713a1e1d570..d18e262d81de 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -15283,7 +15283,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
   rtx op0 = XEXP (cmp, 0);
   rtx op1 = XEXP (cmp, 1);
 
-  if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
+  if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))
 comp_mode = CCmode;
   else if (FLOAT_MODE_P (mode))
 comp_mode = CCFPmode;
@@ -15315,7 +15315,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
 
   /* IEEE 128-bit support in VSX registers when we do not have hardware
  support.  */
-  if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
+  if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))
 {
   rtx libfunc = NULL_RTX;
   bool check_nan = false;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114402.c 
b/gcc/testsuite/gcc.target/powerpc/pr114402.c
new file mode 100644
index ..9323c5ee991d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114402.c
@@ -0,0 +1,16 @@
+/* Explicitly disable VSX when VSX is on.  */
+/* { dg-options "-mno-vsx" { target powerpc_vsx } } */
+
+/* Verify there is no ICE.  */
+
+long double a;
+long double b;
+
+int
+foo ()
+{
+  if (a > b)
+return 0;
+  else
+return 1;
+}


[gcc r15-703] rs6000: Add assert !TARGET_VSX if !TARGET_ALTIVEC and strip a useless check

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:b390b0115696353ba579706531fbd3bcf39281c5

commit r15-703-gb390b0115696353ba579706531fbd3bcf39281c5
Author: Kewen Lin 
Date:   Mon May 20 21:01:06 2024 -0500

rs6000: Add assert !TARGET_VSX if !TARGET_ALTIVEC and strip a useless check

In function rs6000_option_override_internal, we have the
checks and adjustments like:

  if (TARGET_P8_VECTOR && !TARGET_ALTIVEC)
rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;

  if (TARGET_P8_VECTOR && !TARGET_VSX)
rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;

But in fact some previous code has guaranteed !TARGET_VSX if
!TARGET_ALTIVEC, so we can remove the former check and
adjustment.  This patch is to remove it accordingly and also
place an explicit assertion.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove
useless check on TARGET_P8_VECTOR && !TARGET_ALTIVEC and add an
assertion on !TARGET_VSX if !TARGET_ALTIVEC.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d18e262d81de..e4dc629ddcc9 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3940,8 +3940,9 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FPRND;
 }
 
-  if (TARGET_P8_VECTOR && !TARGET_ALTIVEC)
-rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;
+  /* Assert !TARGET_VSX if !TARGET_ALTIVEC and make some adjustments
+ based on either !TARGET_VSX or !TARGET_ALTIVEC concise.  */
+  gcc_assert (TARGET_ALTIVEC || !TARGET_VSX);
 
   if (TARGET_P8_VECTOR && !TARGET_VSX)
 rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;


[gcc r15-704] rs6000: Clean up TF and TD check with FLOAT128_2REG_P

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:2eb1dff8b34a8a7da02b988878172d1b8f203d96

commit r15-704-g2eb1dff8b34a8a7da02b988878172d1b8f203d96
Author: Kewen Lin 
Date:   Mon May 20 21:01:06 2024 -0500

rs6000: Clean up TF and TD check with FLOAT128_2REG_P

Commit r6-2116-g2c83faf86827bf did some clean up on TFmode
and TFmode check with FLOAT128_2REG_P, but it missed to
update an assertion, this patch is to make it align.

btw, it's noticed when I'm making a patch to get rid of
TFmode.

gcc/ChangeLog:

* config/rs6000/rs6000-call.cc (rs6000_darwin64_record_arg_recurse):
Clean up TFmode and TDmode check with FLOAT128_2REG_P.

Diff:
---
 gcc/config/rs6000/rs6000-call.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-call.cc b/gcc/config/rs6000/rs6000-call.cc
index 1f8f93a2ee78..a039ff75f3c9 100644
--- a/gcc/config/rs6000/rs6000-call.cc
+++ b/gcc/config/rs6000/rs6000-call.cc
@@ -1391,7 +1391,7 @@ rs6000_darwin64_record_arg_recurse (CUMULATIVE_ARGS *cum, 
const_tree type,
if (cum->fregno + n_fpreg > FP_ARG_MAX_REG + 1)
  {
gcc_assert (cum->fregno == FP_ARG_MAX_REG
-   && (mode == TFmode || mode == TDmode));
+   && FLOAT128_2REG_P (mode));
/* Long double or _Decimal128 split over regs and memory.  */
mode = DECIMAL_FLOAT_MODE_P (mode) ? DDmode : DFmode;
cum->use_stack=1;


[gcc r15-705] rs6000: Drop useless vector_{load, store}_ defines

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:a6f8b2b63391ff14c2bf6e1b75abd99546dfbfb8

commit r15-705-ga6f8b2b63391ff14c2bf6e1b75abd99546dfbfb8
Author: Kewen Lin 
Date:   Mon May 20 21:01:06 2024 -0500

rs6000: Drop useless vector_{load,store}_ defines

When I was working on a patch to get rid of TFmode, I
noticed that define_expands vector_load_ and
vector_store_ are useless.  This patch is to clean up
both.

gcc/ChangeLog:

* config/rs6000/vector.md (define_expand vector_load_): 
Remove.
(vector_store_): Likewise.

Diff:
---
 gcc/config/rs6000/vector.md | 14 --
 1 file changed, 14 deletions(-)

diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index f9796fb3781b..59489e068399 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -163,20 +163,6 @@
 }
 })
 
-;; Generic vector floating point load/store instructions.  These will match
-;; insns defined in vsx.md or altivec.md depending on the switches.
-(define_expand "vector_load_"
-  [(set (match_operand:VEC_M 0 "vfloat_operand")
-   (match_operand:VEC_M 1 "memory_operand"))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)"
-  "")
-
-(define_expand "vector_store_"
-  [(set (match_operand:VEC_M 0 "memory_operand")
-   (match_operand:VEC_M 1 "vfloat_operand"))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)"
-  "")
-
 ;; Splits if a GPR register was chosen for the move
 (define_split
   [(set (match_operand:VEC_L 0 "nonimmediate_operand")


[gcc r15-706] rs6000: Remove useless entries in rreg

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:2cd8dfd7d599ad6205e40c4e57275ce6ebd073aa

commit r15-706-g2cd8dfd7d599ad6205e40c4e57275ce6ebd073aa
Author: Kewen Lin 
Date:   Mon May 20 21:01:07 2024 -0500

rs6000: Remove useless entries in rreg

When I was working on a trial patch to get rid of TFmode,
I noticed that mode attribute rreg only gets used for mode
iterator SFDF, it means that only SF and DF key-value pairs
are useful, the other are useless, so this patch is to clean
up them.

gcc/ChangeLog:

* config/rs6000/rs6000.md (mode attribute rreg): Remove useless
entries with modes TF, TD, V4SF and V2DF.

Diff:
---
 gcc/config/rs6000/rs6000.md | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ac5651d7420c..7d0019ab410a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -729,11 +729,7 @@
(DI "Y")])
 
 (define_mode_attr rreg [(SF   "f")
-   (DF   "wa")
-   (TF   "f")
-   (TD   "f")
-   (V4SF "wa")
-   (V2DF "wa")])
+   (DF   "wa")])
 
 (define_mode_attr rreg2 [(SF   "f")
 (DF   "d")])


[gcc r15-708] testsuite: Fix typo in torture/vector-{1,2}.c

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:f672ab0ae1f4c00276125b9ff49884886834f5c3

commit r15-708-gf672ab0ae1f4c00276125b9ff49884886834f5c3
Author: Kewen Lin 
Date:   Mon May 20 21:01:07 2024 -0500

testsuite: Fix typo in torture/vector-{1,2}.c

When making some clean up patches, I happened to find test
cases vector-{1,2}.c are having typo "powerpc64--*-*" in
target selector, which should be powerpc64-*-*.  The reason
why we didn't catch before is that all our testing machines
support VMX insns, so it passes always.  But it would break
if a test machine doesn't support that, so this patch is to
fix it to ensure robustness.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/vector-1.c: Fix typo.
* gcc.dg/torture/vector-2.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.dg/torture/vector-1.c | 2 +-
 gcc/testsuite/gcc.dg/torture/vector-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/torture/vector-1.c 
b/gcc/testsuite/gcc.dg/torture/vector-1.c
index 205fee6d6de9..1b98ee26ff3c 100644
--- a/gcc/testsuite/gcc.dg/torture/vector-1.c
+++ b/gcc/testsuite/gcc.dg/torture/vector-1.c
@@ -4,7 +4,7 @@
 /* { dg-options "-msse" { target { i?86-*-* x86_64-*-* } } } */
 /* { dg-require-effective-target sse_runtime { target { i?86-*-* x86_64-*-* } 
} } */
 /* { dg-options "-mabi=altivec" { target { powerpc-*-* powerpc64-*-* } } } */
-/* { dg-require-effective-target vmx_hw { target { powerpc-*-* powerpc64--*-* 
} } } */
+/* { dg-require-effective-target vmx_hw { target { powerpc-*-* powerpc64-*-* } 
} } */
 
 #define vector __attribute__((vector_size(16) ))
 
diff --git a/gcc/testsuite/gcc.dg/torture/vector-2.c 
b/gcc/testsuite/gcc.dg/torture/vector-2.c
index b004d0057754..c9a3a44d4dff 100644
--- a/gcc/testsuite/gcc.dg/torture/vector-2.c
+++ b/gcc/testsuite/gcc.dg/torture/vector-2.c
@@ -4,7 +4,7 @@
 /* { dg-options "-msse" { target { i?86-*-* x86_64-*-* } } } */
 /* { dg-require-effective-target sse_runtime { target { i?86-*-* x86_64-*-* } 
} } */
 /* { dg-options "-mabi=altivec" { target { powerpc-*-* powerpc64-*-* } } } */
-/* { dg-require-effective-target vmx_hw { target { powerpc-*-* powerpc64--*-* 
} } } */
+/* { dg-require-effective-target vmx_hw { target { powerpc-*-* powerpc64-*-* } 
} } */
 
 #define vector __attribute__((vector_size(16) ))


[gcc r15-707] rs6000: Remove useless operands[3]

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:1a87deddf470c728e85cc9ca802b51ed2b1efbd6

commit r15-707-g1a87deddf470c728e85cc9ca802b51ed2b1efbd6
Author: Kewen Lin 
Date:   Mon May 20 21:01:07 2024 -0500

rs6000: Remove useless operands[3]

As shown, three uses of operands[3] are totally useless, so
this patch is to remove them to avoid any confusion.

gcc/ChangeLog:

* config/rs6000/rs6000.md (@ieee_128bit_vsx_neg2): Remove
the use of operands[3].
(@ieee_128bit_vsx_neg2): Likewise.
(*ieee_128bit_vsx_nabs2): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000.md | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 7d0019ab410a..f035e68ff0f8 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -9260,7 +9260,6 @@
   if (GET_CODE (operands[2]) == SCRATCH)
 operands[2] = gen_reg_rtx (V16QImode);
 
-  operands[3] = gen_reg_rtx (V16QImode);
   emit_insn (gen_ieee_128bit_negative_zero (operands[2]));
 }
   [(set_attr "length" "8")
@@ -9289,7 +9288,6 @@
   if (GET_CODE (operands[2]) == SCRATCH)
 operands[2] = gen_reg_rtx (V16QImode);
 
-  operands[3] = gen_reg_rtx (V16QImode);
   emit_insn (gen_ieee_128bit_negative_zero (operands[2]));
 }
   [(set_attr "length" "8")
@@ -9321,7 +9319,6 @@
   if (GET_CODE (operands[2]) == SCRATCH)
 operands[2] = gen_reg_rtx (V16QImode);
 
-  operands[3] = gen_reg_rtx (V16QImode);
   emit_insn (gen_ieee_128bit_negative_zero (operands[2]));
 }
   [(set_attr "length" "8")


[gcc r15-712] testsuite, rs6000: Remove powerpcspe test cases and checks

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:7fa32ad7a4afc7dc93a0c50204fe0b5c00ac4865

commit r15-712-g7fa32ad7a4afc7dc93a0c50204fe0b5c00ac4865
Author: Kewen Lin 
Date:   Mon May 20 21:01:08 2024 -0500

testsuite, rs6000: Remove powerpcspe test cases and checks

Since r9-4728 the powerpcspe support had been removed, this
follow-up patch is to remove the remaining pieces in testsuite.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_vect_cmdline_needed): Remove
check_effective_target_powerpc_spe.
(check_effective_target_powerpc_spe_nocache): Remove.
(check_effective_target_powerpc_spe): Remove.
(check_ppc_cpu_supports_hw_available): Remove powerpc*-*-eabispe 
check.
(check_p8vector_hw_available): Likewise.
(check_p9vector_hw_available): Likewise.
(check_p9modulo_hw_available): Likewise.
(check_ppc_float128_sw_available): Likewise.
(check_ppc_float128_hw_available): Likewise.
(check_vsx_hw_available): Likewise.
(check_vmx_hw_available): Likewise.
(check_ppc_recip_hw_available): Likewise.
(check_dfp_hw_available): Likewise.
(check_htm_hw_available): Likewise.
* g++.dg/ext/spe1.C: Remove.
* g++.dg/other/opaque-1.C: Remove.
* g++.dg/other/opaque-2.C: Remove.
* g++.dg/other/opaque-3.C: Remove.
* g++.target/powerpc/simd-5.C: Remove.

Diff:
---
 gcc/testsuite/g++.dg/ext/spe1.C   | 10 --
 gcc/testsuite/g++.dg/other/opaque-1.C | 31 ---
 gcc/testsuite/g++.dg/other/opaque-2.C | 19 
 gcc/testsuite/g++.dg/other/opaque-3.C | 12 
 gcc/testsuite/g++.target/powerpc/simd-5.C | 44 --
 gcc/testsuite/lib/target-supports.exp | 51 +++
 6 files changed, 5 insertions(+), 162 deletions(-)

diff --git a/gcc/testsuite/g++.dg/ext/spe1.C b/gcc/testsuite/g++.dg/ext/spe1.C
deleted file mode 100644
index b98d4b27b3d7..
--- a/gcc/testsuite/g++.dg/ext/spe1.C
+++ /dev/null
@@ -1,10 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-mcpu=8540 -mspe -mabi=spe -mfloat-gprs=single -O0" } */
-/* { dg-skip-if "not an SPE target" { ! powerpc_spe_nocache } } */
-
-typedef int v2si __attribute__ ((vector_size (8)));
-
-/* The two specializations must be considered different.  */
-template  class X { };
-template <>class X<__ev64_opaque__> { };
-template <>class X   { };
diff --git a/gcc/testsuite/g++.dg/other/opaque-1.C 
b/gcc/testsuite/g++.dg/other/opaque-1.C
deleted file mode 100644
index 669776b9f976..
--- a/gcc/testsuite/g++.dg/other/opaque-1.C
+++ /dev/null
@@ -1,31 +0,0 @@
-/* { dg-do run } */
-/* { dg-options "-mcpu=8540 -mspe -mabi=spe -mfloat-gprs=single" } */
-/* { dg-skip-if "not an SPE target" { ! powerpc_spe_nocache } } */
-
-#define __vector __attribute__((vector_size(8)))
-typedef float __vector __ev64_fs__;
-
-__ev64_fs__ f;
-__ev64_opaque__ o;
-
-int here = 0;
-
-void bar (__ev64_opaque__ x)
-{
-  here = 0;
-}
-
-void bar (__ev64_fs__ x)
-{ 
-  here = 888;
-}
-
-int main ()
-{
-  f = o;
-  o = f;
-  bar (f);
-  if (here != 888)
-return 1;
-  return 0;
-}
diff --git a/gcc/testsuite/g++.dg/other/opaque-2.C 
b/gcc/testsuite/g++.dg/other/opaque-2.C
deleted file mode 100644
index 414f87e6c9a0..
--- a/gcc/testsuite/g++.dg/other/opaque-2.C
+++ /dev/null
@@ -1,19 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-mcpu=8540 -mspe -mabi=spe -mfloat-gprs=single" } */
-/* { dg-skip-if "not an SPE target" { ! powerpc_spe_nocache } } */
-
-#define __vector __attribute__((vector_size(8)))
-typedef float __vector __ev64_fs__;
-
-__ev64_fs__ f;
-__ev64_opaque__ o;
-
-extern void bar (__ev64_opaque__);
-
-int main ()
-{
-  f = o;
-  o = f;
-  bar (f);
-  return 0;
-}
diff --git a/gcc/testsuite/g++.dg/other/opaque-3.C 
b/gcc/testsuite/g++.dg/other/opaque-3.C
deleted file mode 100644
index f915f840510c..
--- a/gcc/testsuite/g++.dg/other/opaque-3.C
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-mcpu=8540 -mspe -mabi=spe -mfloat-gprs=single" } */
-/* { dg-skip-if "not an SPE target" { ! powerpc_spe_nocache } } */
-
-__ev64_opaque__ o;
-#define v __attribute__((vector_size(8)))
-v unsigned int *p;
-
-void m()
-{
-  o = __builtin_spe_evldd(p, 5);
-}
diff --git a/gcc/testsuite/g++.target/powerpc/simd-5.C 
b/gcc/testsuite/g++.target/powerpc/simd-5.C
deleted file mode 100644
index 71e117ead2aa..
--- a/gcc/testsuite/g++.target/powerpc/simd-5.C
+++ /dev/null
@@ -1,44 +0,0 @@
-// Test EH with V2SI SIMD registers actually restores correct values.
-// Origin: Joseph Myers 
-// { dg-options "-O" }
-// { dg-do run { target { powerpc_spe && { ! *-*-vxworks* } } } }
-
-extern "C" void abort (void);
-extern "C" int memcmp (const void *, const voi

[gcc r15-709] testsuite, rs6000: Remove some checks with aix[456]

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:fa8250630dcd5ab50e2e957747d817cae4403c82

commit r15-709-gfa8250630dcd5ab50e2e957747d817cae4403c82
Author: Kewen Lin 
Date:   Mon May 20 21:01:07 2024 -0500

testsuite, rs6000: Remove some checks with aix[456]

Since r12-75-g0745b6fa66c69c aix6 support had been dropped,
so we don't need to check for aix[456].* when testing, this
patch is to remove such checks.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_powerpc_altivec_ok): Remove checks for
aix[456].*
(check_effective_target_powerpc_p9modulo_ok): Likewise.
(check_effective_target_powerpc_float128_sw_ok): Likewise.
(check_effective_target_powerpc_float128_hw_ok): Likewise.
(check_effective_target_powerpc_vsx_ok): Likewise.

Diff:
---
 gcc/testsuite/lib/target-supports.exp | 29 -
 1 file changed, 29 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index ec9baa4f32a3..d38c16354ff0 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6959,11 +6959,6 @@ proc check_effective_target_powerpc_altivec_ok { } {
 # Paired Single, then not ok
 if { [istarget powerpc-*-linux*paired*] } { return 0 }
 
-# AltiVec is not supported on AIX before 5.3.
-if { [istarget powerpc*-*-aix4*]
-|| [istarget powerpc*-*-aix5.1*]
-|| [istarget powerpc*-*-aix5.2*] } { return 0 }
-
 # Return true iff compiling with -maltivec does not error.
 return [check_no_compiler_messages powerpc_altivec_ok object {
int dummy;
@@ -6976,12 +6971,6 @@ proc check_effective_target_powerpc_p9modulo_ok { } {
 if { ([istarget powerpc*-*-*]
  && ![istarget powerpc-*-linux*paired*])
 || [istarget rs6000-*-*] } {
-   # AltiVec is not supported on AIX before 5.3.
-   if { [istarget powerpc*-*-aix4*]
-|| [istarget powerpc*-*-aix5.1*] 
-|| [istarget powerpc*-*-aix5.2*] } {
-   return 0
-   }
return [check_no_compiler_messages powerpc_p9modulo_ok object {
int main (void) {
int i = 5, j = 3, r = -1;
@@ -7112,12 +7101,6 @@ proc check_effective_target_powerpc_float128_sw_ok { } {
 if { ([istarget powerpc*-*-*]
  && ![istarget powerpc-*-linux*paired*])
 || [istarget rs6000-*-*] } {
-   # AltiVec is not supported on AIX before 5.3.
-   if { [istarget powerpc*-*-aix4*]
-|| [istarget powerpc*-*-aix5.1*] 
-|| [istarget powerpc*-*-aix5.2*] } {
-   return 0
-   }
# Darwin doesn't have VSX, so no soft support for float128.
if { [istarget *-*-darwin*] } {
return 0
@@ -7142,12 +7125,6 @@ proc check_effective_target_powerpc_float128_hw_ok { } {
 if { ([istarget powerpc*-*-*]
  && ![istarget powerpc-*-linux*paired*])
 || [istarget rs6000-*-*] } {
-   # AltiVec is not supported on AIX before 5.3.
-   if { [istarget powerpc*-*-aix4*]
-|| [istarget powerpc*-*-aix5.1*] 
-|| [istarget powerpc*-*-aix5.2*] } {
-   return 0
-   }
# Darwin doesn't run on any machine with float128 h/w so far.
if { [istarget *-*-darwin*] } {
return 0
@@ -7211,12 +7188,6 @@ proc check_effective_target_powerpc_vsx_ok { } {
 if { ([istarget powerpc*-*-*]
  && ![istarget powerpc-*-linux*paired*])
 || [istarget rs6000-*-*] } {
-   # VSX is not supported on AIX before 7.1.
-   if { [istarget powerpc*-*-aix4*]
-|| [istarget powerpc*-*-aix5*]
-|| [istarget powerpc*-*-aix6*] } {
-   return 0
-   }
# Darwin doesn't have VSX, even if it's used with an assembler
# which recognises the insns.
if { [istarget *-*-darwin*] } {


[gcc r15-711] testsuite, rs6000: Remove powerpc_popcntb_ok

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:f4598e71cf28478ecad2bc6a47f500e30bd65eb6

commit r15-711-gf4598e71cf28478ecad2bc6a47f500e30bd65eb6
Author: Kewen Lin 
Date:   Mon May 20 21:01:07 2024 -0500

testsuite, rs6000: Remove powerpc_popcntb_ok

There are three uses of effective target powerpc_popcntb_ok,
they are all for compiling, but powerpc_popcntb_ok checks
for executable generation, which is too heavy.  This patch
is to remove powerpc_popcntb_ok and adjust its three uses
accordingly.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp 
(check_effective_target_powerpc_popcntb_ok):
Remove.
* gcc.target/powerpc/cmpb-2.c: Adjust with dg-skip-if as
powerpc_popcntb_ok gets removed.
* gcc.target/powerpc/cmpb-3.c: Likewise.
* gcc.target/powerpc/cmpb32-2.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c   |  3 ++-
 gcc/testsuite/gcc.target/powerpc/cmpb-3.c   |  3 ++-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c |  3 ++-
 gcc/testsuite/lib/target-supports.exp   | 20 
 4 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c 
b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
index 02b84d0731d5..44a554bee4a2 100644
--- a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
+/* Skip powerpc*-*-darwin* powerpc-*-eabi as dropped popcntb_ok.  */
+/* { dg-skip-if "" { powerpc*-*-darwin* powerpc-*-eabi } } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_popcntb_ok } */
 /* { dg-options "-mdejagnu-cpu=power5" } */
 
 void abort ();
diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c 
b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
index 75641bdb22cc..43de37a571d5 100644
--- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
@@ -1,6 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
+/* Skip powerpc*-*-darwin* powerpc-*-eabi as dropped popcntb_ok.  */
+/* { dg-skip-if "" { powerpc*-*-darwin* powerpc-*-eabi } } */
 /* { dg-require-effective-target ilp32 } */
-/* { dg-require-effective-target powerpc_popcntb_ok } */
 /* { dg-options "-mdejagnu-cpu=power6" } */
 
 void abort ();
diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c 
b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
index d4264ab6e7d3..0713c44fcff2 100644
--- a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-require-effective-target powerpc_popcntb_ok } */
+/* Skip powerpc*-*-darwin* powerpc-*-eabi as dropped popcntb_ok.  */
+/* { dg-skip-if "" { powerpc*-*-darwin* powerpc-*-eabi } } */
 /* { dg-options "-mdejagnu-cpu=power5" } */
 
 void abort ();
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 82dea149c257..34027b64e520 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3946,26 +3946,6 @@ proc check_effective_target_unsigned_char {} {
 }]
 }
 
-proc check_effective_target_powerpc_popcntb_ok { } {
-return [check_cached_effective_target powerpc_popcntb_ok {
-
-   # Disable on Darwin.
-   if { [istarget powerpc-*-eabi] || [istarget powerpc*-*-eabispe] || 
[istarget *-*-darwin*]} {
-   expr 0
-   } else {
-   check_runtime_nocache powerpc_popcntb_ok {
-   volatile int r;
-   volatile int a = 0x12345678;
-   int main()
-   {
-   asm volatile ("popcntb %0,%1" : "=r" (r) : "r" (a));
-   return 0;
-   }
-   } "-mcpu=power5"
-   }
-}]
-}
-
 # Return 1 if the target supports executing DFP hardware instructions,
 # 0 otherwise.  Cache the result.


[gcc r15-713] libgcc, rs6000: Remove powerpcspe related code

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:5d1d2e955d1379da77b000f6445c208ff25cd137

commit r15-713-g5d1d2e955d1379da77b000f6445c208ff25cd137
Author: Kewen Lin 
Date:   Mon May 20 21:01:08 2024 -0500

libgcc, rs6000: Remove powerpcspe related code

Since r9-4728 the powerpcspe support had been removed, this
follow-up patch is to remove the remaining pieces in libgcc.

libgcc/ChangeLog:

* config.host: Remove powerpc-*-eabispe* support.
* config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Remove
__SPE__ code.
* config/rs6000/t-savresfgpr (LIB2ADD_ST): Remove e500crtres32gpr.S,
e500crtres32gpr.S, e500crtsav64gpr.S, e500crtsav64gprctr.S,
e500crtres64gpr.S, e500crtsav32gpr.S, e500crtsavg32gpr.S,
e500crtres64gprctr.S, e500crtsavg64gprctr.S, e500crtresx32gpr.S,
e500crtrest32gpr.S, e500crtrest64gpr.S and e500crtresx64gpr.S.
* config/rs6000/e500crtres32gpr.S: Remove.
* config/rs6000/e500crtres64gpr.S: Remove.
* config/rs6000/e500crtres64gprctr.S: Remove.
* config/rs6000/e500crtrest32gpr.S: Remove.
* config/rs6000/e500crtrest64gpr.S: Remove.
* config/rs6000/e500crtresx32gpr.S: Remove.
* config/rs6000/e500crtresx64gpr.S: Remove.
* config/rs6000/e500crtsav32gpr.S: Remove.
* config/rs6000/e500crtsav64gpr.S: Remove.
* config/rs6000/e500crtsav64gprctr.S: Remove.
* config/rs6000/e500crtsavg32gpr.S: Remove.
* config/rs6000/e500crtsavg64gpr.S: Remove.
* config/rs6000/e500crtsavg64gprctr.S: Remove.

Diff:
---
 libgcc/config.host |  4 --
 libgcc/config/rs6000/e500crtres32gpr.S | 73 
 libgcc/config/rs6000/e500crtres64gpr.S | 73 
 libgcc/config/rs6000/e500crtres64gprctr.S  | 90 -
 libgcc/config/rs6000/e500crtrest32gpr.S| 75 
 libgcc/config/rs6000/e500crtrest64gpr.S| 74 
 libgcc/config/rs6000/e500crtresx32gpr.S| 75 
 libgcc/config/rs6000/e500crtresx64gpr.S| 75 
 libgcc/config/rs6000/e500crtsav32gpr.S | 73 
 libgcc/config/rs6000/e500crtsav64gpr.S | 72 ---
 libgcc/config/rs6000/e500crtsav64gprctr.S  | 91 --
 libgcc/config/rs6000/e500crtsavg32gpr.S| 73 
 libgcc/config/rs6000/e500crtsavg64gpr.S| 73 
 libgcc/config/rs6000/e500crtsavg64gprctr.S | 90 -
 libgcc/config/rs6000/linux-unwind.h| 11 
 libgcc/config/rs6000/t-savresfgpr  | 15 +
 16 files changed, 1 insertion(+), 1036 deletions(-)

diff --git a/libgcc/config.host b/libgcc/config.host
index 694602d31859..9fae51d4ce7d 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1238,10 +1238,6 @@ powerpc*-*-freebsd*)
 powerpc-*-netbsd*)
tmake_file="$tmake_file rs6000/t-netbsd rs6000/t-crtstuff"
;;
-powerpc-*-eabispe*)
-   tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr 
rs6000/t-crtstuff t-crtstuff-pic t-fdpbit"
-   extra_parts="$extra_parts crtbegin.o crtend.o crtbeginS.o crtendS.o 
crtbeginT.o ecrti.o ecrtn.o ncrti.o ncrtn.o"
-   ;;
 powerpc-*-eabisimaltivec*)
tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-crtstuff 
t-crtstuff-pic t-fdpbit"
extra_parts="$extra_parts crtbegin.o crtend.o crtbeginS.o crtendS.o 
crtbeginT.o ecrti.o ecrtn.o ncrti.o ncrtn.o"
diff --git a/libgcc/config/rs6000/e500crtres32gpr.S 
b/libgcc/config/rs6000/e500crtres32gpr.S
deleted file mode 100644
index b19703073cad..
--- a/libgcc/config/rs6000/e500crtres32gpr.S
+++ /dev/null
@@ -1,73 +0,0 @@
-/*
- * Special support for e500 eabi and SVR4
- *
- *   Copyright (C) 2008-2024 Free Software Foundation, Inc.
- *   Written by Nathan Froyd
- * 
- * This file is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License as published by the
- * Free Software Foundation; either version 3, or (at your option) any
- * later version.
- * 
- * This file is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- * 
- * Under Section 7 of GPL version 3, you are granted additional
- * permissions described in the GCC Runtime Library Exception, version
- * 3.1, as published by the Free Software Foundation.
- *
- * You should have received a copy of the GNU General Public License and
- * a copy of the GCC Runtime Library Exception along with this program;
- * see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
- * .
- */ 

[gcc r15-710] testsuite, rs6000: Remove all linux*paired* checks and cases

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:458b23bc8b3e2b11a6ea19c69f42ba85abb7d0fe

commit r15-710-g458b23bc8b3e2b11a6ea19c69f42ba85abb7d0fe
Author: Kewen Lin 
Date:   Mon May 20 21:01:07 2024 -0500

testsuite, rs6000: Remove all linux*paired* checks and cases

Since r9-115-g559289370f76bf the support of paired single
had been dropped, but we still have some test checks and
cases for that, this patch is to get rid of them.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_vect_int): Remove
the check on powerpc-*-linux*paired*.
(check_effective_target_vect_intfloat_cvt): Likewise.
(check_effective_target_vect_uintfloat_cvt): Likewise.
(check_effective_target_vect_floatint_cvt): Likewise.
(check_effective_target_vect_floatuint_cvt): Likewise.
(check_effective_target_powerpc_altivec_ok): Likewise.
(check_effective_target_powerpc_p9modulo_ok): Likewise.
(check_effective_target_powerpc_float128_sw_ok): Likewise.
(check_effective_target_powerpc_float128_hw_ok): Likewise.
(check_effective_target_powerpc_vsx_ok): Likewise.
(check_effective_target_powerpc_htm_ok): Likewise.
(check_effective_target_vect_shift): Likewise.
(check_effective_target_vect_char_add): Likewise.
(check_effective_target_vect_shift_char): Likewise.
(check_effective_target_vect_long): Likewise.
(check_effective_target_ifn_copysign): Likewise.
(check_effective_target_vect_sdot_hi): Likewise.
(check_effective_target_vect_udot_hi): Likewise.
(check_effective_target_vect_pack_trunc): Likewise.
(check_effective_target_vect_int_mult): Likewise.
* gcc.target/powerpc/paired-1.c: Remove.
* gcc.target/powerpc/paired-10.c: Remove.
* gcc.target/powerpc/paired-2.c: Remove.
* gcc.target/powerpc/paired-3.c: Remove.
* gcc.target/powerpc/paired-4.c: Remove.
* gcc.target/powerpc/paired-5.c: Remove.
* gcc.target/powerpc/paired-6.c: Remove.
* gcc.target/powerpc/paired-7.c: Remove.
* gcc.target/powerpc/paired-8.c: Remove.
* gcc.target/powerpc/paired-9.c: Remove.
* gcc.target/powerpc/ppc-paired.c: Remove.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/paired-1.c   | 33 ---
 gcc/testsuite/gcc.target/powerpc/paired-10.c  | 25 
 gcc/testsuite/gcc.target/powerpc/paired-2.c   | 35 
 gcc/testsuite/gcc.target/powerpc/paired-3.c   | 34 ---
 gcc/testsuite/gcc.target/powerpc/paired-4.c   | 34 ---
 gcc/testsuite/gcc.target/powerpc/paired-5.c   | 34 ---
 gcc/testsuite/gcc.target/powerpc/paired-6.c   | 34 ---
 gcc/testsuite/gcc.target/powerpc/paired-7.c   | 34 ---
 gcc/testsuite/gcc.target/powerpc/paired-8.c   | 25 
 gcc/testsuite/gcc.target/powerpc/paired-9.c   | 25 
 gcc/testsuite/gcc.target/powerpc/ppc-paired.c | 45 
 gcc/testsuite/lib/target-supports.exp | 59 +--
 12 files changed, 20 insertions(+), 397 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/paired-1.c 
b/gcc/testsuite/gcc.target/powerpc/paired-1.c
deleted file mode 100644
index 19a66a15b30b..
--- a/gcc/testsuite/gcc.target/powerpc/paired-1.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* { dg-do compile { target { powerpc-*-linux*paired* && ilp32} } } */
-/* { dg-options "-mpaired -ffinite-math-only " } */
-
-/* Test PowerPC PAIRED extensions.  */
-
-#include 
-
-static float in1[2] __attribute__ ((aligned (8))) =
-{6.0, 7.0};
-static float in2[2] __attribute__ ((aligned (8))) =
-{4.0, 3.0};
-
-static float out[2] __attribute__ ((aligned (8)));
-
-vector float a, b, c, d;
-void
-test_api ()
-{
-  b = paired_lx (0, in1);
-  c = paired_lx (0, in2);
-
-  a = paired_sub (b, c);
-
-  paired_stx (a, 0, out);
-}
-
-int
-main ()
-{
-  test_api ();
-  return (0);
-}
-
diff --git a/gcc/testsuite/gcc.target/powerpc/paired-10.c 
b/gcc/testsuite/gcc.target/powerpc/paired-10.c
deleted file mode 100644
index 1f904c258413..
--- a/gcc/testsuite/gcc.target/powerpc/paired-10.c
+++ /dev/null
@@ -1,25 +0,0 @@
-/* { dg-do compile { target { powerpc-*-linux*paired* && ilp32 } } } */
-/* { dg-options "-mpaired -ffinite-math-only " } */
-
-/* Test PowerPC PAIRED extensions.  */
-
-#include 
-
-static float out[2] __attribute__ ((aligned (8)));
-void
-test_api (float y, float x)
-{
-  vector float c = {x, y};
-  vector float b = {0.0, 8.0};
-  vector float a;
-
-  a = paired_sub (b, c);
-  paired_stx (a, 0, out);
-}
-
-
-int main ()
-{
-  test_api (6, 7);
-  return (0); 
-}
diff --git a/gcc/testsuite/gcc.target/powerpc/paired-2.c 
b/gcc/testsuite/gcc.target/powerpc/paired-2.c
deleted file mode 100644
index 181bbf1c39cd..000

[gcc r15-714] testsuite, rs6000: Remove effective target powerpc_405_nocache

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:7a9a6091b81d8579ab0470e4e21b5682d4ee4ef4

commit r15-714-g7a9a6091b81d8579ab0470e4e21b5682d4ee4ef4
Author: Kewen Lin 
Date:   Mon May 20 21:01:08 2024 -0500

testsuite, rs6000: Remove effective target powerpc_405_nocache

With the introduction of -mdejagnu-cpu=, when the test case
is specifying -mdejagnu-cpu=405, it would override the other
possibly given -mcpu=, so it would compile for PowerPC 405
for sure.  This patch is to remove the effective target
powerpc_405_nocache and update all its uses.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/405-dlmzb-strlen-1.c: Remove the line using
powerpc_405_nocache check.
* gcc.target/powerpc/405-macchw-1.c: Likewise.
* gcc.target/powerpc/405-macchw-2.c: Likewise.
* gcc.target/powerpc/405-macchwu-1.c: Likewise.
* gcc.target/powerpc/405-macchwu-2.c: Likewise.
* gcc.target/powerpc/405-machhw-1.c: Likewise.
* gcc.target/powerpc/405-machhw-2.c: Likewise.
* gcc.target/powerpc/405-machhwu-1.c: Likewise.
* gcc.target/powerpc/405-machhwu-2.c: Likewise.
* gcc.target/powerpc/405-maclhw-1.c: Likewise.
* gcc.target/powerpc/405-maclhw-2.c: Likewise.
* gcc.target/powerpc/405-maclhwu-1.c: Likewise.
* gcc.target/powerpc/405-maclhwu-2.c: Likewise.
* gcc.target/powerpc/405-mulchw-1.c: Likewise.
* gcc.target/powerpc/405-mulchw-2.c: Likewise.
* gcc.target/powerpc/405-mulchwu-1.c: Likewise.
* gcc.target/powerpc/405-mulchwu-2.c: Likewise.
* gcc.target/powerpc/405-mulhhw-1.c: Likewise.
* gcc.target/powerpc/405-mulhhw-2.c: Likewise.
* gcc.target/powerpc/405-mulhhwu-1.c: Likewise.
* gcc.target/powerpc/405-mulhhwu-2.c: Likewise.
* gcc.target/powerpc/405-mullhw-1.c: Likewise.
* gcc.target/powerpc/405-mullhw-2.c: Likewise.
* gcc.target/powerpc/405-mullhwu-1.c: Likewise.
* gcc.target/powerpc/405-mullhwu-2.c: Likewise.
* gcc.target/powerpc/405-nmacchw-1.c: Likewise.
* gcc.target/powerpc/405-nmacchw-2.c: Likewise.
* gcc.target/powerpc/405-nmachhw-1.c: Likewise.
* gcc.target/powerpc/405-nmachhw-2.c: Likewise.
* gcc.target/powerpc/405-nmaclhw-1.c: Likewise.
* gcc.target/powerpc/405-nmaclhw-2.c: Likewise.
* lib/target-supports.exp
(check_effective_target_powerpc_405_nocache): Remove.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/405-dlmzb-strlen-1.c |  1 -
 gcc/testsuite/gcc.target/powerpc/405-macchw-1.c   |  6 +-
 gcc/testsuite/gcc.target/powerpc/405-macchw-2.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-macchwu-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-macchwu-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-machhw-1.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-machhw-2.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-machhwu-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-machhwu-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-maclhw-1.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-maclhw-2.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-maclhwu-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-maclhwu-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulchw-1.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulchw-2.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulchwu-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulchwu-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulhhw-1.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulhhw-2.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulhhwu-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mulhhwu-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mullhw-1.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mullhw-2.c   |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mullhwu-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-mullhwu-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-nmacchw-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-nmacchw-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-nmachhw-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-nmachhw-2.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-nmaclhw-1.c  |  1 -
 gcc/testsuite/gcc.target/powerpc/405-nmaclhw-2.c  |  1 -
 gcc/testsuite/lib/target-supports.exp | 17 -
 32 files changed, 5 insertions(+), 48 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/405-dlmzb-strlen-1.c 
b/gcc/testsuite/gcc.target/powerpc/405-dlmzb-strlen-1.c
index 5ee427a3b4a9..984ffe7144c4 100644
--- a/gcc/testsuite/gcc.target/powerpc/405-dlmzb-strlen-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/405-dlmzb-strlen-1.c
@@ -4,7 +4,6 @@
 /* { dg-skip-if "" { powerpc*

[gcc r15-715] testsuite, rs6000: Make powerpc_vsx consider current_compiler_flags [PR114842]

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:95080f2a40c5dfc098b75029c30380ecf03875dc

commit r15-715-g95080f2a40c5dfc098b75029c30380ecf03875dc
Author: Kewen Lin 
Date:   Mon May 20 21:01:08 2024 -0500

testsuite, rs6000: Make powerpc_vsx consider current_compiler_flags 
[PR114842]

As noted in PR114842, most of the test cases which require
effective target check powerpc_vsx_ok actually care about
if VSX feature is enabled, and they should adopt effective
target powerpc_vsx instead.  By considering we already have
a number of test cases having explicit -mvsx in dg-options
etc., to keep them still be tested as before even without
vsx enabled by default, this patch is to make powerpc_vsx
consider current_compiler_flags.

PR testsuite/114842

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_powerpc_vsx): Take
current_compiler_flags into account.

Diff:
---
 gcc/testsuite/lib/target-supports.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index cf5512074ad5..8689c11214d4 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7140,7 +7140,7 @@ proc check_effective_target_powerpc_vsx { } {
  nope no vsx
#endif
}
-}]
+} [current_compiler_flags]]
 }
 
 # Return 1 if this is a PowerPC target supporting -mvsx


[gcc r15-716] testsuite, rs6000: Make powerpc_altivec consider current_compiler_flags [PR114842]

2024-05-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:3bb8cdbd60cdb4dab45b97235dc045d6b0a1

commit r15-716-g3bb8cdbd60cdb4dab45b97235dc045d6b0a1
Author: Kewen Lin 
Date:   Mon May 20 21:01:08 2024 -0500

testsuite, rs6000: Make powerpc_altivec consider current_compiler_flags 
[PR114842]

As noted in PR114842, most of the test cases which require
effective target check powerpc_altivec_ok actually care
about if ALTIVEC feature is enabled, and they should adopt
effective target powerpc_altivec instead.  By considering
we already have a number of test cases having explicit
-maltivec in dg-options etc., to keep them still be tested
as before even without altivec enabled by default, this
patch makes powerpc_altivec consider current_compiler_flags
like what we do for powerpc_vsx.

PR testsuite/114842

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_powerpc_altivec):
Take current_compiler_flags into account.

Diff:
---
 gcc/testsuite/lib/target-supports.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 8689c11214d4..3f0f8532dc36 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7323,7 +7323,7 @@ proc check_effective_target_sparc_vis { } {
#else
int dummy;
#endif
-   }]
+   } [current_compiler_flags]]
 } else {
return 0
 }


[gcc r15-3059] testsuite, rs6000: Remove all powerpc-*paired* uses

2024-08-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:118a7241f4fe7132cfd7b028ffd5ad39056ec601

commit r15-3059-g118a7241f4fe7132cfd7b028ffd5ad39056ec601
Author: Kewen Lin 
Date:   Wed Aug 21 00:26:20 2024 -0500

testsuite, rs6000: Remove all powerpc-*paired* uses

Similar to r15-710-g458b23bc8b3e2b which removed all uses of
powerpc-*-linux*paired*, this patch is to remove the remaining
powerpc-*paired* uses which I missed to catch with "*linux*"
in search keyword.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_vect_support_and_set_flags): Remove
the if arm checking powerpc-*paired*.
(check_750cl_hw_available): Remove.
(check_effective_target_vect_unpack): Remove the check on
powerpc-*paired*.

Diff:
---
 gcc/testsuite/lib/target-supports.exp | 35 ++-
 1 file changed, 2 insertions(+), 33 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 11ba77ca404d..91995bff65f7 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2848,30 +2848,6 @@ proc check_ppc_cpu_supports_hw_available { } {
 }]
 }
 
-# Return 1 if the target supports executing 750CL paired-single instructions, 0
-# otherwise.  Cache the result.
-
-proc check_750cl_hw_available { } {
-return [check_cached_effective_target 750cl_hw_available {
-   # If this is not the right target then we can skip the test.
-   if { ![istarget powerpc-*paired*] } {
-   expr 0
-   } else {
-   check_runtime_nocache 750cl_hw_available {
-int main()
-{
-#ifdef __MACH__
-  asm volatile ("ps_mul v0,v0,v0");
-#else
-  asm volatile ("ps_mul 0,0,0");
-#endif
-  return 0;
-}
-   } "-mpaired"
-   }
-}]
-}
-
 # Return 1 if the target supports executing power8 vector instructions, 0
 # otherwise.  Cache the result.
 
@@ -8329,7 +8305,7 @@ proc check_effective_target_vect_pack_trunc { } {
 
 proc check_effective_target_vect_unpack { } {
 return [check_cached_effective_target_indexed vect_unpack {
-  expr { ([istarget powerpc*-*-*] && ![istarget powerpc-*paired*])
+  expr { [istarget powerpc*-*-*]
 || [istarget i?86-*-*] || [istarget x86_64-*-*]
 || [istarget ia64-*-*]
 || [istarget aarch64*-*-*]
@@ -11702,14 +11678,7 @@ proc check_vect_support_and_set_flags { } {
 global dg-do-what-default
 global EFFECTIVE_TARGETS
 
-if  [istarget powerpc-*paired*]  {
-   lappend DEFAULT_VECTCFLAGS "-mpaired"
-   if [check_750cl_hw_available] {
-   set dg-do-what-default run
-   } else {
-   set dg-do-what-default compile
-   }
-} elseif [istarget powerpc*-*-*] {
+if [istarget powerpc*-*-*] {
# Skip targets not supporting -maltivec.
if ![is-effective-target powerpc_altivec_ok] {
return 0


[gcc r15-3060] rs6000: Fix vsx_le_perm_store_* splitters for !reload_completed

2024-08-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:ae53e4b99eaad43424f2b0cc1bbabb3b454fb6d8

commit r15-3060-gae53e4b99eaad43424f2b0cc1bbabb3b454fb6d8
Author: Kewen Lin 
Date:   Wed Aug 21 00:26:20 2024 -0500

rs6000: Fix vsx_le_perm_store_* splitters for !reload_completed

For vsx_le_perm_store_* we have two splitters, one is for
!reload_completed and the other is for reload_completed.
As Richard pointed out in [1], operand 1 here is a pure
input for DF and most passes, but it could be used as the
vector rotation (64 bit) destination of itself, so we
re-compute the source (back to the original value) for
the case reload_completed, while for !reload_completed we
generate one new pseudo, so both cases are fine if operand
1 is still live after this insn.  But according to the
source code, for !reload_completed case, it can logically
reuse the operand 1 as the new pseudo generation is
conditional on can_create_pseudo_p, then it can cause
wrong result once operand 1 is live.  So considering this
and there is no splitting for this when reload_in_progress,
this patch is to fix the code to assert can_create_pseudo_p
there, so that both !reload_completed and reload_completed
cases would ensure operand 1 is unchanged (pure input), it
is also prepared for the following up patch which would
strip the unnecessary INOUT constraint modifier "+".

This also fixes an oversight in the splitter for VSX_LE_128
(!reload_completed), it should use operand 1 rather than
operand 0.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660145.html

gcc/ChangeLog:

* config/rs6000/vsx.md 
(*vsx_le_perm_store_{,,
v8hi,v16qi,} !reload_completed splitters): Assert
can_create_pseudo_p and always generate one new pseudo for operand 
1.

Diff:
---
 gcc/config/rs6000/vsx.md | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 27069d070e15..89eaef183d99 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -703,8 +703,8 @@
   /* Otherwise, fall through to transform into a swapping store.  */
 }
 
-  operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) 
-   : operands[1];
+  gcc_assert (can_create_pseudo_p ());
+  operands[2] = gen_reg_rtx_and_attrs (operands[1]);
 })
 
 ;; The post-reload split requires that we re-permute the source
@@ -775,8 +775,8 @@
   /* Otherwise, fall through to transform into a swapping store.  */
 }
 
-  operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) 
-   : operands[1];
+  gcc_assert (can_create_pseudo_p ());
+  operands[2] = gen_reg_rtx_and_attrs (operands[1]);
 })
 
 ;; The post-reload split requires that we re-permute the source
@@ -854,8 +854,8 @@
   /* Otherwise, fall through to transform into a swapping store.  */
 }
 
-  operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) 
-   : operands[1];
+  gcc_assert (can_create_pseudo_p ());
+  operands[2] = gen_reg_rtx_and_attrs (operands[1]);
 })
 
 ;; The post-reload split requires that we re-permute the source
@@ -947,8 +947,8 @@
   /* Otherwise, fall through to transform into a swapping store.  */
 }
 
-  operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[1]) 
-   : operands[1];
+  gcc_assert (can_create_pseudo_p ());
+  operands[2] = gen_reg_rtx_and_attrs (operands[1]);
 })
 
 ;; The post-reload split requires that we re-permute the source
@@ -1076,9 +1076,8 @@
&& !altivec_indexed_or_indirect_operand (operands[0], mode)"
   [(const_int 0)]
 {
-  rtx tmp = (can_create_pseudo_p ()
-? gen_reg_rtx_and_attrs (operands[0])
-: operands[0]);
+  gcc_assert (can_create_pseudo_p ());
+  rtx tmp = gen_reg_rtx_and_attrs (operands[1]);
   rs6000_emit_le_vsx_permute (tmp, operands[1], mode);
   rs6000_emit_le_vsx_permute (operands[0], tmp, mode);
   DONE;


[gcc r15-3061] rs6000: Remove "+" constraint modifier from *vsx_le_perm_store_* insns

2024-08-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:34292a1ae89a13baf974ff2ecb21dcf89aab4617

commit r15-3061-g34292a1ae89a13baf974ff2ecb21dcf89aab4617
Author: Kewen Lin 
Date:   Wed Aug 21 00:26:20 2024 -0500

rs6000: Remove "+" constraint modifier from *vsx_le_perm_store_* insns

Since *vsx_le_perm_store_* can be split into vector
permute and vector store, after reload_completed, we reuse
the operand 1 as the destination of vector permute, so we
set operand 1 with constraint modifier "+".  But since
it's taken as pure input in DF and most passes as Richard
pointed out in [1], to ensure it's correct when operand 1
is still live, we actually restore the operand 1's value
after the store with vector permute, that is:
  op1 = vector permute op1 (doubleword swapping)
  op0 = op2
  op1 = vector permute op1 (doubleword swapping)
, it means op1's value isn't changed by this insn.

So according to the comments from Richard and Segher in
that thread, this patch is to remove the "+" constraint
modifier of operand 1 from *vsx_le_perm_store_* insns.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660145.html

gcc/ChangeLog:

* config/rs6000/vsx.md (define_insn 
*vsx_le_perm_store_{,
,v8hi,v16qi,}): Remove constraint 
modifier
"+" from operand 1.

Diff:
---
 gcc/config/rs6000/vsx.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 89eaef183d99..b2fc39acf4e8 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -659,7 +659,7 @@
 
 (define_insn "*vsx_le_perm_store_"
   [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand" "=Z")
-(match_operand:VSX_D 1 "vsx_register_operand" "+wa"))]
+(match_operand:VSX_D 1 "vsx_register_operand" "wa"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -729,7 +729,7 @@
 
 (define_insn "*vsx_le_perm_store_"
   [(set (match_operand:VSX_W 0 "indexed_or_indirect_operand" "=Z")
-(match_operand:VSX_W 1 "vsx_register_operand" "+wa"))]
+(match_operand:VSX_W 1 "vsx_register_operand" "wa"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -804,7 +804,7 @@
 
 (define_insn "*vsx_le_perm_store_v8hi"
   [(set (match_operand:V8HI 0 "indexed_or_indirect_operand" "=Z")
-(match_operand:V8HI 1 "vsx_register_operand" "+wa"))]
+(match_operand:V8HI 1 "vsx_register_operand" "wa"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -889,7 +889,7 @@
 
 (define_insn "*vsx_le_perm_store_v16qi"
   [(set (match_operand:V16QI 0 "indexed_or_indirect_operand" "=Z")
-(match_operand:V16QI 1 "vsx_register_operand" "+wa"))]
+(match_operand:V16QI 1 "vsx_register_operand" "wa"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -1059,7 +1059,7 @@
 
 (define_insn "*vsx_le_perm_store_"
   [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
-(match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
+(match_operand:VSX_LE_128 1 "vsx_register_operand" "wa,r"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
&& !altivec_indexed_or_indirect_operand (operands[0], mode)"
   "@


[gcc r15-1591] go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

2024-06-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:fafd87830937d5a0eddeb4e1110910ad817c11b4

commit r15-1591-gfafd87830937d5a0eddeb4e1110910ad817c11b4
Author: Kewen Lin 
Date:   Tue Jun 25 00:04:47 2024 -0500

go: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in go with TYPE_PRECISION of {float,{,long_}double}_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/go/ChangeLog:

* go-gcc.cc (Gcc_backend::float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.
(Gcc_backend::complex_type): Likewise.

Diff:
---
 gcc/go/go-gcc.cc | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
index bc9732c3db3..6aa751f9f30 100644
--- a/gcc/go/go-gcc.cc
+++ b/gcc/go/go-gcc.cc
@@ -993,11 +993,11 @@ Btype*
 Gcc_backend::float_type(int bits)
 {
   tree type;
-  if (bits == FLOAT_TYPE_SIZE)
+  if (bits == TYPE_PRECISION (float_type_node))
 type = float_type_node;
-  else if (bits == DOUBLE_TYPE_SIZE)
+  else if (bits == TYPE_PRECISION (double_type_node))
 type = double_type_node;
-  else if (bits == LONG_DOUBLE_TYPE_SIZE)
+  else if (bits == TYPE_PRECISION (long_double_type_node))
 type = long_double_type_node;
   else
 {
@@ -1014,11 +1014,11 @@ Btype*
 Gcc_backend::complex_type(int bits)
 {
   tree type;
-  if (bits == FLOAT_TYPE_SIZE * 2)
+  if (bits == TYPE_PRECISION (float_type_node) * 2)
 type = complex_float_type_node;
-  else if (bits == DOUBLE_TYPE_SIZE * 2)
+  else if (bits == TYPE_PRECISION (double_type_node) * 2)
 type = complex_double_type_node;
-  else if (bits == LONG_DOUBLE_TYPE_SIZE * 2)
+  else if (bits == TYPE_PRECISION (long_double_type_node) * 2)
 type = complex_long_double_type_node;
   else
 {


[gcc r15-1592] rust: Replace uses of {FLOAT, {, LONG_}DOUBLE}_TYPE_SIZE

2024-06-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:bcd1b7a097031d33bc74943bb260d12ff801cf3f

commit r15-1592-gbcd1b7a097031d33bc74943bb260d12ff801cf3f
Author: Kewen Lin 
Date:   Tue Jun 25 00:04:49 2024 -0500

rust: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in rust with TYPE_PRECISION of {float,{,long_}double}_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/rust/ChangeLog:

* rust-gcc.cc (float_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.

Diff:
---
 gcc/rust/rust-gcc.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/rust/rust-gcc.cc b/gcc/rust/rust-gcc.cc
index f17e19a2dfc..38169c08985 100644
--- a/gcc/rust/rust-gcc.cc
+++ b/gcc/rust/rust-gcc.cc
@@ -411,11 +411,11 @@ tree
 float_type (int bits)
 {
   tree type;
-  if (bits == FLOAT_TYPE_SIZE)
+  if (bits == TYPE_PRECISION (float_type_node))
 type = float_type_node;
-  else if (bits == DOUBLE_TYPE_SIZE)
+  else if (bits == TYPE_PRECISION (double_type_node))
 type = double_type_node;
-  else if (bits == LONG_DOUBLE_TYPE_SIZE)
+  else if (bits == TYPE_PRECISION (long_double_type_node))
 type = long_double_type_node;
   else
 {


[gcc r15-1593] vms: Replace use of LONG_DOUBLE_TYPE_SIZE

2024-06-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:7eddf6e857bc79cfa0bee3b9ad89a7e16a81d1e8

commit r15-1593-g7eddf6e857bc79cfa0bee3b9ad89a7e16a81d1e8
Author: Kewen Lin 
Date:   Tue Jun 25 00:04:51 2024 -0500

vms: Replace use of LONG_DOUBLE_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in vms port
with TYPE_PRECISION of long_double_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/ChangeLog:

* config/vms/vms.cc (vms_patch_builtins): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.

Diff:
---
 gcc/config/vms/vms.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/vms/vms.cc b/gcc/config/vms/vms.cc
index d468c79e559..2fcc673c8a9 100644
--- a/gcc/config/vms/vms.cc
+++ b/gcc/config/vms/vms.cc
@@ -141,6 +141,7 @@ vms_patch_builtins (void)
   if (builtin_decl_implicit_p (BUILT_IN_FWRITE_UNLOCKED))
 set_builtin_decl_implicit_p (BUILT_IN_FWRITE_UNLOCKED, false);
 
+  unsigned long_double_type_size = TYPE_PRECISION (long_double_type_node);
   /* Define aliases for names.  */
   for (i = 0; i < NBR_CRTL_NAMES; i++)
 {
@@ -179,7 +180,7 @@ vms_patch_builtins (void)
  vms_add_crtl_xlat (alt, nlen + 1, res, rlen);
 
  /* Long double version.  */
- res[rlen - 1] = (LONG_DOUBLE_TYPE_SIZE == 128 ? 'X' : 'T');
+ res[rlen - 1] = (long_double_type_size == 128 ? 'X' : 'T');
  alt[nlen] = 'l';
  vms_add_crtl_xlat (alt, nlen + 1, res, rlen);
 
@@ -223,7 +224,7 @@ vms_patch_builtins (void)
   if (n->flags & VMS_CRTL_FLOAT64)
 res[rlen++] = 't';
 
-  if ((n->flags & VMS_CRTL_FLOAT128) && LONG_DOUBLE_TYPE_SIZE == 128)
+  if ((n->flags & VMS_CRTL_FLOAT128) && long_double_type_size == 128)
 res[rlen++] = 'x';
 
   memcpy (res + rlen, n->name, nlen);


[gcc r15-1594] Replace {FLOAT, {, LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_type

2024-06-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:55947b32c38a40777aedbd105bd94b43a42c2a10

commit r15-1594-g55947b32c38a40777aedbd105bd94b43a42c2a10
Author: Kewen Lin 
Date:   Tue Jun 25 00:04:53 2024 -0500

Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook 
mode_for_floating_type

Currently how we determine which mode will be used for a
floating point type is that for a given type precision
(size) call mode_for_size to get the first mode which has
this size in the specified class.  On Powerpc, we have
three modes (TF/KF/IF) having the same mode precision 128
(see[1]), so the processing forces us to have to place TF
at the first place, it would require us to make more
adjustment in some generic code to avoid some unexpected
mode conversions and it would be even worse if we get rid
of TF eventually one day.  And as Joseph pointed out in [2],
"floating  types should have their mode, not a poorly
defined precision value", as Joseph and Richi suggested,
this patch is to introduce one hook mode_for_floating_type
which returns the corresponding mode for type float, double
or long double.  The default implementation returns SFmode
for float and DFmode for double or long double.  For ports
which need special treatment, there are some other patches
for their own port specific implementation (referring to
how {,LONG_}DOUBLE_TYPE_SIZE get used there).  For all
generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
on the context, some of them are replaced with TYPE_PRECISION
of the according type node, some other are replaced with
GET_MODE_PRECISION on the mode from mode_for_floating_type.
This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
specific are removed, but there are still some which are
good to be kept for readability then they get renamed with
port specific prefix.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651017.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/jit/ChangeLog:

* jit-recording.cc (recording::memento_of_get_type::get_size): 
Update
macros {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.

gcc/ChangeLog:

* coretypes.h (enum tree_index): Forward declaration.
* defaults.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* doc/rtl.texi: Update document by replacing 
{FLOAT,DOUBLE}_TYPE_SIZE
with C type {float,double}.
* doc/tm.texi.in: Document new hook mode_for_floating_type, remove
document entries for {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE and
update document for WIDEST_HARDWARE_FP_SIZE.
* doc/tm.texi: Regenerate.
* emit-rtl.cc (init_emit_once): Replace DOUBLE_TYPE_SIZE by
calling targetm.c.mode_for_floating_type with TI_DOUBLE_TYPE.
* real.h (REAL_VALUE_TO_TARGET_LONG_DOUBLE): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.
* system.h (FLOAT_TYPE_SIZE): Poison.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* target.def (mode_for_floating_type): New hook.
* targhooks.cc (default_mode_for_floating_type): New function.
(default_scalar_mode_supported_p): Update macros
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.
* targhooks.h (default_mode_for_floating_type): New declaration.
* tree-core.h (enum tree_index): Specify underlying type unsigned
to sync with forward declaration in coretypes.h.
(NUM_FLOATN_TYPES): Explicitly convert to int.
(NUM_FLOATNX_TYPES): Likewise.
(NUM_FLOATN_NX_TYPES): Likewise.
* tree.cc (build_common_tree_nodes): Update macros
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE and set type mode accordingly.
* config/arc/arc.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/bpf/bpf.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/epiphany/epiphany.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/fr30/fr30.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewis

[gcc r15-1644] rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

2024-06-26 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22

commit r15-1644-g62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md  | 66 +--
 gcc/config/rs6000/rs6000.cc   |  8 ++--
 gcc/testsuite/gcc.target/powerpc/pr106069-1.c | 39 
 3 files changed, 95 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index dcc71cc0f52..a0e8a35b843 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1152,15 +1152,16 @@
(use (match_operand:V16QI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
-   : gen_altivec_vmrglb_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghb_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglb_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghb_direct"
+(define_insn "altivec_vmrghb_direct_be"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
(vec_select:V16QI
  (vec_concat:V32QI
@@ -1174,7 +1175,25 @@
 (const_int 5) (const_int 21)
 (const_int 6) (const_int 22)
 (const_int 7) (const_int 23)])))]
-  "TARGET_ALTIVEC"
+  "TARGET_ALTIVEC && BYTES_BIG_ENDIAN"
+  "vmrghb %0,%1,%2"
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghb_direct_le"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+   (vec_select:V16QI
+ (vec_concat:V32QI
+   (match_operand:V16QI 2 "register_operand" "v")
+   (match

[gcc r15-1645] rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

2024-06-26 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:812c70bf4981958488331d4ea5af8709b5321da1

commit r15-1645-g812c70bf4981958488331d4ea5af8709b5321da1
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md  | 76 +++
 gcc/config/rs6000/rs6000.cc   |  8 +--
 gcc/testsuite/gcc.target/powerpc/pr106069-2.c | 37 +
 3 files changed, 94 insertions(+), 27 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index a0e8a35b843..5af9bf920a2 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1203,17 +1203,18 @@
(use (match_operand:V8HI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghh_direct
-   : gen_altivec_vmrglh_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghh_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglh_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghh_direct"
+(define_insn "altivec_vmrghh_direct_be"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
-(vec_select:V8HI
+   (vec_select:V8HI
  (vec_concat:V16HI
(match_operand:V8HI 

[gcc r12-10587] rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

2024-06-27 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:96ef3367067219c8e3eb88c0474a1090cc7749b4

commit r12-10587-g96ef3367067219c8e3eb88c0474a1090cc7749b4
Author: Kewen Lin 
Date:   Thu Jun 20 20:23:56 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 2) (const_int 6)
   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_): 
Rename
to ...
(altivec_vmrghw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct__le): New define_insn.
(altivec_vmrglw_direct_): Rename to ...
(altivec_vmrglw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct__le): New define_insn.
(altivec_vmrghw): Adjust by calling 
gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling 
gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_): Adjust by calling
gen_altivec_vmrghw_d

[gcc r13-8876] rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

2024-06-27 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:361bfcec901ca882130e338aebaa2ebc6ea2dc3b

commit r13-8876-g361bfcec901ca882130e338aebaa2ebc6ea2dc3b
Author: Kewen Lin 
Date:   Thu Jun 20 20:23:56 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 2) (const_int 6)
   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_): 
Rename
to ...
(altivec_vmrghw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct__le): New define_insn.
(altivec_vmrglw_direct_): Rename to ...
(altivec_vmrglw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct__le): New define_insn.
(altivec_vmrghw): Adjust by calling 
gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling 
gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_): Adjust by calling
gen_altivec_vmrghw_di

[gcc r14-10355] rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

2024-06-27 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:ef8b60dd48faeaf2b4e28c35401fa10d2a3e53fb

commit r14-10355-gef8b60dd48faeaf2b4e28c35401fa10d2a3e53fb
Author: Kewen Lin 
Date:   Thu Jun 20 20:23:56 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 2) (const_int 6)
   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_): 
Rename
to ...
(altivec_vmrghw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct__le): New define_insn.
(altivec_vmrglw_direct_): Rename to ...
(altivec_vmrglw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct__le): New define_insn.
(altivec_vmrghw): Adjust by calling 
gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling 
gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_): Adjust by calling
gen_altivec_vmrghw_d

[gcc r15-1763] isel: Fold more in gimple_expand_vec_cond_expr [PR115659]

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:56670281c6db19d75c7b63e38971ab84681b245c

commit r15-1763-g56670281c6db19d75c7b63e38971ab84681b245c
Author: Kewen Lin 
Date:   Tue Jul 2 02:13:35 2024 -0500

isel: Fold more in gimple_expand_vec_cond_expr [PR115659]

As PR115659 shows, assuming c = x CMP y, there are some
folding chances for patterns r = c ? -1/z : z/0.

For r = c ? -1 : z, it can be folded into:
  - r = c | z (with ior_optab supported)
  - or r = c ? c : z

while for r = c ?  z : 0, it can be foled into:
  - r = c & z (with and_optab supported)
  - or r = c ? z : c

This patch is to teach ISEL to take care of them and also
remove the redundant gsi_replace as the caller of function
gimple_expand_vec_cond_expr will handle it.

PR tree-optimization/115659

gcc/ChangeLog:

* gimple-isel.cc (gimple_expand_vec_cond_expr): Add more foldings 
for
patterns x CMP y ? -1 : z and x CMP y ? z : 0.

Diff:
---
 gcc/gimple-isel.cc | 48 +---
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 54c1801038b..60719eafc65 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -240,16 +240,50 @@ gimple_expand_vec_cond_expr (struct function *fun, 
gimple_stmt_iterator *gsi,
can_compute_op0 = expand_vec_cmp_expr_p (op0a_type, op0_type,
 tcode);
 
- /* Try to fold x CMP y ? -1 : 0 to x CMP y.  */
  if (can_compute_op0
- && integer_minus_onep (op1)
- && integer_zerop (op2)
  && TYPE_MODE (TREE_TYPE (lhs)) == TYPE_MODE (TREE_TYPE (op0)))
{
- tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), op0);
- gassign *new_stmt = gimple_build_assign (lhs, conv_op);
- gsi_replace (gsi, new_stmt, true);
- return new_stmt;
+ /* Assuming c = x CMP y.  */
+ bool op1_minus_onep = integer_minus_onep (op1);
+ bool op2_zerop = integer_zerop (op2);
+ tree vtype = TREE_TYPE (lhs);
+ machine_mode vmode = TYPE_MODE (vtype);
+ /* Try to fold r = c ? -1 : 0 to r = c.  */
+ if (op1_minus_onep && op2_zerop)
+   {
+ tree conv_op = build1 (VIEW_CONVERT_EXPR, vtype, op0);
+ return gimple_build_assign (lhs, conv_op);
+   }
+ /* Try to fold r = c ? -1 : z to r = c | z, or
+r = c ? c : z.  */
+ if (op1_minus_onep)
+   {
+ tree conv_op = build1 (VIEW_CONVERT_EXPR, vtype, op0);
+ tree new_op1 = make_ssa_name (vtype);
+ gassign *new_stmt = gimple_build_assign (new_op1, conv_op);
+ gsi_insert_seq_before (gsi, new_stmt, GSI_SAME_STMT);
+ if (optab_handler (ior_optab, vmode) != CODE_FOR_nothing)
+   /* r = c | z */
+   return gimple_build_assign (lhs, BIT_IOR_EXPR, new_op1,
+   op2);
+ /* r = c ? c : z */
+ op1 = new_op1;
+   }
+ /* Try to fold r = c ? z : 0 to r = c & z, or
+r = c ? z : c.  */
+ else if (op2_zerop)
+   {
+ tree conv_op = build1 (VIEW_CONVERT_EXPR, vtype, op0);
+ tree new_op2 = make_ssa_name (vtype);
+ gassign *new_stmt = gimple_build_assign (new_op2, conv_op);
+ gsi_insert_seq_before (gsi, new_stmt, GSI_SAME_STMT);
+ if (optab_handler (and_optab, vmode) != CODE_FOR_nothing)
+   /* r = c | z */
+   return gimple_build_assign (lhs, BIT_AND_EXPR, new_op2,
+   op1);
+ /* r = c ? z : c */
+ op2 = new_op2;
+   }
}
 
  /* When the compare has EH we do not want to forward it when


[gcc r15-1766] sparc: define SPARC_LONG_DOUBLE_TYPE_SIZE for vxworks [PR115739]

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:39e679e25deca32e73870f7f7a9c4f2c108d4a5e

commit r15-1766-g39e679e25deca32e73870f7f7a9c4f2c108d4a5e
Author: Kewen Lin 
Date:   Tue Jul 2 03:58:06 2024 -0500

sparc: define SPARC_LONG_DOUBLE_TYPE_SIZE for vxworks [PR115739]

Commit r15-1594 removed define of LONG_DOUBLE_TYPE_SIZE in
sparc.cc, it's based on the assumption that each OS has its
own define (see the comments in sparc.h), but it exposes an
issue on vxworks which lacks of the define.

We can bring back the default SPARC_LONG_DOUBLE_TYPE_SIZE to
sparc.cc, but according to the comments in sparc.h, I think
it's better to define this in vxworks.h.  btw, I also went
through all the sparc supported triples, vxworks is the only
one that misses this define.

PR target/115739

gcc/ChangeLog:

* config/sparc/vxworks.h (SPARC_LONG_DOUBLE_TYPE_SIZE): New define.

Diff:
---
 gcc/config/sparc/vxworks.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/sparc/vxworks.h b/gcc/config/sparc/vxworks.h
index c1a9310fb3f..4cdb3b1685d 100644
--- a/gcc/config/sparc/vxworks.h
+++ b/gcc/config/sparc/vxworks.h
@@ -62,3 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 /* This platform supports the probing method of stack checking (RTP mode).
8K is reserved in the stack to propagate exceptions in case of overflow.  */
 #define STACK_CHECK_PROTECT 8192
+
+/* SPARC_LONG_DOUBLE_TYPE_SIZE should be defined per OS.  */
+#undef SPARC_LONG_DOUBLE_TYPE_SIZE
+#define SPARC_LONG_DOUBLE_TYPE_SIZE (BITS_PER_WORD * 2)


[gcc r14-10371] rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:0e495e8e3fde11e430a77db6b477319ed0ae0b7c

commit r14-10371-g0e495e8e3fde11e430a77db6b477319ed0ae0b7c
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

(cherry picked from commit 62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22)

Diff:
---
 gcc/config/rs6000/altivec.md  | 66 +--
 gcc/config/rs6000/rs6000.cc   |  8 ++--
 gcc/testsuite/gcc.target/powerpc/pr106069-1.c | 39 
 3 files changed, 95 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index dcc71cc0f52..a0e8a35b843 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1152,15 +1152,16 @@
(use (match_operand:V16QI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
-   : gen_altivec_vmrglb_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghb_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglb_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghb_direct"
+(define_insn "altivec_vmrghb_direct_be"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
(vec_select:V16QI
  (vec_concat:V32QI
@@ -1174,7 +1175,25 @@
 (const_int 5) (const_int 21)
 (const_int 6) (const_int 22)
 (const_int 7) (const_int 23)])))]
-  "TARGET_ALTIVEC"
+  "TARGET_ALTIVEC && BYTES_BIG_ENDIAN"
+  "vmrghb %0,%1,%2"
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghb_direct_le"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+   (vec_select:V16QI
+ (vec_concat:V32Q

[gcc r14-10372] rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:052f78d010d224c7289f1cf6eec784ac4eeed351

commit r14-10372-g052f78d010d224c7289f1cf6eec784ac4eeed351
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

(cherry picked from commit 812c70bf4981958488331d4ea5af8709b5321da1)

Diff:
---
 gcc/config/rs6000/altivec.md  | 76 +++
 gcc/config/rs6000/rs6000.cc   |  8 +--
 gcc/testsuite/gcc.target/powerpc/pr106069-2.c | 37 +
 3 files changed, 94 insertions(+), 27 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index a0e8a35b843..5af9bf920a2 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1203,17 +1203,18 @@
(use (match_operand:V8HI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghh_direct
-   : gen_altivec_vmrglh_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghh_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglh_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghh_direct"
+(define_insn "altivec_vmrghh_direct_be"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
-(vec_select:V8HI
+ 

[gcc r13-8885] rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:ffdd377fc07cdc7b62669d354e23f30940eaaffe

commit r13-8885-gffdd377fc07cdc7b62669d354e23f30940eaaffe
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

(cherry picked from commit 62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22)

Diff:
---
 gcc/config/rs6000/altivec.md  | 66 +--
 gcc/config/rs6000/rs6000.cc   |  8 ++--
 gcc/testsuite/gcc.target/powerpc/pr106069-1.c | 39 
 3 files changed, 95 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 92e2e4a4090..47664204bc5 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1152,15 +1152,16 @@
(use (match_operand:V16QI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
-   : gen_altivec_vmrglb_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghb_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglb_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghb_direct"
+(define_insn "altivec_vmrghb_direct_be"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
(vec_select:V16QI
  (vec_concat:V32QI
@@ -1174,7 +1175,25 @@
 (const_int 5) (const_int 21)
 (const_int 6) (const_int 22)
 (const_int 7) (const_int 23)])))]
-  "TARGET_ALTIVEC"
+  "TARGET_ALTIVEC && BYTES_BIG_ENDIAN"
+  "vmrghb %0,%1,%2"
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghb_direct_le"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+   (vec_select:V16QI
+ (vec_concat:V32QI

[gcc r13-8886] rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:bab38d9271ce3f26cb64b8cb712351eb3fedd559

commit r13-8886-gbab38d9271ce3f26cb64b8cb712351eb3fedd559
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

(cherry picked from commit 812c70bf4981958488331d4ea5af8709b5321da1)

Diff:
---
 gcc/config/rs6000/altivec.md  | 76 +++
 gcc/config/rs6000/rs6000.cc   |  8 +--
 gcc/testsuite/gcc.target/powerpc/pr106069-2.c | 37 +
 3 files changed, 94 insertions(+), 27 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 47664204bc5..6557393a97c 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1203,17 +1203,18 @@
(use (match_operand:V8HI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghh_direct
-   : gen_altivec_vmrglh_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghh_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglh_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghh_direct"
+(define_insn "altivec_vmrghh_direct_be"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
-(vec_select:V8HI
+  

[gcc r12-10594] rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:13f0528c782c3732052973a5d340769af8182c8f

commit r12-10594-g13f0528c782c3732052973a5d340769af8182c8f
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low char on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low char, which are altivec_vmrg[hl]b.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghb on BE while vmrglb on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
8-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-1.c is a typical example for this issue.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghb expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghb_direct): Rename to ...
(altivec_vmrghb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghb_direct_le): New define_insn.
(altivec_vmrglb_direct): Rename to ...
(altivec_vmrglb_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglb_direct_le): New define_insn.
(altivec_vmrghb): Adjust by calling gen_altivec_vmrghb_direct_be
for BE and gen_altivec_vmrglb_direct_le for LE.
(altivec_vmrglb): Adjust by calling gen_altivec_vmrglb_direct_be
for BE and gen_altivec_vmrghb_direct_le for LE.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghb_direct by
CODE_FOR_altivec_vmrghb_direct_be for BE and
CODE_FOR_altivec_vmrghb_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglb_direct by
CODE_FOR_altivec_vmrglb_direct_be for BE and
CODE_FOR_altivec_vmrglb_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-1.c: New test.

(cherry picked from commit 62520e4e9f7e2fe8a16ee57a4bd35da2e921ae22)

Diff:
---
 gcc/config/rs6000/altivec.md  | 66 +--
 gcc/config/rs6000/rs6000.cc   |  8 ++--
 gcc/testsuite/gcc.target/powerpc/pr106069-1.c | 39 
 3 files changed, 95 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 0c408a9e839..b8baae679c4 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1152,15 +1152,16 @@
(use (match_operand:V16QI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
-   : gen_altivec_vmrglb_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghb_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglb_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghb_direct"
+(define_insn "altivec_vmrghb_direct_be"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
(vec_select:V16QI
  (vec_concat:V32QI
@@ -1174,7 +1175,25 @@
 (const_int 5) (const_int 21)
 (const_int 6) (const_int 22)
 (const_int 7) (const_int 23)])))]
-  "TARGET_ALTIVEC"
+  "TARGET_ALTIVEC && BYTES_BIG_ENDIAN"
+  "vmrghb %0,%1,%2"
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghb_direct_le"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+   (vec_select:V16QI
+ (vec_concat:V32Q

[gcc r12-10595] rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

2024-07-02 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:ca6eea0eb33de8b2e23e0bef3466575bb14ab63f

commit r12-10595-gca6eea0eb33de8b2e23e0bef3466575bb14ab63f
Author: Kewen Lin 
Date:   Wed Jun 26 02:16:17 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition 
BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

(cherry picked from commit 812c70bf4981958488331d4ea5af8709b5321da1)

Diff:
---
 gcc/config/rs6000/altivec.md  | 76 +++
 gcc/config/rs6000/rs6000.cc   |  8 +--
 gcc/testsuite/gcc.target/powerpc/pr106069-2.c | 37 +
 3 files changed, 94 insertions(+), 27 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index b8baae679c4..50689e418ed 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1203,17 +1203,18 @@
(use (match_operand:V8HI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghh_direct
-   : gen_altivec_vmrglh_direct;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
+  if (BYTES_BIG_ENDIAN)
+emit_insn (
+  gen_altivec_vmrghh_direct_be (operands[0], operands[1], operands[2]));
+  else
+emit_insn (
+  gen_altivec_vmrglh_direct_le (operands[0], operands[2], operands[1]));
   DONE;
 })
 
-(define_insn "altivec_vmrghh_direct"
+(define_insn "altivec_vmrghh_direct_be"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
-(vec_select:V8HI
+ 

[gcc r15-1889] rs6000: Consider explicit VSX when masking off ALTIVEC [PR115688]

2024-07-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:f90ca62566c1d20da585d95ced99f6a1903fc2cc

commit r15-1889-gf90ca62566c1d20da585d95ced99f6a1903fc2cc
Author: Kewen Lin 
Date:   Sun Jul 7 22:38:34 2024 -0500

rs6000: Consider explicit VSX when masking off ALTIVEC [PR115688]

PR115688 exposes an inconsistent state in which we have VSX
enabled but ALTIVEC disabled.  There is one hunk:

  if (main_target_opt && !main_target_opt->x_rs6000_altivec_abi)
rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC)
  & ~rs6000_isa_flags_explicit);

which disables both VSX and ALTIVEC together only considering
them explicitly set or not.  For the given case, VSX is explicitly
specified, altivec is implicitly enabled as it's part of set
ISA_2_6_MASKS_SERVER.  When falling into the above hunk, vsx is
kept as it's explicitly enabled but altivec gets masked off, it's
unexpected.

This patch is to consider explicit VSX when masking off ALTIVEC,
not mask off it if TARGET_VSX and it's explicitly set.

PR target/115688

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): 
Consider
explicit VSX when masking off ALTIVEC.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr115688.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  8 ++--
 gcc/testsuite/gcc.target/powerpc/pr115688.c | 14 ++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 58553ff66f46..2cbea6ea2d7c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3933,8 +3933,12 @@ rs6000_option_override_internal (bool global_init_p)
  not for 32-bit.  Don't move this before the above code using ignore_masks,
  since it can reset the cleared VSX/ALTIVEC flag again.  */
   if (main_target_opt && !main_target_opt->x_rs6000_altivec_abi)
-rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC)
- & ~rs6000_isa_flags_explicit);
+{
+  rs6000_isa_flags &= ~(OPTION_MASK_VSX & ~rs6000_isa_flags_explicit);
+  /* Don't mask off ALTIVEC if it is enabled by an explicit VSX.  */
+  if (!TARGET_VSX)
+   rs6000_isa_flags &= ~(OPTION_MASK_ALTIVEC & ~rs6000_isa_flags_explicit);
+}
 
   if (TARGET_CRYPTO && !TARGET_ALTIVEC)
 {
diff --git a/gcc/testsuite/gcc.target/powerpc/pr115688.c 
b/gcc/testsuite/gcc.target/powerpc/pr115688.c
new file mode 100644
index ..5222e66ef170
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr115688.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target powerpc*-*-linux* } } */
+/* { dg-options "-mdejagnu-cpu=power5 -O2" } */
+
+/* Ignore some error messages on "target attribute or
+   pragma changes AltiVec ABI".  */
+/* { dg-excess-errors "pr115688" { target ilp32 } } */
+
+/* Verify there is no ICE under 32 bit env.  */
+
+__attribute__((target("vsx")))
+int test (void)
+{
+  return 0;
+}


[gcc r15-1890] isel: Fold more in gimple_expand_vec_cond_expr with andc and iorc [PR115659]

2024-07-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:f379596e0ba99df249d6e8b3f2e66edfcea916fe

commit r15-1890-gf379596e0ba99df249d6e8b3f2e66edfcea916fe
Author: Kewen Lin 
Date:   Mon Jul 8 00:14:59 2024 -0500

isel: Fold more in gimple_expand_vec_cond_expr with andc and iorc [PR115659]

As PR115659 shows, assuming c = x CMP y, there are some
folding chances for patterns r = c ? 0/z : z/-1:
  - for r = c ? 0 : z, it can be folded into r = ~c & z.
  - for r = c ? z : -1, it can be folded into r = ~c | z.

But BIT_AND/BIT_IOR applied on one BIT_NOT operand is a
compound operation, it's arguable to consider it beats
vector selection.  So this patch is to introduce new
optabs andc, iorc and its corresponding internal functions
BIT_{ANDC,IORC}, and if targets defines such optabs for
vector modes, it means targets support these hardware
insns and should be not worse than vector selection.

PR tree-optimization/115659

gcc/ChangeLog:

* doc/md.texi: Document andcm3 and iorcm3.
* gimple-isel.cc (gimple_expand_vec_cond_expr): Add more foldings 
for
patterns x CMP y ? 0 : z and x CMP y ? z : -1.
* internal-fn.def (BIT_ANDC): New internal function.
(BIT_IORC): Likewise.
* optabs.def (andc, iorc): New optab.

Diff:
---
 gcc/doc/md.texi | 10 ++
 gcc/gimple-isel.cc  | 26 ++
 gcc/internal-fn.def |  4 
 gcc/optabs.def  |  2 ++
 4 files changed, 42 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 4fd7da095feb..7f4335e0aac1 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5543,6 +5543,16 @@ means of constraints requiring operands 1 and 0 to be 
the same location.
 @itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3}
 Similar, for other arithmetic operations.
 
+@cindex @code{andc@var{m}3} instruction pattern
+@item @samp{andc@var{m}3}
+Like @code{and@var{m}3}, but it uses bitwise-complement of operand 2
+rather than operand 2 itself.
+
+@cindex @code{iorc@var{m}3} instruction pattern
+@item @samp{iorc@var{m}3}
+Like @code{ior@var{m}3}, but it uses bitwise-complement of operand 2
+rather than operand 2 itself.
+
 @cindex @code{addv@var{m}4} instruction pattern
 @item @samp{addv@var{m}4}
 Like @code{add@var{m}3} but takes a @code{code_label} as operand 3 and
diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 60719eafc651..e4ab42ad05ba 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -284,6 +284,32 @@ gimple_expand_vec_cond_expr (struct function *fun, 
gimple_stmt_iterator *gsi,
  /* r = c ? z : c */
  op2 = new_op2;
}
+ bool op1_zerop = integer_zerop (op1);
+ bool op2_minus_onep = integer_minus_onep (op2);
+ /* Try to fold r = c ? 0 : z to r = .BIT_ANDC (z, c).  */
+ if (op1_zerop
+ && (direct_internal_fn_supported_p (IFN_BIT_ANDC, vtype,
+ OPTIMIZE_FOR_BOTH)))
+   {
+ tree conv_op = build1 (VIEW_CONVERT_EXPR, vtype, op0);
+ tree new_op = make_ssa_name (vtype);
+ gassign *new_stmt = gimple_build_assign (new_op, conv_op);
+ gsi_insert_seq_before (gsi, new_stmt, GSI_SAME_STMT);
+ return gimple_build_call_internal (IFN_BIT_ANDC, 2, op2,
+new_op);
+   }
+ /* Try to fold r = c ? z : -1 to r = .BIT_IORC (z, c).  */
+ else if (op2_minus_onep
+  && (direct_internal_fn_supported_p (IFN_BIT_IORC, vtype,
+  OPTIMIZE_FOR_BOTH)))
+   {
+ tree conv_op = build1 (VIEW_CONVERT_EXPR, vtype, op0);
+ tree new_op = make_ssa_name (vtype);
+ gassign *new_stmt = gimple_build_assign (new_op, conv_op);
+ gsi_insert_seq_before (gsi, new_stmt, GSI_SAME_STMT);
+ return gimple_build_call_internal (IFN_BIT_IORC, 2, op1,
+new_op);
+   }
}
 
  /* When the compare has EH we do not want to forward it when
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 915d329c05a0..0b45f322f0db 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -595,6 +595,10 @@ DEF_INTERNAL_FN (DIVMODBITINT, ECF_LEAF, ". O . O . R . R 
. ")
 DEF_INTERNAL_FN (FLOATTOBITINT, ECF_LEAF | ECF_NOTHROW, ". O . . ")
 DEF_INTERNAL_FN (BITINTTOFLOAT, ECF_PURE | ECF_LEAF, ". R . ")
 
+/* Bitwise functions.  */
+DEF_INTERNAL_OPTAB_FN (BIT_ANDC, ECF_CONST, andc, binary)
+DEF_INTERNAL_OPTAB_FN (BIT_IORC, ECF_CONST, iorc, binary)
+
 #undef DEF_INTERNAL_WIDENING_OPTAB_FN
 #undef DEF_INTERNAL_SIGNED_COND_FN
 #undef DEF_INTERNAL_COND_FN
diff --git a/gcc/optabs.def b/gcc/optabs.de

[gcc r15-1891] rs6000: Replace orc with iorc [PR115659]

2024-07-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:6425dae07aa4be58abade03455c2d9744f73d4e1

commit r15-1891-g6425dae07aa4be58abade03455c2d9744f73d4e1
Author: Kewen Lin 
Date:   Mon Jul 8 00:15:00 2024 -0500

rs6000: Replace orc with iorc [PR115659]

Since iorc optab is introduced, this patch is to update the
expander names and all the related uses like bif expanders,
gen functions accordingly.

PR tree-optimization/115659

gcc/ChangeLog:

* config/rs6000/rs6000-builtins.def: Update some bif expanders by
replacing orc3 with iorc3.
* config/rs6000/rs6000-string.cc (expand_cmp_vec_sequence): Update 
gen
function by replacing orc3 with iorc3.
* config/rs6000/rs6000.md (orc3): Rename to ...
(iorc3): ... this.

Diff:
---
 gcc/config/rs6000/rs6000-builtins.def | 24 
 gcc/config/rs6000/rs6000-string.cc|  2 +-
 gcc/config/rs6000/rs6000.md   |  2 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 3bc7fed69568..736890fe6cb8 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2147,40 +2147,40 @@
 NEG_V2DI negv2di2 {}
 
   const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
-ORC_V16QI orcv16qi3 {}
+ORC_V16QI iorcv16qi3 {}
 
   const vuc __builtin_altivec_orc_v16qi_uns (vuc, vuc);
-ORC_V16QI_UNS orcv16qi3 {}
+ORC_V16QI_UNS iorcv16qi3 {}
 
   const vsq __builtin_altivec_orc_v1ti (vsq, vsq);
-ORC_V1TI orcv1ti3 {}
+ORC_V1TI iorcv1ti3 {}
 
   const vuq __builtin_altivec_orc_v1ti_uns (vuq, vuq);
-ORC_V1TI_UNS orcv1ti3 {}
+ORC_V1TI_UNS iorcv1ti3 {}
 
   const vd __builtin_altivec_orc_v2df (vd, vd);
-ORC_V2DF orcv2df3 {}
+ORC_V2DF iorcv2df3 {}
 
   const vsll __builtin_altivec_orc_v2di (vsll, vsll);
-ORC_V2DI orcv2di3 {}
+ORC_V2DI iorcv2di3 {}
 
   const vull __builtin_altivec_orc_v2di_uns (vull, vull);
-ORC_V2DI_UNS orcv2di3 {}
+ORC_V2DI_UNS iorcv2di3 {}
 
   const vf __builtin_altivec_orc_v4sf (vf, vf);
-ORC_V4SF orcv4sf3 {}
+ORC_V4SF iorcv4sf3 {}
 
   const vsi __builtin_altivec_orc_v4si (vsi, vsi);
-ORC_V4SI orcv4si3 {}
+ORC_V4SI iorcv4si3 {}
 
   const vui __builtin_altivec_orc_v4si_uns (vui, vui);
-ORC_V4SI_UNS orcv4si3 {}
+ORC_V4SI_UNS iorcv4si3 {}
 
   const vss __builtin_altivec_orc_v8hi (vss, vss);
-ORC_V8HI orcv8hi3 {}
+ORC_V8HI iorcv8hi3 {}
 
   const vus __builtin_altivec_orc_v8hi_uns (vus, vus);
-ORC_V8HI_UNS orcv8hi3 {}
+ORC_V8HI_UNS iorcv8hi3 {}
 
   const vsc __builtin_altivec_vclzb (vsc);
 VCLZB clzv16qi2 {}
diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index 917f5572a6d3..c4c62e8e2f94 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -743,7 +743,7 @@ expand_cmp_vec_sequence (unsigned HOST_WIDE_INT 
bytes_to_compare,
  rtx cmp_combined = gen_reg_rtx (load_mode);
  emit_insn (gen_altivec_eqv16qi (cmp_res, s1data, s2data));
  emit_insn (gen_altivec_eqv16qi (cmp_zero, s1data, zero_reg));
- emit_insn (gen_orcv16qi3 (vec_result, cmp_zero, cmp_res));
+ emit_insn (gen_iorcv16qi3 (vec_result, cmp_zero, cmp_res));
  emit_insn (gen_altivec_vcmpequb_p (cmp_combined, vec_result, 
zero_reg));
}
}
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a5d205947895..276a5c9cf2d3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7324,7 +7324,7 @@
 
 ;; The canonical form is to have the negated element first, so we need to
 ;; reverse arguments.
-(define_expand "orc3"
+(define_expand "iorc3"
   [(set (match_operand:BOOL_128 0 "vlogical_operand")
(ior:BOOL_128
 (not:BOOL_128 (match_operand:BOOL_128 2 "vlogical_operand"))


[gcc r15-1991] rs6000: Remove vcond{,u} expanders

2024-07-11 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:f7e4000397842671fe7e5c0473f1fa62707e1db9

commit r15-1991-gf7e4000397842671fe7e5c0473f1fa62707e1db9
Author: Kewen Lin 
Date:   Fri Jul 12 01:32:57 2024 -0500

rs6000: Remove vcond{,u} expanders

As PR114189 shows, middle-end will obsolete vcond, vcondu
and vcondeq optabs soon.  This patch is to remove all
vcond{,u} expanders in rs6000 port and adjust the function
rs6000_emit_vector_cond_expr which is called by those
expanders as static.

PR target/115659

gcc/ChangeLog:

* config/rs6000/rs6000-protos.h (rs6000_emit_vector_cond_expr): 
Remove.
* config/rs6000/rs6000.cc (rs6000_emit_vector_cond_expr): Add static
qualifier as it is only called by rs6000_emit_swsqrt now.
* config/rs6000/vector.md (vcond): Remove.
(vcond): Remove.
(vcondv4sfv4si): Likewise.
(vcondv4siv4sf): Likewise.
(vcondv2dfv2di): Likewise.
(vcondv2div2df): Likewise.
(vcondu): Likewise.
(vconduv4sfv4si): Likewise.
(vconduv2dfv2di): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000-protos.h |   1 -
 gcc/config/rs6000/rs6000.cc   |   2 +-
 gcc/config/rs6000/vector.md   | 160 --
 3 files changed, 1 insertion(+), 162 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 09a57a806faf..b40557a85577 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -126,7 +126,6 @@ extern void rs6000_emit_dot_insn (rtx dst, rtx src, int 
dot, rtx ccreg);
 extern bool rs6000_emit_set_const (rtx, rtx);
 extern bool rs6000_emit_cmove (rtx, rtx, rtx, rtx);
 extern bool rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
-extern int rs6000_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx);
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
 extern void rs6000_expand_atomic_compare_and_swap (rtx op[]);
 extern rtx swap_endian_selector_for_mode (machine_mode mode);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 2cbea6ea2d7c..195f2af9062e 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16149,7 +16149,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
OP_FALSE are two VEC_COND_EXPR operands.  CC_OP0 and CC_OP1 are the two
operands for the relation operation COND.  */
 
-int
+static int
 rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
  rtx cond, rtx cc_op0, rtx cc_op1)
 {
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 59489e068399..0d3e0a24e118 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -331,166 +331,6 @@
 })
 
 
-;; Vector comparisons
-(define_expand "vcond"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-   (if_then_else:VEC_F
-(match_operator 3 "comparison_operator"
-[(match_operand:VEC_F 4 "vfloat_operand")
- (match_operand:VEC_F 5 "vfloat_operand")])
-(match_operand:VEC_F 1 "vfloat_operand")
-(match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)"
-{
-  if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
-   operands[3], operands[4], operands[5]))
-DONE;
-  else
-gcc_unreachable ();
-})
-
-(define_expand "vcond"
-  [(set (match_operand:VEC_I 0 "vint_operand")
-   (if_then_else:VEC_I
-(match_operator 3 "comparison_operator"
-[(match_operand:VEC_I 4 "vint_operand")
- (match_operand:VEC_I 5 "vint_operand")])
-(match_operand:VEC_I 1 "vector_int_reg_or_same_bit")
-(match_operand:VEC_I 2 "vector_int_reg_or_same_bit")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)"
-{
-  if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
-   operands[3], operands[4], operands[5]))
-DONE;
-  else
-gcc_unreachable ();
-})
-
-(define_expand "vcondv4sfv4si"
-  [(set (match_operand:V4SF 0 "vfloat_operand")
-   (if_then_else:V4SF
-(match_operator 3 "comparison_operator"
-[(match_operand:V4SI 4 "vint_operand")
- (match_operand:V4SI 5 "vint_operand")])
-(match_operand:V4SF 1 "vfloat_operand")
-(match_operand:V4SF 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
-   && VECTOR_UNIT_ALTIVEC_P (V4SImode)"
-{
-  if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
-   operands[3], operands[4], operands[5]))
-DONE;
-  else
-gcc_unreachable ();
-})
-
-(define_expand "vcondv4siv4sf"
-  [(set (match_operand:V4SI 0 "vint_operand")
-   (if_then_else:V4SI
-(match_operator 3 "comparison_operator"
-

[gcc r15-2083] expr: Allow same precision modes conversion between {ibm_extended, ieee_quad}_format

2024-07-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:3f6e6d4b408a26f69816f18d88dde4d983677488

commit r15-2083-g3f6e6d4b408a26f69816f18d88dde4d983677488
Author: Kewen Lin 
Date:   Wed Jul 17 00:14:18 2024 -0500

expr: Allow same precision modes conversion between {ibm_extended, 
ieee_quad}_format

With some historical reasons, rs6000 defines KFmode, TFmode
and IFmode to have different mode precisions, but it causes
some issues and needs some workarounds such as PR112993.
So we are going to make all rs6000 128 bit scalar FP modes
have 128 bit precision.  Be prepared for that, this patch
is to make function convert_mode_scalar allow same precision
FP modes conversion if their underlying formats are
ibm_extended_format and ieee_quad_format respectively, just
like the existing special treatment on arm_bfloat_half_format
<-> ieee_half_format.  It also factors out all the relevant
checks into a lambda function.  Besides, similar to ieee fp16
-> bfloat conversion, it adopts trunc_optab rather than
sext_optab for ibm128 to ieee128 conversion.

PR target/112993

gcc/ChangeLog:

* expr.cc (convert_mode_scalar): Allow same precision conversion
between scalar floating point modes if whose underlying format is
ibm_extended_format or ieee_quad_format, and refactor assertion
with new lambda function acceptable_same_precision_modes.  Use
trunc_optab rather than sext_optab for ibm128 to ieee128 conversion.
* optabs-libfuncs.cc (gen_trunc_conv_libfunc): Use trunc_optab 
rather
than sext_optab for ibm128 to ieee128 conversion.

Diff:
---
 gcc/expr.cc| 39 ++-
 gcc/optabs-libfuncs.cc |  4 +++-
 2 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index ffbac5136923..2089c2b86a98 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -338,6 +338,29 @@ convert_mode_scalar (rtx to, rtx from, int unsignedp)
   enum rtx_code equiv_code = (unsignedp < 0 ? UNKNOWN
  : (unsignedp ? ZERO_EXTEND : SIGN_EXTEND));
 
+  auto acceptable_same_precision_modes
+= [] (scalar_mode from_mode, scalar_mode to_mode) -> bool
+{
+  if (DECIMAL_FLOAT_MODE_P (from_mode) != DECIMAL_FLOAT_MODE_P (to_mode))
+   return true;
+
+  /* arm_bfloat_half_format <-> ieee_half_format */
+  if ((REAL_MODE_FORMAT (from_mode) == &arm_bfloat_half_format
+  && REAL_MODE_FORMAT (to_mode) == &ieee_half_format)
+ || (REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format
+ && REAL_MODE_FORMAT (from_mode) == &ieee_half_format))
+   return true;
+
+  /* ibm_extended_format <-> ieee_quad_format */
+  if ((REAL_MODE_FORMAT (from_mode) == &ibm_extended_format
+  && REAL_MODE_FORMAT (to_mode) == &ieee_quad_format)
+ || (REAL_MODE_FORMAT (from_mode) == &ieee_quad_format
+ && REAL_MODE_FORMAT (to_mode) == &ibm_extended_format))
+   return true;
+
+  return false;
+};
+
   if (to_real)
 {
   rtx value;
@@ -346,18 +369,16 @@ convert_mode_scalar (rtx to, rtx from, int unsignedp)
 
   gcc_assert ((GET_MODE_PRECISION (from_mode)
   != GET_MODE_PRECISION (to_mode))
- || (DECIMAL_FLOAT_MODE_P (from_mode)
- != DECIMAL_FLOAT_MODE_P (to_mode))
- || (REAL_MODE_FORMAT (from_mode) == &arm_bfloat_half_format
- && REAL_MODE_FORMAT (to_mode) == &ieee_half_format)
- || (REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format
- && REAL_MODE_FORMAT (from_mode) == &ieee_half_format));
+ || acceptable_same_precision_modes (from_mode, to_mode));
 
   if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode))
{
- if (REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format
- && REAL_MODE_FORMAT (from_mode) == &ieee_half_format)
-   /* libgcc implements just __trunchfbf2, not __extendhfbf2.  */
+ if ((REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format
+  && REAL_MODE_FORMAT (from_mode) == &ieee_half_format)
+ || (REAL_MODE_FORMAT (to_mode) == &ieee_quad_format
+ && REAL_MODE_FORMAT (from_mode) == &ibm_extended_format))
+   /* libgcc implements just __trunchfbf2, not __extendhfbf2;
+  and __trunctfkf2, not __extendtfkf2.  */
tab = trunc_optab;
  else
/* Conversion between decimal float and binary float, same
diff --git a/gcc/optabs-libfuncs.cc b/gcc/optabs-libfuncs.cc
index 26729910d92b..ab97eace80e5 100644
--- a/gcc/optabs-libfuncs.cc
+++ b/gcc/optabs-libfuncs.cc
@@ -591,7 +591,9 @@ gen_trunc_conv_libfunc (convert_optab tab,
 
   if (GET_MODE_PRECISION (float_fmode) <= GET_MODE_PRECISION (float_tmode)
   && (REAL_MODE_FORMAT (float_tmode) != &arm_bfloat_half_format

[gcc r15-2085] fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]

2024-07-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:de6969fd311307e34904fc1f85603a9d92938974

commit r15-2085-gde6969fd311307e34904fc1f85603a9d92938974
Author: Kewen Lin 
Date:   Wed Jul 17 00:16:59 2024 -0500

fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]

Previously effective target fortran_real_c_float128 never
passes on Power regardless of the default 128 long double
is ibmlongdouble or ieeelongdouble.  It's due to that TF
mode is always used for kind 16 real, which has precision
127, while the node float128_type_node for c_float128 has
128 type precision, get_real_kind_from_node can't find a
matching as it only checks gfc_real_kinds[i].mode_precision
and type precision.

With changing TFmode/IFmode/KFmode to have the same mode
precision 128, now fortran_real_c_float12 can pass with
ieeelongdouble enabled by default and test cases guarded
with it get tested accordingly.  But with ibmlongdouble
enabled by default, since TFmode has precision 128 which
is the same as type precision 128 of float128_type_node,
get_real_kind_from_node considers kind for TFmode matches
float128_type_node, but it's wrong as at this time point
TFmode is with ibm extended format.  So this patch is to
teach get_real_kind_from_node to check one more field which
can be differentiable from the underlying real format, it
can avoid the unexpected matching when there more than one
modes have the same precisoin.

PR target/112993

gcc/fortran/ChangeLog:

* trans-types.cc (get_real_kind_from_node): Consider the case where
more than one modes have the same precision.

Diff:
---
 gcc/fortran/trans-types.cc | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/trans-types.cc b/gcc/fortran/trans-types.cc
index 0ef67723fcd3..f7b80a9761c4 100644
--- a/gcc/fortran/trans-types.cc
+++ b/gcc/fortran/trans-types.cc
@@ -183,7 +183,21 @@ get_real_kind_from_node (tree type)
 
   for (i = 0; gfc_real_kinds[i].kind != 0; i++)
 if (gfc_real_kinds[i].mode_precision == TYPE_PRECISION (type))
-  return gfc_real_kinds[i].kind;
+  {
+   /* On Power, we have three 128-bit scalar floating-point modes
+  and all of their types have 128 bit type precision, so we
+  should check underlying real format details further.  */
+#if defined(HAVE_TFmode) && defined(HAVE_IFmode) && defined(HAVE_KFmode)
+   if (gfc_real_kinds[i].kind == 16)
+ {
+   machine_mode mode = TYPE_MODE (type);
+   const struct real_format *fmt = REAL_MODE_FORMAT (mode);
+   if (fmt->p != gfc_real_kinds[i].digits)
+ continue;
+ }
+#endif
+   return gfc_real_kinds[i].kind;
+  }
 
   return -4;
 }


[gcc r15-2084] rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

2024-07-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:33dca0a4c1c421625cedb2d6105ef1c05f6b774e

commit r15-2084-g33dca0a4c1c421625cedb2d6105ef1c05f6b774e
Author: Kewen Lin 
Date:   Wed Jul 17 00:14:43 2024 -0500

rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

On rs6000, there are three 128 bit scalar floating point
modes TFmode, IFmode and KFmode.  With some historical
reasons, we defines them with different mode precisions,
that is KFmode 126, TFmode 127 and IFmode 128.  But in
fact all of them should have the same mode precision 128,
this special setting has caused some issues like some
unexpected failures mentioned in [1] and also made us have
to introduce some workarounds, such as: the workaround in
build_common_tree_nodes for KFmode 126, the workaround in
range_compatible_p for same mode but different precision
issue.

This patch is to make these three 128 bit scalar floating
point modes TFmode, IFmode and KFmode have 128 bit mode
precision, and keep the order same as previous in order
to make machine independent parts of the compiler not try
to widen IFmode to TFmode.  Besides, build_common_tree_nodes
adopts the newly added hook mode_for_floating_type so we
don't need to worry about unexpected mode for long double
type node.

In function convert_mode_scalar, with the proposed change,
it adopts sext_optab for converting ieee128 format mode to
ibm128 format mode while trunc_optab for converting ibm128
format mode to ieee128 format mode.  Thus this patch removes
useless extend and trunc optab supports, supplements new
define_expands expandkftf2 and trunctfkf2 to align with
convert_mode_scalar implementation.  It also unnames two
define_insn_and_split to avoid conflicts and make them more
clear.  Considering the current implementation that there is
no chance to have KF <-> IF conversion (since either of them
would be TF already), it adds two dummy define_expands to
assert this.

[1] https://inbox.sourceware.org/gcc-patches/
718677e7-614d-7977-312d-05a75e1fd...@linux.ibm.com/

PR target/112993

gcc/ChangeLog:

* config/rs6000/rs6000-modes.def (IFmode, KFmode, TFmode): Define
with FLOAT_MODE instead of FRACTIONAL_FLOAT_MODE, don't use special
precisions any more.
(rs6000-modes.h): Remove include.
* config/rs6000/rs6000-modes.h: Remove.
* config/rs6000/rs6000.h (rs6000-modes.h): Remove include.
* config/rs6000/t-rs6000: Remove rs6000-modes.h include.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
all uses of FLOAT_PRECISION_TFmode with 128.
(rs6000_c_mode_for_floating_type): Likewise.
* config/rs6000/rs6000.md (define_expand extendiftf2): Remove.
(define_expand extendifkf2): Remove.
(define_expand extendtfkf2): Remove.
(define_expand trunckftf2): Remove.
(define_expand trunctfif2): Remove.
(define_expand extendtfif2): Add new assertion.
(define_expand expandkftf2): New.
(define_expand trunciftf2): Add new assertion.
(define_expand trunctfkf2): New.
(define_expand truncifkf2): Change with gcc_unreachable.
(define_expand expandkfif2): New.
(define_insn_and_split extendkftf2): Rename to  ...
(define_insn_and_split *extendkftf2): ... this.
(define_insn_and_split trunctfkf2): Rename to ...
(define_insn_and_split *extendtfkf2): ... this.

Diff:
---
 gcc/config/rs6000/rs6000-modes.def | 31 +++---
 gcc/config/rs6000/rs6000-modes.h   | 36 
 gcc/config/rs6000/rs6000.cc|  9 ++---
 gcc/config/rs6000/rs6000.h |  5 ---
 gcc/config/rs6000/rs6000.md| 67 +++---
 gcc/config/rs6000/t-rs6000 |  1 -
 6 files changed, 41 insertions(+), 108 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-modes.def 
b/gcc/config/rs6000/rs6000-modes.def
index 094b246c834d..b69593c40a61 100644
--- a/gcc/config/rs6000/rs6000-modes.def
+++ b/gcc/config/rs6000/rs6000-modes.def
@@ -18,12 +18,11 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
-/* We order the 3 128-bit floating point types so that IFmode (IBM 128-bit
-   floating point) is the 128-bit floating point type with the highest
-   precision (128 bits).  This so that machine independent parts of the
-   compiler do not try to widen IFmode to TFmode on ISA 3.0 (power9) that has
-   hardware support for IEEE 128-bit.  We set TFmode (long double mode) in
-   between, and KFmode (explicit __float128) below it.
+/* We order the 3 128-bit floating point type modes here as KFmode, TFmode and
+   IFmode, it is the same as the previous order, to make machine in

[gcc r15-2086] ranger: Revert the workaround introduced in PR112788 [PR112993]

2024-07-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:fa86f510f51e6d940a28ea997fca3a6e3f50b4d3

commit r15-2086-gfa86f510f51e6d940a28ea997fca3a6e3f50b4d3
Author: Kewen Lin 
Date:   Wed Jul 17 00:17:42 2024 -0500

ranger: Revert the workaround introduced in PR112788 [PR112993]

This reverts commit r14-6478-gfda8e2f8292a90 "range:
Workaround different type precision between _Float128 and
long double [PR112788]" as the fixes for PR112993 make
all 128 bits scalar floating point have the same 128 bit
precision, this workaround isn't needed any more.

PR target/112993

gcc/ChangeLog:

* value-range.h (range_compatible_p): Remove the workaround on
different type precision between _Float128 and long double.

Diff:
---
 gcc/value-range.h | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/gcc/value-range.h b/gcc/value-range.h
index 334ea1bc338c..03af758d152c 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -1764,13 +1764,7 @@ range_compatible_p (tree type1, tree type2)
   // types_compatible_p requires conversion in both directions to be useless.
   // GIMPLE only requires a cast one way in order to be compatible.
   // Ranges really only need the sign and precision to be the same.
-  return TYPE_SIGN (type1) == TYPE_SIGN (type2)
-&& (TYPE_PRECISION (type1) == TYPE_PRECISION (type2)
-// FIXME: As PR112788 shows, for now on rs6000 _Float128 has
-// type precision 128 while long double has type precision 127
-// but both have the same mode so their precision is actually
-// the same, workaround it temporarily.
-|| (SCALAR_FLOAT_TYPE_P (type1)
-&& TYPE_MODE (type1) == TYPE_MODE (type2)));
+  return (TYPE_PRECISION (type1) == TYPE_PRECISION (type2)
+ && TYPE_SIGN (type1) == TYPE_SIGN (type2));
 }
 #endif // GCC_VALUE_RANGE_H


[gcc r15-2087] tree: Remove KFmode workaround [PR112993]

2024-07-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:b5c813ed6035cf6ef831927e66e184a5847afbe6

commit r15-2087-gb5c813ed6035cf6ef831927e66e184a5847afbe6
Author: Kewen Lin 
Date:   Wed Jul 17 00:19:00 2024 -0500

tree: Remove KFmode workaround [PR112993]

The fix for PR112993 makes KFmode have 128 bit mode precision,
we don't need this workaround to fix up the type precision any
more, and just go with mode precision.  So this patch is to
remove KFmode workaround.

PR target/112993

gcc/ChangeLog:

* tree.cc (build_common_tree_nodes): Drop the workaround for rs6000
KFmode precision adjustment.

Diff:
---
 gcc/tree.cc | 9 -
 1 file changed, 9 deletions(-)

diff --git a/gcc/tree.cc b/gcc/tree.cc
index 2d2d5b6db6ed..a2d431662bd5 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -9633,15 +9633,6 @@ build_common_tree_nodes (bool signed_char)
   if (!targetm.floatn_mode (n, extended).exists (&mode))
continue;
   int precision = GET_MODE_PRECISION (mode);
-  /* Work around the rs6000 KFmode having precision 113 not
-128.  */
-  const struct real_format *fmt = REAL_MODE_FORMAT (mode);
-  gcc_assert (fmt->b == 2 && fmt->emin + fmt->emax == 3);
-  int min_precision = fmt->p + ceil_log2 (fmt->emax - fmt->emin);
-  if (!extended)
-   gcc_assert (min_precision == n);
-  if (precision < min_precision)
-   precision = min_precision;
   FLOATN_NX_TYPE_NODE (i) = make_node (REAL_TYPE);
   TYPE_PRECISION (FLOATN_NX_TYPE_NODE (i)) = precision;
   layout_type (FLOATN_NX_TYPE_NODE (i));


[gcc r15-2088] rs6000: Change optab for ibm128 and ieee128 conversion

2024-07-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:dd4d71ca4d8d4252eb33a3202380524e6d43ba05

commit r15-2088-gdd4d71ca4d8d4252eb33a3202380524e6d43ba05
Author: Kewen Lin 
Date:   Wed Jul 17 00:19:30 2024 -0500

rs6000: Change optab for ibm128 and ieee128 conversion

Currently for 128 bit floating-point ibm128 and ieee128
formats conversion, the corresponding libcalls are:
  ibm128 -> ieee128 "__trunctfkf2"
  ieee128 -> ibm128 "__extendkftf2"
, and generic code handling (like convert_mode_scalar) also
adopts sext_optab for ieee128 -> ibm128 while trunc_optab
for ibm128 -> ieee128.  But in rs6000 port as function
rs6000_expand_float128_convert and init_float128_ieee show,
we adopt sext_optab for ibm128 -> ieee128 with "__trunctfkf2"
while trunc_optab for ieee128 -> ibm128 with "__extendkftf2".

To make them consistent and avoid some surprises, this patch
is to adjust rs6000 internal handlings by adopting trunc_optab
for ibm128 -> ieee128 with "__trunctfkf2" while sext_optab for
ieee128 -> ibm128 with "__extendkftf2".

gcc/ChangeLog:

* config/rs6000/rs6000.cc (init_float128_ieee): Use trunc_optab 
rather
than sext_optab for converting FLOAT128_IBM_P mode to 
FLOAT128_IEEE_P
mode, and use sext_optab rather than trunc_optab for converting
FLOAT128_IEEE_P mode to FLOAT128_IBM_P mode.
(rs6000_expand_float128_convert): Likewise.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 905e6cb6a942..2c0a7fc8cefa 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11476,13 +11476,13 @@ init_float128_ieee (machine_mode mode)
   set_conv_libfunc (trunc_optab, SFmode, mode, "__trunckfsf2");
   set_conv_libfunc (trunc_optab, DFmode, mode, "__trunckfdf2");
 
-  set_conv_libfunc (sext_optab, mode, IFmode, "__trunctfkf2");
+  set_conv_libfunc (trunc_optab, mode, IFmode, "__trunctfkf2");
   if (mode != TFmode && FLOAT128_IBM_P (TFmode))
-   set_conv_libfunc (sext_optab, mode, TFmode, "__trunctfkf2");
+   set_conv_libfunc (trunc_optab, mode, TFmode, "__trunctfkf2");
 
-  set_conv_libfunc (trunc_optab, IFmode, mode, "__extendkftf2");
+  set_conv_libfunc (sext_optab, IFmode, mode, "__extendkftf2");
   if (mode != TFmode && FLOAT128_IBM_P (TFmode))
-   set_conv_libfunc (trunc_optab, TFmode, mode, "__extendkftf2");
+   set_conv_libfunc (sext_optab, TFmode, mode, "__extendkftf2");
 
   set_conv_libfunc (sext_optab, mode, SDmode, "__dpd_extendsdkf");
   set_conv_libfunc (sext_optab, mode, DDmode, "__dpd_extendddkf");
@@ -15640,7 +15640,7 @@ rs6000_expand_float128_convert (rtx dest, rtx src, bool 
unsigned_p)
case E_IFmode:
case E_TFmode:
  if (FLOAT128_IBM_P (src_mode))
-   cvt = sext_optab;
+   cvt = trunc_optab;
  else
do_move = true;
  break;
@@ -15702,7 +15702,7 @@ rs6000_expand_float128_convert (rtx dest, rtx src, bool 
unsigned_p)
case E_IFmode:
case E_TFmode:
  if (FLOAT128_IBM_P (dest_mode))
-   cvt = trunc_optab;
+   cvt = sext_optab;
  else
do_move = true;
  break;


[gcc r15-2190] testsuite: powerpc: fix dg-do run typo

2024-07-21 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:913bab282d95e842907fec5a552a74ef64a6d4f6

commit r15-2190-g913bab282d95e842907fec5a552a74ef64a6d4f6
Author: Sam James 
Date:   Sun Jul 21 20:36:08 2024 -0500

testsuite: powerpc: fix dg-do run typo

'dg-run' is not a valid dejagnu directive, 'dg-do run' is needed here
for the test to be executed.

PR target/108699

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr108699.c: Fix 'dg-run' typo.

Signed-off-by: Sam James 

Diff:
---
 gcc/testsuite/gcc.target/powerpc/pr108699.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr108699.c 
b/gcc/testsuite/gcc.target/powerpc/pr108699.c
index f02bac130cc7..beb8b601fd51 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr108699.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr108699.c
@@ -1,4 +1,4 @@
-/* { dg-run } */
+/* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */
 
 #define N 16


[gcc r15-2214] rs6000: Escalate warning to error for VSX with explicit no-altivec etc.

2024-07-22 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:04da747a063850333b062e48d0531debe314dff9

commit r15-2214-g04da747a063850333b062e48d0531debe314dff9
Author: Kewen Lin 
Date:   Tue Jul 23 00:47:49 2024 -0500

rs6000: Escalate warning to error for VSX with explicit no-altivec etc.

As the discussion in PR115688, for now when users specify
-mvsx and -mno-altivec explicitly, compiler emits warning
rather than error, but considering both options are given
explicitly, emitting hard error should be better.

So this patch is to escalate some related warning to error
when both are incompatible.

PR target/115713

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Emit 
error
messages when explicit VSX encounters explicit soft-float, 
no-altivec
or avoid-indexed-addresses.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/warn-1.c: Move to ...
* gcc.target/powerpc/error-1.c: ... here.  Adjust dg-warning with
dg-error and remove ineffective scan.

Diff:
---
 gcc/config/rs6000/rs6000.cc| 41 --
 .../gcc.target/powerpc/{warn-1.c => error-1.c} |  3 +-
 2 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index eddd2adbab59..019bb7ccc380 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3830,32 +3830,37 @@ rs6000_option_override_internal (bool global_init_p)
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
-  const char *msg = NULL;
+  bool explicit_vsx_p = rs6000_isa_flags_explicit & OPTION_MASK_VSX;
   if (!TARGET_HARD_FLOAT)
{
- if (rs6000_isa_flags_explicit & OPTION_MASK_VSX)
-   msg = N_("%<-mvsx%> requires hardware floating point");
- else
+ if (explicit_vsx_p)
{
- rs6000_isa_flags &= ~ OPTION_MASK_VSX;
- rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
+ if (rs6000_isa_flags_explicit & OPTION_MASK_SOFT_FLOAT)
+   error ("%<-mvsx%> and %<-msoft-float%> are incompatible");
+ else
+   warning (0, N_("%<-mvsx%> requires hardware floating-point"));
}
+ rs6000_isa_flags &= ~OPTION_MASK_VSX;
+ rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
}
   else if (TARGET_AVOID_XFORM > 0)
-   msg = N_("%<-mvsx%> needs indexed addressing");
-  else if (!TARGET_ALTIVEC && (rs6000_isa_flags_explicit
-  & OPTION_MASK_ALTIVEC))
-{
- if (rs6000_isa_flags_explicit & OPTION_MASK_VSX)
-   msg = N_("%<-mvsx%> and %<-mno-altivec%> are incompatible");
+   {
+ if (explicit_vsx_p && OPTION_SET_P (TARGET_AVOID_XFORM))
+   error ("%<-mvsx%> and %<-mavoid-indexed-addresses%>"
+  " are incompatible");
  else
-   msg = N_("%<-mno-altivec%> disables vsx");
-}
-
-  if (msg)
+   warning (0, N_("%<-mvsx%> needs indexed addressing"));
+ rs6000_isa_flags &= ~OPTION_MASK_VSX;
+ rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
+   }
+  else if (!TARGET_ALTIVEC
+  && (rs6000_isa_flags_explicit & OPTION_MASK_ALTIVEC))
{
- warning (0, msg);
- rs6000_isa_flags &= ~ OPTION_MASK_VSX;
+ if (explicit_vsx_p)
+   error ("%<-mvsx%> and %<-mno-altivec%> are incompatible");
+ else
+   warning (0, N_("%<-mno-altivec%> disables vsx"));
+ rs6000_isa_flags &= ~OPTION_MASK_VSX;
  rs6000_isa_flags_explicit |= OPTION_MASK_VSX;
}
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/warn-1.c 
b/gcc/testsuite/gcc.target/powerpc/error-1.c
similarity index 70%
rename from gcc/testsuite/gcc.target/powerpc/warn-1.c
rename to gcc/testsuite/gcc.target/powerpc/error-1.c
index 76ac0c4e26e5..d38eba8bb8ad 100644
--- a/gcc/testsuite/gcc.target/powerpc/warn-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/error-1.c
@@ -3,7 +3,7 @@
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O -mvsx -mno-altivec" } */
 
-/* { dg-warning "'-mvsx' and '-mno-altivec' are incompatible" "" { target 
*-*-* } 0 } */
+/* { dg-error "'-mvsx' and '-mno-altivec' are incompatible" "" { target *-*-* 
} 0 } */
 
 double
 foo (double *x, double *y)
@@ -16,4 +16,3 @@ foo (double *x, double *y)
   return z[0] * z[1];
 }
 
-/* { dg-final { scan-assembler-not "xsadddp" } } */


[gcc r15-2215] rs6000: Consider explicitly set options in target option parsing [PR115713]

2024-07-22 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:e6db8848d956f5e712dd621d33630b799ff60a72

commit r15-2215-ge6db8848d956f5e712dd621d33630b799ff60a72
Author: Kewen Lin 
Date:   Tue Jul 23 00:48:00 2024 -0500

rs6000: Consider explicitly set options in target option parsing [PR115713]

In rs6000_inner_target_options, when enabling VSX we enable
altivec and disable -mavoid-indexed-addresses implicitly,
but it doesn't consider the case that the options altivec
and avoid-indexed-addresses can be explicitly disabled.  As
the test case in PR115713#c1 shows, with target attribute
"no-altivec,vsx", it results in that VSX unexpectedly set
altivec flag and there isn't an expected error.

This patch is to avoid the automatic enablement when they
are explicitly specified.  With this change, an existing
test case ppc-target-4.c also requires an adjustment by
specifying explicit altivec in target attribute (since it
requires altivec feature and command line is specifying
no-altivec).

PR target/115713

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_inner_target_options): Avoid to
enable altivec or disable avoid-indexed-addresses automatically
when they get specified explicitly.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr115713-1.c: New test.
* gcc.target/powerpc/ppc-target-4.c: Adjust by specifying altivec
in target attribute.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  7 +--
 gcc/testsuite/gcc.target/powerpc/ppc-target-4.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr115713-1.c   | 20 
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 019bb7ccc380..ce888d3caa65 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -24669,8 +24669,11 @@ rs6000_inner_target_options (tree args, bool attr_p)
  {
if (mask == OPTION_MASK_VSX)
  {
-   mask |= OPTION_MASK_ALTIVEC;
-   TARGET_AVOID_XFORM = 0;
+   if (!(rs6000_isa_flags_explicit
+ & OPTION_MASK_ALTIVEC))
+ mask |= OPTION_MASK_ALTIVEC;
+   if (!OPTION_SET_P (TARGET_AVOID_XFORM))
+ TARGET_AVOID_XFORM = 0;
  }
  }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
index 43a98b353cf7..db9ba500e0e1 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-target-4.c
@@ -18,7 +18,7 @@
 #error "__VSX__ should not be defined."
 #endif
 
-#pragma GCC target("vsx")
+#pragma GCC target("altivec,vsx")
 #include 
 #pragma GCC reset_options
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr115713-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr115713-1.c
new file mode 100644
index ..1b93a78682a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr115713-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* Force power7 to avoid possible error message on AltiVec ABI change.  */
+/* { dg-options "-mdejagnu-cpu=power7" } */
+
+/* Verify there is an error message for incompatible -maltivec and -mvsx
+   even when they are specified by target attributes.  */
+
+int __attribute__ ((target ("no-altivec,vsx")))
+test1 (void)
+{
+  /* { dg-error "'-mvsx' and '-mno-altivec' are incompatible" "" { target 
*-*-* } .-1 } */
+  return 0;
+}
+
+int __attribute__ ((target ("vsx,no-altivec")))
+test2 (void)
+{
+  /* { dg-error "'-mvsx' and '-mno-altivec' are incompatible" "" { target 
*-*-* } .-1 } */
+  return 0;
+}


[gcc r15-2216] rs6000: Update option set in rs6000_inner_target_options [PR115713]

2024-07-22 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:f4062e3615a32597afdb6c8066c87f680276

commit r15-2216-gf4062e3615a32597afdb6c8066c87f680276
Author: Kewen Lin 
Date:   Tue Jul 23 00:48:14 2024 -0500

rs6000: Update option set in rs6000_inner_target_options [PR115713]

When function rs6000_inner_target_options parsing target
options, it updates the explicit option set information for
rs6000_opt_masks by rs6000_isa_flags_explicit, but it misses
to update that information for rs6000_opt_vars, and it can
result in some unexpected consequence as the associated test
case shows.  This patch is to fix rs6000_inner_target_options
to update the option set for rs6000_opt_vars as well.

PR target/115713

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_inner_target_options): Update 
option
set information for rs6000_opt_vars.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr115713-2.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.cc   |  3 ++-
 gcc/testsuite/gcc.target/powerpc/pr115713-2.c | 22 ++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index ce888d3caa65..85211565eb4c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -24694,7 +24694,8 @@ rs6000_inner_target_options (tree args, bool attr_p)
if (strcmp (r, rs6000_opt_vars[i].name) == 0)
  {
size_t j = rs6000_opt_vars[i].global_offset;
-   *((int *) ((char *)&global_options + j)) = !invert;
+   *((int *) ((char *) &global_options + j)) = !invert;
+   *((int *) ((char *) &global_options_set + j)) = 1;
error_p = false;
not_valid_p = false;
break;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr115713-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr115713-2.c
new file mode 100644
index ..47b39c0fabaf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr115713-2.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* Force power7 to avoid possible error message on AltiVec ABI change.  */
+/* { dg-options "-mdejagnu-cpu=power7" } */
+
+/* Verify there is an error message for -mvsx incompatible with
+   -mavoid-indexed-addresses even when they are specified by
+   target attributes.  */
+
+int __attribute__ ((target ("avoid-indexed-addresses,vsx")))
+test1 (void)
+{
+  /* { dg-error "'-mvsx' and '-mavoid-indexed-addresses' are incompatible" "" 
{ target *-*-* } .-1 } */
+  return 0;
+}
+
+int __attribute__ ((target ("vsx,avoid-indexed-addresses")))
+test2 (void)
+{
+  /* { dg-error "'-mvsx' and '-mavoid-indexed-addresses' are incompatible" "" 
{ target *-*-* } .-1 } */
+  return 0;
+}
+


[gcc r14-9851] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:9c97de682303b81c8886ac131fcfb3b122f2f1a6

commit r14-9851-g9c97de682303b81c8886ac131fcfb3b122f2f1a6
Author: Kewen Lin 
Date:   Mon Apr 8 21:02:17 2024 -0500

testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

As PR114614 shows, the newly added test case gcov-20.c by
commit r14-9789-g08a52331803f66 failed on targets which do
not support atomic profile update, there would be a message
like:

  warning: target does not support atomic profile update,
   single mode is selected

Since the test case adopts -fprofile-update=atomic, it
requires effective target check profile_update_atomic, this
patch is to add the check accordingly.

PR testsuite/114614

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-20.c: Add effective target check
profile_update_atomic.

Diff:
---
 gcc/testsuite/gcc.misc-tests/gcov-20.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-20.c 
b/gcc/testsuite/gcc.misc-tests/gcov-20.c
index 215faffc980..ca8c12aad2b 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-20.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-20.c
@@ -1,5 +1,6 @@
 /* { dg-options "-fcondition-coverage -ftest-coverage -fprofile-update=atomic" 
} */
 /* { dg-do run { target native } } */
+/* { dg-require-effective-target profile_update_atomic } */
 
 /* Some side effect to stop branches from being pruned */
 int x = 0;


[gcc r14-9850] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:26eb5f8fd173e2425ae7505528fc426de4b7e34c

commit r14-9850-g26eb5f8fd173e2425ae7505528fc426de4b7e34c
Author: Kewen Lin 
Date:   Mon Apr 8 21:01:36 2024 -0500

rs6000: Fix wrong align passed to build_aligned_type [PR88309]

As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.

This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.

PR target/88309

Co-authored-by: Andrew Pinski 

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.cc (build_qualified_type): Update function comments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr88309.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 +++
 gcc/tree-ssa-loop-prefetch.cc  |  2 ++
 gcc/tree.cc|  3 ++-
 4 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 6698274031b..e7d6204074c 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1900,7 +1900,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree lhs_type = TREE_TYPE (lhs);
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
  required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_ltype = build_aligned_type (lhs_type, 4);
+   tree align_ltype = build_aligned_type (lhs_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg0.  The resulting type will match
   the type of arg1.  */
@@ -1944,7 +1944,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree arg2_type = ptr_type_node;
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
   required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_stype = build_aligned_type (arg0_type, 4);
+   tree align_stype = build_aligned_type (arg0_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg1.  */
gimple_seq stmts = NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
b/gcc/testsuite/gcc.target/powerpc/pr88309.c
new file mode 100644
index 000..c0078cf2b8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
+
+/* Verify there is no ICE or hanging.  */
+
+#include 
+
+void b(float *c, vector float a, vector float, vector float)
+{
+  vector float d;
+  vector char ahbc;
+  vec_xst(vec_perm(a, d, ahbc), 0, c);
+}
+
+vector float e(vector unsigned);
+
+void f() {
+  float *dst;
+  int g = 0;
+  for (;; g += 16) {
+vector unsigned m, i;
+vector unsigned n, j;
+vector unsigned k, l;
+b(dst + g * 3, e(m), e(n), e(k));
+b(dst + (g + 4) * 3, e(i), e(j), e(l));
+  }
+}
diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
index bbd98e03254..70073cc4fe4 100644
--- a/gcc/tree-ssa-loop-prefetch.cc
+++ b/gcc/tree-ssa-loop-prefetch.cc
@@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
cache_line_size,
   if (delta >= (HOST_WIDE_INT) cache_line_size)
 return false;
 
+  gcc_assert (align_unit > 0);
+
   miss_positions = 0;
   total_positions = (cache_line_size / align_unit) * distinct_iters;
   max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 1000;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index f801712c9dd..787168e9255 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -5689,7 +5689,8 @@ build_qualified_type (tree type, int type_quals 
MEM_STAT_DECL)
   return t;
 }
 
-/* Create a variant of type T with alignment ALIGN.  */
+/* Create a variant of type T with alignment ALIGN which
+   is measured in bits.  */
 
 tree
 build_aligned_type (tree type, unsigned int align)


[gcc r14-9886] testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

2024-04-10 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:4923ed49b93352bcf9e43cafac38345e4a54c3f8

commit r14-9886-g4923ed49b93352bcf9e43cafac38345e4a54c3f8
Author: Kewen Lin 
Date:   Wed Apr 10 02:59:43 2024 -0500

testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

pr113359-2_*.c define a struct having unsigned long type
members ay and az which have 4 bytes size at -m32, while
the related constants CL1 and CL2 used for equality check
are always 8 bytes, it makes compiler consider the below

  69   if (a.ay != CL1)
  70 __builtin_abort ();

always to abort and optimize away the following call to
getb, which leads to the expected wpa dumping on
"Semantic equality" missing.

This patch is to modify the types with unsigned long long
accordingly.

PR testsuite/114662

gcc/testsuite/ChangeLog:

* gcc.dg/lto/pr113359-2_0.c: Use unsigned long long instead of
unsigned long.
* gcc.dg/lto/pr113359-2_1.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.dg/lto/pr113359-2_0.c | 8 
 gcc/testsuite/gcc.dg/lto/pr113359-2_1.c | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c 
b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
index 8b2d5bdfab2..8495667599d 100644
--- a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
+++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
@@ -8,15 +8,15 @@
 struct SA
 {
   unsigned int ax;
-  unsigned long ay;
-  unsigned long az;
+  unsigned long long ay;
+  unsigned long long az;
 };
 
 struct SB
 {
   unsigned int bx;
-  unsigned long by;
-  unsigned long bz;
+  unsigned long long by;
+  unsigned long long bz;
 };
 
 struct ZA
diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c 
b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
index 61bc0547981..8320f347efe 100644
--- a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
+++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
@@ -5,15 +5,15 @@
 struct SA
 {
   unsigned int ax;
-  unsigned long ay;
-  unsigned long az;
+  unsigned long long ay;
+  unsigned long long az;
 };
 
 struct SB
 {
   unsigned int bx;
-  unsigned long by;
-  unsigned long bz;
+  unsigned long long by;
+  unsigned long long bz;
 };
 
 struct ZA


[gcc r14-10011] testsuite, rs6000: Fix builtins-6-p9-runnable.c for BE [PR114744]

2024-04-17 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:6e62ede7aaccc6ebe027c8e00224f65e226072e9

commit r14-10011-g6e62ede7aaccc6ebe027c8e00224f65e226072e9
Author: Kewen Lin 
Date:   Wed Apr 17 22:20:07 2024 -0500

testsuite, rs6000: Fix builtins-6-p9-runnable.c for BE [PR114744]

Test case builtins-6-p9-runnable.c doesn't work well on BE
due to two problems:
  - When applying vec_xl_len onto data_128 and data_u128
with length 8, it expects to load 128[01] from
the memory, but unfortunately assigning 128[01] to
a {vector} {u,}int128 type variable, the value isn't
guaranteed to be at the beginning of storage (in the
low part of memory), which means the loaded value can
be unexpected (as shown on BE).  So this patch is to
introduce getU128 which can ensure the given value
shows up as expected and also update some dumping code
for debugging.
  - When applying vec_xl_len_r with length 16, on BE it's
just like the normal vector load, so the expected data
should not be reversed from the original.

PR testsuite/114744

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/builtins-6-p9-runnable.c: Adjust for BE by 
fixing
data_{u,}128, their uses and vec_uc_expected1, also adjust some 
formats.

Diff:
---
 .../gcc.target/powerpc/builtins-6-p9-runnable.c| 119 +++--
 1 file changed, 64 insertions(+), 55 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-6-p9-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-6-p9-runnable.c
index 20fdd3bb4ec..36101c2b861 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-6-p9-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-6-p9-runnable.c
@@ -337,6 +337,30 @@ void print_f (vector float vec_expected,
 }
 #endif
 
+typedef union
+{
+  vector __int128_t i1;
+  __int128_t i2;
+  vector __uint128_t u1;
+  __uint128_t u2;
+  struct
+  {
+long long d1;
+long long d2;
+  };
+} U128;
+
+/* For a given long long VALUE, ensure it's stored from the beginning
+   of {u,}int128 memory storage (the low address), it avoids to load
+   unexpected data without the whole vector length.  */
+
+static inline void
+getU128 (U128 *pu, unsigned long long value)
+{
+  pu->d1 = value;
+  pu->d2 = 0;
+}
+
 int main() {
   int i, j;
   size_t len;
@@ -835,21 +859,24 @@ int main() {
 #endif
 }
 
+  U128 u_temp;
   /* vec_xl_len() tests */
   for (i = 0; i < 100; i++)
 {
-  data_c[i] = i;
-  data_uc[i] = i+1;
-  data_ssi[i] = i+10;
-  data_usi[i] = i+11;
-  data_si[i] = i+100;
-  data_ui[i] = i+101;
-  data_sll[i] = i+1000;
-  data_ull[i] = i+1001;
-  data_f[i] = i+10.0;
-  data_d[i] = i+100.0;
-  data_128[i] = i + 1280;
-  data_u128[i] = i + 1281;
+   data_c[i] = i;
+   data_uc[i] = i + 1;
+   data_ssi[i] = i + 10;
+   data_usi[i] = i + 11;
+   data_si[i] = i + 100;
+   data_ui[i] = i + 101;
+   data_sll[i] = i + 1000;
+   data_ull[i] = i + 1001;
+   data_f[i] = i + 10.0;
+   data_d[i] = i + 100.0;
+   getU128 (&u_temp, i + 1280);
+   data_128[i] = u_temp.i2;
+   getU128 (&u_temp, i + 1281);
+   data_u128[i] = u_temp.u2;
 }
 
   len = 16;
@@ -1160,34 +1187,38 @@ int main() {
 #endif
 }
 
-  vec_s128_expected1 = (vector __int128_t){1280};
+  getU128 (&u_temp, 1280);
+  vec_s128_expected1 = u_temp.i1;
   vec_s128_result1 = vec_xl_len (data_128, len);
 
   if (vec_s128_expected1[0] != vec_s128_result1[0])
 {
 #ifdef DEBUG
-   printf("Error: vec_xl_len(), len = %d, vec_s128_result1[0] = %lld %llu; 
",
- len, vec_s128_result1[0] >> 64,
- vec_s128_result1[0] & (__int128_t)0x);
-   printf("vec_s128_expected1[0] = %lld %llu\n",
- vec_s128_expected1[0] >> 64,
- vec_s128_expected1[0] & (__int128_t)0x);
+  U128 u1, u2;
+  u1.i1 = vec_s128_result1;
+  u2.i1 = vec_s128_expected1;
+  printf ("Error: vec_xl_len(), len = %d,"
+ "vec_s128_result1[0] = %llx %llx; ",
+ len, u1.d1, u1.d2);
+  printf ("vec_s128_expected1[0] = %llx %llx\n", u2.d1, u2.d2);
 #else
abort ();
 #endif
 }
 
   vec_u128_result1 = vec_xl_len (data_u128, len);
-  vec_u128_expected1 = (vector __uint128_t){1281};
+  getU128 (&u_temp, 1281);
+  vec_u128_expected1 = u_temp.u1;
   if (vec_u128_expected1[0] != vec_u128_result1[0])
 #ifdef DEBUG
 {
-   printf("Error: vec_xl_len(), len = %d, vec_u128_result1[0] = %lld; ",
- len, vec_u128_result1[0] >> 64,
- vec_u128_result1[0] & (__int128_t)0x);
-   printf("vec_u128_expected1[0] = %lld\n",
- vec_u128_expected1[0] >> 64,
- vec_u128_expected1[0] & (__int128_t)0x);
+  U128 u1, u2;
+  u1.u1 = vec_u128_result1;
+ 

[gcc r13-8646] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:a9f174f01f25fa6df989707dc2fec29ef78aad24

commit r13-8646-ga9f174f01f25fa6df989707dc2fec29ef78aad24
Author: Kewen Lin 
Date:   Mon Apr 8 21:01:36 2024 -0500

rs6000: Fix wrong align passed to build_aligned_type [PR88309]

As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.

This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.

PR target/88309

Co-authored-by: Andrew Pinski 

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.cc (build_qualified_type): Update function comments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr88309.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 +++
 gcc/tree-ssa-loop-prefetch.cc  |  2 ++
 gcc/tree.cc|  3 ++-
 4 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 534698e7d3e..2b4412e0403 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1896,7 +1896,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree lhs_type = TREE_TYPE (lhs);
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
  required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_ltype = build_aligned_type (lhs_type, 4);
+   tree align_ltype = build_aligned_type (lhs_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg0.  The resulting type will match
   the type of arg1.  */
@@ -1940,7 +1940,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree arg2_type = ptr_type_node;
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
   required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_stype = build_aligned_type (arg0_type, 4);
+   tree align_stype = build_aligned_type (arg0_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg1.  */
gimple_seq stmts = NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
b/gcc/testsuite/gcc.target/powerpc/pr88309.c
new file mode 100644
index 000..c0078cf2b8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
+
+/* Verify there is no ICE or hanging.  */
+
+#include 
+
+void b(float *c, vector float a, vector float, vector float)
+{
+  vector float d;
+  vector char ahbc;
+  vec_xst(vec_perm(a, d, ahbc), 0, c);
+}
+
+vector float e(vector unsigned);
+
+void f() {
+  float *dst;
+  int g = 0;
+  for (;; g += 16) {
+vector unsigned m, i;
+vector unsigned n, j;
+vector unsigned k, l;
+b(dst + g * 3, e(m), e(n), e(k));
+b(dst + (g + 4) * 3, e(i), e(j), e(l));
+  }
+}
diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
index 130c00f3b3a..5a79a9e6a5e 100644
--- a/gcc/tree-ssa-loop-prefetch.cc
+++ b/gcc/tree-ssa-loop-prefetch.cc
@@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
cache_line_size,
   if (delta >= (HOST_WIDE_INT) cache_line_size)
 return false;
 
+  gcc_assert (align_unit > 0);
+
   miss_positions = 0;
   total_positions = (cache_line_size / align_unit) * distinct_iters;
   max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 1000;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 207293c48cb..1d1c240b257 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -5660,7 +5660,8 @@ build_qualified_type (tree type, int type_quals 
MEM_STAT_DECL)
   return t;
 }
 
-/* Create a variant of type T with alignment ALIGN.  */
+/* Create a variant of type T with alignment ALIGN which
+   is measured in bits.  */
 
 tree
 build_aligned_type (tree type, unsigned int align)


[gcc r12-10393] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:43c8cb0e003996b3a7a9f98923f602561f3f0ec7

commit r12-10393-g43c8cb0e003996b3a7a9f98923f602561f3f0ec7
Author: Kewen Lin 
Date:   Mon Apr 8 21:01:36 2024 -0500

rs6000: Fix wrong align passed to build_aligned_type [PR88309]

As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.

This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.

PR target/88309

Co-authored-by: Andrew Pinski 

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.cc (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.cc (build_qualified_type): Update function comments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr88309.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 +++
 gcc/tree-ssa-loop-prefetch.cc  |  2 ++
 gcc/tree.cc|  3 ++-
 4 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index cc385a2b277..39a07a27c86 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1920,7 +1920,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree lhs_type = TREE_TYPE (lhs);
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
  required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_ltype = build_aligned_type (lhs_type, 4);
+   tree align_ltype = build_aligned_type (lhs_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg0.  The resulting type will match
   the type of arg1.  */
@@ -1964,7 +1964,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree arg2_type = ptr_type_node;
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
   required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_stype = build_aligned_type (arg0_type, 4);
+   tree align_stype = build_aligned_type (arg0_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg1.  */
gimple_seq stmts = NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
b/gcc/testsuite/gcc.target/powerpc/pr88309.c
new file mode 100644
index 000..c0078cf2b8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
+
+/* Verify there is no ICE or hanging.  */
+
+#include 
+
+void b(float *c, vector float a, vector float, vector float)
+{
+  vector float d;
+  vector char ahbc;
+  vec_xst(vec_perm(a, d, ahbc), 0, c);
+}
+
+vector float e(vector unsigned);
+
+void f() {
+  float *dst;
+  int g = 0;
+  for (;; g += 16) {
+vector unsigned m, i;
+vector unsigned n, j;
+vector unsigned k, l;
+b(dst + g * 3, e(m), e(n), e(k));
+b(dst + (g + 4) * 3, e(i), e(j), e(l));
+  }
+}
diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
index aebd7c9206f..543e142f85d 100644
--- a/gcc/tree-ssa-loop-prefetch.cc
+++ b/gcc/tree-ssa-loop-prefetch.cc
@@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
cache_line_size,
   if (delta >= (HOST_WIDE_INT) cache_line_size)
 return false;
 
+  gcc_assert (align_unit > 0);
+
   miss_positions = 0;
   total_positions = (cache_line_size / align_unit) * distinct_iters;
   max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 1000;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index e6593de87b6..ead4c1421cd 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -5649,7 +5649,8 @@ build_qualified_type (tree type, int type_quals 
MEM_STAT_DECL)
   return t;
 }
 
-/* Create a variant of type T with alignment ALIGN.  */
+/* Create a variant of type T with alignment ALIGN which
+   is measured in bits.  */
 
 tree
 build_aligned_type (tree type, unsigned int align)


[gcc r11-11363] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-24 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:02f1b5361188c9d833cef39caf723d31d44ba5d5

commit r11-11363-g02f1b5361188c9d833cef39caf723d31d44ba5d5
Author: Kewen Lin 
Date:   Mon Apr 8 21:01:36 2024 -0500

rs6000: Fix wrong align passed to build_aligned_type [PR88309]

As the comments in PR88309 show, there are two oversights
in rs6000_gimple_fold_builtin that pass align in bytes to
build_aligned_type but which actually requires align in
bits, it causes unexpected ICE or hanging in function
is_miss_rate_acceptable due to zero align_unit value.

This patch is to fix them by converting bytes to bits, add
an assertion on positive align_unit value and notes function
build_aligned_type requires align measured in bits in its
function comment.

PR target/88309

Co-authored-by: Andrew Pinski 

gcc/ChangeLog:

* config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Fix
wrong align passed to function build_aligned_type.
* tree-ssa-loop-prefetch.c (is_miss_rate_acceptable): Add an
assertion to ensure align_unit should be positive.
* tree.c (build_qualified_type): Update function comments.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr88309.c: New test.

(cherry picked from commit 26eb5f8fd173e2425ae7505528fc426de4b7e34c)

Diff:
---
 gcc/config/rs6000/rs6000-call.c|  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr88309.c | 27 +++
 gcc/tree-ssa-loop-prefetch.c   |  2 ++
 gcc/tree.c |  3 ++-
 4 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 1be4797e834..c555f7857d1 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12658,7 +12658,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree lhs_type = TREE_TYPE (lhs);
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
  required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_ltype = build_aligned_type (lhs_type, 4);
+   tree align_ltype = build_aligned_type (lhs_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg0.  The resulting type will match
   the type of arg1.  */
@@ -12702,7 +12702,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
tree arg2_type = ptr_type_node;
/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
   required alignment (power) is 4 bytes regardless of data type.  */
-   tree align_stype = build_aligned_type (arg0_type, 4);
+   tree align_stype = build_aligned_type (arg0_type, 32);
/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
   the tree using the value from arg1.  */
gimple_seq stmts = NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88309.c 
b/gcc/testsuite/gcc.target/powerpc/pr88309.c
new file mode 100644
index 000..c0078cf2b8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr88309.c
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2 -fprefetch-loop-arrays" } */
+
+/* Verify there is no ICE or hanging.  */
+
+#include 
+
+void b(float *c, vector float a, vector float, vector float)
+{
+  vector float d;
+  vector char ahbc;
+  vec_xst(vec_perm(a, d, ahbc), 0, c);
+}
+
+vector float e(vector unsigned);
+
+void f() {
+  float *dst;
+  int g = 0;
+  for (;; g += 16) {
+vector unsigned m, i;
+vector unsigned n, j;
+vector unsigned k, l;
+b(dst + g * 3, e(m), e(n), e(k));
+b(dst + (g + 4) * 3, e(i), e(j), e(l));
+  }
+}
diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 98062eb4616..8d73b14dfc4 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -739,6 +739,8 @@ is_miss_rate_acceptable (unsigned HOST_WIDE_INT 
cache_line_size,
   if (delta >= (HOST_WIDE_INT) cache_line_size)
 return false;
 
+  gcc_assert (align_unit > 0);
+
   miss_positions = 0;
   total_positions = (cache_line_size / align_unit) * distinct_iters;
   max_allowed_miss_positions = (ACCEPTABLE_MISS_RATE * total_positions) / 1000;
diff --git a/gcc/tree.c b/gcc/tree.c
index 79e03204a6e..8b5b0b7508c 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -6711,7 +6711,8 @@ build_qualified_type (tree type, int type_quals 
MEM_STAT_DECL)
   return t;
 }
 
-/* Create a variant of type T with alignment ALIGN.  */
+/* Create a variant of type T with alignment ALIGN which
+   is measured in bits.  */
 
 tree
 build_aligned_type (tree type, unsigned int align)


[gcc r15-2427] rs6000: Use standard name uabd for absdu insns

2024-07-30 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:169341f0893a009736f9715db969909880d0e876

commit r15-2427-g169341f0893a009736f9715db969909880d0e876
Author: Kewen Lin 
Date:   Tue Jul 30 21:20:51 2024 -0500

rs6000: Use standard name uabd for absdu insns

r14-1832 adds recognition pattern, ifn and optab for ABD
(ABsolute Difference), we have some vector absolute
difference unsigned instructions since ISA 3.0, as the
associated test cases shown, they are not exploited well
as we don't define it (them) with a standard name.  So this
patch is to rename it with standard name first.  And it
merges both define_expand and define_insn as a separated
define_expand isn't needed.  Besides, it adjusts the RTL
pattern by using generic umax and umin rather than
UNSPEC_VADU, it's more meaningful and can catch umin/umax
opportunity.

gcc/ChangeLog:

* config/rs6000/altivec.md (p9_vadu3): Rename to ...
(uabd3): ... this.  Update RTL pattern with umin and umax 
rather
than UNSPEC_VADU.
(vadu3): Remove.
(UNSPEC_VADU): Remove.
(usadv16qi): Replace gen_p9_vaduv16qi3 with gen_uabdv16qi3.
(usadv8hi): Replace gen_p9_vaduv8hi3 with gen_uabdv8hi3.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vadub): 
Replace
expander with uabdv16qi3.
(__builtin_altivec_vaduh): Adjust expander with uabdv8hi3.
(__builtin_altivec_vaduw): Adjust expander with uabdv4si3.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/abd-vectorize-1.c: New test.
* gcc.target/powerpc/abd-vectorize-2.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md   | 25 ++-
 gcc/config/rs6000/rs6000-builtins.def  |  6 ++--
 gcc/testsuite/gcc.target/powerpc/abd-vectorize-1.c | 27 
 gcc/testsuite/gcc.target/powerpc/abd-vectorize-2.c | 37 ++
 4 files changed, 77 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 5af9bf920a2e..aa9d8fffc901 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -119,7 +119,6 @@
UNSPEC_STVLXL
UNSPEC_STVRX
UNSPEC_STVRXL
-   UNSPEC_VADU
UNSPEC_VSLV
UNSPEC_VSRV
UNSPEC_VMULWHUB
@@ -4323,19 +4322,15 @@
   [(set_attr "type" "vecsimple")])
 
 ;; Vector absolute difference unsigned
-(define_expand "vadu3"
-  [(set (match_operand:VI 0 "register_operand")
-(unspec:VI [(match_operand:VI 1 "register_operand")
-   (match_operand:VI 2 "register_operand")]
- UNSPEC_VADU))]
-  "TARGET_P9_VECTOR")
-
-;; Vector absolute difference unsigned
-(define_insn "p9_vadu3"
+(define_insn "uabd3"
   [(set (match_operand:VI 0 "register_operand" "=v")
-(unspec:VI [(match_operand:VI 1 "register_operand" "v")
-   (match_operand:VI 2 "register_operand" "v")]
- UNSPEC_VADU))]
+   (minus:VI
+ (umax:VI
+   (match_operand:VI 1 "register_operand" "v")
+   (match_operand:VI 2 "register_operand" "v"))
+ (umin:VI
+   (match_dup 1)
+   (match_dup 2]
   "TARGET_P9_VECTOR"
   "vabsdu %0,%1,%2"
   [(set_attr "type" "vecsimple")])
@@ -4500,7 +4495,7 @@
   rtx zero = gen_reg_rtx (V4SImode);
   rtx psum = gen_reg_rtx (V4SImode);
 
-  emit_insn (gen_p9_vaduv16qi3 (absd, operands[1], operands[2]));
+  emit_insn (gen_uabdv16qi3 (absd, operands[1], operands[2]));
   emit_insn (gen_altivec_vspltisw (zero, const0_rtx));
   emit_insn (gen_altivec_vsum4ubs (psum, absd, zero));
   emit_insn (gen_addv4si3 (operands[0], psum, operands[3]));
@@ -4521,7 +4516,7 @@
   rtx zero = gen_reg_rtx (V4SImode);
   rtx psum = gen_reg_rtx (V4SImode);
 
-  emit_insn (gen_p9_vaduv8hi3 (absd, operands[1], operands[2]));
+  emit_insn (gen_uabdv8hi3 (absd, operands[1], operands[2]));
   emit_insn (gen_altivec_vspltisw (zero, const0_rtx));
   emit_insn (gen_altivec_vsum4shs (psum, absd, zero));
   emit_insn (gen_addv4si3 (operands[0], psum, operands[3]));
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 12d131d016d6..0c3c884c1104 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2345,13 +2345,13 @@
 VFIRSTMISMATCHOREOSINDEX_V4SI first_mismatch_or_eos_index_v4si {}
 
   const vsc __builtin_altivec_vadub (vsc, vsc);
-VADUB vaduv16qi3 {}
+VADUB uabdv16qi3 {}
 
   const vss __builtin_altivec_vaduh (vss, vss);
-VADUH vaduv8hi3 {}
+VADUH uabdv8hi3 {}
 
   const vsi __builtin_altivec_vaduw (vsi, vsi);
-VADUW vaduv4si3 {}
+VADUW uabdv4si3 {}
 
   const vsll __builtin_altivec_vbpermd (vsll, vsc);
 VBPERMD altivec_vbpermd {}
diff --git a/gcc/testsuite/gcc.target/powerpc/abd-vectorize-1.c 
b/gcc/testsuite/gcc.target/powerpc/abd-vectorize-1.c
new file mode 100644
index ..d63b887b4b8f
--- /d

[gcc r15-2428] rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359]

2024-07-30 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:993a3c0894c487dce5efc6cfb5b31a8358905e8f

commit r15-2428-g993a3c0894c487dce5efc6cfb5b31a8358905e8f
Author: Kewen Lin 
Date:   Tue Jul 30 21:21:15 2024 -0500

rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P 
[PR105359]

As PR105359 shows, we disable some FLOAT128 expanders for
64-bit long double, but in fact IEEE float128 types like
__ieee128 are only guarded with TARGET_FLOAT128_TYPE and
TARGET_LONG_DOUBLE_128 is only checked when determining if
we can reuse long_double_type_node.  So this patch is to
relax all affected FLOAT128 expander conditions for
FLOAT128_IEEE_P.  By the way, currently IBM double double
type __ibm128 is guarded by TARGET_LONG_DOUBLE_128, so we
have to use TARGET_LONG_DOUBLE_128 for it.  IMHO, it's not
necessary and can be enhanced later.

Btw, for all test cases mentioned in PR105359, I removed
the xfails and tested them with explicit -mlong-double-64,
both pr79004.c and float128-hw.c are tested well and
float128-hw4.c isn't tested (unsupported due to 64 bit
long double conflicts with -mabi=ieeelongdouble).

PR target/105359

gcc/ChangeLog:

* config/rs6000/rs6000.md (@extenddf2): Don't check
TARGET_LONG_DOUBLE_128 for FLOAT128_IEEE_P modes.
(extendsf2): Likewise.
(truncdf2): Likewise.
(truncsf2): Likewise.
(floatsi2): Likewise.
(fix_truncsi2): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr79004.c: Remove xfails.

Diff:
---
 gcc/config/rs6000/rs6000.md| 18 --
 gcc/testsuite/gcc.target/powerpc/pr79004.c | 14 ++
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index cfb22a3cb7da..d352a1431add 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8845,7 +8845,8 @@
 (define_expand "@extenddf2"
   [(set (match_operand:FLOAT128 0 "gpc_reg_operand")
(float_extend:FLOAT128 (match_operand:DF 1 "gpc_reg_operand")))]
-  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT
+   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"
 {
   if (FLOAT128_IEEE_P (mode))
 rs6000_expand_float128_convert (operands[0], operands[1], false);
@@ -8903,7 +8904,8 @@
 (define_expand "extendsf2"
   [(set (match_operand:FLOAT128 0 "gpc_reg_operand")
(float_extend:FLOAT128 (match_operand:SF 1 "gpc_reg_operand")))]
-  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT
+   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"
 {
   if (FLOAT128_IEEE_P (mode))
 rs6000_expand_float128_convert (operands[0], operands[1], false);
@@ -8919,7 +8921,8 @@
 (define_expand "truncdf2"
   [(set (match_operand:DF 0 "gpc_reg_operand")
(float_truncate:DF (match_operand:FLOAT128 1 "gpc_reg_operand")))]
-  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT
+   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"
 {
   if (FLOAT128_IEEE_P (mode))
 {
@@ -8956,7 +8959,8 @@
 (define_expand "truncsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand")
(float_truncate:SF (match_operand:FLOAT128 1 "gpc_reg_operand")))]
-  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT
+   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"
 {
   if (FLOAT128_IEEE_P (mode))
 rs6000_expand_float128_convert (operands[0], operands[1], false);
@@ -8973,7 +8977,8 @@
   [(parallel [(set (match_operand:FLOAT128 0 "gpc_reg_operand")
   (float:FLOAT128 (match_operand:SI 1 "gpc_reg_operand")))
  (clobber (match_scratch:DI 2))])]
-  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT
+   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"
 {
   rtx op0 = operands[0];
   rtx op1 = operands[1];
@@ -9009,7 +9014,8 @@
 (define_expand "fix_truncsi2"
   [(set (match_operand:SI 0 "gpc_reg_operand")
(fix:SI (match_operand:FLOAT128 1 "gpc_reg_operand")))]
-  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT
+   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"
 {
   rtx op0 = operands[0];
   rtx op1 = operands[1];
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79004.c 
b/gcc/testsuite/gcc.target/powerpc/pr79004.c
index 60c576cd36b6..ac89a4c9f327 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr79004.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr79004.c
@@ -100,12 +100,10 @@ void to_uns_short_store_n (TYPE a, unsigned short *p, 
long n) { p[n] = (unsigned
 void to_uns_int_store_n (TYPE a, unsigned int *p, long n) { p[n] = (unsigned 
int)a; }
 void to_uns_long_store_n (TYPE a, unsigned long *p, long n) { p[n] = (unsigned 
long)a; }
 
-/* On targets with 64-bit long double, some opcodes to deal with __float128 are
-   disabled, see PR target/105359.  */
-/* { dg-final { scan-assembler-not

[gcc r15-2658] testsuite: Adjust fam-in-union-alone-in-struct-2.c to support BE [PR116148]

2024-08-01 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:86e2dc89c5b8c9d9cca649a34a650e381a05b3a5

commit r15-2658-g86e2dc89c5b8c9d9cca649a34a650e381a05b3a5
Author: Kewen Lin 
Date:   Thu Aug 1 19:29:22 2024 -0500

testsuite: Adjust fam-in-union-alone-in-struct-2.c to support BE [PR116148]

As Andrew pointed out in PR116148, fam-in-union-alone-in-struct-2.c
was designed for little-endian, the recent commit r15-2403 made it
be tested with running on BE and PR116148 got exposed.

This patch is to adjust the expected data for members in with_fam_2_v
and with_fam_3_v by considering endianness, also update with_fam_3_v.b[1]
from 0x5f6f7f7f to 0x5f6f7f8f to avoid two "7f"s.

PR testsuite/116148

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-2.c: Define macros
WITH_FAM_2_V_B[03] and WITH_FAM_3_V_A[07] as endianness, update the
checking with these macros and initialize with_fam_3_v.b[1] with
0x5f6f7f8f instead of 0x5f6f7f7f.

Diff:
---
 .../c-c++-common/fam-in-union-alone-in-struct-2.c  | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c 
b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
index 93f9d5128f6e..7845a7fbab3e 100644
--- a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
+++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
@@ -16,7 +16,7 @@ union with_fam_2 {
 union with_fam_3 {
   char a[];  
   int b[];  
-} with_fam_3_v = {.b = {0x1f2f3f4f, 0x5f6f7f7f}};
+} with_fam_3_v = {.b = {0x1f2f3f4f, 0x5f6f7f8f}};
 
 struct only_fam {
   int b[]; 
@@ -28,16 +28,28 @@ struct only_fam_2 {
   int b[]; 
 } only_fam_2_v = {{7, 11}};
 
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define WITH_FAM_2_V_B0 0x4f
+#define WITH_FAM_2_V_B3 0x1f
+#define WITH_FAM_3_V_A0 0x4f
+#define WITH_FAM_3_V_A7 0x5f
+#else
+#define WITH_FAM_2_V_B0 0x1f
+#define WITH_FAM_2_V_B3 0x4f
+#define WITH_FAM_3_V_A0 0x1f
+#define WITH_FAM_3_V_A7 0x8f
+#endif
+
 int main ()
 {
   if (with_fam_1_v.b[3] != 4
   || with_fam_1_v.b[0] != 1)
 __builtin_abort ();
-  if (with_fam_2_v.b[3] != 0x1f
-  || with_fam_2_v.b[0] != 0x4f)
+  if (with_fam_2_v.b[3] != WITH_FAM_2_V_B3
+  || with_fam_2_v.b[0] != WITH_FAM_2_V_B0)
 __builtin_abort ();
-  if (with_fam_3_v.a[0] != 0x4f
-  || with_fam_3_v.a[7] != 0x5f)
+  if (with_fam_3_v.a[0] != WITH_FAM_3_V_A0
+  || with_fam_3_v.a[7] != WITH_FAM_3_V_A7)
 __builtin_abort ();
   if (only_fam_v.b[0] != 7
   || only_fam_v.b[1] != 11)


[gcc r15-2783] testsuite, rs6000: Make {vmx, vsx, p8vector}_hw check for altivec/vsx feature

2024-08-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:22b4e4fae86c86e15dd3d44cd653c70d65e0a993

commit r15-2783-g22b4e4fae86c86e15dd3d44cd653c70d65e0a993
Author: Kewen Lin 
Date:   Wed Aug 7 02:03:54 2024 -0500

testsuite, rs6000: Make {vmx,vsx,p8vector}_hw check for altivec/vsx feature

Different from p9vector_hw, vmx_hw/vsx_hw/p8vector_hw checks
can still succeed without Altivec/VSX feature support.  We
have many runnable test cases only checking for these *_hw
without extra checking for if Altivec/VSX feature enabled or
not.  It means they can fail if being tested by explicitly
disabling Altivec/VSX.  So I think it's reasonable to check
if Altivec/VSX feature is enabled too while checking testing
environment is able to execute some instructions since these
instructions reply on these features.  So similar to what we
test for p9vector_hw, this patch is to modify C functions
used for vmx_hw, vsx_hw and p8vector_hw with according vector
types and constraints.  For p8vector_hw, excepting for VSX
feature, it also requires ISA 2.7 support.  A good thing is
that now almost all of the test cases using p8vector_hw have
specified -mdejagnu-cpu=power8 always or if !has_arch_pwr8.
Considering checking _ARCH_PWR8 in p8vector_hw can stop test
cases being tested even if test case itself has specified
-mdejagnu-cpu=power8, this patch doesn't force p8vector_hw to
check _ARCH_PWR8, instead it updates all existing test cases
which adopt p8vector_hw but don't have -mdejagnu-cpu=power8.
By the way, all test cases adopting p9vector_hw are all fine.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_vsx_hw_available): Modify C source
code used for testing with type vector long long and constraint wa
which require VSX feature.
(check_p8vector_hw_available): Likewise.
(check_vmx_hw_available): Modify C source code used for testing with
type vector int and constraint v which require Altivec feature.
* gcc.target/powerpc/divkc3-1.c: Specify -mdejagnu-cpu=power8 for
!has_arch_pwr8 to ensure power8 support.
* gcc.target/powerpc/mulkc3-1.c: Likewise.
* gcc.target/powerpc/pr96264.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/divkc3-1.c |  1 +
 gcc/testsuite/gcc.target/powerpc/mulkc3-1.c |  1 +
 gcc/testsuite/gcc.target/powerpc/pr96264.c  |  1 +
 gcc/testsuite/lib/target-supports.exp   | 24 +---
 4 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/divkc3-1.c 
b/gcc/testsuite/gcc.target/powerpc/divkc3-1.c
index 89bf04f12a97..96fb5c212042 100644
--- a/gcc/testsuite/gcc.target/powerpc/divkc3-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/divkc3-1.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { powerpc64*-*-* && p8vector_hw } } } */
 /* { dg-options "-mfloat128 -mvsx" } */
+/* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 } 
} } */
 
 void abort ();
 
diff --git a/gcc/testsuite/gcc.target/powerpc/mulkc3-1.c 
b/gcc/testsuite/gcc.target/powerpc/mulkc3-1.c
index b975a91dbd7a..1b0a1e24814a 100644
--- a/gcc/testsuite/gcc.target/powerpc/mulkc3-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/mulkc3-1.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { powerpc64*-*-* && p8vector_hw } } } */
 /* { dg-options "-mfloat128 -mvsx" } */
+/* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 } 
} } */
 
 void abort ();
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr96264.c 
b/gcc/testsuite/gcc.target/powerpc/pr96264.c
index 9f7d885daf2a..906720fdcd11 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr96264.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr96264.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { powerpc64le-*-* } } } */
 /* { dg-options "-Os -fno-forward-propagate -fschedule-insns -fno-tree-ter 
-Wno-psabi" } */
+/* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 } 
} } */
 /* { dg-require-effective-target p8vector_hw } */
 
 typedef unsigned char __attribute__ ((__vector_size__ (64))) v512u8;
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index f8e5f5f36d03..26820b146d48 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2887,11 +2887,9 @@ proc check_p8vector_hw_available { } {
check_runtime_nocache p8vector_hw_available {
int main()
{
-   #ifdef __MACH__
- asm volatile ("xxlorc vs0,vs0,vs0");
-   #else
- asm volatile ("xxlorc 0,0,0");
-   #endif
+ vector long long v1 = {0x1, 0x2};
+ vector long long v2;
+ asm ("xxlorc %0,%1,%1" : "=wa" (v2) : "wa" (v1));
  return 0;
}
} $options
@@ -3188,11 +3186,9 @@ proc check_vsx_hw_ava

[gcc r15-2784] testsuite, rs6000: Remove useless powerpc_{altivec, vsx}_ok

2024-08-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:9b4b4dd108f262c95f5ee8aff911e4193a26e55a

commit r15-2784-g9b4b4dd108f262c95f5ee8aff911e4193a26e55a
Author: Kewen Lin 
Date:   Wed Aug 7 02:03:54 2024 -0500

testsuite, rs6000: Remove useless powerpc_{altivec,vsx}_ok

Checking the existing powerpc_{altivec,vsx}_ok test cases,
I found there are some test cases which don't require the
checks powerpc_{altivec,vsx} even, some of them already
have other effective target check which can cover check
powerpc_{altivec,vsx}, or some of them don't actually
require VSX/AltiVec feature at all.  So this patch is to
remove such useless checks.

PR testsuite/114842

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/amo2.c: Remove powerpc_vsx_ok effective target
check as p9vector_hw already covers it.
* gcc.target/powerpc/p9-sign_extend-runnable.c: Likewise.
* gcc.target/powerpc/clone2.c: Remove powerpc_vsx_ok effective 
target
check as ppc_cpu_supports_hw already covers it.
* gcc.target/powerpc/pr47251.c: Remove powerpc_vsx_ok effective 
target
check as it doesn't need VSX.
* gcc.target/powerpc/pr60137.c: Likewise.
* gcc.target/powerpc/pr80098-1.c: Likewise.
* gcc.target/powerpc/pr80098-2.c: Likewise.
* gcc.target/powerpc/pr80098-3.c: Likewise.
* gcc.target/powerpc/sd-pwr6.c: Likewise.
* gcc.target/powerpc/pr57744.c: Remove powerpc_vsx_ok effective 
target
check and option -mvsx as it doesn't need VSX.
* gcc.target/powerpc/pr69548.c: Remove powerpc_vsx_ok effective 
target
check as it doesn't need VSX, remove lp64 and use int128 instead.
* gcc.target/powerpc/vec-cmpne-long.c: Remove powerpc_vsx_ok 
effective
target check as p8vector_hw already covers it.
* gcc.target/powerpc/darwin-save-world-1.c: Remove 
powerpc_altivec_ok
effective target check as vmx_hw already covers it.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/amo2.c| 1 -
 gcc/testsuite/gcc.target/powerpc/clone2.c  | 1 -
 gcc/testsuite/gcc.target/powerpc/darwin-save-world-1.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c | 1 -
 gcc/testsuite/gcc.target/powerpc/pr47251.c | 1 -
 gcc/testsuite/gcc.target/powerpc/pr57744.c | 3 +--
 gcc/testsuite/gcc.target/powerpc/pr60137.c | 1 -
 gcc/testsuite/gcc.target/powerpc/pr69548.c | 6 +++---
 gcc/testsuite/gcc.target/powerpc/pr80098-1.c   | 1 -
 gcc/testsuite/gcc.target/powerpc/pr80098-2.c   | 1 -
 gcc/testsuite/gcc.target/powerpc/pr80098-3.c   | 1 -
 gcc/testsuite/gcc.target/powerpc/sd-pwr6.c | 1 -
 gcc/testsuite/gcc.target/powerpc/vec-cmpne-long.c  | 1 -
 13 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/amo2.c 
b/gcc/testsuite/gcc.target/powerpc/amo2.c
index 9cb493da53e9..592f0fb3f92d 100644
--- a/gcc/testsuite/gcc.target/powerpc/amo2.c
+++ b/gcc/testsuite/gcc.target/powerpc/amo2.c
@@ -1,5 +1,4 @@
 /* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mvsx -mpower9-misc" } */
 /* { dg-additional-options "-mdejagnu-cpu=power9" { target { ! has_arch_pwr9 } 
} } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/clone2.c 
b/gcc/testsuite/gcc.target/powerpc/clone2.c
index e64940b79527..4098e878c213 100644
--- a/gcc/testsuite/gcc.target/powerpc/clone2.c
+++ b/gcc/testsuite/gcc.target/powerpc/clone2.c
@@ -1,6 +1,5 @@
 /* { dg-do run { target { powerpc*-*-linux* } } } */
 /* { dg-options "-mvsx -O2" } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-require-effective-target ppc_cpu_supports_hw } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/darwin-save-world-1.c 
b/gcc/testsuite/gcc.target/powerpc/darwin-save-world-1.c
index 3326765f4fb7..27fc1d30a8bb 100644
--- a/gcc/testsuite/gcc.target/powerpc/darwin-save-world-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/darwin-save-world-1.c
@@ -1,7 +1,7 @@
 /* { dg-do run { target powerpc*-*-* } } */
 /* { dg-options "-maltivec" } */
 /* { dg-require-effective-target powerpc_altivec } */
-/* { dg-skip-if "need to be able to execute AltiVec" { ! { powerpc_altivec_ok 
&& vmx_hw } } } */
+/* { dg-skip-if "need to be able to execute AltiVec" { ! vmx_hw } } */
 
 /* With altivec turned on, Darwin wants to save the world but we did not mark 
lr as being saved any more
as saving the lr is not needed for saving altivec registers.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c
index f0514993bc0d..595aa4768ccb 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.

[gcc r15-2788] testsuite, rs6000: Adjust pr78056-[1357].c and remove pr78056-[246].c

2024-08-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:9db55ec0547e171eed8e7a7c50c8dad79d62fd65

commit r15-2788-g9db55ec0547e171eed8e7a7c50c8dad79d62fd65
Author: Kewen Lin 
Date:   Wed Aug 7 02:03:55 2024 -0500

testsuite, rs6000: Adjust pr78056-[1357].c and remove pr78056-[246].c

When cleaning up the remaining powerpc_{vsx,altivec}_ok test
cases, I found some issues are related to pr78056-*.c.
Firstly, the test points of pr78056-[246].c are no longer
available since r9-3164 drops many HAVE_AS_* and the expected
warning are dropped together, so this patch is to remove them.
Secondly, pr78056-1.c and pr78056-3.c include altivec.h but
don't use any builtins, checking powerpc_altivec is enough
(don't need to check powerpc_vsx).  And pr78056-5.c doesn't
require any altivec/vsx feature, so powerpc_vsx_ok can be
removed.  Lastly, pr78056-7.c should just use powerpc_fprs
instead of dfp_hw as it only cares about insn fcpsgn.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr78056-1.c: Check for powerpc_altivec rather 
than
powerpc_vsx.
* gcc.target/powerpc/pr78056-3.c: Likewise.
* gcc.target/powerpc/pr78056-5.c: Drop powerpc_vsx_ok check.
* gcc.target/powerpc/pr78056-7.c: Check for powerpc_fprs rather than
dfp_hw.
* gcc.target/powerpc/pr78056-2.c: Remove.
* gcc.target/powerpc/pr78056-4.c: Remove.
* gcc.target/powerpc/pr78056-6.c: Remove.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/pr78056-1.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr78056-2.c | 18 --
 gcc/testsuite/gcc.target/powerpc/pr78056-3.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr78056-4.c | 19 ---
 gcc/testsuite/gcc.target/powerpc/pr78056-5.c |  2 --
 gcc/testsuite/gcc.target/powerpc/pr78056-6.c | 25 -
 gcc/testsuite/gcc.target/powerpc/pr78056-7.c |  2 --
 7 files changed, 4 insertions(+), 70 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr78056-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr78056-1.c
index 72640007dbb6..49ebafe39b65 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr78056-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr78056-1.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-aix* } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx" } */
-/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-options "-mdejagnu-cpu=power8" } */
+/* { dg-require-effective-target powerpc_altivec } */
 
 /* This test should succeed on both 32- and 64-bit configurations.  */
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr78056-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr78056-2.c
deleted file mode 100644
index 5cda9d6193b2..
--- a/gcc/testsuite/gcc.target/powerpc/pr78056-2.c
+++ /dev/null
@@ -1,18 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-skip-if "" { powerpc_vsx_ok } } */
-/* { dg-skip-if "" { powerpc*-*-aix* } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx" } */
-
-/* This test should succeed on both 32- and 64-bit configurations.  */
-#include 
-
-/* Though the command line specifies power8 target, this function is
-   to support power9. Expect an error message here because this target
-   does not support power9.  */
-__attribute__((target("cpu=power9")))
-int get_random ()
-{ /* { dg-warning "lacks power9 support" } */
-  return __builtin_darn_32 (); /* { dg-warning "implicit declaration" } */
-}
-
diff --git a/gcc/testsuite/gcc.target/powerpc/pr78056-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr78056-3.c
index cf57d058e8be..745552b244d0 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr78056-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr78056-3.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-options "-mdejagnu-cpu=power7" } */
-/* { dg-require-effective-target powerpc_vsx } */
 /* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mdejagnu-cpu=power7" } */
+/* { dg-require-effective-target powerpc_altivec } */
 
 /* This test should succeed on both 32- and 64-bit configurations.  */
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr78056-4.c 
b/gcc/testsuite/gcc.target/powerpc/pr78056-4.c
deleted file mode 100644
index 0bea0f895fac..
--- a/gcc/testsuite/gcc.target/powerpc/pr78056-4.c
+++ /dev/null
@@ -1,19 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* } } } */
-/* powerpc_vsx_ok represents power7 */
-/* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-skip-if "" { powerpc_vsx_ok } } */
-/* { dg-skip-if "" { powerpc*-*-aix* } } */
-/* { dg-options "-mdejagnu-cpu=power7" } */
-
-/* This test should succeed on both 32- and 64-bit configurations.  */
-#include 
-
-/* Though the command line specifies power7 target, this function is
-   to support power8, which will fail because this platform does not
-   support power8

[gcc r15-2785] testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_altivec etc.

2024-08-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:4ddd886fececd83456d2d03dd6c191903dbca321

commit r15-2785-g4ddd886fececd83456d2d03dd6c191903dbca321
Author: Kewen Lin 
Date:   Wed Aug 7 02:03:54 2024 -0500

testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_altivec etc.

This is a follow up patch for the previous patch adjusting
powerpc_vsx_ok with powerpc_vsx, focusing on those test cases
which don't really require VSX feature but used powerpc_vsx_ok
before, they actually require some other effective target check,
like some of them just require ALTIVEC feature, some of them
just require hard float support, and some of them just require
ISA 2.06 etc..

By the way, ppc-fpconv-4.c is the only one missing powerpc_fprs
among ppc-fpconv-*.c after this replacement, so I also fix it
here.

PR testsuite/114842

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/bswap64-2.c: Replace powerpc_vsx_ok check with
has_arch_pwr7.
* gcc.target/powerpc/ppc-fpconv-2.c: Replace powerpc_vsx_ok check 
with
powerpc_fprs.
* gcc.target/powerpc/ppc-fpconv-6.c: Likewise.
* gcc.target/powerpc/ppc-pow.c: Likewise.
* gcc.target/powerpc/ppc-target-1.c: Likewise.
* gcc.target/powerpc/ppc-target-2.c: Likewise.
* gcc.target/powerpc/ppc-target-3.c: Likewise.
* gcc.target/powerpc/ppc-target-4.c: Likewise.
* gcc.target/powerpc/ppc-fpconv-4.c: Check for powerpc_fprs.
* gcc.target/powerpc/fold-vec-select-char.c: Replace powerpc_vsx_ok
with powerpc_altivec check and move it after dg-options line.
* gcc.target/powerpc/fold-vec-select-float.c: Likewise.
* gcc.target/powerpc/fold-vec-select-int.c: Likewise.
* gcc.target/powerpc/fold-vec-select-short.c: Likewise.
* gcc.target/powerpc/p9-novsx.c: Likewise.
* gcc.target/powerpc/p9-options-1.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/bswap64-2.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/fold-vec-select-char.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/fold-vec-select-float.c | 6 +++---
 gcc/testsuite/gcc.target/powerpc/fold-vec-select-int.c   | 2 +-
 gcc/testsuite/gcc.target/powerpc/fold-vec-select-short.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/p9-novsx.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/p9-options-1.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/ppc-fpconv-2.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/ppc-fpconv-4.c  | 1 +
 gcc/testsuite/gcc.target/powerpc/ppc-fpconv-6.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/ppc-pow.c   | 2 +-
 gcc/testsuite/gcc.target/powerpc/ppc-target-1.c  | 3 ++-
 gcc/testsuite/gcc.target/powerpc/ppc-target-2.c  | 3 ++-
 gcc/testsuite/gcc.target/powerpc/ppc-target-3.c  | 2 +-
 gcc/testsuite/gcc.target/powerpc/ppc-target-4.c  | 2 +-
 15 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/bswap64-2.c 
b/gcc/testsuite/gcc.target/powerpc/bswap64-2.c
index 6c3d8ca05289..70d872b5e304 100644
--- a/gcc/testsuite/gcc.target/powerpc/bswap64-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bswap64-2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-options "-O2 -mpopcntd" } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target has_arch_pwr7 } */
 /* { dg-final { scan-assembler "ldbrx" } } */
 /* { dg-final { scan-assembler "stdbrx" } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-select-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-select-char.c
index e055c017536b..17e28914aae4 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-select-char.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-select-char.c
@@ -2,8 +2,8 @@
inputs produce the right code.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-maltivec -O2" } */
+/* { dg-require-effective-target powerpc_altivec } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-select-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-select-float.c
index 1656fbff2ca5..848bd750ff85 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-select-float.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-select-float.c
@@ -1,9 +1,9 @@
-/* Verify that overloaded built-ins for vec_sel with float 
-   inputs for VSX produce the right code.  */
+/* Verify that overloaded built-ins for vec_sel with float
+   inputs produce the right code.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-maltivec -O2" } */
+/* { dg-require-effective-target powerpc_altivec } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-select-int.c 
b/gcc/testsuite/gcc.target/powe

[gcc r15-2786] testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_vsx

2024-08-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:cdca23875296edd78327d3da6890bb334c28f2fd

commit r15-2786-gcdca23875296edd78327d3da6890bb334c28f2fd
Author: Kewen Lin 
Date:   Wed Aug 7 02:03:55 2024 -0500

testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_vsx

Following up the previous r15-886, this patch to clean up
the remaining powerpc_vsx_ok which actually should use
powerpc_vsx instead.

PR testsuite/114842

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/error-1.c: Replace powerpc_vsx_ok check with
powerpc_vsx.
* gcc.target/powerpc/warn-2.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise.
* gcc.target/powerpc/ppc-fortran/pr80108-1.f90: Replace 
powerpc_vsx_ok
check with powerpc_vsx and remove useless -mfloat128.
* gcc.target/powerpc/pragma_power8.c: Replace powerpc_vsx_ok check 
with
powerpc_vsx.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/error-1.c   | 2 +-
 gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c | 4 ++--
 gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr80108-1.f90   | 4 ++--
 gcc/testsuite/gcc.target/powerpc/pragma_power8.c | 5 -
 gcc/testsuite/gcc.target/powerpc/warn-2.c| 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/error-1.c 
b/gcc/testsuite/gcc.target/powerpc/error-1.c
index d38eba8bb8ad..9327076baf00 100644
--- a/gcc/testsuite/gcc.target/powerpc/error-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/error-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target powerpc_vsx } */
 /* { dg-options "-O -mvsx -mno-altivec" } */
 
 /* { dg-error "'-mvsx' and '-mno-altivec' are incompatible" "" { target *-*-* 
} 0 } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c
index 60af61a7f163..aae4694f5519 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c
@@ -4,7 +4,7 @@
 /* { dg-do compile } */
 /* { dg-options "-mvsx -O2" } */
 /* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 } 
} } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target powerpc_vsx } */
 
 #include 
 
@@ -154,7 +154,7 @@ test6_nor (vector unsigned long long x, vector unsigned 
long long y)
 
 // The number of xxlor instructions generated varies between 6 and 24 for
 // older systems (power6,power7), as well as for 32-bit versus 64-bit targets.
-// For simplicity, this test now only targets "powerpc_vsx_ok" environments
+// For simplicity, this test now only targets "powerpc_vsx" environments
 // where the answer is expected to be 6.
 
 /* { dg-final { scan-assembler-times {\mxxlor\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr80108-1.f90 
b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr80108-1.f90
index 00392b5fed99..e0e157bd245a 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr80108-1.f90
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr80108-1.f90
@@ -1,7 +1,7 @@
 ! Originally contributed by Tobias Burnas.
 ! { dg-do compile { target { powerpc*-*-* } } }
-! { dg-require-effective-target powerpc_vsx_ok }
-! { dg-options "-mdejagnu-cpu=405 -mpower9-minmax -mfloat128" }
+! { dg-require-effective-target powerpc_vsx }
+! { dg-options "-mdejagnu-cpu=405 -mpower9-minmax" }
 ! { dg-excess-errors "expect error due to conflicting target options" }
 ! Since the error message is not associated with a particular line
 ! number, we cannot use the dg-error directive and cannot specify a
diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_power8.c 
b/gcc/testsuite/gcc.target/powerpc/pragma_power8.c
index 8de815e5a9e5..43ea6dd406e5 100644
--- a/gcc/testsuite/gcc.target/powerpc/pragma_power8.c
+++ b/gcc/testsuite/gcc.target/powerpc/pragma_power8.c
@@ -1,6 +1,9 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* Ensure there is no explicit -mno-vsx etc., otherwise
+   the below bif __builtin_vec_vcmpeq_p replies on power8
+   vsx would fail.  */
+/* { dg-require-effective-target powerpc_vsx } */
 /* { dg-options "-mdejagnu-cpu=power6 -maltivec -O2" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/warn-2.c 
b/gcc/testsuite/gcc.target/powerpc/warn-2.c
index 29c6ce50cd71..ba294cb52e55 100644
--- a/gcc/testsuite/gcc.target/powerpc/warn-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/warn-2.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
-/* { dg-require-effective-target

[gcc r15-2787] testsuite, rs6000: Fix some run cases with appropriate _hw

2024-08-07 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:3ab04f1f1dbfbf3ff0f8a934e27ac2adbd16d93a

commit r15-2787-g3ab04f1f1dbfbf3ff0f8a934e27ac2adbd16d93a
Author: Kewen Lin 
Date:   Wed Aug 7 02:03:55 2024 -0500

testsuite, rs6000: Fix some run cases with appropriate _hw

When cleaning up the remaining powerpc_{vsx,altivec}_ok test
cases, I found some dg-do run test cases which should check
for the appropriate {p8vector,vmx}_hw check instead.  This
patch is to adjust them accordingly.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/swaps-p8-46.c: Check for p8vector_hw rather 
than
powerpc_vsx_ok.
* gcc.target/powerpc/ppc64-abi-2.c: Check for vmx_hw rather than
powerpc_altivec_ok.
* gcc.target/powerpc/pr96139-c.c: Likewise.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c | 2 +-
 gcc/testsuite/gcc.target/powerpc/pr96139-c.c   | 2 +-
 gcc/testsuite/gcc.target/powerpc/swaps-p8-46.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c 
b/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
index b490fc3c2fd8..2a5a76020049 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
@@ -1,4 +1,4 @@
-/* { dg-do run { target { { powerpc*-*-linux* && lp64 } && powerpc_altivec_ok 
} } } */
+/* { dg-do run { target { { powerpc*-*-linux* && lp64 } && vmx_hw } } } */
 /* { dg-options "-O2 -fprofile -mprofile-kernel -maltivec -mabi=altivec 
-mno-pcrel" } */
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr96139-c.c 
b/gcc/testsuite/gcc.target/powerpc/pr96139-c.c
index 3ada26034280..b39c559ec0ba 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr96139-c.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr96139-c.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -Wall -maltivec" } */
-/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-require-effective-target vmx_hw } */
 
 /*
  * Based on test created by sjmunroe for pr96139
diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-46.c 
b/gcc/testsuite/gcc.target/powerpc/swaps-p8-46.c
index 3b5154b12312..d0392f25eeec 100644
--- a/gcc/testsuite/gcc.target/powerpc/swaps-p8-46.c
+++ b/gcc/testsuite/gcc.target/powerpc/swaps-p8-46.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target le } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-require-effective-target p8vector_hw } */
 /* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 " } */
 
 typedef __attribute__ ((__aligned__ (8))) unsigned long long __m64;


[gcc r15-2899] LRA: Don't emit move for substituted CONSTATNT_P operand [PR116170]

2024-08-13 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:49d5e21d41aed827038f6766140e2449a64a9726

commit r15-2899-g49d5e21d41aed827038f6766140e2449a64a9726
Author: Kewen Lin 
Date:   Tue Aug 13 04:28:28 2024 -0500

LRA: Don't emit move for substituted CONSTATNT_P operand [PR116170]

Commit r15-2084 exposes one ICE in LRA.  Firstly, before
r15-2084 KFmode has 126 bit precision while V1TImode has 128
bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
paradoxical_subreg_p, which stops some passes from doing
some optimization.  After r15-2084, KFmode has the same mode
precision as V1TImode, passes are able to optimize more, but
it causes this ICE in LRA as described below:

For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
which matches pattern

(define_insn "*vsx_le_perm_store_"
  [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
(match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
   && !altivec_indexed_or_indirect_operand (operands[0], mode)"
  "@
   #
   #"
  [(set_attr "type" "vecstore,store")
   (set_attr "length" "12,8")
   (set_attr "isa" ",*")])

LRA makes equivalence substitution on r133 with const double
(const_double:KF 0.0), selects alternative 0 and fixes up
operand 1 for constraint "wa", because operand 1 is OP_INOUT,
so it considers assigning back to it as well, that is:

  lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);

But because old has been changed to const_double in equivalence
substitution, the move is actually assigning to const_double,
which is invalid and cause ICE.

Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
even though this operand is OP_INOUT, IMHO there should not be
any following uses of reg:KF 133, otherwise it doesn't have the
chance to be equivalent to (const_double:KF 0.0).  So this patch
is to guard the lra_emit_move with !CONSTANT_P to exclude such
case.

PR rtl-optimization/116170

gcc/ChangeLog:

* lra-constraints.cc (curr_insn_transform): Don't emit move back to
old operand if it's CONSTANT_P.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr116170.c: New test.

Diff:
---
 gcc/lra-constraints.cc  |  4 +++-
 gcc/testsuite/gcc.target/powerpc/pr116170.c | 18 ++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 92b343fa99a0..90cbe6c012b7 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -4742,7 +4742,9 @@ curr_insn_transform (bool check_only_p)
}
  *loc = new_reg;
  if (type != OP_IN
- && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
+ && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX
+ /* OLD can be an equivalent constant here.  */
+ && !CONSTANT_P (old))
{
  start_sequence ();
  lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr116170.c 
b/gcc/testsuite/gcc.target/powerpc/pr116170.c
new file mode 100644
index ..6f6ca0f1ae93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr116170.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_sw } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -fstack-protector-strong 
-ffloat-store" } */
+
+/* Verify there is no ICE.  */
+
+int a, d;
+_Float128 b, c;
+void
+e ()
+{
+  int f = 0;
+  if (a)
+if (b || c)
+  f = 1;
+  if (d)
+e (f ? 0 : b);
+}


[gcc r15-2907] testsuite: Fix fam-in-union-alone-in-struct-2.c with unsigned char [PR116148]

2024-08-13 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:10972e6fb59cf83e12dcf7d5d6db01aa8e38dc52

commit r15-2907-g10972e6fb59cf83e12dcf7d5d6db01aa8e38dc52
Author: Kewen Lin 
Date:   Tue Aug 13 21:25:13 2024 -0500

testsuite: Fix fam-in-union-alone-in-struct-2.c with unsigned char 
[PR116148]

As PR116148#c7 shows, fam-in-union-alone-in-struct-2.c still
fails on hppa which is a BE environment, but by checking more
(also confirmed by John in PR116148#c12), it's due to that
signedness of plain char on hppa is signed therefore the value
of with_fam_3_v.a[7] "8f" get sign extended as "ff8f" then
the verification will fail.  This patch is to change plain char
with unsigned char to avoid that.

PR testsuite/116148

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-2.c: Change the type of
member a[] of union with_fam_3 with unsigned char.

Diff:
---
 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c 
b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
index 7845a7fbab3e..49960a3b45f7 100644
--- a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
+++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-2.c
@@ -14,7 +14,7 @@ union with_fam_2 {
 } with_fam_2_v = {.a = 0x1f2f3f4f};
 
 union with_fam_3 {
-  char a[];  
+  unsigned char a[];  
   int b[];  
 } with_fam_3_v = {.b = {0x1f2f3f4f, 0x5f6f7f8f}};


[gcc r15-884] rs6000: Don't clobber return value when eh_return called [PR114846]

2024-05-28 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:e5fc5d42d25c86ae48178db04ce64d340a834614

commit r15-884-ge5fc5d42d25c86ae48178db04ce64d340a834614
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc   |  7 +++
 gcc/config/rs6000/rs6000.md | 15 +++
 gcc/testsuite/gcc.target/powerpc/pr114846.c | 20 
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 60ba15a8bc3..bd5d56ba002 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -4308,9 +4308,6 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   rs6000_stack_t *info = rs6000_stack_info ();
 
-  if (epilogue_type == EPILOGUE_TYPE_NORMAL && crtl->calls_eh_return)
-epilogue_type = EPILOGUE_TYPE_EH_RETURN;
-
   int strategy = info->savres_strategy;
   bool using_load_multiple = !!(strategy & REST_MULTIPLE);
   bool restoring_GPRs_inline = !!(strategy & REST_INLINE_GPRS);
@@ -4788,7 +4785,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* In the ELFv2 ABI we need to restore all call-saved CR fields from
  *separate* slots if the routine calls __builtin_eh_return, so
- that they can be independently restored by the unwinder.  */
+ that they can be independently restored by the unwinder.  Since
+ it is for CR fields restoring, it should be done for any epilogue
+ types (not EPILOGUE_TYPE_EH_RETURN specific).  */
   if (DEFAULT_ABI == ABI_ELFv2 && crtl->calls_eh_return)
 {
   int i, cr_off = info->ehcr_offset;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f035e68ff0f..a5d20594789 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14274,6 +14274,8 @@
   ""
 {
   emit_insn (gen_eh_set_lr (Pmode, operands[0]));
+  emit_jump_insn (gen_eh_return_internal ());
+  emit_barrier ();
   DONE;
 })
 
@@ -14290,6 +14292,19 @@
   DONE;
 })
 
+(define_insn_and_split "eh_return_internal"
+  [(eh_return)]
+  ""
+  "#"
+  "epilogue_completed"
+  [(const_int 0)]
+{
+  if (!TARGET_SCHED_PROLOG)
+emit_insn (gen_blockage ());
+  rs6000_emit_epilogue (EPILOGUE_TYPE_EH_RETURN);
+  DONE;
+})
+
 (define_insn "prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 (match_operand:SI 1 "const_int_operand" "n")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114846.c 
b/gcc/testsuite/gcc.target/powerpc/pr114846.c
new file mode 100644
index 000..efe2300b73a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114846.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target builtin_eh_return } */
+
+/* Ensure it runs successfully.  */
+
+__attribute__ ((noipa))
+int f (int *a, long offset, void *handler)
+{
+  if (*a == 5)
+return 5;
+  __builtin_eh_return (offset, handler);
+}
+
+int main ()
+{
+  int t = 5;
+  if (f (&t, 0, 0) != 5)
+__builtin_abort ();
+  return 0;
+}


[gcc r15-1031] ada: Replace use of LONG_DOUBLE_TYPE_SIZE

2024-06-05 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:6fa25aa970cb82ee7fd6884d75bb14673b14dbbe

commit r15-1031-g6fa25aa970cb82ee7fd6884d75bb14673b14dbbe
Author: Kewen Lin 
Date:   Wed Jun 5 04:22:25 2024 -0500

ada: Replace use of LONG_DOUBLE_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in ada
with TYPE_PRECISION of long_double_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/ada/ChangeLog:

* gcc-interface/decl.cc (gnat_to_gnu_entity): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.

Diff:
---
 gcc/ada/gcc-interface/decl.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
index f6a4c0631b6..8b72c96c439 100644
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -520,7 +520,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, 
bool definition)
  esize = UI_To_Int (Esize (gnat_entity));
 
  if (IN (kind, Float_Kind))
-   max_esize = fp_prec_to_size (LONG_DOUBLE_TYPE_SIZE);
+   max_esize
+ = fp_prec_to_size (TYPE_PRECISION (long_double_type_node));
  else if (IN (kind, Access_Kind))
max_esize = POINTER_SIZE * 2;
  else


[gcc r15-1032] d: Replace use of LONG_DOUBLE_TYPE_SIZE

2024-06-05 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:b36461f126148b027e7541aaf356d5322a0fbc08

commit r15-1032-gb36461f126148b027e7541aaf356d5322a0fbc08
Author: Kewen Lin 
Date:   Wed Jun 5 04:22:25 2024 -0500

d: Replace use of LONG_DOUBLE_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to remove the only one use of LONG_DOUBLE_TYPE_SIZE
in d.  Iain found that LONG_DOUBLE_TYPE_SIZE is poorly named
and used incorrectly before, so this patch follows his advice
with int_size_in_bytes.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

Co-authored-by: Iain Buclaw 

gcc/d/ChangeLog:

* d-target.cc (Target::_init): Use int_size_in_bytes of
long_double_type_node to replace the expression with
LONG_DOUBLE_TYPE_SIZE for c.long_doublesize assignment.

Diff:
---
 gcc/d/d-target.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/d/d-target.cc b/gcc/d/d-target.cc
index 127b9d7ce7c..dd46e535891 100644
--- a/gcc/d/d-target.cc
+++ b/gcc/d/d-target.cc
@@ -163,7 +163,7 @@ Target::_init (const Param &)
   this->c.intsize = (INT_TYPE_SIZE / BITS_PER_UNIT);
   this->c.longsize = (LONG_TYPE_SIZE / BITS_PER_UNIT);
   this->c.long_longsize = (LONG_LONG_TYPE_SIZE / BITS_PER_UNIT);
-  this->c.long_doublesize = (LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT);
+  this->c.long_doublesize = int_size_in_bytes (long_double_type_node);
   this->c.wchar_tsize = (WCHAR_TYPE_SIZE / BITS_PER_UNIT);
 
   this->c.bitFieldStyle = targetm.ms_bitfield_layout_p (unknown_type_node)


[gcc r15-1033] fortran: Replace uses of {FLOAT, {, LONG_}DOUBLE}_TYPE_SIZE

2024-06-05 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:37a4800939bd90400e03a8fa561d2a0df394bced

commit r15-1033-g37a4800939bd90400e03a8fa561d2a0df394bced
Author: Kewen Lin 
Date:   Wed Jun 5 04:22:25 2024 -0500

fortran: Replace uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in fortran with TYPE_PRECISION of
{float,{,long_}double}_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/fortran/ChangeLog:

* trans-intrinsic.cc (build_round_expr): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.
* trans-types.cc (gfc_build_real_type): Use TYPE_PRECISION of
{float,double,long_double}_type_node to replace
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE.

Diff:
---
 gcc/fortran/trans-intrinsic.cc |  3 ++-
 gcc/fortran/trans-types.cc | 10 ++
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index 912c1000e18..96839705112 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -395,7 +395,8 @@ build_round_expr (tree arg, tree restype)
  don't have an appropriate function that converts directly to the integer
  type (such as kind == 16), just use ROUND, and then convert the result to
  an integer.  We might also need to convert the result afterwards.  */
-  if (resprec <= INT_TYPE_SIZE && argprec <= LONG_DOUBLE_TYPE_SIZE)
+  if (resprec <= INT_TYPE_SIZE
+  && argprec <= TYPE_PRECISION (long_double_type_node))
 fn = builtin_decl_for_precision (BUILT_IN_IROUND, argprec);
   else if (resprec <= LONG_TYPE_SIZE)
 fn = builtin_decl_for_precision (BUILT_IN_LROUND, argprec);
diff --git a/gcc/fortran/trans-types.cc b/gcc/fortran/trans-types.cc
index 8466c595e06..0ef67723fcd 100644
--- a/gcc/fortran/trans-types.cc
+++ b/gcc/fortran/trans-types.cc
@@ -873,13 +873,15 @@ gfc_build_real_type (gfc_real_info *info)
   int mode_precision = info->mode_precision;
   tree new_type;
 
-  if (mode_precision == FLOAT_TYPE_SIZE)
+  if (mode_precision == TYPE_PRECISION (float_type_node))
 info->c_float = 1;
-  if (mode_precision == DOUBLE_TYPE_SIZE)
+  if (mode_precision == TYPE_PRECISION (double_type_node))
 info->c_double = 1;
-  if (mode_precision == LONG_DOUBLE_TYPE_SIZE && !info->c_float128)
+  if (mode_precision == TYPE_PRECISION (long_double_type_node)
+  && !info->c_float128)
 info->c_long_double = 1;
-  if (mode_precision != LONG_DOUBLE_TYPE_SIZE && mode_precision == 128)
+  if (mode_precision != TYPE_PRECISION (long_double_type_node)
+  && mode_precision == 128)
 {
   /* TODO: see PR101835.  */
   info->c_float128 = 1;


[gcc r15-1034] darwin: Replace use of LONG_DOUBLE_TYPE_SIZE

2024-06-05 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:58ecd2eb507ab216861408cf10ec05efc4e8344e

commit r15-1034-g58ecd2eb507ab216861408cf10ec05efc4e8344e
Author: Kewen Lin 
Date:   Wed Jun 5 04:23:04 2024 -0500

darwin: Replace use of LONG_DOUBLE_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to replace use of LONG_DOUBLE_TYPE_SIZE in darwin
with TYPE_PRECISION of long_double_type_node.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/ChangeLog:

* config/darwin.cc (darwin_patch_builtins): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.

Diff:
---
 gcc/config/darwin.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/darwin.cc b/gcc/config/darwin.cc
index 63b8c509405..9129378be37 100644
--- a/gcc/config/darwin.cc
+++ b/gcc/config/darwin.cc
@@ -3620,7 +3620,7 @@ darwin_patch_builtin (enum built_in_function fncode)
 void
 darwin_patch_builtins (void)
 {
-  if (LONG_DOUBLE_TYPE_SIZE != 128)
+  if (TYPE_PRECISION (long_double_type_node) != 128)
 return;
 
 #define PATCH_BUILTIN(fncode) darwin_patch_builtin (fncode);


[gcc r15-1362] m2: Remove uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

2024-06-16 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:96fe23eb8a9ebac6b64aeb55db88d219177a345a

commit r15-1362-g96fe23eb8a9ebac6b64aeb55db88d219177a345a
Author: Kewen Lin 
Date:   Sun Jun 16 21:50:19 2024 -0500

m2: Remove uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE

Joseph pointed out "floating types should have their mode,
not a poorly defined precision value" in the discussion[1],
as he and Richi suggested, the existing macros
{FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE will be replaced with a
hook mode_for_floating_type.  To be prepared for that, this
patch is to remove uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE
in m2.  Currently they are used for assertion and can be
replaced with TYPE_SIZE check on the corresponding type node,
since we dropped the call to layout_type which would early
return once TYPE_SIZE is set and this assertion ensures it's
safe to drop that call.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html

gcc/m2/ChangeLog:

* gm2-gcc/m2type.cc (build_m2_short_real_node): Adjust assertion 
with
TYPE_SIZE check.
(build_m2_real_node): Likewise.
(build_m2_long_real_node): Add assertion with TYPE_SIZE check.

Diff:
---
 gcc/m2/gm2-gcc/m2type.cc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/m2/gm2-gcc/m2type.cc b/gcc/m2/gm2-gcc/m2type.cc
index 5773a5cbd190..7ed184518cb1 100644
--- a/gcc/m2/gm2-gcc/m2type.cc
+++ b/gcc/m2/gm2-gcc/m2type.cc
@@ -1416,7 +1416,7 @@ static tree
 build_m2_short_real_node (void)
 {
   /* Define `SHORTREAL'.  */
-  ASSERT_CONDITION (TYPE_PRECISION (float_type_node) == FLOAT_TYPE_SIZE);
+  ASSERT_CONDITION (TYPE_SIZE (float_type_node));
   return float_type_node;
 }
 
@@ -1424,7 +1424,7 @@ static tree
 build_m2_real_node (void)
 {
   /* Define `REAL'.  */
-  ASSERT_CONDITION (TYPE_PRECISION (double_type_node) == DOUBLE_TYPE_SIZE);  
+  ASSERT_CONDITION (TYPE_SIZE (double_type_node));
   return double_type_node;
 }
 
@@ -1432,12 +1432,13 @@ static tree
 build_m2_long_real_node (void)
 {
   tree longreal;
-  
+
   /* Define `LONGREAL'.  */
   if (M2Options_GetIEEELongDouble ())
 longreal = float128_type_node;
   else
 longreal = long_double_type_node;
+  ASSERT_CONDITION (TYPE_SIZE (longreal));
   return longreal;
 }


[gcc r15-1504] rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

2024-06-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:52c112800d9f44457c4832309a48c00945811313

commit r15-1504-g52c112800d9f44457c4832309a48c00945811313
Author: Kewen Lin 
Date:   Thu Jun 20 20:23:56 2024 -0500

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 0) (const_int 4)
   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
 (vec_concat:
(match_operand:VSX_W 1 "register_operand" "wa,v")
(match_operand:VSX_W 2 "register_operand" "wa,v"))
(parallel [(const_int 2) (const_int 6)
   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo 

PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_): 
Rename
to ...
(altivec_vmrghw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct__le): New define_insn.
(altivec_vmrglw_direct_): Rename to ...
(altivec_vmrglw_direct__be): ... this.  Add the 
condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct__le): New define_insn.
(altivec_vmrghw): Adjust by calling 
gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling 
gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_): Adjust by calling
gen_altivec_vmrghw_di

[gcc r14-10342] rs6000: Don't clobber return value when eh_return called [PR114846]

2024-06-23 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:2b5e8f918ef0027d2af8e53c4e114e1d133fc609

commit r14-10342-g2b5e8f918ef0027d2af8e53c4e114e1d133fc609
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc   |  7 +++
 gcc/config/rs6000/rs6000.md | 15 +++
 gcc/testsuite/gcc.target/powerpc/pr114846.c | 20 
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 60ba15a8bc3..bd5d56ba002 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -4308,9 +4308,6 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   rs6000_stack_t *info = rs6000_stack_info ();
 
-  if (epilogue_type == EPILOGUE_TYPE_NORMAL && crtl->calls_eh_return)
-epilogue_type = EPILOGUE_TYPE_EH_RETURN;
-
   int strategy = info->savres_strategy;
   bool using_load_multiple = !!(strategy & REST_MULTIPLE);
   bool restoring_GPRs_inline = !!(strategy & REST_INLINE_GPRS);
@@ -4788,7 +4785,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* In the ELFv2 ABI we need to restore all call-saved CR fields from
  *separate* slots if the routine calls __builtin_eh_return, so
- that they can be independently restored by the unwinder.  */
+ that they can be independently restored by the unwinder.  Since
+ it is for CR fields restoring, it should be done for any epilogue
+ types (not EPILOGUE_TYPE_EH_RETURN specific).  */
   if (DEFAULT_ABI == ABI_ELFv2 && crtl->calls_eh_return)
 {
   int i, cr_off = info->ehcr_offset;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ac5651d7420..d4120c3b9ce 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14281,6 +14281,8 @@
   ""
 {
   emit_insn (gen_eh_set_lr (Pmode, operands[0]));
+  emit_jump_insn (gen_eh_return_internal ());
+  emit_barrier ();
   DONE;
 })
 
@@ -14297,6 +14299,19 @@
   DONE;
 })
 
+(define_insn_and_split "eh_return_internal"
+  [(eh_return)]
+  ""
+  "#"
+  "epilogue_completed"
+  [(const_int 0)]
+{
+  if (!TARGET_SCHED_PROLOG)
+emit_insn (gen_blockage ());
+  rs6000_emit_epilogue (EPILOGUE_TYPE_EH_RETURN);
+  DONE;
+})
+
 (define_insn "prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 (match_operand:SI 1 "const_int_operand" "n")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114846.c 
b/gcc/testsuite/gcc.target/powerpc/pr114846.c
new file mode 100644
index 000..efe2300b73a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114846.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target builtin_eh_return } */
+
+/* Ensure it runs successfully.  */
+
+__attribute__ ((noipa))
+int f (int *a, long offset, void *handler)
+{
+  if (*a == 5)
+return 5;
+  __builtin_eh_return (offset, handler);
+}
+
+int main ()
+{
+  int t = 5;
+  if (f (&t, 0, 0) != 5)
+__builtin_abort ();
+  return 0;
+}


[gcc r13-8866] rs6000: Don't clobber return value when eh_return called [PR114846]

2024-06-23 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:dd54ed4ae417935300a3c4bb356d37c2ae7f731e

commit r13-8866-gdd54ed4ae417935300a3c4bb356d37c2ae7f731e
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc   |  7 +++
 gcc/config/rs6000/rs6000.md | 15 +++
 gcc/testsuite/gcc.target/powerpc/pr114846.c | 20 
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index d6c9c6e5b52..baadbbd692e 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -4311,9 +4311,6 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   rs6000_stack_t *info = rs6000_stack_info ();
 
-  if (epilogue_type == EPILOGUE_TYPE_NORMAL && crtl->calls_eh_return)
-epilogue_type = EPILOGUE_TYPE_EH_RETURN;
-
   int strategy = info->savres_strategy;
   bool using_load_multiple = !!(strategy & REST_MULTIPLE);
   bool restoring_GPRs_inline = !!(strategy & REST_INLINE_GPRS);
@@ -4791,7 +4788,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* In the ELFv2 ABI we need to restore all call-saved CR fields from
  *separate* slots if the routine calls __builtin_eh_return, so
- that they can be independently restored by the unwinder.  */
+ that they can be independently restored by the unwinder.  Since
+ it is for CR fields restoring, it should be done for any epilogue
+ types (not EPILOGUE_TYPE_EH_RETURN specific).  */
   if (DEFAULT_ABI == ABI_ELFv2 && crtl->calls_eh_return)
 {
   int i, cr_off = info->ehcr_offset;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 7e33fb4039a..8d8118197da 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14147,6 +14147,8 @@
   ""
 {
   emit_insn (gen_eh_set_lr (Pmode, operands[0]));
+  emit_jump_insn (gen_eh_return_internal ());
+  emit_barrier ();
   DONE;
 })
 
@@ -14163,6 +14165,19 @@
   DONE;
 })
 
+(define_insn_and_split "eh_return_internal"
+  [(eh_return)]
+  ""
+  "#"
+  "epilogue_completed"
+  [(const_int 0)]
+{
+  if (!TARGET_SCHED_PROLOG)
+emit_insn (gen_blockage ());
+  rs6000_emit_epilogue (EPILOGUE_TYPE_EH_RETURN);
+  DONE;
+})
+
 (define_insn "prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 (match_operand:SI 1 "const_int_operand" "n")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114846.c 
b/gcc/testsuite/gcc.target/powerpc/pr114846.c
new file mode 100644
index 000..efe2300b73a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114846.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target builtin_eh_return } */
+
+/* Ensure it runs successfully.  */
+
+__attribute__ ((noipa))
+int f (int *a, long offset, void *handler)
+{
+  if (*a == 5)
+return 5;
+  __builtin_eh_return (offset, handler);
+}
+
+int main ()
+{
+  int t = 5;
+  if (f (&t, 0, 0) != 5)
+__builtin_abort ();
+  return 0;
+}


[gcc r12-10579] rs6000: Don't clobber return value when eh_return called [PR114846]

2024-06-23 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:0fd6ae9b20913ab84d596448e14411eedbd324f9

commit r12-10579-g0fd6ae9b20913ab84d596448e14411eedbd324f9
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc   |  7 +++
 gcc/config/rs6000/rs6000.md | 15 +++
 gcc/testsuite/gcc.target/powerpc/pr114846.c | 20 
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index a868ede24fb..33077b72611 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -4283,9 +4283,6 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   rs6000_stack_t *info = rs6000_stack_info ();
 
-  if (epilogue_type == EPILOGUE_TYPE_NORMAL && crtl->calls_eh_return)
-epilogue_type = EPILOGUE_TYPE_EH_RETURN;
-
   int strategy = info->savres_strategy;
   bool using_load_multiple = !!(strategy & REST_MULTIPLE);
   bool restoring_GPRs_inline = !!(strategy & REST_INLINE_GPRS);
@@ -4763,7 +4760,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* In the ELFv2 ABI we need to restore all call-saved CR fields from
  *separate* slots if the routine calls __builtin_eh_return, so
- that they can be independently restored by the unwinder.  */
+ that they can be independently restored by the unwinder.  Since
+ it is for CR fields restoring, it should be done for any epilogue
+ types (not EPILOGUE_TYPE_EH_RETURN specific).  */
   if (DEFAULT_ABI == ABI_ELFv2 && crtl->calls_eh_return)
 {
   int i, cr_off = info->ehcr_offset;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2f69c89c689..c38bebde185 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14039,6 +14039,8 @@
   ""
 {
   emit_insn (gen_eh_set_lr (Pmode, operands[0]));
+  emit_jump_insn (gen_eh_return_internal ());
+  emit_barrier ();
   DONE;
 })
 
@@ -14055,6 +14057,19 @@
   DONE;
 })
 
+(define_insn_and_split "eh_return_internal"
+  [(eh_return)]
+  ""
+  "#"
+  "epilogue_completed"
+  [(const_int 0)]
+{
+  if (!TARGET_SCHED_PROLOG)
+emit_insn (gen_blockage ());
+  rs6000_emit_epilogue (EPILOGUE_TYPE_EH_RETURN);
+  DONE;
+})
+
 (define_insn "prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 (match_operand:SI 1 "const_int_operand" "n")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114846.c 
b/gcc/testsuite/gcc.target/powerpc/pr114846.c
new file mode 100644
index 000..efe2300b73a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114846.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target builtin_eh_return } */
+
+/* Ensure it runs successfully.  */
+
+__attribute__ ((noipa))
+int f (int *a, long offset, void *handler)
+{
+  if (*a == 5)
+return 5;
+  __builtin_eh_return (offset, handler);
+}
+
+int main ()
+{
+  int t = 5;
+  if (f (&t, 0, 0) != 5)
+__builtin_abort ();
+  return 0;
+}


[gcc r11-11535] rs6000: Don't clobber return value when eh_return called [PR114846]

2024-06-23 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:549701628b64a7c4ac9bb5f9623e83a8dc1d828c

commit r11-11535-g549701628b64a7c4ac9bb5f9623e83a8dc1d828c
Author: Kewen Lin 
Date:   Tue May 28 21:13:40 2024 -0500

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value.  Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

Diff:
---
 gcc/config/rs6000/rs6000-logue.c|  7 +++
 gcc/config/rs6000/rs6000.md | 15 +++
 gcc/testsuite/gcc.target/powerpc/pr114846.c | 20 
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c
index bdcb37c72f5..f84af5515bb 100644
--- a/gcc/config/rs6000/rs6000-logue.c
+++ b/gcc/config/rs6000/rs6000-logue.c
@@ -4283,9 +4283,6 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   rs6000_stack_t *info = rs6000_stack_info ();
 
-  if (epilogue_type == EPILOGUE_TYPE_NORMAL && crtl->calls_eh_return)
-epilogue_type = EPILOGUE_TYPE_EH_RETURN;
-
   int strategy = info->savres_strategy;
   bool using_load_multiple = !!(strategy & REST_MULTIPLE);
   bool restoring_GPRs_inline = !!(strategy & REST_INLINE_GPRS);
@@ -4763,7 +4760,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* In the ELFv2 ABI we need to restore all call-saved CR fields from
  *separate* slots if the routine calls __builtin_eh_return, so
- that they can be independently restored by the unwinder.  */
+ that they can be independently restored by the unwinder.  Since
+ it is for CR fields restoring, it should be done for any epilogue
+ types (not EPILOGUE_TYPE_EH_RETURN specific).  */
   if (DEFAULT_ABI == ABI_ELFv2 && crtl->calls_eh_return)
 {
   int i, cr_off = info->ehcr_offset;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3462205b532..59091593780 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13764,6 +13764,8 @@
   ""
 {
   emit_insn (gen_eh_set_lr (Pmode, operands[0]));
+  emit_jump_insn (gen_eh_return_internal ());
+  emit_barrier ();
   DONE;
 })
 
@@ -13780,6 +13782,19 @@
   DONE;
 })
 
+(define_insn_and_split "eh_return_internal"
+  [(eh_return)]
+  ""
+  "#"
+  "epilogue_completed"
+  [(const_int 0)]
+{
+  if (!TARGET_SCHED_PROLOG)
+emit_insn (gen_blockage ());
+  rs6000_emit_epilogue (EPILOGUE_TYPE_EH_RETURN);
+  DONE;
+})
+
 (define_insn "prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 (match_operand:SI 1 "const_int_operand" "n")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114846.c 
b/gcc/testsuite/gcc.target/powerpc/pr114846.c
new file mode 100644
index 000..efe2300b73a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114846.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target builtin_eh_return } */
+
+/* Ensure it runs successfully.  */
+
+__attribute__ ((noipa))
+int f (int *a, long offset, void *handler)
+{
+  if (*a == 5)
+return 5;
+  __builtin_eh_return (offset, handler);
+}
+
+int main ()
+{
+  int t = 5;
+  if (f (&t, 0, 0) != 5)
+__builtin_abort ();
+  return 0;
+}


[gcc r15-5302] rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:2e22882f3ec88f540c2255ddce4fb69ac69911b7

commit r15-5302-g2e22882f3ec88f540c2255ddce4fb69ac69911b7
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:33 2024 +

rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4

All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.

This is part 4, it further checks for comparison opeators
LT/UNGE.  In rs6000_emit_vector_compare, for the handling
of LT, it switches to use code GT, swaps operands and try
again, it's exactly the same as what we have in vector.md:

; lt(a,b)   = gt(b,a)

As to UNGE, in rs6000_emit_vector_compare, it uses reversed
code LT and further operates on the result with one_cmpl,
it's also the same as what's in vector.md:

; unge(a,b) = ~lt(a,b)

This patch should not have any functionality change too.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare_inner): Emit 
rtx
comparison for operators LT/UNGE of MODE_VECTOR_FLOAT directly.
(rs6000_emit_vector_compare): Move assertion of no 
MODE_VECTOR_FLOAT to
function beginning.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 9cde30853f76..16e7b3521019 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16015,6 +16015,7 @@ static rtx
 rs6000_emit_vector_compare_inner (enum rtx_code code, rtx op0, rtx op1)
 {
   machine_mode mode = GET_MODE (op0);
+  gcc_assert (GET_MODE_CLASS (mode) != MODE_VECTOR_FLOAT);
 
   switch (code)
 {
@@ -16024,7 +16025,6 @@ rs6000_emit_vector_compare_inner (enum rtx_code code, 
rtx op0, rtx op1)
 case EQ:
 case GT:
 case GTU:
-  gcc_assert (GET_MODE_CLASS (mode) != MODE_VECTOR_FLOAT);
   rtx mask = gen_reg_rtx (mode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, mode, op0, op1)));
   return mask;
@@ -16049,18 +16049,8 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
  comparison operators in a comparison rtl pattern, we can
  just emit the comparison rtx insn directly here.  Besides,
  we should have a centralized place to handle the possibility
- of raising invalid exception.  For EQ/GT/GE/UNORDERED/
- ORDERED/LTGT/UNEQ, they are handled equivalently as before;
- for NE/UNLE/UNLT, they are handled with reversed code
- and inverting, it's the same as before; for LE/UNGT, they
- are handled with LE ior EQ previously, emitting directly
- here will make use of GE later, it's slightly better;
-
- FIXME: Handle the remaining vector float comparison operators
- here.  */
-  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
-  && rcode != LT
-  && rcode != UNGE)
+ of raising invalid exception.  */
+  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT)
 {
   mask = gen_reg_rtx (dmode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
@@ -16088,23 +16078,17 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   try_again = true;
   break;
 case NE:
-case UNGE:
   /* Invert condition and try again.
 e.g., A != B becomes ~(A==B).  */
   {
-   enum rtx_code rev_code;
enum insn_code nor_code;
rtx mask2;
 
-   rev_code = reverse_condition_maybe_unordered (rcode);
-   if (rev_code == UNKNOWN)
- return NULL_RTX;
-
nor_code = optab_handler (one_cmpl_optab, dmode);
if (nor_code == CODE_FOR_nothing)
  return NULL_RTX;
 
-   mask2 = rs6000_emit_vector_compare (rev_code, op0, op1, dmode);
+   mask2 = rs6000_emit_vector_compare (EQ, op0, op1, dmode);
if (!mask2)
  return NULL_RTX;


[gcc r15-5303] rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:311bcf9d4c3950e75a8ea83f8b1dd1facffd1910

commit r15-5303-g311bcf9d4c3950e75a8ea83f8b1dd1facffd1910
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:33 2024 +

rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1

The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly.  So it's better
to refactor the handlings of vector integer comparison here.

This is part 1, it's to remove the helper function
rs6000_emit_vector_compare_inner and move the logics into
rs6000_emit_vector_compare.  This patch doesn't introduce any
functionality change.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare_inner): 
Remove.
(rs6000_emit_vector_compare): Emit rtx comparison for operators EQ/
GT/GTU directly.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 37 +
 1 file changed, 9 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 16e7b3521019..44477657bc29 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16009,30 +16009,6 @@ output_cbranch (rtx op, const char *label, int 
reversed, rtx_insn *insn)
   return string;
 }
 
-/* Return insn for VSX or Altivec comparisons.  */
-
-static rtx
-rs6000_emit_vector_compare_inner (enum rtx_code code, rtx op0, rtx op1)
-{
-  machine_mode mode = GET_MODE (op0);
-  gcc_assert (GET_MODE_CLASS (mode) != MODE_VECTOR_FLOAT);
-
-  switch (code)
-{
-default:
-  break;
-
-case EQ:
-case GT:
-case GTU:
-  rtx mask = gen_reg_rtx (mode);
-  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, mode, op0, op1)));
-  return mask;
-}
-
-  return NULL_RTX;
-}
-
 /* Emit vector compare for operands OP0 and OP1 using code RCODE.
DMODE is expected destination mode. This is a recursive function.  */
 
@@ -16057,10 +16033,15 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   return mask;
 }
 
-  /* See if the comparison works as is.  */
-  mask = rs6000_emit_vector_compare_inner (rcode, op0, op1);
-  if (mask)
-return mask;
+  /* For any of vector integer comparison operators for which we
+ have direct hardware instructions, just emit it directly
+ here.  */
+  if (rcode == EQ || rcode == GT || rcode == GTU)
+{
+  mask = gen_reg_rtx (dmode);
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
+  return mask;
+}
 
   bool swap_operands = false;
   bool try_again = false;


[gcc r15-5306] rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:c8e5c0e01ecfc8b7bb98359242d36614155a6606

commit r15-5306-gc8e5c0e01ecfc8b7bb98359242d36614155a6606
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:33 2024 +

rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4

The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly.  So it's better
to refactor the handlings of vector integer comparison here.

This is part 4, it's to rework the handlings on GE/GEU/LE/LEU,
also make the function not recursive any more.  This patch
doesn't introduce any functionality change.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare): Refine the
handlings for operators GE/GEU/LE/LEU.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 87 +
 1 file changed, 17 insertions(+), 70 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 1f20782d66b1..bad5e4196537 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16010,7 +16010,7 @@ output_cbranch (rtx op, const char *label, int 
reversed, rtx_insn *insn)
 }
 
 /* Emit vector compare for operands OP0 and OP1 using code RCODE.
-   DMODE is expected destination mode. This is a recursive function.  */
+   DMODE is expected destination mode.  */
 
 static rtx
 rs6000_emit_vector_compare (enum rtx_code rcode,
@@ -16019,7 +16019,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
 {
   gcc_assert (VECTOR_UNIT_ALTIVEC_OR_VSX_P (dmode));
   gcc_assert (GET_MODE (op0) == GET_MODE (op1));
-  rtx mask;
+  rtx mask = gen_reg_rtx (dmode);
 
   /* In vector.md, we support all kinds of vector float point
  comparison operators in a comparison rtl pattern, we can
@@ -16028,7 +16028,6 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
  of raising invalid exception.  */
   if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT)
 {
-  mask = gen_reg_rtx (dmode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
   return mask;
 }
@@ -16037,11 +16036,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
  have direct hardware instructions, just emit it directly
  here.  */
   if (rcode == EQ || rcode == GT || rcode == GTU)
-{
-  mask = gen_reg_rtx (dmode);
-  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
-  return mask;
-}
+emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
   else if (rcode == LT || rcode == LTU)
 {
   /* lt{,u}(a,b) = gt{,u}(b,a)  */
@@ -16049,76 +16044,28 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   std::swap (op0, op1);
   mask = gen_reg_rtx (dmode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
-  return mask;
 }
-  else if (rcode == NE)
+  else if (rcode == NE || rcode == LE || rcode == LEU)
 {
-  /* ne(a,b) = ~eq(a,b)  */
+  /* ne(a,b) = ~eq(a,b); le{,u}(a,b) = ~gt{,u}(a,b)  */
+  enum rtx_code code = reverse_condition (rcode);
   mask = gen_reg_rtx (dmode);
-  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (EQ, dmode, op0, op1)));
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
+  enum insn_code nor_code = optab_handler (one_cmpl_optab, dmode);
+  gcc_assert (nor_code != CODE_FOR_nothing);
+  emit_insn (GEN_FCN (nor_code) (mask, mask));
+} else {
+  /* ge{,u}(a,b) = ~gt{,u}(b,a)  */
+  gcc_assert (rcode == GE || rcode == GEU);
+  enum rtx_code code = rcode == GE ? GT : GTU;
+  mask = gen_reg_rtx (dmode);
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
   enum insn_code nor_code = optab_handler (one_cmpl_optab, dmode);
   gcc_assert (nor_code != CODE_FOR_nothing);
   emit_insn (GEN_FCN (nor_code) (mask, mask));
-  return mask;
-}
-
-  switch (rcode)
-{
-case GE:
-case GEU:
-case LE:
-case LEU:
-  /* Try GT/GTU/LT/LTU OR EQ */
-  {
-   rtx c_rtx, eq_rtx;
-   enum insn_code ior_code;
-   enum rtx_code new_code;
-
-   switch (rcode)
- {
- case  GE:
-   new_code = GT;
-   break;
-
- case GEU:
-   new_code = GTU;
-   break;
-
- case LE:
-   new_code = LT;
-   break;
-
- case LEU:
-   new_code = LTU;
-   break;
-
- default:
-   gcc_unreachable ();
- }
-
-   ior_code = optab_handler (ior_optab, dmode);
-   if (ior_code == CODE_FOR_nothing)
- return NULL_RTX;
-
-   c_rtx = rs6000_emit_vector_compare (new_code, op0, op1, dmode);
-   if (!c_rtx)
- return NULL_RTX;
-
-   eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1, dmode);
-   if (!eq_rtx)
- return NULL_

[gcc r15-5301] rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:893ee27356b05e706c79e4551b628fb93645623e

commit r15-5301-g893ee27356b05e706c79e4551b628fb93645623e
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:32 2024 +

rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3

All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.

This is part 3, it further checks for comparison opeators
LE/UNGT.  In rs6000_emit_vector_compare, UNGT is handled
with reversed code LE and inverting with one_cmpl_optab,
LE is handled with LT ior EQ, while in vector.md, we have
the support:

; le(a,b)   = ge(b,a)
; ungt(a,b) = ~le(a,b)

The associated test case shows it's an improvement.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare): Emit rtx
comparison for operators LE/UNGT of MODE_VECTOR_FLOAT directly.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/vcond-fp.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  9 -
 gcc/testsuite/gcc.target/powerpc/vcond-fp.c | 26 ++
 2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 793fb95b660b..9cde30853f76 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16052,15 +16052,15 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
  of raising invalid exception.  For EQ/GT/GE/UNORDERED/
  ORDERED/LTGT/UNEQ, they are handled equivalently as before;
  for NE/UNLE/UNLT, they are handled with reversed code
- and inverting, it's the same as before.
+ and inverting, it's the same as before; for LE/UNGT, they
+ are handled with LE ior EQ previously, emitting directly
+ here will make use of GE later, it's slightly better;
 
  FIXME: Handle the remaining vector float comparison operators
  here.  */
   if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
   && rcode != LT
-  && rcode != LE
-  && rcode != UNGE
-  && rcode != UNGT)
+  && rcode != UNGE)
 {
   mask = gen_reg_rtx (dmode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
@@ -16089,7 +16089,6 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   break;
 case NE:
 case UNGE:
-case UNGT:
   /* Invert condition and try again.
 e.g., A != B becomes ~(A==B).  */
   {
diff --git a/gcc/testsuite/gcc.target/powerpc/vcond-fp.c 
b/gcc/testsuite/gcc.target/powerpc/vcond-fp.c
new file mode 100644
index ..2a9f056a2aa2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vcond-fp.c
@@ -0,0 +1,26 @@
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */
+/* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 } 
} } */
+
+/* Test we use xvcmpge[sd]p rather than xvcmpeq[sd]p and xvcmpgt[sd]p
+   for UNGT and LE handlings.  */
+
+#define UNGT(a, b) (!__builtin_islessequal ((a), (b)))
+#define LE(a, b) (((a) <= (b)))
+
+#define TEST_VECT(NAME, TYPE)  
\
+  __attribute__ ((noipa)) void test_##NAME##_##TYPE (TYPE *x, TYPE *y, 
\
+int *res, int n)  \
+  {
\
+for (int i = 0; i < n; i++)
\
+  res[i] = NAME (x[i], y[i]);  
\
+  }
+
+#define TEST(TYPE) 
\
+  TEST_VECT (UNGT, TYPE)   
\
+  TEST_VECT (LE, TYPE)
+
+TEST (float)
+TEST (double)
+
+/* { dg-final { scan-assembler-not {\mxvcmp(gt|eq)[sd]p\M} } } */


[gcc r15-5299] rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:a2da2fca004fd3002d45ba298f6203c7972f9eb6

commit r15-5299-ga2da2fca004fd3002d45ba298f6203c7972f9eb6
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:32 2024 +

rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1

All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.

This is part 1, it only handles the operators which are
already emitted with an rtx comparison previously in function
rs6000_emit_vector_compare_inner, they are EQ/GT/GE/ORDERED/
UNORDERED/UNEQ/LTGT.  There is no functionality change.

With this change, rs6000_emit_vector_compare_inner would
only work for vector integer comparison handling, it would
be cleaned up later in vector integer comparison rework.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare_inner): Move
MODE_VECTOR_FLOAT handlings out.
(rs6000_emit_vector_compare): Emit rtx comparison for operators 
EQ/GT/
GE/UNORDERED/ORDERED/UNEQ/LTGT of MODE_VECTOR_FLOAT directly, and
adjust one call site of rs6000_emit_vector_compare_inner to
rs6000_emit_vector_compare.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 47 ++---
 1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 950fd947fda3..692acbb76535 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16014,7 +16014,6 @@ output_cbranch (rtx op, const char *label, int 
reversed, rtx_insn *insn)
 static rtx
 rs6000_emit_vector_compare_inner (enum rtx_code code, rtx op0, rtx op1)
 {
-  rtx mask;
   machine_mode mode = GET_MODE (op0);
 
   switch (code)
@@ -16022,19 +16021,11 @@ rs6000_emit_vector_compare_inner (enum rtx_code code, 
rtx op0, rtx op1)
 default:
   break;
 
-case GE:
-  if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
-   return NULL_RTX;
-  /* FALLTHRU */
-
 case EQ:
 case GT:
 case GTU:
-case ORDERED:
-case UNORDERED:
-case UNEQ:
-case LTGT:
-  mask = gen_reg_rtx (mode);
+  gcc_assert (GET_MODE_CLASS (mode) != MODE_VECTOR_FLOAT);
+  rtx mask = gen_reg_rtx (mode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, mode, op0, op1)));
   return mask;
 }
@@ -16050,18 +16041,42 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
rtx op0, rtx op1,
machine_mode dmode)
 {
-  rtx mask;
-  bool swap_operands = false;
-  bool try_again = false;
-
   gcc_assert (VECTOR_UNIT_ALTIVEC_OR_VSX_P (dmode));
   gcc_assert (GET_MODE (op0) == GET_MODE (op1));
+  rtx mask;
+
+  /* In vector.md, we support all kinds of vector float point
+ comparison operators in a comparison rtl pattern, we can
+ just emit the comparison rtx insn directly here.  Besides,
+ we should have a centralized place to handle the possibility
+ of raising invalid exception.  As the first step, only check
+ operators EQ/GT/GE/UNORDERED/ORDERED/LTGT/UNEQ for now, they
+ are handled equivalently as before.
+
+ FIXME: Handle the remaining vector float comparison operators
+ here.  */
+  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
+  && (rcode == EQ
+ || rcode == GT
+ || rcode == GE
+ || rcode == UNORDERED
+ || rcode == ORDERED
+ || rcode == LTGT
+ || rcode == UNEQ))
+{
+  mask = gen_reg_rtx (dmode);
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
+  return mask;
+}
 
   /* See if the comparison works as is.  */
   mask = rs6000_emit_vector_compare_inner (rcode, op0, op1);
   if (mask)
 return mask;
 
+  bool swap_operands = false;
+  bool try_again = false;
+
   switch (rcode)
 {
 case LT:
@@ -16161,7 +16176,7 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   if (swap_operands)
std::swap (op0, op1);
 
-  mask = rs6000_emit_vector_compare_inner (rcode, op0, op1);
+  mask = rs6000_emit_vector_compare (rcode, op0, op1, dmode);
   if (mask)
return mask;
 }


[gcc r15-5300] rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:5210565ec17728eab289104aedd09d50731da8ec

commit r15-5300-g5210565ec17728eab289104aedd09d50731da8ec
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:32 2024 +

rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2

All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.

This is part 2, it further checks for comparison opeators
NE/UNLE/UNLT.  In rs6000_emit_vector_compare, they are
handled with reversed code which is queried from function
reverse_condition_maybe_unordered and inverting with
one_cmpl_optab.  It's the same as what we have in vector.md:

; ne(a,b)   = ~eq(a,b)
; unle(a,b) = ~gt(a,b)
; unlt(a,b) = ~ge(a,b)

The operators on the right side have been supported in part 1.
This patch should not have any functionality change too.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare): Emit rtx
comparison for operators NE/UNLE/UNLT of MODE_VECTOR_FLOAT directly.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 692acbb76535..793fb95b660b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16049,20 +16049,18 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
  comparison operators in a comparison rtl pattern, we can
  just emit the comparison rtx insn directly here.  Besides,
  we should have a centralized place to handle the possibility
- of raising invalid exception.  As the first step, only check
- operators EQ/GT/GE/UNORDERED/ORDERED/LTGT/UNEQ for now, they
- are handled equivalently as before.
+ of raising invalid exception.  For EQ/GT/GE/UNORDERED/
+ ORDERED/LTGT/UNEQ, they are handled equivalently as before;
+ for NE/UNLE/UNLT, they are handled with reversed code
+ and inverting, it's the same as before.
 
  FIXME: Handle the remaining vector float comparison operators
  here.  */
   if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
-  && (rcode == EQ
- || rcode == GT
- || rcode == GE
- || rcode == UNORDERED
- || rcode == ORDERED
- || rcode == LTGT
- || rcode == UNEQ))
+  && rcode != LT
+  && rcode != LE
+  && rcode != UNGE
+  && rcode != UNGT)
 {
   mask = gen_reg_rtx (dmode);
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
@@ -16090,8 +16088,6 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   try_again = true;
   break;
 case NE:
-case UNLE:
-case UNLT:
 case UNGE:
 case UNGT:
   /* Invert condition and try again.


[gcc r15-5307] rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:96a468842ef8b5d9b971428c7ba4e14fdab5ea94

commit r15-5307-g96a468842ef8b5d9b971428c7ba4e14fdab5ea94
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:33 2024 +

rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5

The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly.  So it's better
to refactor the handlings of vector integer comparison here.

This is part 5, it's to refactor all the handlings of vector
integer comparison to make it neat.  This patch doesn't
introduce any functionality change.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare): Refactor the
handlings of vector integer comparison.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 68 +
 1 file changed, 44 insertions(+), 24 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index bad5e4196537..0d7ee1e5bdf2 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16032,34 +16032,54 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   return mask;
 }
 
-  /* For any of vector integer comparison operators for which we
- have direct hardware instructions, just emit it directly
- here.  */
-  if (rcode == EQ || rcode == GT || rcode == GTU)
-emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
-  else if (rcode == LT || rcode == LTU)
+  bool swap_operands = false;
+  bool need_invert = false;
+  enum rtx_code code = rcode;
+
+  switch (rcode)
 {
+case EQ:
+case GT:
+case GTU:
+  /* Emit directly with native hardware insn.  */
+  break;
+case LT:
+case LTU:
   /* lt{,u}(a,b) = gt{,u}(b,a)  */
-  enum rtx_code code = swap_condition (rcode);
-  std::swap (op0, op1);
-  mask = gen_reg_rtx (dmode);
-  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
+  code = swap_condition (rcode);
+  swap_operands = true;
+  break;
+case NE:
+case LE:
+case LEU:
+  /* ne(a,b) = ~eq(a,b); le{,u}(a,b) = ~gt{,u}(a,b)  */
+  code = reverse_condition (rcode);
+  need_invert = true;
+  break;
+case GE:
+  /* ge(a,b) = ~gt(b,a)  */
+  code = GT;
+  swap_operands = true;
+  need_invert = true;
+  break;
+case GEU:
+  /* geu(a,b) = ~gtu(b,a)  */
+  code = GTU;
+  swap_operands = true;
+  need_invert = true;
+  break;
+default:
+  gcc_unreachable ();
+  break;
 }
-  else if (rcode == NE || rcode == LE || rcode == LEU)
+
+  if (swap_operands)
+std::swap (op0, op1);
+
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
+
+  if (need_invert)
 {
-  /* ne(a,b) = ~eq(a,b); le{,u}(a,b) = ~gt{,u}(a,b)  */
-  enum rtx_code code = reverse_condition (rcode);
-  mask = gen_reg_rtx (dmode);
-  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
-  enum insn_code nor_code = optab_handler (one_cmpl_optab, dmode);
-  gcc_assert (nor_code != CODE_FOR_nothing);
-  emit_insn (GEN_FCN (nor_code) (mask, mask));
-} else {
-  /* ge{,u}(a,b) = ~gt{,u}(b,a)  */
-  gcc_assert (rcode == GE || rcode == GEU);
-  enum rtx_code code = rcode == GE ? GT : GTU;
-  mask = gen_reg_rtx (dmode);
-  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
   enum insn_code nor_code = optab_handler (one_cmpl_optab, dmode);
   gcc_assert (nor_code != CODE_FOR_nothing);
   emit_insn (GEN_FCN (nor_code) (mask, mask));


[gcc r15-5304] rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:d35ee820b43e80a1298deecc60fdee32d9416eff

commit r15-5304-gd35ee820b43e80a1298deecc60fdee32d9416eff
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:33 2024 +

rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2

The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly.  So it's better
to refactor the handlings of vector integer comparison here.

This is part 2, it's to refactor the handlings on LT and LTU.
This patch doesn't introduce any functionality change.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare): Refine the
handlings for operators LT and LTU.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 32 +---
 1 file changed, 9 insertions(+), 23 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 44477657bc29..718d65951e7f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16042,22 +16042,18 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
   return mask;
 }
-
-  bool swap_operands = false;
-  bool try_again = false;
+  else if (rcode == LT || rcode == LTU)
+{
+  /* lt{,u}(a,b) = gt{,u}(b,a)  */
+  enum rtx_code code = swap_condition (rcode);
+  std::swap (op0, op1);
+  mask = gen_reg_rtx (dmode);
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
+  return mask;
+}
 
   switch (rcode)
 {
-case LT:
-  rcode = GT;
-  swap_operands = true;
-  try_again = true;
-  break;
-case LTU:
-  rcode = GTU;
-  swap_operands = true;
-  try_again = true;
-  break;
 case NE:
   /* Invert condition and try again.
 e.g., A != B becomes ~(A==B).  */
@@ -16131,16 +16127,6 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   return NULL_RTX;
 }
 
-  if (try_again)
-{
-  if (swap_operands)
-   std::swap (op0, op1);
-
-  mask = rs6000_emit_vector_compare (rcode, op0, op1, dmode);
-  if (mask)
-   return mask;
-}
-
   /* You only get two chances.  */
   return NULL_RTX;
 }


[gcc r15-5305] rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3

2024-11-14 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:35c83e66c0085bc67fcb21b4413bace452ce0ca0

commit r15-5305-g35c83e66c0085bc67fcb21b4413bace452ce0ca0
Author: Kewen Lin 
Date:   Fri Nov 15 03:46:33 2024 +

rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3

The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly.  So it's better
to refactor the handlings of vector integer comparison here.

This is part 3, it's to refactor the handlings on NE.
This patch doesn't introduce any functionality change.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_vector_compare): Refactor the
handlings for operator NE.

Diff:
---
 gcc/config/rs6000/rs6000.cc | 30 ++
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 718d65951e7f..1f20782d66b1 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16051,29 +16051,19 @@ rs6000_emit_vector_compare (enum rtx_code rcode,
   emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));
   return mask;
 }
+  else if (rcode == NE)
+{
+  /* ne(a,b) = ~eq(a,b)  */
+  mask = gen_reg_rtx (dmode);
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (EQ, dmode, op0, op1)));
+  enum insn_code nor_code = optab_handler (one_cmpl_optab, dmode);
+  gcc_assert (nor_code != CODE_FOR_nothing);
+  emit_insn (GEN_FCN (nor_code) (mask, mask));
+  return mask;
+}
 
   switch (rcode)
 {
-case NE:
-  /* Invert condition and try again.
-e.g., A != B becomes ~(A==B).  */
-  {
-   enum insn_code nor_code;
-   rtx mask2;
-
-   nor_code = optab_handler (one_cmpl_optab, dmode);
-   if (nor_code == CODE_FOR_nothing)
- return NULL_RTX;
-
-   mask2 = rs6000_emit_vector_compare (EQ, op0, op1, dmode);
-   if (!mask2)
- return NULL_RTX;
-
-   mask = gen_reg_rtx (dmode);
-   emit_insn (GEN_FCN (nor_code) (mask, mask2));
-   return mask;
-  }
-  break;
 case GE:
 case GEU:
 case LE:


[gcc r15-5549] rs6000: Simplify some conditions or code related to TARGET_DIRECT_MOVE

2024-11-21 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:33386d14210aa6e5cc9e1d65652261fbfc087b95

commit r15-5549-g33386d14210aa6e5cc9e1d65652261fbfc087b95
Author: Kewen Lin 
Date:   Thu Nov 21 07:41:33 2024 +

rs6000: Simplify some conditions or code related to TARGET_DIRECT_MOVE

When I was making a patch to rework TARGET_P8_VECTOR, I
noticed that there are some redundant checks and dead code
related to TARGET_DIRECT_MOVE, so I made this patch as one
separated preparatory patch, it consists of:
  - Check either TARGET_DIRECT_MOVE or TARGET_P8_VECTOR only
according to the context, rather than checking both of
them since they are actually the same (TARGET_DIRECT_MOVE
is defined as TARGET_P8_VECTOR).
  - Simplify TARGET_VSX && TARGET_DIRECT_MOVE as
TARGET_DIRECT_MOVE since direct move ensures VSX enabled.
  - Replace some TARGET_POWERPC64 && TARGET_DIRECT_MOVE as
TARGET_DIRECT_MOVE_64BIT to simplify it.
  - Remove some dead code guarded with TARGET_DIRECT_MOVE
but the condition never holds here.

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): 
Simplify
TARGET_P8_VECTOR && TARGET_DIRECT_MOVE as TARGET_P8_VECTOR.
(rs6000_output_move_128bit): Simplify TARGET_VSX && 
TARGET_DIRECT_MOVE
as TARGET_DIRECT_MOVE.
* config/rs6000/rs6000.h (TARGET_XSCVDPSPN): Simplify conditions
TARGET_DIRECT_MOVE || TARGET_P8_VECTOR as TARGET_P8_VECTOR.
(TARGET_XSCVSPDPN): Likewise.
(TARGET_DIRECT_MOVE_128): Simplify TARGET_DIRECT_MOVE &&
TARGET_POWERPC64 as TARGET_DIRECT_MOVE_64BIT.
(TARGET_VEXTRACTUB): Likewise.
(TARGET_DIRECT_MOVE_64BIT): Simplify TARGET_P8_VECTOR &&
TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE.
* config/rs6000/rs6000.md (signbit2, @signbit2_dm,
*signbit2_dm_mem, floatsi2_lfiwax,
floatsi2_lfiwax__mem_zext,
floatunssi2_lfiwzx, float2,
*float2_internal, 
floatuns2,
*floatuns2_internal, p8_mtvsrd_v16qidi2,
p8_mtvsrd_df, p8_xxpermdi_, reload_vsx_from_gpr,
p8_mtvsrd_sf, reload_vsx_from_gprsf, p8_mfvsrd_3_,
reload_gpr_from_vsx, reload_gpr_from_vsxsf, unpack_dm):
Simplify TARGET_DIRECT_MOVE && TARGET_POWERPC64 as
TARGET_DIRECT_MOVE_64BIT.
(unpack_nodm): Simplify !TARGET_DIRECT_MOVE || 
!TARGET_POWERPC64
as !TARGET_DIRECT_MOVE_64BIT.
(fix_truncsi2, fix_truncsi2_stfiwx,
fix_truncsi2_internal): Simplify TARGET_P8_VECTOR &&
TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE.
(fix_truncsi2_stfiwx, fixuns_truncsi2_stfiwx): Remove 
some
dead code as the guard TARGET_DIRECT_MOVE there never holds.
(fixuns_truncsi2_stfiwx): Change TARGET_P8_VECTOR with
TARGET_DIRECT_MOVE which is a better fit.
* config/rs6000/vsx.md (define_peephole2 for SFmode in GPR): 
Simplify
TARGET_DIRECT_MOVE && TARGET_POWERPC64 as TARGET_DIRECT_MOVE_64BIT.

Diff:
---
 gcc/config/rs6000/rs6000.cc |  4 +--
 gcc/config/rs6000/rs6000.h  | 11 +++-
 gcc/config/rs6000/rs6000.md | 62 ++---
 gcc/config/rs6000/vsx.md|  2 +-
 4 files changed, 32 insertions(+), 47 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 0d7ee1e5bdf2..9cdf704824ce 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4055,7 +4055,7 @@ rs6000_option_override_internal (bool global_init_p)
  support. If we only have ISA 2.06 support, and the user did not specify
  the switch, leave it set to -1 so the movmisalign patterns are enabled,
  but we don't enable the full vectorization support  */
-  if (TARGET_ALLOW_MOVMISALIGN == -1 && TARGET_P8_VECTOR && TARGET_DIRECT_MOVE)
+  if (TARGET_ALLOW_MOVMISALIGN == -1 && TARGET_P8_VECTOR)
 TARGET_ALLOW_MOVMISALIGN = 1;
 
   else if (TARGET_ALLOW_MOVMISALIGN && !TARGET_VSX)
@@ -13799,7 +13799,7 @@ rs6000_output_move_128bit (rtx operands[])
? "mfvsrd %0,%x1\n\tmfvsrld %L0,%x1"
: "mfvsrd %L0,%x1\n\tmfvsrld %0,%x1");
 
- else if (TARGET_VSX && TARGET_DIRECT_MOVE && src_vsx_p)
+ else if (TARGET_DIRECT_MOVE && src_vsx_p)
return "#";
}
 
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index d460eb065448..e0c41e1dfd26 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -469,13 +469,11 @@ extern int rs6000_vector_align[];
 
 /* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
 #define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
-#define TARGET_XSCVDPSPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
-#define TARGET_XSCVSPDPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
+#define TARGET_XSCVDPSPN   TARGET

[gcc r15-5551] rs6000: Add veqv support to *eqv3_internal1

2024-11-21 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:2441dc2495d257c4894a4d0c8d36cfbdc851579c

commit r15-5551-g2441dc2495d257c4894a4d0c8d36cfbdc851579c
Author: Kewen Lin 
Date:   Thu Nov 21 07:41:33 2024 +

rs6000: Add veqv support to *eqv3_internal1

When making patch to replace TARGET_P8_VECTOR, I noticed
for *eqv3_internal1 unlike the other logical
operations, we only exploited the vsx version.  I think it
is an oversight, this patch is to consider veqv as well.

gcc/ChangeLog:

* config/rs6000/rs6000.md (*eqv3_internal1): Generate
insn veqv if TARGET_ALTIVEC and operands are 
altivec_register_operand.

Diff:
---
 gcc/config/rs6000/rs6000.md | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2598059280bf..ca91a24795b1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7557,9 +7557,12 @@
  (match_operand:BOOL_128 2 "vlogical_operand" ""]
   "TARGET_P8_VECTOR"
 {
-  if (vsx_register_operand (operands[0], mode))
+  if (TARGET_VSX && vsx_register_operand (operands[0], mode))
 return "xxleqv %x0,%x1,%x2";
 
+  if (TARGET_ALTIVEC && altivec_register_operand (operands[0], mode))
+return "veqv %0,%1,%2";
+
   return "#";
 }
   "TARGET_P8_VECTOR && reload_completed


[gcc r15-5553] rs6000: Use standard name {add, sub}v1ti3 for altivec_v{add, sub}uqm

2024-11-21 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:baf536754f615c808f02592b765cdd900f240359

commit r15-5553-gbaf536754f615c808f02592b765cdd900f240359
Author: Kewen Lin 
Date:   Thu Nov 21 07:41:33 2024 +

rs6000: Use standard name {add,sub}v1ti3 for altivec_v{add,sub}uqm

This patch is to adjust define_insn altivec_v{add,sub}uqm
with standard names, as the associated test case shows, w/o
this patch, it ends up with scalar {add,subf}c/{add,subf}e,
the standard names help to exploit v{add,sub}uqm.

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vadduqm): Rename to ...
(addv1ti3): ... this.
(altivec_vsubuqm): Rename to ...
(subv1ti3): ... this.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vadduqm):
Replace bif expander altivec_vadduqm with addv1ti3.
(__builtin_altivec_vsubuqm): Replace bif expander altivec_vsubuqm 
with
subv1ti3.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/p8vector-int128-3.c: New test.

Diff:
---
 gcc/config/rs6000/altivec.md   |  4 ++--
 gcc/config/rs6000/rs6000-builtins.def  |  4 ++--
 .../gcc.target/powerpc/p8vector-int128-3.c | 23 ++
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 687c3c0ac7e1..b6a778ef6179 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -4426,7 +4426,7 @@
 ;; ISA 2.07 128-bit binary support to target the VMX/altivec registers without
 ;; having to worry about the register allocator deciding GPRs are better.
 
-(define_insn "altivec_vadduqm"
+(define_insn "addv1ti3"
   [(set (match_operand:V1TI 0 "register_operand" "=v")
(plus:V1TI (match_operand:V1TI 1 "register_operand" "v")
   (match_operand:V1TI 2 "register_operand" "v")))]
@@ -4443,7 +4443,7 @@
   "vaddcuq %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "altivec_vsubuqm"
+(define_insn "subv1ti3"
   [(set (match_operand:V1TI 0 "register_operand" "=v")
(minus:V1TI (match_operand:V1TI 1 "register_operand" "v")
(match_operand:V1TI 2 "register_operand" "v")))]
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 0e9dc05dbcff..69046fd22442 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2012,7 +2012,7 @@
 VADDUDM addv2di3 {}
 
   const vsq __builtin_altivec_vadduqm (vsq, vsq);
-VADDUQM altivec_vadduqm {}
+VADDUQM addv1ti3 {}
 
   const vsll __builtin_altivec_vbpermq (vsc, vsc);
 VBPERMQ altivec_vbpermq {}
@@ -2150,7 +2150,7 @@
 VSUBUDM subv2di3 {}
 
   const vsq __builtin_altivec_vsubuqm (vsq, vsq);
-VSUBUQM altivec_vsubuqm {}
+VSUBUQM subv1ti3 {}
 
   const vsll __builtin_altivec_vupkhsw (vsi);
 VUPKHSW altivec_vupkhsw {}
diff --git a/gcc/testsuite/gcc.target/powerpc/p8vector-int128-3.c 
b/gcc/testsuite/gcc.target/powerpc/p8vector-int128-3.c
new file mode 100644
index ..5559410e46b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p8vector-int128-3.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2" } */
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-require-effective-target int128 } */
+
+#ifndef TYPE
+#define TYPE vector __int128_t
+#endif
+
+TYPE
+do_adduqm (TYPE p, TYPE q)
+{
+  return p + q;
+}
+
+TYPE
+do_subuqm (TYPE p, TYPE q)
+{
+  return p - q;
+}
+
+/* { dg-final { scan-assembler-times "vadduqm" 1 } } */
+/* { dg-final { scan-assembler-times "vsubuqm" 1 } } */


[gcc r15-5550] rs6000: Remove ISA_3_0_MASKS_IEEE and check P9_VECTOR instead

2024-11-21 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:0719ade048d66c91eebdcce07ae69e90a8385e1e

commit r15-5550-g0719ade048d66c91eebdcce07ae69e90a8385e1e
Author: Kewen Lin 
Date:   Thu Nov 21 07:41:33 2024 +

rs6000: Remove ISA_3_0_MASKS_IEEE and check P9_VECTOR instead

When working to get rid of mask bit OPTION_MASK_P8_VECTOR,
I noticed that the check on ISA_3_0_MASKS_IEEE is actually
to check TARGET_P9_VECTOR, since we check all three mask
bits together and p9 vector guarantees p8 vector and vsx
should be enabled.  So this patch is to adjust this first
as preparatory patch for the following patch to change
all uses of OPTION_MASK_P8_VECTOR and TARGET_P8_VECTOR.

gcc/ChangeLog:

* config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_IEEE): Remove.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
ISA_3_0_MASKS_IEEE check with TARGET_P9_VECTOR.

Diff:
---
 gcc/config/rs6000/rs6000-cpus.def | 6 --
 gcc/config/rs6000/rs6000.cc   | 5 ++---
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 84fac8bdac1d..6f96d47eb728 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -68,12 +68,6 @@
  | OPTION_MASK_P9_VECTOR)  \
 & ~OTHER_FUSION_MASKS)
 
-/* Support for the IEEE 128-bit floating point hardware requires a lot of the
-   VSX instructions that are part of ISA 3.0.  */
-#define ISA_3_0_MASKS_IEEE (OPTION_MASK_VSX\
-| OPTION_MASK_P8_VECTOR\
-| OPTION_MASK_P9_VECTOR)
-
 /* Flags that need to be turned off if -mno-power10.  */
 /* We comment out PCREL_OPT here to disable it by default because SPEC2017
performance was degraded by it.  */
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 9cdf704824ce..fa0e2ce8eea5 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4189,12 +4189,11 @@ rs6000_option_override_internal (bool global_init_p)
  because sometimes the compiler wants to put things in an integer
  container, and if we don't have __int128 support, it is impossible.  */
   if (TARGET_FLOAT128_TYPE && !TARGET_FLOAT128_HW && TARGET_64BIT
-  && (rs6000_isa_flags & ISA_3_0_MASKS_IEEE) == ISA_3_0_MASKS_IEEE
+  && TARGET_P9_VECTOR
   && !(rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW))
 rs6000_isa_flags |= OPTION_MASK_FLOAT128_HW;
 
-  if (TARGET_FLOAT128_HW
-  && (rs6000_isa_flags & ISA_3_0_MASKS_IEEE) != ISA_3_0_MASKS_IEEE)
+  if (TARGET_FLOAT128_HW && (!TARGET_P9_VECTOR))
 {
   if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0)
error ("%qs requires full ISA 3.0 support", "%<-mfloat128-hardware%>");


[gcc r15-5554] rs6000: Adjust FLOAT128 signbit2 expander for P8 LE [PR114567]

2024-11-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:10e702789eeabcc88451e34c2a5c7dccb96190a5

commit r15-5554-g10e702789eeabcc88451e34c2a5c7dccb96190a5
Author: Kewen Lin 
Date:   Thu Nov 21 07:41:34 2024 +

rs6000: Adjust FLOAT128 signbit2 expander for P8 LE [PR114567]

As the associated test case shows, signbit generated assembly
is sub-optimal for _Float128 argument from memory on P8 LE.
On P8 LE, p8swap pass puts an explicit AND -16 on the memory,
which causes mode_dependent_address_p considers it's invalid
to change its mode and combine fails to make use of the
existing pattern signbit2_dm_mem.  Considering
it's always more efficient to make use of 8 bytes load and
shift on P8 LE, this patch is to adjust the current expander
and treat it specially.

PR target/114567

gcc/ChangeLog:

* config/rs6000/rs6000.md (expander signbit2): 
Adjust.
(*signbit2_dm_mem): Rename to ...
(signbit2_dm_mem): ... this.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114567.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.md | 22 ++
 gcc/testsuite/gcc.target/powerpc/pr114567.c | 17 +
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ca91a24795b1..95be36d5a726 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5287,7 +5287,7 @@
 ;; when little-endian.
 (define_expand "signbit2"
   [(set (match_dup 2)
-   (float_truncate:DF (match_operand:FLOAT128 1 "gpc_reg_operand")))
+   (float_truncate:DF (match_operand:FLOAT128 1 "reg_or_mem_operand")))
(set (match_dup 3)
(subreg:DI (match_dup 2) 0))
(set (match_dup 4)
@@ -5303,12 +5303,26 @@
   rtx dest = operands[0];
   rtx src = operands[1];
   rtx tmp = gen_reg_rtx (DImode);
+  /* For P8 LE, we generate memory access with subreg:V1TI which
+ prevents the related gen_signbitkf2_dm_mem being matched so
+ directly emit it here and leave the other cases alone.  */
+  if (!BYTES_BIG_ENDIAN
+  && !TARGET_P9_VECTOR
+  && memory_operand (src, mode))
+emit_insn (gen_signbitkf2_dm_mem (tmp, src));
+  else
+{
+  if (!gpc_reg_operand (src, mode))
+src = copy_to_mode_reg (mode, src);
+  gcc_assert (gpc_reg_operand (src, mode));
+  emit_insn (gen_signbit2_dm (mode, tmp, src));
+}
   rtx dest_di = gen_lowpart (DImode, dest);
-
-  emit_insn (gen_signbit2_dm (mode, tmp, src));
   emit_insn (gen_lshrdi3 (dest_di, tmp, GEN_INT (63)));
   DONE;
 }
+  if (!gpc_reg_operand (operands[1], mode))
+operands[1] = copy_to_mode_reg (mode, operands[1]);
   operands[2] = gen_reg_rtx (DFmode);
   operands[3] = gen_reg_rtx (DImode);
   if (TARGET_POWERPC64)
@@ -5354,7 +5368,7 @@
 ;; Optimize IEEE 128-bit signbit on to avoid loading the value into a vector
 ;; register and then doing a direct move if the value comes from memory.  On
 ;; little endian, we have to load the 2nd double-word to get the sign bit.
-(define_insn_and_split "*signbit2_dm_mem"
+(define_insn_and_split "signbit2_dm_mem"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=b")
(unspec:DI [(match_operand:SIGNBIT 1 "memory_operand" "m")]
   UNSPEC_SIGNBIT))]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114567.c 
b/gcc/testsuite/gcc.target/powerpc/pr114567.c
new file mode 100644
index ..b904387dca4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114567.c
@@ -0,0 +1,17 @@
+/* { dg-options "-O2 -mabi=ibmlongdouble -Wno-psabi" } */
+/* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! has_arch_pwr8 } 
} } */
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-require-effective-target float128 } */
+
+/* Verify there is no lxv.*x? and mfvsrd (vector load and move).  */
+
+int
+sbm (_Float128 *a)
+{
+  return __builtin_signbit (*a);
+}
+
+/* { dg-final { scan-assembler-times {\ml(d|wz)\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mlxv\M} } } */
+/* { dg-final { scan-assembler-not {\mlxvd2x\M} } } */
+/* { dg-final { scan-assembler-not {\mmfvsrd\M} } } */


[gcc r15-5552] rs6000: Remove entry for V1TImode from VI_unit

2024-11-20 Thread Kewen Lin via Gcc-cvs
https://gcc.gnu.org/g:ca96c1d1bc04b498e401571e99296e526db5db58

commit r15-5552-gca96c1d1bc04b498e401571e99296e526db5db58
Author: Kewen Lin 
Date:   Thu Nov 21 07:41:33 2024 +

rs6000: Remove entry for V1TImode from VI_unit

When making a patch to adjust VECTOR_P8_VECTOR rs6000_vector
enum, I noticed that V1TImode's mode attribute in VI_unit
VECTOR_UNIT_ALTIVEC_P (V1TImode) is never true, since
VECTOR_UNIT_ALTIVEC_P checks if vector_unit[V1TImode] is
equal to VECTOR_ALTIVEC, but vector_unit[V1TImode] can only
be VECTOR_NONE or VECTOR_P8_VECTOR, there is no chance to be
VECTOR_ALTIVEC:
  rs6000_vector_unit[V1TImode]
  = (TARGET_P8_VECTOR) ? VECTOR_P8_VECTOR : VECTOR_NONE;

By checking all uses of VI_unit, the used mode iterator is
one of VI2, VI, VP_small and VP, none of them has V1TImode,
so the entry for V1TImode is useless.  I guessed it was
designed to have one mode attribute to cover all integer
vector modes, but later we separated V1TI handlings to its
own patterns (those guarded with TARGET_VADDUQM).  Anyway,
this patch is to remove this useless and confusing entry.

gcc/ChangeLog:

* config/rs6000/altivec.md (mode attr for V1TI in VI_unit): Remove.

Diff:
---
 gcc/config/rs6000/altivec.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 00dad4b91f1c..687c3c0ac7e1 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -227,8 +227,7 @@
 (define_mode_attr VI_unit [(V16QI "VECTOR_UNIT_ALTIVEC_P (V16QImode)")
   (V8HI "VECTOR_UNIT_ALTIVEC_P (V8HImode)")
   (V4SI "VECTOR_UNIT_ALTIVEC_P (V4SImode)")
-  (V2DI "VECTOR_UNIT_P8_VECTOR_P (V2DImode)")
-  (V1TI "VECTOR_UNIT_ALTIVEC_P (V1TImode)")])
+  (V2DI "VECTOR_UNIT_P8_VECTOR_P (V2DImode)")])
 
 ;; Vector pack/unpack
 (define_mode_iterator VP [V2DI V4SI V8HI])