RE:Look for the diamond tools 2018/6/7 8:03:13

2018-06-06 Thread peter
| Hi dear customer,

Have a great day.  My name is Peter.

We are the facotory in China, produce stone abrasive and diamond tools.

Our main products as below:

1. Cutting tools.
2. Grinding & polishing tools.
3. Other tools to process stone, concrete, ceramics etc.

We also can make tools as your requirement.

Do not hesitate to contact us, if you need our products.

Thanks and best regards,
Peter Ye |
| 2018/6/7 8:03:19 |





--

Quanzhou Danar Diamond Tools & Abrasive Co.,Ltd
Email: i...@danartool.com 
Mobile Ph:+8613645919851(WhatsApp&Wechat)  
Skype:892721...@qq.com
Website:  www.danartool.com

[PING][PATCH, rs6000, C/C++] Fix PR target/86324: divkc3-1.c FAILs when compiling with -mabi=ieeelongdouble

2018-07-02 Thread Peter Bergner
I'd like to PING:

  https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01713.html

I've included the entire patch below, since I missed the test cases in
the original submission and Segher asked for some updated text for the
hook documentation which I've included below.

Peter


gcc/
PR target/86324
* target.def (translate_mode_attribute): New hook.
* targhooks.h (default_translate_mode_attribute): Declare.
* targhooks.c (default_translate_mode_attribute): New function.
* doc/tm.texi.in (TARGET_TRANSLATE_MODE_ATTRIBUTE): New hook.
* doc/tm.texi: Regenerate.
* config/rs6000/rs6000.c (TARGET_TRANSLATE_MODE_ATTRIBUTE): Define.
(rs6000_translate_mode_attribute): New function.

gcc/c-family/
PR target/86324
* c-attribs.c (handle_mode_attribute): Call new translate_mode_attribute
target hook.

gcc/testsuite/
PR target/86324
gcc.target/powerpc/pr86324-1.c: New test.
gcc.target/powerpc/pr86324-2.c: Likewise.

Index: gcc/target.def
===
--- gcc/target.def  (revision 262159)
+++ gcc/target.def  (working copy)
@@ -3310,6 +3310,16 @@ constants can be done inline.  The funct
  HOST_WIDE_INT, (const_tree constant, HOST_WIDE_INT basic_align),
  default_constant_alignment)
 
+DEFHOOK
+(translate_mode_attribute,
+ "Define this hook if during mode attribute processing, the port should\n\
+translate machine_mode @var{mode} to another mode.  For example, rs6000's\n\
+@code{KFmode}, when it is the same as @code{TFmode}.\n\
+\n\
+The default version of the hook returns that mode that was passed in.",
+ machine_mode, (machine_mode mode),
+ default_translate_mode_attribute)
+
 /* True if MODE is valid for the target.  By "valid", we mean able to
be manipulated in non-trivial ways.  In particular, this means all
the arithmetic is supported.  */
Index: gcc/targhooks.c
===
--- gcc/targhooks.c (revision 262159)
+++ gcc/targhooks.c (working copy)
@@ -393,6 +393,14 @@ default_mangle_assembler_name (const cha
   return get_identifier (stripped);
 }
 
+/* The default implementation of TARGET_TRANSLATE_MODE_ATTRIBUTE.  */
+
+machine_mode
+default_translate_mode_attribute (machine_mode mode)
+{
+  return mode;
+}
+
 /* True if MODE is valid for the target.  By "valid", we mean able to
be manipulated in non-trivial ways.  In particular, this means all
the arithmetic is supported.
Index: gcc/targhooks.h
===
--- gcc/targhooks.h (revision 262159)
+++ gcc/targhooks.h (working copy)
@@ -72,6 +72,7 @@ extern void default_print_operand_addres
 extern bool default_print_operand_punct_valid_p (unsigned char);
 extern tree default_mangle_assembler_name (const char *);
 
+extern machine_mode default_translate_mode_attribute (machine_mode);
 extern bool default_scalar_mode_supported_p (scalar_mode);
 extern bool default_libgcc_floating_mode_supported_p (scalar_float_mode);
 extern opt_scalar_float_mode default_floatn_mode (int, bool);
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in  (revision 262159)
+++ gcc/doc/tm.texi.in  (working copy)
@@ -3336,6 +3336,8 @@ stack.
 
 @hook TARGET_REF_MAY_ALIAS_ERRNO
 
+@hook TARGET_TRANSLATE_MODE_ATTRIBUTE
+
 @hook TARGET_SCALAR_MODE_SUPPORTED_P
 
 @hook TARGET_VECTOR_MODE_SUPPORTED_P
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi (revision 262159)
+++ gcc/doc/tm.texi (working copy)
@@ -4255,6 +4255,14 @@ hook returns true for both @code{ptr_mod
 Define this to return nonzero if the memory reference @var{ref}  may alias 
with the system C library errno location.  The default  version of this hook 
assumes the system C library errno location  is either a declaration of type 
int or accessed by dereferencing  a pointer to int.
 @end deftypefn
 
+@deftypefn {Target Hook} machine_mode TARGET_TRANSLATE_MODE_ATTRIBUTE 
(machine_mode @var{mode})
+Define this hook if during mode attribute processing, the port should
+translate machine_mode @var{mode} to another mode.  For example, rs6000's
+@code{KFmode}, when it is the same as @code{TFmode}.
+
+The default version of the hook returns that mode that was passed in.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_SCALAR_MODE_SUPPORTED_P (scalar_mode 
@var{mode})
 Define this to return nonzero if the port is prepared to handle
 insns involving scalar mode @var{mode}.  For a scalar mode to be
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 262159)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1802,6 +1802,9 @@ static const struct attribute_s

Re: [PING][PATCH, rs6000, C/C++] Fix PR target/86324: divkc3-1.c FAILs when compiling with -mabi=ieeelongdouble

2018-07-06 Thread Peter Bergner
On 7/5/18 2:36 PM, Jeff Law wrote:
> On 07/02/2018 03:50 PM, Peter Bergner wrote:
>> I'd like to PING:
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01713.html
>>
>> I've included the entire patch below, since I missed the test cases in
>> the original submission and Segher asked for some updated text for the
>> hook documentation which I've included below.
>
> OK.

Is that an ok for GCC 8 as well which I asked for in the initial patch
submission?

Peter




Re: [PATCHv2] Change the library search path when using --with-advance-toolchain

2019-10-23 Thread Peter Bergner
On 10/5/19 12:20 PM, Segher Boessenkool wrote:
> On Fri, Oct 04, 2019 at 06:31:34PM -0300, Tulio Magno Quites Machado Filho 
> wrote:
>> Remove all -L directories from LINK_OS_EXTRA_SPEC32 and
>> LINK_OS_EXTRA_SPEC64 so that user directories specified at
>> build time have higher preference over the advance toolchain libraries.
>>
>> Set MD_STARTFILE_PREFIX to $prefix/lib/ and MD_STARTFILE_PREFIX_1 to
>> $at/lib/ so that a compiler library has preference over the Advance
>> Toolchain libraries.
> 
> This is fine, approved for all branches.  Thank you!  And thanks to Mike
> for the testing.

I've committed the back ports to the FSF 7, 8 and 9 branches now after
clean bootstraps and regtesting.

Peter




[PATCH, rs6000][committed] Fix PR92090: Allow MODE_PARTIAL_INT modes for integer constant input operands.

2019-11-07 Thread Peter Bergner
Before, LRA, we have an insn that sets a TImode pseudo with an integer
constant and a following insn that copies that TImode pseudo to a PTImode
pseudo.  During LRA spilling, we generate a new insn that sets a PTImode
pseudo to that constant directly and we ICE because we do not recognize
that as a valid insn.  The fix below fixes the ICE reported in PR92090 by
modifying our input_operand predicate to allow MODE_PARTIAL_INT modes for
integer constant input operands.

This patch (preapproved by Segher) passed bootstrap and regtesting
with no errors.  Committed.

Peter


gcc/
PR other/92090
* config/rs6000/predicates.md (input_operand): Allow MODE_PARTIAL_INT
modes for integer constants.

gcc/testsuite/
PR other/92090
* gcc.target/powerpc/pr92090.c: New test.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 277861)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1047,8 +1047,7 @@ (define_predicate "input_operand"
 return 1;
 
   /* Allow any integer constant.  */
-  if (GET_MODE_CLASS (mode) == MODE_INT
-  && CONST_SCALAR_INT_P (op))
+  if (SCALAR_INT_MODE_P (mode) && CONST_SCALAR_INT_P (op))
 return 1;
 
   /* Allow easy vector constants.  */
Index: gcc/testsuite/gcc.target/powerpc/pr92090.c
===
--- gcc/testsuite/gcc.target/powerpc/pr92090.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr92090.c  (working copy)
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -Os -mbig" } */
+
+/* Verify that we don't ICE.  */
+
+_Atomic int a;
+_Atomic long double b, c;
+int j;
+void foo (void);
+void bar (int, int, int, int);
+
+void
+bug (void)
+{
+  b = 1;
+  int d, e, f, g;
+  while (a)
+;
+  for (int h = 0; h < 1; h++)
+{
+  double i = b /= 3;
+  foo ();
+  if (i)
+   {
+ if (i == 1)
+   d++;
+ e++;
+ b = 0;
+   }
+  else
+   {
+ if (i == 2)
+   f++;
+ g++;
+ b = 1;
+   }
+}
+  bar (d, e, f, g);
+  c = 1;
+  for (int h; h; h++)
+j = 0;
+}


[PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772

2020-01-09 Thread Peter Bergner
My fix to PR92923 seems to have caused the vmx/ops.c and vsx-vector-6.p*.c
test failures.  The ops.c issue is we need a new option to quiet a warning
we didn't see when we were emitting VIEW_CONVERT_EXPRs.  The other test
cases just need a slight adjustment to some of their counts.  However, we
were seeing double/triple/... counting due to using "xxland" instead of
{\mxxland\M} for our regex, so that was also counting xxlandc too.
I adjusted all of the insn regexs to use \m and \M to fix that.

I must say that the vsx-vector-6.p*.c tests are fragile!  They're so big and
reusing source operands, that the compiler can sometimes optimize several
builtin calls together, meaning we don't see as many vsx instructions as
we have calls to builtins.  I didn't bother trying to fix that, since that
is a lot more work!  I just wanted to vent! :-)

I've confirmed the updated test cases now pass on both BE and LE.
Ok for trunk?

Peter

PR target/92923
PR target/93136
* gcc.dg/vmx/ops.c: Add -flax-vector-conversions to dg-options.
* gcc.target/powerpc/vsx-vector-6.p7.c: Adjust scan-assembler-times
regex directives.  Adjust expected instruction counts.
* gcc.target/powerpc/vsx-vector-6.p8.c: Likewise.
* gcc.target/powerpc/vsx-vector-6.p9.c: Likewise.

Index: gcc/testsuite/gcc.dg/vmx/ops.c
===
--- gcc/testsuite/gcc.dg/vmx/ops.c  (revision 279852)
+++ gcc/testsuite/gcc.dg/vmx/ops.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated" 
} */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated 
-flax-vector-conversions" } */
 #include 
 #include 
 extern char * *var_char_ptr;
Index: gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c
===
--- gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c  (revision 279852)
+++ gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c  (working copy)
@@ -5,37 +5,38 @@
 
 /* Expected instruction counts for Power 7 */
 
-/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
-/* { dg-final { scan-assembler-times "xvadddp" 1 } } */
-/* { dg-final { scan-assembler-times "xxlnor" 5 } } */
+/* { dg-final { scan-assembler-times {\mxvabsdp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvadddp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxxlnor\M} 5 } } */
 /* { dg-final { scan-assembler-times {\mxvcmpeqdp\s} 1 } } */
 /* { dg-final { scan-assembler-times {\mxvcmpeqdp\.\s} 5 } } */
 /* { dg-final { scan-assembler-times {\mxvcmpgtdp\s} 2 } } */
 /* { dg-final { scan-assembler-times {\mxvcmpgtdp\.\s} 5 } } */
 /* { dg-final { scan-assembler-times {\mxvcmpgedp\s} 1 } } */
 /* { dg-final { scan-assembler-times {\mxvcmpgedp\.\s} 6 } } */
-/* { dg-final { scan-assembler-times "xvrdpim" 1 } } */
-/* { dg-final { scan-assembler-times "xvmaddadp" 1 } } */
-/* { dg-final { scan-assembler-times "xvmsubadp" 1 } } */
-/* { dg-final { scan-assembler-times "xvsubdp" 1 } } */
-/* { dg-final { scan-assembler-times "xvmaxdp" 1 } } */
-/* { dg-final { scan-assembler-times "xvmindp" 1 } } */
-/* { dg-final { scan-assembler-times "xvmuldp" 1 } } */
-/* { dg-final { scan-assembler-times "vperm" 2 } } */
-/* { dg-final { scan-assembler-times "xvrdpic" 2 } } */
-/* { dg-final { scan-assembler-times "xvsqrtdp" 1 } } */
-/* { dg-final { scan-assembler-times "xvrdpiz" 1 } } */
-/* { dg-final { scan-assembler-times "xvmsubasp" 1 } } */
-/* { dg-final { scan-assembler-times "xvnmaddasp" 1 } } */
-/* { dg-final { scan-assembler-times "xvnmaddadp" 1 } } */
-/* { dg-final { scan-assembler-times "xvnmsubadp" 1 } } */
-/* { dg-final { scan-assembler-times "vmsumshs" 2 } } */
-/* { dg-final { scan-assembler-times "xxland" 13 } } */
-/* { dg-final { scan-assembler-times "xxlxor" 2 } } */
-/* { dg-final { scan-assembler-times "xxsel" 4 } } */
-/* { dg-final { scan-assembler-times "xvrdpip" 1 } } */
-/* { dg-final { scan-assembler-times "xvdivdp" 1 } } */
-/* { dg-final { scan-assembler-times "xvrdpi" 7 } } */
+/* { dg-final { scan-assembler-times {\mxvrdpim\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmaddadp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmsubadp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvsubdp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmuldp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvperm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxv

Re: [PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772

2020-01-09 Thread Peter Bergner
On 1/9/20 4:51 PM, Segher Boessenkool wrote:
> On Thu, Jan 09, 2020 at 01:44:59PM -0600, Peter Bergner wrote:
>> The other test cases just need a slight adjustment to some of their
>> counts.
> 
> What were the changes?  Or, I'll just trust you looked at it and it is
> okay :-)

I did look.  Most of the changes brought us closer to one vsx instruction
for each builtin than what we had before.  We still have some optimization
across builtin functions occurring, so we still don't get a one-to-one mapping
of builtin call to vsx instruction.



>> I must say that the vsx-vector-6.p*.c tests are fragile!  They're so big and
>> reusing source operands, that the compiler can sometimes optimize several
>> builtin calls together, meaning we don't see as many vsx instructions as
>> we have calls to builtins.  I didn't bother trying to fix that, since that
>> is a lot more work!  I just wanted to vent! :-)
> 
> Splitting out separate functions in the testcase shouldn't be so much
> work?  Or am I too optimistic :-)
> 
> This should make the test a good deal less prone to random changes in
> output caused by the lunar cycle.

Ok, let me take a stab at rewriting the tests to be more similar to the
pr92923-[12].c and see how much work that is.  I do agree that it would
be nice not having the insn counts be so fragile.

Peter


Re: [PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772

2020-02-07 Thread Peter Bergner
On 1/9/20 6:29 PM, Peter Bergner wrote:
> On 1/9/20 4:51 PM, Segher Boessenkool wrote:
>> Splitting out separate functions in the testcase shouldn't be so much
>> work?  Or am I too optimistic :-)
>>
>> This should make the test a good deal less prone to random changes in
>> output caused by the lunar cycle.
> 
> Ok, let me take a stab at rewriting the tests to be more similar to the
> pr92923-[12].c and see how much work that is.  I do agree that it would
> be nice not having the insn counts be so fragile.

Sorry for taking so long to get back to this.  I split the functions into
smaller chunks, but didn't need to go all the way to one function per
builtin call.  Does this look better?  The POWER7 and POWER8 files are
identical, except for their -mcpu= option.  The POWER9 file has some slight
differences to the other files, because of new instructions added that we
make sure of.

I've confirmed the updated test cases now pass on both LE and BE.
Ok for trunk?

Peter

PR target/93136
* gcc.dg/vmx/ops.c: Add -flax-vector-conversions to dg-options.
* gcc.target/powerpc/vsx-vector-6.h: Split tests into smaller functions.
* gcc.target/powerpc/vsx-vector-6.p7.c: Adjust scan-assembler-times
regex directives.  Adjust expected instruction counts.
* gcc.target/powerpc/vsx-vector-6.p8.c: Likewise.
* gcc.target/powerpc/vsx-vector-6.p9.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/vmx/ops.c b/gcc/testsuite/gcc.dg/vmx/ops.c
index 6aff80bbd1a..4a0650c06bd 100644
--- a/gcc/testsuite/gcc.dg/vmx/ops.c
+++ b/gcc/testsuite/gcc.dg/vmx/ops.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated" 
} */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mno-vsx -Wno-deprecated 
-flax-vector-conversions" } */
 #include 
 #include 
 extern char * *var_char_ptr;
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h 
b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h
index a891b64e6fa..0106e8d2901 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6.h
@@ -1,167 +1,154 @@
-/* This test code is included into vsx-vector-6-be.c and vsx-vector-6-le.c.  
-   The two files have the tests for the number of instructions generated for
-   LE versus BE.  */
+/* This test code is included into vsx-vector-6.p7.c, vsx-vector-6.p8.c
+   and vsx-vector-6.p9.c.  The .c files have the tests for the number
+   of instructions generated for each cpu type.  */
 
 #include 
 
-void foo (vector double *out, vector double *in, vector long *p_l, vector bool 
long *p_b,
- vector unsigned char *p_uc, int *i, vector float *p_f,
- vector bool char *outbc, vector bool int *outbi,
- vector bool short *outbsi, vector int *outsi,
- vector unsigned int *outui, vector signed char *outsc,
- vector unsigned char *outuc)
+typedef struct {
+  vector double d;
+  vector float f;
+  vector long sl;
+  vector int si;
+  vector short ss;
+  vector char sc;
+  vector unsigned int ui;
+  vector unsigned short int us;
+  vector unsigned char uc;
+  vector bool long long bll;
+  vector bool long bl;
+  vector bool int bi;
+  vector bool short bs;
+  vector bool char bc;
+} opnd_t;
+
+void
+func_1op (opnd_t *dst, opnd_t *src)
 {
-  vector double in0 = in[0];
-  vector double in1 = in[1];
-  vector double in2 = in[2];
-  vector long inl = *p_l;
-  vector bool long inb = *p_b;
-  vector bool long long inbl0;
-  vector bool long long inbl1;
-  vector unsigned char uc = *p_uc;
-  vector float inf0;
-  vector float inf1;
-  vector float inf2;
-  vector char inc0;
-  vector char inc1;
-  vector bool char inbc0;
-  vector bool char inbc1;
-  vector bool short inbs0;
-  vector bool short inbs1;
-  vector bool int inbi0;
-  vector bool int inbi1;
-  vector signed short int inssi0, inssi1;
-  vector unsigned short int inusi0, inusi1;
-  vector signed int insi0, insi1;
-  vector unsigned int inui0, inui1;
-  vector unsigned char inuc0, inuc1;
-  
-  *out++ = vec_abs (in0);
-  *out++ = vec_add (in0, in1);
-  *out++ = vec_and (in0, in1);
-  *out++ = vec_and (in0, inb);
-  *out++ = vec_and (inb, in0);
-  *out++ = vec_andc (in0, in1);
-  *out++ = vec_andc (in0, inb);
-  *out++ = vec_andc (inb, in0);
-  *out++ = vec_andc (inbl0, in0);
-  *out++ = vec_andc (in0, inbl0);
-
-  *out++ = vec_ceil (in0);
-  *p_b++ = vec_cmpeq (in0, in1);
-  *p_b++ = vec_cmpgt (in0, in1);
-  *p_b++ = vec_cmpge (in0, in1);
-  *p_b++ = vec_cmplt (in0, in1);
-  *p_b++ = vec_cmple (in0, in1);
-  *out++ = vec_div (in0, in1);
-  *out++ = vec_floor (in0);
-  *out++ = vec_madd (in0, in1, in2);
-  *out++ = vec_msub (in0, in1, in2);
-  *out++ = vec_max (in0, in1);
-  *out++ = vec_min (in0, in1);
-  *out++ = vec_msub (in0, in1, in2);
-  *out++ = vec_mul (in0, in1);
-  *out++ = vec_nearbyint (in0);
-  *out++ = vec_nmadd 

Re: [PATCH] rs6000: Fix PR93136, gcc.dg/vmx/ops.c and several other test break after r279772

2020-02-08 Thread Peter Bergner
On 2/8/20 11:13 AM, Segher Boessenkool wrote:
>> +/* { dg-final { scan-assembler-times {\mvperm[r]?\M} 1 } } */
> 
> You can write this without the square brackets, fwiw.
> 
> Okay for trunk.  Thank you!

Ok, I pushed this change with your suggestion to remove the square brackets
on the above regex.  Thanks.

Peter




Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-04 Thread Peter Bergner
On 12/4/19 1:16 PM, Segher Boessenkool wrote:
> For future patches: it is much easier to review if you make the big,
> mechanical move a separate (earlier) patch.

Will do.




>> I have also
>> included a small patch to disable running the powerpc/dfp/ tests even for
>> powerpc*-linux when --disable-decimal-float is used.
> 
> What is the reason for that?
[snip]
> This isn't run from powerpc.exp, so it needs to still do that first check.
> And it's up to the Darwin maintainers whether they want that second part
> (there are many more tests and testsuites that disable *-darwin* while
> that isn't really necessary).
> 
> Why do you need/want the check_effective_target_dfp test?  If for example
> this is to prevent ICEs, that is no good, that is *hiding* problems.
> 
> I suspect it is to stop the testsuite from complaining if you configure
> with --disable-decimal-float.  What is different there then, compared to
> targets that do actually not support decimal float?

Well, yes.  I saw those tests being run for my --disable-decimal-float
runs, which resulted in FAILs for all of those tests.  They had ICE's
on unpatched trunk and FAILed gracefully using my patch, but they all
still FAILed, since these are DFP tests and DFP is disabled.
There's no sense in running these tests when DFP is disabled, either
manually due to --disable-decimal-float or implicitly because of the
specific target.

Why isn't just testing check_effective_target_dfp enough to disable the
tests on Darwin, --disable-decimal-float, etc.?  That would seem to imply
that one of those targets we're testing against enables DFP, but somehow
we don't want to run the tests or they all still FAIL for some reason???



> Okay for trunk minus the changes to powerpc-dfp.exp (we can iterate on
> that).  Thanks!

Ok, I committed that part of the patch.  Thanks!

Peter



Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-04 Thread Peter Bergner
On 12/4/19 2:47 PM, Iain Sandoe wrote:
> Peter Bergner  wrote:
>>
>> Why isn't just testing check_effective_target_dfp enough to disable the
>> tests on Darwin, --disable-decimal-float, etc.?
> 
> … It should be a better solution - I will confirm this.

Thanks for checking.  The nice thing about this solution (if it works
for Darwin and the other targets) is that if you eventually add DFP
support, then these tests will just automatically start being run for
you.  Otherwise, you'll have to hunt through the DFP tests looking for
a dg-skip-if darwin test.

Peter




Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-04 Thread Peter Bergner
On 12/4/19 2:50 PM, Segher Boessenkool wrote:
> It would be nice to keep *some* dfp test(s) to make sure we don't ICE.
> If we disabled all such tests already, like with this patch, we wouldn't
> have ICEd or seen this problem.  That can be a separate test of course
> (and could be outside gcc.target/powerpc/dfp/).

Sure, I can add a test in gcc.target/powerpc/ that uses both a builtin
and an overloaded builtin to make sure we don't ICE when DFP is disabled.




> OTOH, if you add this check, we can lose the
> 
> /* { dg-require-effective-target powerpc_p9vector_ok } */
> /* { dg-skip-if "" { powerpc*-*-aix* } } */
> 
> from all the dtstsfi-* tests, etc.  (Well, no, need to keep the p9 part).

Agreed on not needing the dg-skip-if tests.  Not only on the powerpc/dfp/
tests, but we should be able to remove them from the DFP tests that are
in gcc.target/powerpc/ too.


> Making such changes to the testsuite needs testing on *all* subtargets :-(

Right.  I'll come up with a patch and hopefully Iain and David can test
on Darwin and AIX and I can test on Linux with --enable-decimal-float
and --disable-decimal-float.  That should cover the major subtargets and
if it works there, I'd expect it to work on the minor subtargets too.



>> Why isn't just testing check_effective_target_dfp enough to disable the
>> tests on Darwin, --disable-decimal-float, etc.?  That would seem to imply
>> that one of those targets we're testing against enables DFP, but somehow
>> we don't want to run the tests or they all still FAIL for some reason???
> 
> It should be enough.  Currently we just directly skip all tests on OSes
> that do not support DFP, but that is not as nice as the effective target.

Right, if using the effective target test, any target that adds DFP support
in the future will automatically get these tests runs for it, which is what
we want.


Peter




Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-05 Thread Peter Bergner
On 12/5/19 2:44 AM, Iain Sandoe wrote:
> Segher Boessenkool  wrote:
>> On Wed, Dec 04, 2019 at 03:53:30PM -0600, Peter Bergner wrote:
>>> Sure, I can add a test in gcc.target/powerpc/ that uses both a builtin
>>> and an overloaded builtin to make sure we don't ICE when DFP is disabled.
>>
>> Great, thanks!
> 
> so that would be a/some dg-do compile test(s), then?
> (presumably, gated on effective_target_dfp … see below)

Yes, only a dg-do compile test.  For DFP enabled targets, they shouldn't
see any errors at all and for DFP disabled targets, I'd insert some
dg-error checks gated on ![check_effective_target_dfp] looking for the
"error: decimal floating-point not supported for this target" errors.
Any ICE would flag a real test case error.


> (on r278957, with the system assembler which doesn’t support DFP insns
>  and gcc.target dfp.exp not yet renamed)
> 
> # Skip these tests for targets that don't support this extension.
> if { ![check_effective_target_dfp] } {
> return;
> }
> 
> Works on Darwin to skip the entire set  (not too much of a  surprise, since
> it’s used for the GCC and G++ cases already).

Great, thanks for checking.


> It’s possible (even likely) that Darwin can be built with an assembler that
> supports DFP instructions, even tho it has no OS support.  Usually, my policy
> is that we would do compile tests then, since that excercises more code.

My --disable-decimal-float run also worked and I had an assembler that
supports all of the DFP insns, so I'd expect it to work for you too.
The check_effective_target_dfp tests really is checking for whether the
DFP modes exist in the compiler and that isn't gated on the assembler...
at least fully.




> ..but then we need to gate run tests on the availability of runtime support.

There exists check_effective_target_dfprt for that.



> I see that the GCC and G++ testsuites have also….
> 
> # If the decimal float is supported in the compiler but not yet in the
> # runtime, treat all tests as compile-only.
> global dg-do-what-default
> set save-dg-do-what-default ${dg-do-what-default}
> if { ![check_effective_target_dfprt] } {
> verbose "dfp.exp: runtime support for decimal float does not exist" 2
> set dg-do-what-default compile
> } else {
> verbose "dfp.exp: runtime support for decimal float exists, use it" 2
> set dg-do-what-default run
> }
> verbose "dfp.exp: dg-do-what-default is ${dg-do-what-default}” 2
> 
> Do you think there’s any merit in doing something like that here too?

Adding this to powerpc/dfp/dfp.exp won't do anything, since all of the
powerpc/dfp/ tests are dg-do compile tests.


> (I guess a finer-grained alternative is check_effective_target_dfprt in any
>  run-tests)
> 
> .. or I can just force a false return from effective_target_dfp as we
>  do for other cases where assembler support does not imply system 
>  support.

I've already verified that if decimal float is disabled, but you do
have an assembler that supports dfp insns, check_effective_target_dfp
still works correctly.


Peter


Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-09 Thread Peter Bergner
On 12/6/19 5:12 PM, Segher Boessenkool wrote:
> On Thu, Dec 05, 2019 at 08:44:57AM +, Iain Sandoe wrote:
>> .. or I can just force a false return from effective_target_dfp as we
>>  do for other cases where assembler support does not imply system 
>>  support.
> 
> That's what I would do, yes.

I'm not sure that's necessary.  DFP enablement isn't triggered by
assembler support.  Just the gcc/configure fragment (ignoring manually
using --enable-decimal-float):

  case $target in
powerpc*-*-linux* | i?86*-*-linux* | x86_64*-*-linux* | s390*-*-linux* | \
i?86*-*-elfiamcu | i?86*-*-gnu* | x86_64*-*-gnu* | \
i?86*-*-mingw* | x86_64*-*-mingw* | \
i?86*-*-cygwin* | x86_64*-*-cygwin*)
  enable_decimal_float=yes
  ;;
*)
  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: decimal float is not 
supported for this target, ignored" >&5
$as_echo "$as_me: WARNING: decimal float is not supported for this target, 
ignored" >&2;}
  enable_decimal_float=no
  ;;

So I don't think there is anything to do wrt Darwin here.

Peter




Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-10 Thread Peter Bergner
On 12/4/19 5:03 PM, Segher Boessenkool wrote:
> On Wed, Dec 04, 2019 at 03:53:30PM -0600, Peter Bergner wrote:
>> Right.  I'll come up with a patch and hopefully Iain and David can test
>> on Darwin and AIX and I can test on Linux with --enable-decimal-float
>> and --disable-decimal-float.  That should cover the major subtargets and
>> if it works there, I'd expect it to work on the minor subtargets too.

Ok, how about the patch below?  If Iain and David could test this on Darwin
and AIX respectively, that would be great.  I'm currently testing this on
powerpc64le-linux, with and without --disable-decimal-float.

The pr92661.c test case is the DFP test case you wanted added to make sure
we do not ICE, even when DFP is disabled.  The dfp-[dt]d*.c changes are
due to me seeing them being run (and FAILing) on my --disable-decimal-float
runs.  Clearly, they shouldn't be run when DFP is disabled.

All of the powerpc/dfp/* tests had powerpc*-*-* target tests, but that is
covered by the dfp.exp target tests, so I removed them along with the
now unneeded dg-skip-if AIX tests.  If dg-do compile is the default, do
we want to just remove that whole line?

How is this looking?

Peter


PR bootstrap/92661
* gcc.target/powerpc/pr92661.c: New test.
* gcc.target/powerpc/dfp-dd.c: Add dg-require-effective-target dfp_hw.
Remove unneeded powerpc_fprs test.
* gcc.target/powerpc/dfp-td.c: Likewise.
* gcc.target/powerpc/dfp-dd-2.c: Add dg-require-effective-target dfp.
* gcc.target/powerpc/dfp-td-2.c: Likewise.
* gcc.target/powerpc/dfp-td-3.c: Likewise.
* gcc.target/powerpc/dfp/dfp.exp: Remove rs6000-*-* and
powerpc*-*-darwin* target tests.  Add check_effective_target_dfp test.
* gcc.target/powerpc/dfp/dtstsfi-0.c: Remove unneeded target test.
Remove unneeded dg-skip-if.
* gcc.target/powerpc/dfp/dtstsfi-1.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-10.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-11.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-12.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-13.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-14.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-15.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-16.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-17.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-18.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-19.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-2.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-20.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-21.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-22.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-23.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-24.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-25.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-26.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-27.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-28.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-29.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-3.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-30.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-31.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-32.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-33.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-34.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-35.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-36.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-37.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-38.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-39.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-4.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-40.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-41.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-42.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-43.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-44.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-45.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-46.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-47.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-48.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-49.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-5.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-50.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-51.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-52.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-53.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-54.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-55.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-56.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-57.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-58.c: Likewise

Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-10 Thread Peter Bergner
On 12/10/19 12:27 PM, Peter Bergner wrote:
> Ok, how about the patch below?  If Iain and David could test this on Darwin
> and AIX respectively, that would be great.  I'm currently testing this on
> powerpc64le-linux, with and without --disable-decimal-float.

So my --enable-decimal-float builds showed no regressions or differences
in the testsuite results (as expected).  My --disable-decimal-float runs
showed no new regressions.  The only differences in the testsuite results
were due to all of the FAILs on the base compiler build that are fixed
with the patch.

Peter





[PATCH] rs6000: Fix PR92923, __builtin_vec_xor() causes subregs to be used when not using V4SImode vectors

2019-12-17 Thread Peter Bergner
PR92923 shows a problem where builtin function usage is causing performance
problems due to unneeded subreg usage.  These are being caused by the front-
end emitting unneeded VIEW_CONVERT_EXPR's on the builtin functions operands.
These in tern where caused by a lack of overloaded builtin entries in the
rs6000 backend.  The following patch adds just enough new definitions to
match what our vector documentation says we must support.  I have also
added new test cases so that we will catch any regressions in this area.

This passed bootstrap and regression testing with no errors.  Ok for trunk?

This is a bug on the release branches too, but given how big this patch
ended up being, I don't think we want to backport this.

Peter


gcc/
PR target/92923
* config/rs6000/rs6000-builtin.def (VAND, VANDC, VNOR, VOR,
VXOR): Delete.
(EQV_V16QI_UNS, EQV_V8HI_UNS, EQV_V4SI_UNS, EQV_V2DI_UNS, EQV_V1TI_UNS,
NAND_V16QI_UNS, NAND_V8HI_UNS, NAND_V4SI_UNS, NAND_V2DI_UNS,
NAND_V1TI_UNS, ORC_V16QI_UNS, ORC_V8HI_UNS, ORC_V4SI_UNS, ORC_V2DI_UNS,
ORC_V1TI_UNS, VAND_V16QI_UNS, VAND_V16QI, VAND_V8HI_UNS, VAND_V8HI,
VAND_V4SI_UNS, VAND_V4SI, VAND_V2DI_UNS, VAND_V2DI, VAND_V4SF,
VAND_V2DF, VANDC_V16QI_UNS, VANDC_V16QI, VANDC_V8HI_UNS, VANDC_V8HI,
VANDC_V4SI_UNS, VANDC_V4SI, VANDC_V2DI_UNS, VANDC_V2DI, VANDC_V4SF,
VANDC_V2DF, VNOR_V16QI_UNS, VNOR_V16QI, VNOR_V8HI_UNS, VNOR_V8HI,
VNOR_V4SI_UNS, VNOR_V4SI, VNOR_V2DI_UNS, VNOR_V2DI, VNOR_V4SF,
VNOR_V2DF, VOR_V16QI_UNS, VOR_V16QI, VOR_V8HI_UNS, VOR_V8HI,
VOR_V4SI_UNS, VOR_V4SI, VOR_V2DI_UNS, VOR_V2DI, VOR_V4SF, VOR_V2DF,
VXOR_V16QI_UNS, VXOR_V16QI, VXOR_V8HI_UNS, VXOR_V8HI,
VXOR_V4SI_UNS, VXOR_V4SI, VXOR_V2DI_UNS, VXOR_V2DI, VXOR_V4SF,
VXOR_V2DF): Add definitions.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins)
: Remove.
: Add
definitions.
: Change unsigned usages to use the new *_UNS
definition names.
(rs6000_gimple_fold_builtin) : Use new definition names.
(builtin_function_type) : Handle unsigned
builtins.

   gcc/testsuite/
   PR target/92923
   * gcc.target/powerpc/pr92923-1.c: New test.
   * gcc.target/powerpc/pr92923-2.c: Likewise.

Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 279479)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -1000,8 +1000,26 @@ BU_ALTIVEC_2 (VADDUHS, "vadduhs",
 BU_ALTIVEC_2 (VADDSHS,   "vaddshs",CONST,  altivec_vaddshs)
 BU_ALTIVEC_2 (VADDUWS,   "vadduws",CONST,  altivec_vadduws)
 BU_ALTIVEC_2 (VADDSWS,   "vaddsws",CONST,  altivec_vaddsws)
-BU_ALTIVEC_2 (VAND,  "vand",   CONST,  andv4si3)
-BU_ALTIVEC_2 (VANDC, "vandc",  CONST,  andcv4si3)
+BU_ALTIVEC_2 (VAND_V16QI_UNS, "vand_v16qi_uns",CONST,  andv16qi3)
+BU_ALTIVEC_2 (VAND_V16QI, "vand_v16qi",CONST,  andv16qi3)
+BU_ALTIVEC_2 (VAND_V8HI_UNS,  "vand_v8hi_uns", CONST,  andv8hi3)
+BU_ALTIVEC_2 (VAND_V8HI,  "vand_v8hi", CONST,  andv8hi3)
+BU_ALTIVEC_2 (VAND_V4SI_UNS,  "vand_v4si_uns", CONST,  andv4si3)
+BU_ALTIVEC_2 (VAND_V4SI,  "vand_v4si", CONST,  andv4si3)
+BU_ALTIVEC_2 (VAND_V2DI_UNS,  "vand_v2di_uns", CONST,  andv2di3)
+BU_ALTIVEC_2 (VAND_V2DI,  "vand_v2di", CONST,  andv2di3)
+BU_ALTIVEC_2 (VAND_V4SF,  "vand_v4sf", CONST,  andv4sf3)
+BU_ALTIVEC_2 (VAND_V2DF,  "vand_v2df", CONST,  andv2df3)
+BU_ALTIVEC_2 (VANDC_V16QI_UNS,"vandc_v16qi_uns",CONST, andcv16qi3)
+BU_ALTIVEC_2 (VANDC_V16QI,"vandc_v16qi",   CONST,  andcv16qi3)
+BU_ALTIVEC_2 (VANDC_V8HI_UNS, "vandc_v8hi_uns",CONST,  andcv8hi3)
+BU_ALTIVEC_2 (VANDC_V8HI, "vandc_v8hi",CONST,  andcv8hi3)
+BU_ALTIVEC_2 (VANDC_V4SI_UNS, "vandc_v4si_uns",CONST,  andcv4si3)
+BU_ALTIVEC_2 (VANDC_V4SI, "vandc_v4si",CONST,  andcv4si3)
+BU_ALTIVEC_2 (VANDC_V2DI_UNS, "vandc_v2di_uns",CONST,  andcv2di3)
+BU_ALTIVEC_2 (VANDC_V2DI, "vandc_v2di",CONST,  andcv2di3)
+BU_ALTIVEC_2 (VANDC_V4SF, "vandc_v4sf",CONST,  andcv4sf3)
+BU_ALTIVEC_2 (VANDC_V2DF, "vandc_v2df",CONST,  andcv2df3)
 BU_ALTIVEC_2 (VAVGUB,"vavgub", CONST,  uavgv16qi3_ceil)
 BU_ALTIVEC_2 (VAVGSB,"vavgsb", CONST,  avgv16qi3_ceil)
 BU_ALTIVEC_2 (VAVGUH,"vavguh", CONST,  uavgv8hi3_ceil)
@@ -1057,8 +1075,27 @@ BU_ALTIVEC_2 (VMULOUH, "vmulouh",
 BU_ALTIVEC_2 (VMULOSH,

Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-18 Thread Peter Bergner
On 12/18/19 8:15 AM, Segher Boessenkool wrote:
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
>> +/* { dg-options "-w -O2 -mdejagnu-cpu=power9" } */
> 
> You don't need that target clause in gcc.target/powerpc (and dg-do compile
> is the default, but having it explicit is also fine of course).

I think leaving the bare dg-do compile (ie, no target) is nice,
for newbies who don't know that dg-do compile is the default.



>> --- gcc/testsuite/gcc.target/powerpc/dfp-dd.c(revision 278980)
>> +++ gcc/testsuite/gcc.target/powerpc/dfp-dd.c(working copy)
>> @@ -1,6 +1,7 @@
>>  /* Test generation of DFP instructions for POWER6.  */
>>  /* Origin: Janis Johnson  */
>> -/* { dg-do compile { target { powerpc*-*-linux* && powerpc_fprs } } } */
>> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
>> +/* { dg-require-effective-target dfp_hw } */
> 
> You can remove powerpc_fprs now because it became redundant?  Cool.

Right, hard dfp support requires we have hard float support.


> But dfp_hw is the wrong conditions for a dg-do compile test.

Ok, yes.  Looking closer, that dfp_hw is a runtime test and not
what we want.  I'll change this to using "hard_dfp" which is a
compile time test.



> Nice cleanups!  Please fix that dfp_hw thing, and then, okay for trunk,
> Thanks!

Will do, thanks.  I'll commit this after making these changes and
rerunning the updated test cases.

Peter



Re: [PATCH] rs6000: Fix PR92923, __builtin_vec_xor() causes subregs to be used when not using V4SImode vectors

2019-12-20 Thread Peter Bergner
On 12/20/19 9:35 AM, Segher Boessenkool wrote:
> Something automated to verify what we implement is what we have documented
> would be neat to have.  Maybe this becomes feasible with the rewrite of
> the builtin stuff :-)

Agreed!



>> This passed bootstrap and regression testing with no errors.  Ok for trunk?
> 
> On what kind of system did you test?
> 
> I'd like to see this tested on both BE and LE, and various processor
> generations -- but we'll see if it regresses anyway, and it is still
> stage 3.  So, okay for trunk, just please keep an eye out for
> regressions (in January that is :-) )

It was an LE test.  I fixed up the problems below and have kicked off
a BE run now and will run the testsuite in both 32-bit and 64-bit modes.
I leave on vacation tomorrow, so I'll wait until I return next week
before committing so I can watch for any regressions.


>>  * config/rs6000/rs6000-builtin.def (VAND, VANDC, VNOR, VOR,
>>  VXOR): Delete.
> 
> You can end a changelog line in a colon just fine, fwiw.

Ok.



>> --- gcc/testsuite/gcc.target/powerpc/pr92923-1.c (nonexistent)
>> +++ gcc/testsuite/gcc.target/powerpc/pr92923-1.c (working copy)
>> @@ -0,0 +1,454 @@
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
> 
> You don't need this target clause, everything in gc.target/powerpc has it
> already.
> 
>> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> 
> I'm not sure if this is necessary, or just cargo culting :-)

I think maybe both of those were cut/paste issues.  I've removed them both.




>> +/* { dg-final { scan-tree-dump-times "VIEW_CONVERT_EXPR" 0 "gimple" } } */
> 
> That's scan-tree-dump-not.

Ahh, that is better.  Fixed.


> Same comments for the p8 test of course.  Okay with or without those
> adjusted (they aren't wrong, just weird style).

Fixed too.


Peter





Re: [PATCH] rs6000: Fix PR92923, __builtin_vec_xor() causes subregs to be used when not using V4SImode vectors

2019-12-30 Thread Peter Bergner
On 12/20/19 12:20 PM, Peter Bergner wrote:
>> On what kind of system did you test?
>>
>> I'd like to see this tested on both BE and LE, and various processor
>> generations -- but we'll see if it regresses anyway, and it is still
>> stage 3.  So, okay for trunk, just please keep an eye out for
>> regressions (in January that is :-) )
> 
> It was an LE test.  I fixed up the problems below and have kicked off
> a BE run now and will run the testsuite in both 32-bit and 64-bit modes.
> I leave on vacation tomorrow, so I'll wait until I return next week
> before committing so I can watch for any regressions.

The BE run was clean and I'm back from vacation, so I committed
the updated patch.  Thanks.

Peter




Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-11-13 Thread Peter Zijlstra
On Sat, Nov 13, 2021 at 03:37:24PM -0500, David Malcolm wrote:

> This approach is much less expressive that the custom addres space
> approach; it would only cover the trust boundary aspect; it wouldn't
> cover any differences between generic pointers and __user, vs __iomem,
> __percpu, and __rcu which I admit I only dimly understand.

__iomem would point at device memory, which can have curious side
effects or is yet another trust boundary, depending on device and usage.

__percpu is an address space that denotes a per-cpu variable's relative
offset, it needs be combined with a per-cpu offset to get a 'real'
pointer, on x86_64 %gs segment offset is used for this purpose, other
architectures are less fortunate. The whole per_cpu()/this_cpu_*()
family of APIs accepts such pointers.

__rcu is the regular kernel address space, but denotes that the object
pointed to has RCU lifetime management. The attribute is laundered
through rcu_dereference() to remove the __rcu qualifier.

> Possibly silly question: is it always a bug for the value of a kernel
> pointer to leak into user space?  i.e. should I be complaining about an
> infoleak if the value of a trusted_ptr itself is written to
> *untrusted_ptr?  e.g.

Yes, always. Leaking kernel pointers is unconditionally bad.


Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-15 Thread Peter Zijlstra
On Mon, Nov 15, 2021 at 12:33:16PM +0530, Prathamesh Kulkarni wrote:
> On Sun, 14 Nov 2021 at 02:07, David Malcolm via Gcc-patches

> > +/* Handle "returns_zero_on_failure" and "returns_zero_on_success" 
> > attributes;
> > +   arguments as in struct attribute_spec.handler.  */
> > +
> > +static tree
> > +handle_returns_zero_on_attributes (tree *node, tree name, tree, int,
> > +  bool *no_add_attrs)
> > +{
> > +  if (!INTEGRAL_TYPE_P (TREE_TYPE (*node)))
> > +{
> > +  error ("%qE attribute on a function not returning an integral type",
> > +name);
> > +  *no_add_attrs = true;
> > +}
> > +  return NULL_TREE;
> Hi David,
> Just curious if a warning should be emitted if the function is marked
> with the attribute but it's return value isn't actually 0 ?
> 
> There are other constants like -1 or 1 that are often used to indicate
> error, so maybe tweak the attribute to
> take the integer as an argument ?
> Sth like returns_int_on_success(cst) / returns_int_on_failure(cst) ?
> 
> Also, would it make sense to extend it for pointers too for returning
> NULL on success / failure ?

Please also consider that in Linux we use the 'last' page for error code
returns. That is, a function returning a pointer could return '(void
*)-EFAULT' also see linux/err.h


Re: [PATCH V3] rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di built-in [PR113950]

2024-03-15 Thread Peter Bergner
On 3/6/24 3:27 AM, Kewen.Lin wrote:
> on 2024/3/4 02:55, jeevitha wrote:
>> The following patch has been bootstrapped and regtested on powerpc64le-linux.
>>  
>> When we expand the __builtin_vsx_splat_2di function, we were allowing 
>> immediate
>> value for second operand which causes an unrecognizable insn ICE. Even though
>> the immediate value was forced into a register, it wasn't correctly assigned
>> to the second operand. So corrected the assignment of op1 to operands[1].
[snip]
> As the discussions in the thread of the previous versions, I think
> Segher agreed this approach, so OK for trunk with the above nits
> tweaked, thanks!

The bogus vsx_splat_ code goes all the way back to GCC 8, so we
should backport this fix.  Segher and Ke Wen, can we get an approval
to backport this to all the open release branches (GCC 13, 12, 11)?
Thanks.

Jeevitha, once we get approval, please perform the backports.

Peter




Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2024-03-18 Thread Peter Bergner
On 3/18/24 9:36 AM, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Feb 23, 2024 at 03:04:13PM +0530, jeevitha wrote:
>> PTImode attribute assists in generating even/odd register pairs on 128 bits.
> 
> It is a mode, not an attribute.  Attributes are on declarations, while
> modes are on a much more fundamental level (every value has a mode, in
> GCC!)
> 
>> When the user specifies PTImode as an attribute, it breaks because there is 
>> no
>> internal type to handle this mode . We have created a tree node with dummy 
>> type
>> to handle PTImode.
> 
> You are talking about the mode attribute.  Aha.

Correct.



>> We are not documenting this dummy type since users are not
>> allowed to use this type externally.
> 
> Not sure what this is meant to mean?  What does "allowed to" mean, even?
> We do not forbid people from doing anything (we can discourage them
> though).  Or dso you mean something just doesn't work?

The use case is the Linux kernel will use the mode attribute as shown in
the test case.  The type she created was needed for the code to "work"
internally, but we don't want to advertise it to users, so that's why we're
not documenting it.  Yes, pesky users who go digging through the GCC source
could find it and use it, but we don't want to encourage them. :-)


>>  PR target/106895
>>  * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields
>>  to hold PTImode type.
> 
> An enum does not have fields.  What do you actually mean?

Yeah, as per your follow-on comment, I think a simple "Add RS6000_BTI_INTPTI."
should suffice.


Peter



Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-03-22 Thread Peter Bergner
so same problem with the comment, but the following
change...

> -  else if (align_words < GP_ARG_NUM_REG)
> +  else if (align_words < GP_ARG_NUM_REG
> +|| (cum->hidden_string_length
> +&& cum->actual_parm_length <= GP_ARG_NUM_REG))
{
  if (TARGET_32BIT && TARGET_POWERPC64)
return rs6000_mixed_function_arg (mode, type, align_words);

  return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
}
  else
return NULL_RTX;

The old code for the unused hidden parameter (which was the 9th param) would
fall thru to the "return NULL_RTX;" which would make the callee assume there
was a parameter save area allocated.  Now instead, we'll return a reg rtx,
probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now
see a copy of r11 into a pseudo like we do for the other param regs.
Is that a problem? Given it's an unused parameter, it'll probably get deleted
as dead code, but could it cause any issues?  What if we have more than one
unused hidden parameter and we return r12 and r13 which have specific uses
in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead.
Have you verified what the callee RTL looks like after expand for these
unused hidden parameters?  Is there a rtx we can return that isn't a NULL_RTX
which triggers the assumption of a parameter save area, but isn't a reg rtx
which might lead to some rtl being generated?  Would a (const_int 0) or
something else work?



> +  /* Actual parameter length ignoring hidden parameter.
> + This is done to C++ wrapper calling fortran procedures
> + which has hidden parameter that are not used.  */

I think a simpler comment will suffice:

  /* Actual parameter count ignoring unused hidden parameters.  */


Peter





Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-03-23 Thread Peter Bergner
On 3/23/24 4:33 AM, Ajit Agarwal wrote:
>>> -  else if (align_words < GP_ARG_NUM_REG)
>>> +  else if (align_words < GP_ARG_NUM_REG
>>> +  || (cum->hidden_string_length
>>> +  && cum->actual_parm_length <= GP_ARG_NUM_REG))
>> {
>>   if (TARGET_32BIT && TARGET_POWERPC64)
>> return rs6000_mixed_function_arg (mode, type, align_words);
>>
>>   return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
>> }
>>   else
>> return NULL_RTX;
>>
>> The old code for the unused hidden parameter (which was the 9th param) would
>> fall thru to the "return NULL_RTX;" which would make the callee assume there
>> was a parameter save area allocated.  Now instead, we'll return a reg rtx,
>> probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now
>> see a copy of r11 into a pseudo like we do for the other param regs.
>> Is that a problem? Given it's an unused parameter, it'll probably get deleted
>> as dead code, but could it cause any issues?  What if we have more than one
>> unused hidden parameter and we return r12 and r13 which have specific uses
>> in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead.
>> Have you verified what the callee RTL looks like after expand for these
>> unused hidden parameters?  Is there a rtx we can return that isn't a NULL_RTX
>> which triggers the assumption of a parameter save area, but isn't a reg rtx
>> which might lead to some rtl being generated?  Would a (const_int 0) or
>> something else work?
>>
>>
> For the above use case it will return 
> 
> (reg:DI 5 %r5) and below check entry_parm = 
> (reg:DI 5 %r5) and the following check will not return TRUE and hence
>parameter save area will not be allocated.

Why r5?!?!   The 8th (integer) param would return r10, so I'd assume if
the next param was a hidden param, then it'd get the next gpr, so r11.
How does it jump back to r5 which may have been used by the 3rd param?





> It will not generate any rtx in the callee rtl code but it just used to
> check whether to allocate parameter save area or not when number of args > 8.
> 
> /* If there is no incoming register, we need a stack.  */
>   entry_parm = rs6000_function_arg (args_so_far, arg);
>   if (entry_parm == NULL)
> return true;
> 
>   /* Likewise if we need to pass both in registers and on the stack.  */
>   if (GET_CODE (entry_parm) == PARALLEL
>   && XEXP (XVECEXP (entry_parm, 0, 0), 0) == NULL_RTX)
> return true;

Yes, this code in rs6000_parm_needs_stack() uses the rs6000_function_arg()
return value as a boolean to tell us whether a parameter save area is required
so what we return is unimportant other than to know it's not NULL_RTX.

I'm more concerned about the use of the target hook targetm.calls.function_arg
used in the generic parts of the compiler.  What will that code do differently
now that we return a reg rtx rather than NULL_RTX?  Might that code use
the reg rtx to emit something?  I'd feel better if you could verify what
happens in that code when we return a reg rtx for that 9th hidden param which
isn't really being passed in a register.


Peter




[PATCH] lto: Don't assume a posix shell is usable on windows [PR110710]

2024-03-26 Thread Peter Damianov
lto-wrapper generates Makefiles that use the following:
touch -r file file.tmp && mv file.tmp file
to truncate files.
If there is no suitable "touch" or "mv" available, then this errors with
"The system cannot find the file specified".

The solution here is to check if sh -c true works, before trying to use it in
the Makefile. If it doesn't, then fall back to "copy /y nul file" instead.
The fallback doesn't work exactly the same (the modified time of the file gets
updated), but this doesn't seem to matter in practice.

I tested this both in environments both with and without sh present, and
observed no issues.

Signed-off-by: Peter Damianov 
---
 gcc/lto-wrapper.cc | 35 ---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
index 5186d040ce0..8dee0aaa2d8 100644
--- a/gcc/lto-wrapper.cc
+++ b/gcc/lto-wrapper.cc
@@ -1389,6 +1389,27 @@ make_exists (void)
   return errmsg == NULL && exit_status == 0 && err == 0;
 }
 
+/* Test that an sh command is present and working, return true if so.
+   This is only relevant for windows hosts, where a /bin/sh shell cannot
+   be assumed to exist. */
+
+static bool
+sh_exists (void)
+{
+  const char *sh = "sh";
+  const char *sh_args[] = {sh, "-c", "true", NULL};
+#ifdef _WIN32
+  int exit_status = 0;
+  int err = 0;
+  const char *errmsg
+= pex_one (PEX_SEARCH, sh_args[0], CONST_CAST (char **, sh_args),
+  "sh", NULL, NULL, &exit_status, &err);
+  return errmsg == NULL && exit_status == 0 && err == 0;
+#else
+  return true;
+#endif
+}
+
 /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. 
*/
 
 static void
@@ -1402,6 +1423,7 @@ run_gcc (unsigned argc, char *argv[])
   const char *collect_gcc;
   char *collect_gcc_options;
   int parallel = 0;
+  bool have_sh = sh_exists ();
   int jobserver = 0;
   bool jobserver_requested = false;
   int auto_parallel = 0;
@@ -2016,6 +2038,7 @@ cont:
  argv_ptr[5] = NULL;
  if (parallel)
{
+ fprintf (mstream, "SHELL=%s\n", have_sh ? "sh" : "cmd");
  fprintf (mstream, "%s:\n\t@%s ", output_name, new_argv[0]);
  for (j = 1; new_argv[j] != NULL; ++j)
fprintf (mstream, " '%s'", new_argv[j]);
@@ -2024,9 +2047,15 @@ cont:
 truncate them as soon as we have processed it.  This
 reduces temporary disk-space usage.  */
  if (! save_temps)
-   fprintf (mstream, "\t@-touch -r \"%s\" \"%s.tem\" > /dev/null "
-"2>&1 && mv \"%s.tem\" \"%s\"\n",
-input_name, input_name, input_name, input_name); 
+   {
+ fprintf (mstream,
+  have_sh
+  ? "\t@-touch -r \"%s\" \"%s.tem\" > /dev/null "
+"2>&1 && mv \"%s.tem\" \"%s\"\n"
+  : "\t@-copy /y nul \"%s\" > NUL "
+"2>&1\n",
+  input_name, input_name, input_name, input_name);
+   }
}
  else
{
-- 
2.39.2



Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread Peter Bergner
On 4/3/24 7:40 AM, H.J. Lu wrote:
> We can't profile indirect calls to IFUNC resolvers nor their callees as
> it requires TLS which hasn't been set up yet when the dynamic linker is
> resolving IFUNC symbols.
> 
> Add an IFUNC resolver caller marker to cgraph_node and set it if the
> function is called by an IFUNC resolver.  Disable indirect call profiling
> for IFUNC resolvers and their callees.

The IFUNC resolvers on Power do not use TLS, so isn't this a little too
conservative?  Should this be triggered via a target hook so architectures
that don't use TLS in their IFUNC resolvers could still profile them?

Peter




Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread Peter Bergner
On 4/3/24 10:45 AM, Andrew Pinski wrote:
> On Wed, Apr 3, 2024 at 8:32 AM Peter Bergner  wrote:
> I think you misunderstood the patch/situtation. Most ifunc resolves
> don't use TLS at all; what is happening here is that the profiler
> (-fprofile-generate) is adding TLS usage to the ifunc resolver which
> then causes issues. And the use of TLS causes a PLT call to be inside
> the ifun which causes all the fun stuff.
> 
> This is not about ifunc resolves using TLS directly in code but rather
> indirectly via -fprofile-generate.

Ah, ok.  Thanks to you and H.J. for clarifying.

Peter



[PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-05 Thread Peter Bergner
This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR
and delete the now dead mask.

This passed bootstrap and retesting on powerpc64le-linux with no regressions.
Ok for trunk?

Eventually we'll want to backport this along with the follow-on patch that
actually fixes PR101865.

Peter


gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_VSX_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mno-direct-move): New.
(mdirect-move): Remove Mask and Var.


diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 68bc45d65ba..77d045c9f6e 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -471,6 +471,8 @@ extern int rs6000_vector_align[];
 #define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64)
 #define TARGET_MADDLD  TARGET_MODULO
 
+/* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
+#define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
 #define TARGET_XSCVDPSPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
 #define TARGET_XSCVSPDPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
 #define TARGET_VADDUQM (TARGET_P8_VECTOR && TARGET_POWERPC64)
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 6ba9df4f02e..c241371147c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3811,7 +3811,7 @@ rs6000_option_override_internal (bool global_init_p)
  Testing for direct_move matches power8 and later.  */
   if (!BYTES_BIG_ENDIAN
   && !(processor_target_table[tune_index].target_enable
-  & OPTION_MASK_DIRECT_MOVE))
+  & OPTION_MASK_P8_VECTOR))
 rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
   /* Add some warnings for VSX.  */
@@ -3853,8 +3853,7 @@ rs6000_option_override_internal (bool global_init_p)
   && (rs6000_isa_flags_explicit & (OPTION_MASK_SOFT_FLOAT
   | OPTION_MASK_ALTIVEC
   | OPTION_MASK_VSX)) != 0)
-rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO
-  | OPTION_MASK_DIRECT_MOVE)
+rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO)
 & ~rs6000_isa_flags_explicit);
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
@@ -3939,13 +3938,6 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FPRND;
 }
 
-  if (TARGET_DIRECT_MOVE && !TARGET_VSX)
-{
-  if (rs6000_isa_flags_explicit & OPTION_MASK_DIRECT_MOVE)
-   error ("%qs requires %qs", "-mdirect-move", "-mvsx");
-  rs6000_isa_flags &= ~OPTION_MASK_DIRECT_MOVE;
-}
-
   if (TARGET_P8_VECTOR && !TARGET_ALTIVEC)
 rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;
 
@@ -24429,7 +24421,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
false, true  },
   { "cmpb",OPTION_MASK_CMPB,   false, true  },
   { "crypto",  OPTION_MASK_CRYPTO, false, true  },
-  { "direct-move", OPTION_MASK_DIRECT_MOVE,false, true  },
+  { "direct-move", 0,  false, true  },
   { "dlmzb",   OPTION_MASK_DLMZB,  false, true  },
   { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX,
false, true  },
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index ce0b14a8d37..647f20de7f2 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Not

Re: [PATCH, rs6000] Split TARGET_POWER8 from TARGET_DIRECT_MOVE [PR101865] (2/2)

2024-04-07 Thread Peter Bergner
I'm picking up Will's patches for this bug.  As an FYI, this is the bug where
_ARCH_PWR8 is conditional on TARGET_DIRECT_MOVE which can be disabled with
-mno-vsx which is bad.

I already posted the cleanup patch that the updated patch for this bug will rely
on, that removed the OPTION_MASK_DIRECT_MOVE because it is fully redundant with
OPTION_MASK_P8_VECTOR.  I've also incorporated some of Ke Wen's review comments
on Will's original patch.  I have a couple of comments on your review though...


On 10/17/22 1:08 PM, Segher Boessenkool wrote:
> On Mon, Sep 19, 2022 at 11:13:20AM -0500, will schmidt wrote:
>> @@ -24046,10 +24045,11 @@ static struct rs6000_opt_mask const 
>> rs6000_opt_masks[] =
>>{ "block-ops-vector-pair",OPTION_MASK_BLOCK_OPS_VECTOR_PAIR,
>>  false, true  },
>>{ "cmpb", OPTION_MASK_CMPB,   false, true  },
>>{ "crypto",   OPTION_MASK_CRYPTO, false, 
>> true  },
>>{ "direct-move",  OPTION_MASK_DIRECT_MOVE,false, true  },
>> +  { "power8",   OPTION_MASK_POWER8, false, 
>> true  },
> 
> Why would we want a #pragma power8 ?

Agreed, we don't want that.  We have target attribute cpu=power8 for that.



>> +mpower8
>> +Target Mask(POWER8) Var(rs6000_isa_flags)
>> +Use instructions added in ISA 2.07 (power8).
> 
> There should not be such an option.  It is set by -mcpu=power8 and
> later, but can never be enabled or disabled direfctly by the user.

So we need an OPTION_MASK_POWER8 to be created for use in rs6000_isa_flags, but
the only way I see that we can do that is to create an option in rs6000.opt.
Did I miss that there is another way?  Otherwise, I was thinking of creating a
dummy option that is WarnRemoved from the start ala:

+;; This option exists only for its MASK.  It is not intended for users.
+mpower8
+Target Mask(POWER8) Var(rs6000_isa_flags) WarnRemoved
+

Is there a better way?  The problem is P8 created lots of new instructions, but
they were basically all vector and htm instructions.  There were no general
GPR or FPR instructions (ie, what we'd think of as base architecture) added,
so there's no other OPTION_MASK_*/TARGET_* we can use as a P8 base architecture
test.

I'll note I tried just a bare "Target Mask(POWER8) Var(rs6000_isa_flags)" with 
no
option name mentioned at all, but that didn't work, as no OPTION_MASK_POWER8 was
created.

Peter




Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Peter Bergner
On 4/8/24 3:55 AM, Kewen.Lin wrote:
> on 2024/4/6 06:28, Peter Bergner wrote:
>> +mno-direct-move
>> +Target Undocumented WarnRemoved
>> +
>>  mdirect-move
>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
>> +Target Undocumented WarnRemoved
> 
> When reviewing my previous patch to "neuter option -mpower{8,9}-vector",
> Segher mentioned that we don't need to keep such option warning all the
> time and can drop it like in a release later as users should be aware of
> this information then, I agreed and considering that patch disabling
> -m[no-]direct-move was r8-7845-g57f108f5a1e1b2, I think we can just remove
> m[no-]direct-move here?  What do you think?


I'm fine with that if that is what we want.  So something like the following?

+;; This option existed in the past, but now is always silently ignored.
mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented Ignore


The above seems to silently ignore both -mdirect-move and -mno-direct-move
which I think is what we want.  That said, it's not what we've done with
other options, but maybe those just need to be changed too?


Peter


;; This option existed in the past, but now is always off.
mno-mfpgpr
Target RejectNegative Undocumented Ignore

mmfpgpr
Target RejectNegative Undocumented WarnRemoved

[snip other examples]


Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Peter Bergner
On 4/8/24 9:37 PM, Kewen.Lin wrote:
> on 2024/4/8 21:21, Peter Bergner wrote:
> I prefer to remove it completely, that is:
> 
>> -mdirect-move
>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> 
> The reason why you still kept it is to keep a historical record here?

I believe we've never completely removed an option before.  I think the
thought was, if some software package explicitly used the option, then
they shouldn't see an 'unrecognized command-line option' error, but
rather either a warning that the option was removed or just silently
ignore it.  Ie, we don't want to make a package that used to build with
an old compiler now break its build because the option doesn't exist
anymore.



> Segher pointed out to me that this kind of option complete removal should be
> stage 1 stuff, so let's defer to make it in a separated patch next release
> (including some other options like mfpgpr you showed below etc.). :)

If we're going to completely remove it, then for sure, it's a stage1 thing.
I'd like to hear Segher's thoughts on whether we should completely remove
it or just silently ignore it.



> For the original patch,
> 
>> +mno-direct-move
>> +Target Undocumented WarnRemoved
> 
> s/WarnRemoved/Ignore/ to match some other existing practice, there is no
> warning now if specifying -mno-direct-move and it would be good to keep
> the same behavior for users.

If we want to silently ignore -mdirect-move and -mno-direct-move, then we
just need to do:

mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented Ignore

There's no need to mention -mno-direct-move at all then.  It was only in the
case I thought we wanted to warn against it's use that I added -mno-direct-move.



>> That said, it's not what we've done with
>> other options, but maybe those just need to be changed too?
> 
> Yes, I think they need to be changed too (next release).

If that's the consensus with Segher, sure, we can plan on that in stage1.

Peter




Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-09 Thread Peter Bergner
On 4/9/24 12:37 AM, Kewen.Lin wrote:
> Since removing it completely is a stage1 thing, I prefer to keep mdirect-move
> and -mno-direct-move handlings as before, WarnRemoved and Ignore separately.

Ok, current trunk ignores -mno-direct-move and warns on -mdirect-move, so to
keep the same behavior for GCC 14 (before removing in stage1), we want just:

mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented WarnRemoved

which is what I'll commit and push after one last round of testing.
Thanks.

Peter




Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-09 Thread Peter Bergner
On 4/9/24 3:19 PM, Peter Bergner wrote:
> Ok, current trunk ignores -mno-direct-move and warns on -mdirect-move, so to
> keep the same behavior for GCC 14 (before removing in stage1), we want just:
> 
> mdirect-move
> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> +Target Undocumented WarnRemoved
> 
> which is what I'll commit and push after one last round of testing.

Testing was clean as expected, so I pushed the commit.  Thanks.

Peter



[PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-11 Thread Peter Bergner
FYI: This patch is an update to Will Schmidt's patches to fix PR101865:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601825.html
  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601823.html

...taking into consideration patch reviews received than.  I also found
a few more locations that needed patching, as well as simplifying the
testsuite test cases by removing the need to scan for the predefined macros.



The bug in PR101865 is the _ARCH_PWR8 predefined macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

The passed bootstrap and regtest on powerpc64le-linux.  Ok for trunk?

This is also broken on the release branches, so ok for backports after
some burn-in time on trunk?

Peter


2024-04-11  Will Schmidt  
    Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mdo-not-use-this-option): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefined-p7-novsx.c: New test.
* gcc.target/powerpc/predefined-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefined-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefined-p8-novsx.c: New test.
* gcc.target/powerpc/predefined-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefined-p9-novsx.c: New test.

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index e7d6204074c..320affd79e3 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -165,7 +165,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P7_64:
   return TARGET_POPCNTD && TARGET_POWERPC64;
 case ENB_P8:
-  return TARGET_DIRECT_MOVE;
+  return TARGET_POWER8;
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 647f20de7f2..bd493ab87c5 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,7 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
+  if ((flags & OPTION_MASK_POWER8) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 45dd5a85901..6ee678e69c3 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -47,6 +47,7 @@
fusion here, instead set it in rs6000.cc if we are tuning for a power8
system.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
+| OPTION_MASK_POWER8   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
@@ -130,6 +131,7 @@
 | OPTION_MASK_MODULO   \
 | OPTION_MASK_MULHW\
 | OPTION_MASK_NO_UPDATE\
+| OPTION_MASK_POWER8   \
 | OPTION_MASK_P8_FUSION\
 | OPTION_MASK_P8_VECTOR\
   

[PATCH] Fortran: fix passing array component to polymorphic argument [PR105658]

2024-02-15 Thread Peter Hill
Dear all,

The attached patch fixes PR105658 by forcing an array temporary to be
created. This is required when passing an array component, but this
didn't happen if the dummy argument was an unlimited polymorphic type.

The problem bit of code is in `gfc_conv_expr_descriptor`, near L7828:

  subref_array_target = (is_subref_array (expr)
 && (se->direct_byref
|| expr->ts.type == BT_CHARACTER));
  need_tmp = (gfc_ref_needs_temporary_p (expr->ref)
  && !subref_array_target);

where `need_tmp` is being evaluated to 0.  The logic here isn't clear
to me, and this function is used in several places, which is why I
went with setting `parmse.force_tmp = 1` in `gfc_conv_procedure_call`
and using the same conditional as the later branch for the
non-polymorphic case (near the call to `gfc_conv_subref_array_arg`)

If this patch is ok, please could someone commit it for me? This is my
first patch for GCC, so apologies in advance if the commit message is
missing something.

Tested on x86_64-pc-linux-gnu.

The bug is present in gfortran back to 4.9, so should it also be backported?

Cheers,
Peter

 PR fortran/105658

gcc/fortran/ChangeLog

* trans-expr.cc (gfc_conv_procedure_call): When passing an
array component reference of intrinsic type to a procedure
with an unlimited polymorphic dummy argument, a temporary
should be created.

gcc/testsuite/ChangeLog

* gfortran.dg/PR105658.f90: New test.
---
 gcc/fortran/trans-expr.cc  |  8 
 gcc/testsuite/gfortran.dg/PR105658.f90 | 25 +
 2 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/PR105658.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index a0593b76f18..7fd3047c4e9 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6439,6 +6439,14 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   CLASS object for the unlimited polymorphic formal.  */
gfc_find_vtab (&e->ts);
gfc_init_se (&parmse, se);
+   /* The actual argument is a component reference to an array
+  of derived types, so we need to force creation of a
+  temporary */
+   if (e->expr_type == EXPR_VARIABLE
+   && is_subref_array (e)
+   && !(fsym && fsym->attr.pointer))
+ parmse.force_tmp = 1;
+
gfc_conv_intrinsic_to_class (&parmse, e, fsym->ts);

  }
diff --git a/gcc/testsuite/gfortran.dg/PR105658.f90
b/gcc/testsuite/gfortran.dg/PR105658.f90
new file mode 100644
index 000..407ee25f77c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/PR105658.f90
@@ -0,0 +1,25 @@
+! { dg-do compile }
+! { dg-options "-Warray-temporaries" }
+! Test fix for incorrectly passing array component to unlimited
polymorphic procedure
+
+module test_PR105658_mod
+  implicit none
+  type :: foo
+integer :: member1
+integer :: member2
+  end type foo
+contains
+  subroutine print_poly(array)
+class(*), dimension(:), intent(in) :: array
+select type(array)
+type is (integer)
+  print*, array
+end select
+  end subroutine print_poly
+
+  subroutine do_print(thing)
+type(foo), dimension(3), intent(in) :: thing
+call print_poly(thing%member1) ! { dg-warning "array temporary" }
+  end subroutine do_print
+
+end module test_PR105658_mod
-- 
2.43.0


Re: [PATCH] Fortran: fix passing array component to polymorphic argument [PR105658]

2024-02-19 Thread Peter Hill
Hi Harald,

Thanks for your help, please see the updated and signed-off patch below.

> (I am not entirely sure whether we need to exclude pointer and
> allocatable attributes here explicitly, given the constraints
> in F2023:15.5.2.6, but other may have an opinion, too.
> The above should be safe anyway.)

I've included them in the patch here, but it does seem to work fine
without checking those attributes here -- and invalid code is still
caught with that change.

It also occurred to me that array temporaries aren't _required_ here
(for arrays of derived type components), but in the general case with
a type with differently sized components, the stride wouldn't be a
multiple of the component's type's size. Is it possible in principle
to have an arbitrary stride?

Cheers,
Peter

>From 907a104facfc7f35f48ebcfa9ef5f8f5430d4d3c Mon Sep 17 00:00:00 2001
From: Peter Hill 
Date: Thu, 15 Feb 2024 16:58:33 +
Subject: [PATCH] Fortran: fix passing array component ref to polymorphic
 procedures

 PR fortran/105658

gcc/fortran/ChangeLog

* trans-expr.cc (gfc_conv_intrinsic_to_class): When passing an
array component reference of intrinsic type to a procedure
with an unlimited polymorphic dummy argument, a temporary
should be created.

gcc/testsuite/ChangeLog

* gfortran.dg/PR105658.f90: New test.

Signed-off-by: Peter Hill 
---
 gcc/fortran/trans-expr.cc  |  9 +
 gcc/testsuite/gfortran.dg/PR105658.f90 | 50 ++
 2 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/PR105658.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index a0593b76f18..004081aa6c3 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -1019,6 +1019,14 @@ gfc_conv_intrinsic_to_class (gfc_se *parmse, gfc_expr *e,
   tmp = gfc_typenode_for_spec (&class_ts);
   var = gfc_create_var (tmp, "class");

+  /* Force a temporary for component or substring references */
+  if (unlimited_poly
+  && class_ts.u.derived->components->attr.dimension
+  && !class_ts.u.derived->components->attr.allocatable
+  && !class_ts.u.derived->components->attr.class_pointer
+  && is_subref_array (e))
+parmse->force_tmp = 1;
+
   /* Set the vptr.  */
   ctree = gfc_class_vptr_get (var);

@@ -6439,6 +6447,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   CLASS object for the unlimited polymorphic formal.  */
gfc_find_vtab (&e->ts);
gfc_init_se (&parmse, se);
+
gfc_conv_intrinsic_to_class (&parmse, e, fsym->ts);

  }
diff --git a/gcc/testsuite/gfortran.dg/PR105658.f90
b/gcc/testsuite/gfortran.dg/PR105658.f90
new file mode 100644
index 000..8aacecf806e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/PR105658.f90
@@ -0,0 +1,50 @@
+! { dg-do compile }
+! { dg-options "-Warray-temporaries" }
+! Test fix for incorrectly passing array component to unlimited
polymorphic procedure
+
+module test_PR105658_mod
+   implicit none
+   type :: foo
+ integer :: member1
+ integer :: member2
+   end type foo
+contains
+   subroutine print_poly(array)
+ class(*), dimension(:), intent(in) :: array
+ select type(array)
+ type is (integer)
+   print*, array
+ type is (character(*))
+   print *, array
+ end select
+   end subroutine print_poly
+
+   subroutine do_print(thing)
+ type(foo), dimension(3), intent(in) :: thing
+ type(foo), parameter :: y(3) = [foo(1,2),foo(3,4),foo(5,6)]
+ integer :: i, j, uu(5,6)
+
+ call print_poly(thing%member1)   ! { dg-warning "array temporary" }
+ call print_poly(y%member2)   ! { dg-warning "array temporary" }
+ call print_poly(y(1::2)%member2) ! { dg-warning "array temporary" }
+
+ ! The following array sections work without temporaries
+ uu = reshape([(((10*i+j),i=1,5),j=1,6)],[5,6])
+ print *, uu(2,2::2)
+ call print_poly (uu(2,2::2)) ! no temp needed!
+ print *, uu(1::2,6)
+ call print_poly (uu(1::2,6)) ! no temp needed!
+   end subroutine do_print
+
+   subroutine do_print2(thing2)
+ class(foo), dimension(:), intent(in) :: thing2
+ call print_poly (thing2% member2) ! { dg-warning "array temporary" }
+   end subroutine do_print2
+
+   subroutine do_print3 ()
+ character(3) :: c(3) = ["abc","def","ghi"]
+ call print_poly (c(1::2))  ! no temp needed!
+ call print_poly (c(1::2)(2:3)) ! { dg-warning "array temporary" }
+   end subroutine do_print3
+
+end module test_PR105658_mod
-- 
2.43.0


[PATCH] rs6000: Update instruction counts due to combine changes [PR112103]

2024-02-19 Thread Peter Bergner
rs6000: Update instruction counts due to combine changes [PR112103]

The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c.
Adjust expected instruction counts accordingly.

This passed on both powerpc64le-linux and powerpc64-linux running the
testsuite in both 32-bit and 64-bit modes.  Ok for trunk?

FYI, I will open a new bug to track the removing of the superfluous
insns detected in PR112103.


Peter


gcc/testsuite/
PR target/112103
* gcc.target/powerpc/rlwinm-0.c: Adjust expected instruction counts.

diff --git a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c 
b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
index 4f4fca2d8ef..a10d9174306 100644
--- a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
@@ -4,10 +4,10 @@
 /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 6739 { target ilp32 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9716 { target lp64 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+blr} 3375 } } */
-/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3081 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3090 { target lp64 } } } 
*/
 
 /* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3197 { target ilp32 } } 
} */
-/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3093 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3084 { target lp64 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+rotlwi} 154 } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+srwi} 13 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+srdi} 13 { target lp64 } } } */


Re: [PATCH] rs6000: Update instruction counts due to combine changes [PR112103]

2024-02-20 Thread Peter Bergner
On 2/20/24 3:29 AM, Kewen.Lin wrote:
> on 2024/2/20 06:35, Peter Bergner wrote:
>> rs6000: Update instruction counts due to combine changes [PR112103]
>>
>> The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c.
>> Adjust expected instruction counts accordingly.
>>
>> This passed on both powerpc64le-linux and powerpc64-linux running the
>> testsuite in both 32-bit and 64-bit modes.  Ok for trunk?
> 
> OK for trunk, thanks for fixing!

Ok, pushed.  Thanks.


>> FYI, I will open a new bug to track the removing of the superfluous
>> insns detected in PR112103.
> 
> Hope this test case will become not fragile any more once this filed
> issue gets fixed. :)

I think this will become less fragile after we fix PR114004 which is
the bug I opened to track fixing the superfluous insn that was emitted
that we found in this bug.  The fragility was due to the superfluous
insn being different before and after Roger's patch.  Once we don't
emit it anymore, this test case should be less fragile.

Peter



Re: [PATCH] rs6000: Neuter option -mpower{8,9}-vector [PR109987]

2024-02-20 Thread Peter Bergner
On 2/20/24 3:27 AM, Kewen.Lin wrote:
> on 2024/2/20 02:45, Segher Boessenkool wrote:
>> On Tue, Jan 16, 2024 at 10:50:01AM +0800, Kewen.Lin wrote:
>>> it consists of some aspects:
>>>   - effective target powerpc_p{8,9}vector_ok are removed
>>> and replaced with powerpc_vsx_ok.
>>
>> So all such testcases already arrange to have p8 or p9 some other way?

Shouldn't that be replaced with powerpc_vsx instead of powerpc_vsx_ok?
That way we know VSX code gen is enabled for the options being used,
even those in RUNTESTFLAGS.

I thought we agreed that powerpc_vsx_ok was almost always useless and
we always want to use powerpc_vsx.  ...or did I miss that we removed
the old powerpc_vsx_ok and renamed powerpc_vsx to powerpc_vsx_ok?



>>>   - Some test cases are updated with explicit -mvsx.
>>>   - Some test cases with those two option mixed are adjusted
>>> to keep the test points, like -mpower8-vector
>>> -mno-power9-vector are updated with -mdejagnu-cpu=power8
>>> -mvsx etc.
>>
>> -mcpu=power8 implies -mvsx already.

Then we can omit the explicit -msx option, correct?  Ie, if the
user forces -mno-vsx in RUNTESTFLAGS, then we'll just skip the
test case as UNSUPPORTED rather than trying to compile some
vsx test case with vsx disabled via the options.



Peter


Re: [PATCH] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-26 Thread Peter Bergner
On 2/26/24 4:49 AM, Kewen.Lin wrote:
> on 2024/2/26 14:18, jeevitha wrote:
>> Hi All,
>> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
>> index 6111cc90eb7..e5688ff972a 100644
>> --- a/gcc/config/rs6000/vsx.md
>> +++ b/gcc/config/rs6000/vsx.md
>> @@ -4660,7 +4660,7 @@
>>  (define_expand "vsx_splat_"
>>[(set (match_operand:VSX_D 0 "vsx_register_operand")
>>  (vec_duplicate:VSX_D
>> - (match_operand: 1 "input_operand")))]
>> + (match_operand: 1 "splat_input_operand")))]
>>"VECTOR_MEM_VSX_P (mode)"
>>  {
>>rtx op1 = operands[1];
> 
> This hunk actually does force_reg already:
> 
> ...
>   else if (!REG_P (op1))
> op1 = force_reg (mode, op1);
> 
> but it's assigning to op1 unexpectedly (an omission IMHO), so just
> simply fix it with:
> 
>   else if (!REG_P (op1))
> -op1 = force_reg (mode, op1);
> +operands[1] = force_reg (mode, op1);

I agree op1 was an oversight and it should be operands[1].
That said, I think using more precise predicates is a good thing,
so I think we should use both Jeevitha's predicate change and
your operands[1] change.

I'll note that Jeevitha originally had the operands[1] change, but I
didn't look closely enough at the issue or the pattern and mentioned
that these kinds of bugs can be caused by too loose constraints and
predicates, which is when she found the updated predicate to use.
I believe she already even bootstrapped and regtested the operands[1]
only change.  Jeevitha???




>> +/* PR target/113950 */
>> +/* { dg-do compile } */
> 
> We need an effective target to ensure vsx support, for now it's 
> powerpc_vsx_ok.
> ie: /* { dg-require-effective-target powerpc_vsx_ok } */

Agreed.


>> +/* { dg-options "-O1" } */

I think we should also use a -mcpu=XXX option to ensure VSX is enabled
when compiling these VSX built-in functions.  I'm fine using any CPU
(power7 or later) where the ICE exists with an unpatched compiler.
Otherwise, testing will be limited to our server systems that have
VSX enabled by default.


Peter





Re: [PATCH] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-26 Thread Peter Bergner
On 2/26/24 7:55 PM, Kewen.Lin wrote:
> on 2024/2/26 23:07, Peter Bergner wrote:
>> so I think we should use both Jeevitha's predicate change and
>> your operands[1] change.
> 
> Since either the original predicate change or operands[1] change
> can fix this issue, I think it's implied that only either of them
> is enough, so we can remove "else if (!REG_P (op1))" arm (or even
> replaced with one else arm to assert REG_P (op1))?

splat_input_operand allows, mem, reg and subreg, so I don't think
we can just assert on REG_P (op1), since op1 could be a subreg.
I do agree we can remove the "if (!REG_P (op1))" test on the else
branch, since force_reg() has an early exit for regs, so a simple:

  ...
  else
operands[1] = force_reg (mode, op1);

..should work.




> Good point, or maybe just an explicit -mvsx like some existing ones, which
> can avoid to only test some fixed cpu type.

If a simple "-O1 -vsx" is enough to expose the ICE on an unpacthed
compiler and a PASS on a patched compiler, then I'm all for it.
Jeevitha, can you try confirming that?


Peter




Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-27 Thread Peter Bergner
On 2/27/24 6:40 AM, Segher Boessenkool wrote:
> On Tue, Feb 27, 2024 at 02:02:38AM +0530, jeevitha wrote:
>> There is no immediate value splatting instruction in Power. Currently, those
>> values need to be stored in a register or memory. To address this issue, I
>> have updated the predicate for the second operand in vsx_splat to
>> splat_input_operand and corrected the assignment of op1 to operands[1].
>> These changes ensure that operand1 is stored in a register.
> 
> input_operand allows a lot of things that splat_input_operand does not,
> not just immediate operands.  NAK.
> 
> (For example, *all* memory is okay for input_operand, always).
> 
> I'm not saying we do not want to restrict these things, but a commit
> that doesn't discuss this at all is not okay.  Sorry.

So it seems you're not NAKing the use of splat_input_operand, but
just that it needs more explanation in the git log entry, correct?

Yes, input_operand accepts a lot more things than splat_input_operand
does, but the multiple define_insns this define_expand feeds, uses
gpc_reg_operand, memory_operand and splat_input_operand for their
operands[1] operand (splat_input_operand accepts reg and mem too),
so it seems to match better what the patterns will be accepting and
I always thought that using predicates that more accurately reflect
what the define_insns expect/accept lead to better code gen.

Mike, was it just an oversight to not use splat_input_operand for the
vsx_splat_ expander or was input_operand a conscious decision?

If input_operand was used purposely, then we can just fall back to
the s/op1/operands[1]/ change which we already know fixes the bug.


Peter




Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Peter Bergner
On 2/28/24 8:31 AM, Segher Boessenkool wrote:
> On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote:
>> So it seems you're not NAKing the use of splat_input_operand, but
>> just that it needs more explanation in the git log entry, correct?
> 
> I NAK the patch.  _Of course_ there needs to be *something* done, there
> is a bug after all, it needs to be fixed.
> 
> But no, there are big questions about if splat_input_operand is correct
> as well.  This needs to be justified in the patch submission.

Ok, then Jeevitha, repost the patch with the s/op1/operands[1]/ only change.
Jeevitha has already bootstrapped and regtested that change and it does
fix the bug.

Clearly, the splat_input_operand change needs more discussion and would
be a follow-on patch...if we decide to do it at all.

Peter



Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-19 Thread Peter Bergner
On 1/11/24 11:29 AM, Michael Meissner wrote:
> This is version 2 of the patch.  The only difference is I made the test case
> simpler to read.
[snip]
> gcc/
> 
>   PR target/112886
>   * config/rs6000/rs6000.cc (print_operand): Add %S output modifier.
>   * doc/md.texi (Modifiers): Mention %S can be used like %x.
> 
> gcc/testsuite/
> 
>   PR target/112886
>   * /gcc.target/powerpc/pr112886.c: New test.

This resolves my issue with the first patch, so LGTM.

Peter




Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-23 Thread Peter Bergner
On 1/23/24 8:30 PM, Kewen.Lin wrote:
>> -output_operand_lossage ("invalid %%x value");
>> +output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 
>> 'x'));
> 
> Nit: Seems simpler with
>   
>   output_operand_lossage ("invalid %%%c value", (char) code);

Agreed, good catch.




>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>> new file mode 100644
>> index 000..4e59dcda6ea
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target power10_ok } */
> 
> I think this needs one more:
> 
> /* { dg-require-effective-target powerpc_vsx_ok } */

I agree with this...



>> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> 
> ... and
> 
> /* { dg-options "-mdejagnu-cpu=power10 -O2 -mvsx" } */
> 
> , otherwise with explicit -mno-vsx, this test case would fail.

But not with this.  The -mdejagnu-cpu=power10 option already enables -mvsx.
If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
The options set in RUNTESTFLAGS come after the options in the dg-options
line, so even adding -mvsx like the above won't help the test case PASS
if we didn't have the powerpc_vsx_ok test.  In other words, the -mvsx option
doesn't help with anything.

All we need is the new powerpc_vsx_ok check and that will guard against the FAIL
in the case the user forces -mno-vsx.  In that case, we'll just get an 
UNSUPPORTED
and that is fine.

Peter





Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-24 Thread Peter Bergner
On 1/24/24 12:04 AM, Kewen.Lin wrote:
> on 2024/1/24 11:11, Peter Bergner wrote:
>> But not with this.  The -mdejagnu-cpu=power10 option already enables -mvsx.
>> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
>> The options set in RUNTESTFLAGS come after the options in the dg-options
>> line, so even adding -mvsx like the above won't help the test case PASS
> 
> But this is NOT true, at least on one of internal Power10 machine 
> (ltcden2-lp1).
> 
> With the command below:
>   
>   make check-gcc-c RUNTESTFLAGS="--target_board=unix/-mno-vsx 
> powerpc.exp=pr112886.c"
> 
> this test case fails without the explicit -mvsx in dg-options.
> 
> From the verbose dumping, the compilation command looks like:
> 
> /home/linkw/gcc/build/gcc-test-debug/gcc/xgcc 
> -B/home/linkw/gcc/build/gcc-test-debug/gcc/
> /home/linkw/gcc/gcc-test/gcc/testsuite/gcc.target/powerpc/pr112886.c  
> -mno-vsx 
> -fdiagnostics-plain-output  -mdejagnu-cpu=power10 -O2 -ffat-lto-objects 
> -fno-ident -S
> -o pr112886.s
> 
> "-mno-vsx" comes **before** "-mdejagnu-cpu=power10 -O2" rather than **after**.
> 
> I guess it might be due to different behaviors of different versions of 
> runtest framework?

That is confusing, unless as you say, the behavior changed.  The whole reason 
we added
-mdejagnu-cpu= (and the dg-skip usage before that) was due to encountering 
problems
when the test case wanted a specific -mcpu= value and the user overrode it in 
their
RUNTESTFLAGS and that can only happen when its options come last on the command 
line.

Then again, why didn't the powerpc_vsx_ok test not save us here?



> So there can be two cases with user explicitly specified -mno-vsx:
> 
> 1) RUNTESTFLAGS comes after dg-options (assuming same order for -mvsx in 
> powerpc_vsx_ok)
> 
>   powerpc_vsx_ok test failed, so UNSUPPORTED
> 
>   // with explicit -mvsx does nothing as you said.
> 
> 2) RUNTESTFLAGS comes before dg-options
> 
>   powerpc_vsx_ok test succeeds, but FAIL.
>   
>  // with suggested -mvsx, make it match the powerpc_vsx_ok checking and the 
> case not fail.
> 
> As above I think we still need to append the "-mvsx" explicitly.  As 
> tested/verified, it
> does help the case not to fail on ltcden2-lp1.

I'd like to verify that the behavior did change before we enforce adding that 
option.
The problem is, there are a HUGE number of older test cases that would need 
updating
to "fix" them too.  ...and not just adding -mnsx, but -maltivec and basically 
any
-mfoo option where the test case is expecting the feature foo to be used/tested.
It would be a huge mess.

Peter



Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Peter Bergner
On 12/13/23 2:05 AM, Jakub Jelinek wrote:
> On Wed, Dec 13, 2023 at 08:51:16AM +0100, Richard Biener wrote:
>> On Tue, 12 Dec 2023, Peter Bergner wrote:
>>
>>> On 12/12/23 8:36 PM, Jason Merrill wrote:
>>>> This test is failing for me below C++17, I think you need
>>>>
>>>> // { dg-do compile { target c++17 } }
>>>> or
>>>> // { dg-require-effective-target c++17 }
>>>
>>> Sorry about that.  Should we do the above or should we just add
>>> -std=c++17 to dg-options?  ...or do we need to do both?
>>
>> Just do the above, the C++ testsuite iterates over all standards,
>> adding -std=c++17 would just run that 5 times.  But the above
>> properly skips unsupported cases.
> 
> I believe if one uses explicit -std=gnu++17 or -std=c++17 in dg-options
> then it will not iterate:
> # If the testcase specifies a standard, use that one.
> # If not, run it under several standards, allowing GNU extensions
> # if there's a dg-options line.
> if ![search_for $test "-std=*++"] {
> and otherwise how many times exactly it iterates depends on what the user
> asked for or what effective target is there (normally the default is
> to iterate 4 times (98,14,17,20), one can use e.g.
> GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 to iterate 7 times, or the
> default also changes if c++23, c++26 or c++11_only effective targets
> are present somewhere in the test.
> 
> But sure, if the test is valid in C++17, 20, 23, 26, then
> // { dg-do compile { target c++17 } }
> is best (unless the test is mostly language version independent and
> very expensive to compile or run).

I confirmed the test case builds with C++17, 20, 23, 26 and errors out
with C++11, so I went with your solution.  Thanks for the input and
sorry for the breakage.  Pushed.

Peter


testsuite: Add dg-do compile target c++17 directive for testcase [PR112822]

Add dg-do compile target directive that limits the test case to being built
on c++17 compiles or greater.

2023-12-13  Peter Bergner  

gcc/testsuite/
PR tree-optimization/112822
* g++.dg/pr112822.C: Add dg-do compile target c++17 directive.

diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822.C
index d1490405493..a8557522467 100644
--- a/gcc/testsuite/g++.dg/pr112822.C
+++ b/gcc/testsuite/g++.dg/pr112822.C
@@ -1,4 +1,5 @@
 /* PR tree-optimization/112822 */
+/* { dg-do compile { target c++17 } } */
 /* { dg-options "-w -O2" } */
 
 /* Verify we do not ICE on the following noisy creduced test case.  */




Re: [PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2023-12-13 Thread Peter Bergner
On 11/24/23 3:28 AM, Kewen.Lin wrote:
>> +  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
> 
> Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16?
> I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more,
> we have to add one explicit .to_constant () here.  So I prefer this
> to use 16 directly, maybe one comment above to indicate what's for
> the value 16.

I normally don't like hard coding constants in the code, so used
GET_MODE_SIZE (V16QImode) as the number of bytes of a vector register,
but if that's going to cause an issue in the future, I'm fine using 16.
Changed.



>> +  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
> 
> Likewise.

Changed.




>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 5f56c3ed85b..f2efa46c147 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, 
>> machine_mode mode)
>>  static bool
>>  rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>>  {
>> -  if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
>> -  || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
>> -return mode1 == mode2;
>> +   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
>> +   || mode2 == PTImode || mode2 == XOmode)
>> + return mode1 == mode2;
>> + 
>> +  if (mode2 == OOmode)
>> +return ALTIVEC_OR_VSX_VECTOR_MODE (mode1);
> 
> I vaguely remembered that Segher mentioned it's unexpected for opaque
> modes to have tieable modes excepting for themselves, but if this is the
> only way to get rid of those extra moves, I guess we can special-case
> them here.  Looking forward to Segher's comments on this part.

To be honest, my original patch didn't have this.  I think it was Mike who
said we want or need this.  Mike, why do we want/need this again?

That said, the documentation for TARGET_MODES_TIEABLE_P says:

  This hook returns true if a value of mode mode1 is accessible in
  mode mode2 without copying.

Given OOmode (ie, __vector_pair) under the covers is two consecutive
vector registers, and we use them/initialize them with two vectors,
then mode1 being a vector mode could be accesible from an OOmode mode2
without copying, meaning we could access it directly from the registers
holding mode2.

Segher, your input to the above an the subreg portion of the patch in general?

Peter





Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-12-14 Thread Peter Bergner
On 7/16/23 10:40 PM, P Jeevitha via Gcc-patches wrote:
> Normally, GPR2 is the TOC pointer and is defined as a fixed and non-volatile
> register. However, it can be used as volatile for PCREL addressing. Therefore,
> modified r2 to be non-fixed in FIXED_REGISTERS and set it to fixed if it is 
> not
> PCREL and also when the user explicitly requests TOC or fixed. If the register
> r2 is fixed, it is made as non-volatile. Changes in register preservation 
> roles
> can be accomplished with the help of available target hooks
> (TARGET_CONDITIONAL_REGISTER_USAGE).
> 
> 2023-07-12  Jeevitha Palanisamy  
> 
> gcc/
>   PR target/PR110320
>   * config/rs6000/rs6000.cc (rs6000_conditional_register_usage): Change
>   GPR2 to volatile and non-fixed register for PCREL.
> 
> gcc/testsuite/
>   PR target/PR110320
>   * gcc.target/powerpc/pr110320-1.c: New testcase.
>   * gcc.target/powerpc/pr110320-2.c: New testcase.
>   * gcc.target/powerpc/pr110320-3.c: New testcase.
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 44b448d2ba6..9aa04ec5d57 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10193,9 +10193,13 @@ rs6000_conditional_register_usage (void)
>  for (i = 32; i < 64; i++)
>fixed_regs[i] = call_used_regs[i] = 1;
>  
> +  /* For non PC-relative code, GPR2 is unavailable for register allocation.  
> */
> +  if (FIXED_R2 && !rs6000_pcrel_p ())
> +fixed_regs[2] = 1;
> +
>/* The TOC register is not killed across calls in a way that is
>   visible to the compiler.  */
> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> +  if (fixed_regs[2] && (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2))
>  call_used_regs[2] = 0;

Segher and Ke Wen,

As we discussed on my PR111045 patch that disabled PCREL on everything other
than ELFv2 and Segher said, well, it should work on ELFv1, modulo fixing bugs.
Segher said we should attempt to fix those bugs before we ship the next release
and if we miss that, we can push my PR111045 patch then to disable it.

On a related note, Jeevitha's patch above allows using r2 for normal register
allocation if r2 is not fixed and pcrel is enabled.  Given pcrel with this patch
enables pcrel on ELFv1, that means this patch can also enable using r2 for 
normal
register allocation on ELFv1.  Is that safe?  Should we add a check above where
we set fixed_regs[2] = 1, to also check for whether this is not an ELFv2 
compile?
...or Segher, should we leave this as is and add it to the things to check for
non-ELFv2 compiles before the next release and possible disable it then if we
know/aren't sure whether it legal?

So I guessing I'm wondering, should Jeevitha push the above approved patch as
is, or should we modify it so r2 is only available for RA on ELFv2 and pcrel?

Peter




Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-12-14 Thread Peter Bergner
On 12/14/23 9:57 PM, Peter Bergner wrote:
> On 7/16/23 10:40 PM, P Jeevitha via Gcc-patches wrote:
>> +  /* For non PC-relative code, GPR2 is unavailable for register allocation. 
>>  */
>> +  if (FIXED_R2 && !rs6000_pcrel_p ())
>> +fixed_regs[2] = 1;
[snip]
> On a related note, Jeevitha's patch above allows using r2 for normal register
> allocation if r2 is not fixed and pcrel is enabled.  Given pcrel with this 
> patch
> enables pcrel on ELFv1, that means this patch can also enable using r2 for 
> normal
> register allocation on ELFv1.

Nevermind, I'm daft and r2 usage is not allowed on ELFv1.  The rs6000_pcrel_p()
call above is always false for non ELFv2 compiles, so we'll mark r2 as fixed
for ELFv1.  Move along, nothing to see. :-)

That said, I think we need a "dg-require-effective-target powerpc_elfv2" for
the first test case where we're checking that we do use r2 for normal RA.
That'll only be true on ELFv2 compiles, hence the need for the extra target
requirement.  I've asked Jeevitha to add that to the pr111045-1.c test
case and verify it fixes the failure of that test case on her BE run.

Peter







Re: [PATCH] PR target/112886, Add %S to print_operand for vector pair support

2024-01-09 Thread Peter Bergner
On 1/5/24 4:18 PM, Michael Meissner wrote:
> @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
>   print_operand (file, x, 0);
>return;
>  
> +case 'S':
>  case 'x':
> -  /* X is a FPR or Altivec register used in a VSX context.  */
> +  /* X is a FPR or Altivec register used in a VSX context.  %x prints
> +  the VSX register number, %S prints the 2nd register number for
> +  vector pair, decimal 128-bit floating and IBM 128-bit binary floating
> +  values.  */
>if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
> - output_operand_lossage ("invalid %%x value");
> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 
> 'x'));
>else
>   {
> -   int reg = REGNO (x);
> +   int reg = REGNO (x) + (code == 'S' ? 1 : 0);
> int vsx_reg = (FP_REGNO_P (reg)
>? reg - 32
>: reg - FIRST_ALTIVEC_REGNO + 32);

The above looks good to me.  However:


> +: "=v" (*p)
> +: "v" (*q), "v" (*r));

These really should use "wa" rather than "v", since these are
VSX instructions... or did you use those to ensure you got
Altivec registers numbers assigned?



> +/* { dg-final { scan-assembler-times {\mxvadddp 
> (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M}
>  2 } } */

...and this is really ugly and hard to read/understand.  Can't we use
register variables to make it simpler?  Something like the following
which tests having both FPR and Altivec reg numbers assigned?

...
void
test (__vector_pair *ptr)
{
  register __vector_pair p asm ("vs10");
  register __vector_pair q asm ("vs42");
  register __vector_pair r asm ("vs44");
  q = ptr[1];
  r = ptr[2];
  __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
   : "=wa" (p)
   : "wa" (q), "wa" (r));
  ptr[2] = p;
}

/* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
/* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
...

Peter



Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 10:50 AM, Martin Jambor wrote:
> The testcase has reasonable size but it is specific to ppc64le and its
> altivec vectors.  My plan is to ask the bug reporter to massage it into
> a target specific testcase in bugzilla.  Alternatively I can try to
> craft a testcase from scratch but that will take time.

I rewrote the Altivec specific part of the testcase to use generic C code
and it still ICEs for me on ppc64le using an unpatched compiler.  Therefore,
I think we can just add the updated testcase to the generic g++ tests. 

I'll note I was wrong in the bugzilla comments, -O3 -mcpu=power10 is not
required to hit the ICE.  A simple -O2 on ppc64le is enough to hit the ICE.

Ok for trunk?

Peter


testsuite: Add testcase for already fixed PR [PR112822]

gcc/testsuite/
PR tree-optimization/112822
* g++.dg/pr112822.C: New test.

diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822.C
new file mode 100644
index 000..3921d5c1bbe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr112822.C
@@ -0,0 +1,369 @@
+/* PR target/112822 */
+/* { dg-options "-w -O2" } */
+
+/* Verify we do not ICE on the following noisy creduced test case.  */
+
+namespace b {
+typedef int c;
+template  struct d;
+template  struct d { using f = e; };
+template  struct aa;
+template  struct aa { using f = h; };
+template  using ab = typename d::f;
+template  using n = typename aa::f;
+template  class af {
+public:
+  typedef __complex__ ah;
+  template  af operator+=(e) {
+ah o;
+x = o;
+return *this;
+  }
+  ah x;
+};
+} // namespace b
+namespace {
+enum { p };
+enum { ac, ad };
+struct ae;
+struct al;
+struct ag;
+typedef b::c an;
+namespace ai {
+template  struct ak { typedef aj f; };
+template  using ar = typename ak::f;
+template  struct am {
+  enum { at };
+};
+template  struct ao {
+  enum { at };
+};
+template  struct ap;
+template  struct aq {
+  enum { at };
+};
+} // namespace ai
+template  struct ay;
+template  class as;
+template  class ba;
+template  class aw;
+template  class be;
+template  class az;
+namespace ai {
+template  struct bg;
+template ::bd>
+struct bk;
+template  struct bf;
+template  struct bm;
+template  struct bh;
+template ::bj>::at> struct bp {
+  typedef bi f;
+};
+template  struct br {
+  typedef typename bp::f>::f f;
+};
+template  struct bn;
+template  struct bn {
+  typedef aw f;
+};
+template  struct bx {
+  typedef typename bn::bs, aj ::bo>::f f;
+};
+template  struct bt { typedef b::n<0, aj, aj> f; };
+template ::f> struct cb {
+  enum { bw };
+  typedef b::n::f> f;
+};
+template ::bs> struct by {
+  typedef be f;
+};
+template  struct bz {
+  typedef typename by::f f;
+};
+template  struct ch;
+template  struct ch { typedef ci bd; };
+} // namespace ai
+template > struct cg;
+template  struct cg { typedef aj cn; };
+namespace ai {
+template  cj cp;
+template  void cl(bu *cr, cj cs) { ct(cr, cs); }
+typedef __attribute__((altivec(vector__))) double co;
+void ct(double *cr, co cs) { *(co *)cr = cs; }
+struct cq {
+  co q;
+};
+template <> struct bm> { typedef cq f; };
+template <> struct bh { typedef cq bj; };
+void ct(b::af *cr, cq cs) { ct((double *)cr, cs.q); }
+template  struct cx {
+  template  void cu(cw *a, cj) {
+cl(a, cp);
+  }
+};
+} // namespace ai
+template  class ba : public ay {
+public:
+  typedef ai::ap bu;
+  typedef b::n::bo, bu, b::n::at, bu, bu>> cv;
+  typedef ay db;
+  db::dc;
+  cv coeff(an dd, an col) const { return dc().coeff(dd, col); }
+};
+template  class cz : public ba::at> {
+public:
+  ai::ap b;
+  enum { da, dg, dh, bv, bq, di = dg, bo };
+};
+template  class be : public cz {
+public:
+  typedef typename ai::ap::bu bu;
+  typedef cz db;
+  db::dc;
+  template  cd &operator+=(const be &);
+  template  az df(de);
+};
+template  struct ay {
+  cd &dc() { return *static_cast(this); }
+  cd dc() const;
+};
+template  class dl;
+namespace ai {
+template  struct ap> {
+  typedef bb dj;
+  typedef bc r;
+  typedef ap s;
+  typedef ap t;
+  typedef typename cg::cn bu;
+  typedef typename ch::bd>::bd cf;
+  enum { bo };
+};
+} // namespace ai
+template 
+class az : public dl, ai::ap, ai::bg::bd>> {
+public:
+  typedef dk bb;
+  typedef Rhs_ bc;
+  typedef typename ai::bt::f LhsNested;
+  typedef typename ai::bt::f dn;
+  typedef ai::ar u;
+  typedef ai::ar RhsNestedCleaned;
+  u lhs();
+  RhsNestedCleaned rhs();
+};
+template 
+class dl : public ai::bz, al>::f {};
+namespace ai {
+template  struct v { typedef ag w; };
+template  struct evaluator_traits_base {
+  typedef typename v::cf>::w w;
+};
+template  struct ax : evaluator_traits_base {};
+template  struct y { static const bool at = false; };
+template  class plainobjectbase_evaluator_data {
+public:
+  plainobjectbase_evaluator_data(bu *ptr, an) : data(ptr) {}
+  an outerStride() { return z; }
+  bu *data;
+};
+template  struct evaluator {
+  

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 12:45 PM, Peter Bergner wrote:
> +/* PR target/112822 */

Oops, this should be:

/* PR tree-optimization/112822 */

It's fixed on my end.

Peter






Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 1:26 PM, Richard Biener wrote:
>> Am 12.12.2023 um 19:51 schrieb Peter Bergner :
>>
>> On 12/12/23 12:45 PM, Peter Bergner wrote:
>>> +/* PR target/112822 */
>>
>> Oops, this should be:
>>
>> /* PR tree-optimization/112822 */
>>
>> It's fixed on my end.
> 
> Ok

Pushed now that Martin has pushed his fix.  Thanks!

Peter




Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 8:36 PM, Jason Merrill wrote:
> This test is failing for me below C++17, I think you need
> 
> // { dg-do compile { target c++17 } }
> or
> // { dg-require-effective-target c++17 }

Sorry about that.  Should we do the above or should we just add
-std=c++17 to dg-options?  ...or do we need to do both?

Peter





[PATCH] configure: Only create serdep.tmp if needed

2023-01-14 Thread Peter Foley
There's no reason to create this file if none of the serial configure
options are passed.

ChangeLog:

* configure: Regenerate.
* configure.ac: Only create serdep.tmp if needed

Signed-off-by: Peter Foley 
---
 configure| 2 ++
 configure.ac | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/configure b/configure
index 85883099410..47caaeb23fe 100755
--- a/configure
+++ b/configure
@@ -9918,7 +9918,9 @@ esac
 # These force 'configure's to be done one at a time, to avoid problems
 # with contention over a shared config.cache.
 rm -f serdep.tmp
+if  "x${enable_serial_build_configure}" = xyes || 
"x${enable_serial_host_configure}" = xyes || 
"x${enable_serial_target_configure}" = xyes ; then
 echo '# serdep.tmp' > serdep.tmp
+fi
 olditem=
 test "x${enable_serial_build_configure}" = xyes &&
 for item in ${build_configdirs} ; do
diff --git a/configure.ac b/configure.ac
index 2b612dce6e9..847df0c99b5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3071,7 +3071,9 @@ esac
 # These force 'configure's to be done one at a time, to avoid problems
 # with contention over a shared config.cache.
 rm -f serdep.tmp
+if [ "x${enable_serial_build_configure}" = xyes || 
"x${enable_serial_host_configure}" = xyes || 
"x${enable_serial_target_configure}" = xyes ]; then
 echo '# serdep.tmp' > serdep.tmp
+fi
 olditem=
 test "x${enable_serial_build_configure}" = xyes &&
 for item in ${build_configdirs} ; do
-- 
2.39.0



[PATCH v2] configure: Only create serdep.tmp if needed

2023-01-16 Thread Peter Foley
There's no reason to create this file if none of the serial configure
options are passed.

v2: Use test instead of [ to avoid running afoul of autoconf quoting.

ChangeLog:

* configure: Regenerate.
* configure.ac: Only create serdep.tmp if needed

Signed-off-by: Peter Foley 
---
 configure| 2 ++
 configure.ac | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/configure b/configure
index 85883099410..0494e2fa2bf 100755
--- a/configure
+++ b/configure
@@ -9918,7 +9918,9 @@ esac
 # These force 'configure's to be done one at a time, to avoid problems
 # with contention over a shared config.cache.
 rm -f serdep.tmp
+if test "x${enable_serial_build_configure}" = xyes || test 
"x${enable_serial_host_configure}" = xyes || test 
"x${enable_serial_target_configure}" = xyes; then
 echo '# serdep.tmp' > serdep.tmp
+fi
 olditem=
 test "x${enable_serial_build_configure}" = xyes &&
 for item in ${build_configdirs} ; do
diff --git a/configure.ac b/configure.ac
index 2b612dce6e9..f5cce5830bc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3071,7 +3071,9 @@ esac
 # These force 'configure's to be done one at a time, to avoid problems
 # with contention over a shared config.cache.
 rm -f serdep.tmp
+if test "x${enable_serial_build_configure}" = xyes || test 
"x${enable_serial_host_configure}" = xyes || test 
"x${enable_serial_target_configure}" = xyes; then
 echo '# serdep.tmp' > serdep.tmp
+fi
 olditem=
 test "x${enable_serial_build_configure}" = xyes &&
 for item in ${build_configdirs} ; do
-- 
2.39.0



[PATCH] ada: Respect GNATMAKE

2023-01-16 Thread Peter Foley
Use the GNATMAKE variables consistently.
Avoids failures when bootstraping with a custom GNATMAKE value.

gcc/ada/ChangeLog:

* Make-generated.in: Use GNATMAKE.
* gcc-interface/Makefile.in: Ditto.

Signed-off-by: Peter Foley 
---
 gcc/ada/Make-generated.in | 6 +++---
 gcc/ada/gcc-interface/Makefile.in | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/Make-generated.in b/gcc/ada/Make-generated.in
index 948fc508a56..34c86b2cd63 100644
--- a/gcc/ada/Make-generated.in
+++ b/gcc/ada/Make-generated.in
@@ -18,7 +18,7 @@ GEN_IL_FLAGS = -gnata -gnat2012 -gnatw.g -gnatyg -gnatU 
$(GEN_IL_INCLUDES)
 ada/seinfo_tables.ads ada/seinfo_tables.adb ada/sinfo.h ada/einfo.h 
ada/nmake.ads ada/nmake.adb ada/seinfo.ads ada/sinfo-nodes.ads 
ada/sinfo-nodes.adb ada/einfo-entities.ads ada/einfo-entities.adb: 
ada/stamp-gen_il ; @true
 ada/stamp-gen_il: $(fsrcdir)/ada/gen_il*
$(MKDIR) ada/gen_il
-   cd ada/gen_il; gnatmake -q -g $(GEN_IL_FLAGS) gen_il-main
+   cd ada/gen_il; $(GNATMAKE) -q -g $(GEN_IL_FLAGS) gen_il-main
# Ignore errors to work around finalization issues in older compilers
- cd ada/gen_il; ./gen_il-main
$(fsrcdir)/../move-if-change ada/gen_il/seinfo_tables.ads 
ada/seinfo_tables.ads
@@ -39,14 +39,14 @@ ada/stamp-gen_il: $(fsrcdir)/ada/gen_il*
 # would cause bootstrapping with older compilers to fail. You can call it by
 # hand, as a sanity check that these files are legal.
 ada/seinfo_tables.o: ada/seinfo_tables.ads ada/seinfo_tables.adb
-   cd ada ; gnatmake $(GEN_IL_INCLUDES) seinfo_tables.adb -gnatU -gnatX
+   cd ada ; $(GNATMAKE) $(GEN_IL_INCLUDES) seinfo_tables.adb -gnatU -gnatX
 
 ada/snames.h ada/snames.ads ada/snames.adb : ada/stamp-snames ; @true
 ada/stamp-snames : ada/snames.ads-tmpl ada/snames.adb-tmpl ada/snames.h-tmpl 
ada/xsnamest.adb ada/xutil.ads ada/xutil.adb
-$(MKDIR) ada/bldtools/snamest
$(RM) $(addprefix ada/bldtools/snamest/,$(notdir $^))
$(CP) $^ ada/bldtools/snamest
-   cd ada/bldtools/snamest; gnatmake -q xsnamest ; ./xsnamest
+   cd ada/bldtools/snamest; $(GNATMAKE) -q xsnamest ; ./xsnamest
$(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.ns 
ada/snames.ads
$(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.nb 
ada/snames.adb
$(fsrcdir)/../move-if-change ada/bldtools/snamest/snames.nh ada/snames.h
diff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index da6a56fcec8..c8c38acf447 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -616,7 +616,7 @@ OSCONS_EXTRACT=$(GCC_FOR_ADA_RTS) $(GNATLIBCFLAGS_FOR_C) -S 
s-oscons-tmplt.i
-$(MKDIR) ./bldtools/oscons
$(RM) $(addprefix ./bldtools/oscons/,$(notdir $^))
$(CP) $^ ./bldtools/oscons
-   (cd ./bldtools/oscons ; gnatmake -q xoscons)
+   (cd ./bldtools/oscons ; $(GNATMAKE) -q xoscons)
 
 $(RTSDIR)/s-oscons.ads: ../stamp-gnatlib1-$(RTSDIR) s-oscons-tmplt.c gsocket.h 
./bldtools/oscons/xoscons
$(RM) $(RTSDIR)/s-oscons-tmplt.i $(RTSDIR)/s-oscons-tmplt.s
-- 
2.39.0



Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-10 Thread Peter Bergner
On 8/27/23 9:06 PM, Kewen.Lin wrote:
> Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we
> can have either TARGET_PCREL or !TARGET_PCREL after the checking.
> For the latter, it's fine and don't need any checks. For the former,
> if it's implicit, for !TARGET_PREFIXED we will clean it silently;
> while if it's explicit, for !TARGET_PREFIXED we will emit an error.
> TARGET_PREFIXED checking has considered Power10, so it's also
> concerned accordingly.
[snip]
> Yeah, looking forward to their opinions.  IMHO, with the current proposed
> change, pcrel doesn't look like a pure Power10 hardware feature, it also
> quite relies on ABIs, that's why I thought it seems good not to turn it
> on by default for Power10.

Ok, how about the patch below?  This removes OPTION_MASK_PCREL from the
power10 flags, so instead of our options override code needing to disable
PCREL on the systems that don't support it, we now enable it only on those
systems that do support it.

Jeevitha, can you test this patch to see whether it fixes the testsuite
issue caused by your earlier patch that was approved, but not yet pushed?
That was the use GPR2 for register allocation, correct?  Note, you'll need
to update the patch to replace the rs6000_pcrel_p() usage with just
TARGET_PCREL, since this patch removes rs6000_pcrel_p().

If testing is clean and everyone is OK with the patch, I'll officially
submit it for review with git log entry, etc.

Peter


gcc/
* config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI.
* config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL
from power10.
* config/rs6000/predicates.md: Use TARGET_PCREL.
* config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise.
(rs6000_global_entry_point_prologue_needed_p): Likewise.
(rs6000_output_function_prologue): Likewise.
* config/rs6000/rs6000.md: Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework
the logic for enabling PCREL by default.
(rs6000_legitimize_tls_address): Use TARGET_PCREL.
(rs6000_call_template_1): Likewise.
(rs6000_indirect_call_template_1): Likewise.
(rs6000_longcall_ref): Likewise.
(rs6000_call_aix): Likewise.
(rs6000_sibcall_aix): Likewise.
(rs6000_pcrel_p): Remove.
* config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise.

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 98b7255c95f..5b77bd7fd51 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -563,8 +563,5 @@ extern int dot_symbols;
 #define TARGET_FLOAT128_ENABLE_TYPE 1
 
 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI
-   supports it.  The ELF v2 ABI only supports PC-relative relocations for
-   the medium code model.  */
-#define PCREL_SUPPORTED_BY_OS  (TARGET_POWER10 && TARGET_PREFIXED  \
-&& ELFv2_ABI_CHECK \
-&& TARGET_CMODEL == CMODEL_MEDIUM)
+   supports it.  */
+#define PCREL_SUPPORTED_BY_OS  (ELFv2_ABI_CHECK)
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f350da378c..fe01a2312ae 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | 
ISA_2_7_MASKS_SERVER
| OPTION_MASK_HTM)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER
| OPTION_MASK_HTM)
-RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | 
ISA_3_1_MASKS_SERVER)
+RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64
+   | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL))
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT
| MASK_POWERPC64)
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ef7d3f214c4..0b76541fc0a 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1216,7 +1216,7 @@
 && SYMBOL_REF_DECL (op) != NULL
 && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL
 && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op))
-!= rs6000_pcrel_p ()))")))
+!= TARGET_PCREL))")))
 
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate "input_operand"
diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 98846f781ec..9e08d9bb4d2 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-lo

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-10 Thread Peter Bergner
On 8/25/23 6:20 AM, Kewen.Lin wrote:
> btw, I was also expecting that we don't implicitly set
> OPTION_MASK_PCREL any more for Power10, that is to remove
> OPTION_MASK_PCREL from OTHER_POWER10_MASKS.

So my patch removes the flag from the default power10 flags, like
you want.  However, it doesn't remove it from OTHER_POWER10_MASKS,
since that is used to set ISA_3_1_MASKS_SERVER and I didn't want
to change how rs6000_machine_from_flags() behaves, so instead, I
just explicitly mask it off when defining the power10 default flags.

Peter



Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-14 Thread Peter Bergner
On 11/13/23 8:33 PM, Kewen.Lin wrote:
>> if (PCREL_SUPPORTED_BY_OS)
> 
>> +  else
>> +{
>> +  if (TARGET_PCREL
>> +  && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
>> +error ("use of %qs is invalid for this target", "-mpcrel");
>>rs6000_isa_flags &= ~OPTION_MASK_PCREL;
>>  }
> 
> now, I think it should be fine not to explicitly mask it off Power10
> default flags?  Then leave Power10 default flags unchanged seems more
> consistent.  The others look good to me!  Thanks again.

Ah, good catch, yes, I'll remove the clearing of the bit, since we know
it's either already clear (because that is the default value) or if it's
set, then that's the error condition we're catching here.  That also means
that we can remove the test for rs6000_isa_flags_explicit & OPTION_MASK_PCREL) 
!= 0
too, since at this point, TARGET_PCREL can only be true if it was explicitly
enabled with -mpcrel by the user.  Therefor, the code can now look like:

  if (PCREL_SUPPORTED_BY_OS)

  else if (TARGET_PCREL)
error ("use of %qs is invalid for this target", "-mpcrel");

I'll make that change, redo the bootstrap and regtesting and then
officially submit the patch.  Thanks!

Peter



Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-14 Thread Peter Bergner
On 11/12/23 6:08 AM, Lehua Ding wrote:
> V3 Changes:
>   1. fix three ICE.
>   2. rebase


I tested this on powerpc64le-linux and powerpc64-linux.  The LE build
bootstrapped fine and it looks like only one testsuite FAIL which I have
to look into why it's FAILing.

The BE build did bootstrap, but the 32-bit and 64-bit testsuite runs both
had lots of FAILs (over 100 between them both) which I have yet to look
into what is happening.

I'll also note I have done no performance testing yet until I have an
idea of what the testsuite failures are.  I think a patch like this that
can affect the performance of all architectures needs some performance
testing to ensure we don't have unintended performance degradations.
I'll have someone on my team kick off some builds once I have a handle
on the testsuite FAILs.

Peter




Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-14 Thread Peter Bergner
On 11/13/23 11:37 PM, Lehua Ding wrote:
> On 2023/11/14 3:37, Vladimir Makarov wrote:
>> Also besides testing major targets I'd recommend testing at least one big
>> endian target (I'd recommend ppc64be. gcc110.fsfrance.org could be used
>> for this).  Plenty RA issues occur because BE targets are not tested.
> 
> You said the address looks a bit wrong, it should be this gcc110.fsffrance.org
> right? I looked for it and it looks like you have to go to portal.cfarm.net
> first to apply for an account on this site, I'll try that, thanks a lot.


The compile farm just went through with a domain name change, so the Power7 BE
gcc110.fsffrance.org system is now reachable via cfarm110.cfarm.net.
You are correct on the address for requesting a cfarm account.

That said, I posted results using your V3 patches for both LE and BE Power
in my other reply. 

Peter



[PATCH] rs6000: Only enable PCREL on supported ABIs [PR111045]

2023-11-14 Thread Peter Bergner
PCREL data accesses are only officially supported on ELFv2.  We currently
incorrectly enable PCREL on all Power10 compiles in which prefix instructions
are also enabled.  Rework the option override code so we only enable PCREL
for those ABIs that actually support it.

Jeevitha has confirmed this patch fixes the testsuite fallout seen with her
PR110320 patch.

This has been bootstrapped and regtested with no regressions on the following
builds: powerpc64le-linux, powerpc64le-linux --with-cpu=power10 and
powerpc64-linux - testsuite run in both 32-bit and 64-bit modes.
Ok for trunk?

Ok for the release branches after some burn-in on trunk?

Peter


gcc/
PR target/111045
* config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI.
* config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL
from power10.
* config/rs6000/predicates.md: Use TARGET_PCREL.
* config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise.
(rs6000_global_entry_point_prologue_needed_p): Likewise.
(rs6000_output_function_prologue): Likewise.
* config/rs6000/rs6000.md: Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework
the logic for enabling PCREL by default.
(rs6000_legitimize_tls_address): Use TARGET_PCREL.
(rs6000_call_template_1): Likewise.
(rs6000_indirect_call_template_1): Likewise.
(rs6000_longcall_ref): Likewise.
(rs6000_call_aix): Likewise.
(rs6000_sibcall_aix): Likewise.
(rs6000_pcrel_p): Remove.
* config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise.

gcc/testsuite/
PR target/111045
* gcc.target/powerpc/pr111045.c: New test.
* gcc.target/powerpc/float128-constant.c: Add instruction counts for
non-pcrel compiles.

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 98b7255c95f..5b77bd7fd51 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -563,8 +563,5 @@ extern int dot_symbols;
 #define TARGET_FLOAT128_ENABLE_TYPE 1
 
 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI
-   supports it.  The ELF v2 ABI only supports PC-relative relocations for
-   the medium code model.  */
-#define PCREL_SUPPORTED_BY_OS  (TARGET_POWER10 && TARGET_PREFIXED  \
-&& ELFv2_ABI_CHECK \
-&& TARGET_CMODEL == CMODEL_MEDIUM)
+   supports it.  */
+#define PCREL_SUPPORTED_BY_OS  (ELFv2_ABI_CHECK)
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f350da378c..fe01a2312ae 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | 
ISA_2_7_MASKS_SERVER
| OPTION_MASK_HTM)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER
| OPTION_MASK_HTM)
-RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | 
ISA_3_1_MASKS_SERVER)
+RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64
+   | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL))
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT
| MASK_POWERPC64)
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ef7d3f214c4..0b76541fc0a 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1216,7 +1216,7 @@
 && SYMBOL_REF_DECL (op) != NULL
 && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL
 && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op))
-!= rs6000_pcrel_p ()))")))
+!= TARGET_PCREL))")))
 
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate "input_operand"
diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 98846f781ec..9e08d9bb4d2 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -1106,7 +1106,7 @@ rs6000_decl_ok_for_sibcall (tree decl)
 r2 for its caller's TOC.  Such a function may make sibcalls to any
 function, whether local or external, without restriction based on
 TOC-save/restore rules.  */
-  if (rs6000_pcrel_p ())
+  if (TARGET_PCREL)
return true;
 
   /* Otherwise, under the AIX or ELFv2 ABIs we can't allow sibcalls
@@ -2583,7 +2583,7 @@ rs6000_global_entry_point_prologue_needed_p (void)
 return false;
 
   /* PC-relative functions never generate a global entry point prologue.  */
-  if (rs6000_pcrel_p ())
+  if (TARGET_PCREL)
 return f

Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-14 Thread Peter Bergner
On 11/14/23 9:12 PM, Lehua Ding wrote:
> I've applied for machine permissions on the compile farm, can you give
> me the way to compile and run tests on PPC64BE machine? I'll take a look
> at it too, thanks a lot.

That's an old system, with too old system libgmp, etc.  Let me attempt a
build there so I can give you correct build directions for that system.

That said, unfortunately, that system is currently almost out of available
disk space:

  [bergner@gcc1-power7 ~]$ df -h
  Filesystem  Size  Used Avail Use% Mounted on
  ...
  /dev/md41.6T  1.6T  9.0G 100% /home

Segher, can you please send out an admin note for people to clean up
unneeded space on cfarm110?  Thanks.

Peter




[PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2023-11-15 Thread Peter Bergner
PR109116 exposes an issue where using unspecs to access each vector component
of an opaque mode variable leads to unneeded register copies, because our rtl
optimizers cannot handle unspecs.  Instead, use subregs to access each vector
component of the opaque mode variable, which our optimizers know how to handle.

I did not include a test case with the patch, since writing a test case that
attempts to ensure we don't emit unneeded register copies is nearly impossible
since those copies can still be generated for reasons other than the causes
in this patch.  I have verified that this patch does improve code generation
for some unit tests and our AI libraries team has confirmed that performance
of their tests improved when using this patch.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux
and powerpc64-linux.  Ok for trunk?

Peter


gcc/
PR target/109116
* config/rs6000/mma.md (vsx_disassemble_pair): Expand into a vector
register sized subreg.
* config/rs6000/mma.md (*vsx_disassemble_pair): Delete.
(mma_disassemble_acc): Expand into a vector register sized subreg.
(*mma_disassemble_acc): Delete.
* config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Allow vector modes
to tie with OOmode.

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 575751d477e..2ca405469e2 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -398,29 +398,8 @@ (define_expand "vsx_disassemble_pair"
(match_operand 2 "const_0_to_1_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*vsx_disassemble_pair"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa")
- (match_operand 2 "const_0_to_1_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && vsx_register_operand (operands[1], OOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], OOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
@@ -472,29 +451,8 @@ (define_expand "mma_disassemble_acc"
(match_operand 2 "const_0_to_3_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*mma_disassemble_acc"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:XO 1 "fpr_reg_operand" "d")
- (match_operand 2 "const_0_to_3_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && fpr_reg_operand (operands[1], XOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], XOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5f56c3ed85b..f2efa46c147 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
 static bool
 rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
 {
-  if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
-  || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
-return mode1 == mode2;
+   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
+   || mode2 == PTImode || mode2 == XOmode)
+ return mode1 == mode2;
+ 
+  if (mode2 == OOmode)
+return ALTIVEC_OR_VSX_VECTOR_MODE (mode1);
 
   if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1))
 return ALTIVEC_OR_VSX_VECTOR_MODE (mode2);


[PATCH] diagnostics: Follow DECL_ABSTRACT_ORIGIN links in lhd_decl_printable_name [PR102061]

2024-07-02 Thread Peter Damianov
Currently, if a warning references a cloned function, the name of the cloned
function will be emitted in the "In function 'xyz'" part of the diagnostic,
which users aren't supposed to see. This patch follows the DECL_ABSTRACT_ORIGIN
links until encountering the original function.

gcc/ChangeLog:
PR diagnostics/102061
* langhooks.cc (lhd_decl_printable_name): Follow DECL_ABSTRACT_ORIGIN
links to the source

Signed-off-by: Peter Damianov 
---

I would add a testcase but I'm not familiar with that process, and would need
some help. I also did not bootstrap or test this patch, I'm posting to see if
the CI will do it for me.

I used "while" because I'm not sure if there can be clones of clones or not.
The second check is because I see comments elsewhere that say:
"DECL_ABSTRACT_ORIGIN can point to itself", so I want to avoid a potential
infinite loop.

 gcc/langhooks.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc
index 61f2b676256..89a89b74535 100644
--- a/gcc/langhooks.cc
+++ b/gcc/langhooks.cc
@@ -223,6 +223,8 @@ lhd_get_alias_set (tree ARG_UNUSED (t))
 const char *
 lhd_decl_printable_name (tree decl, int ARG_UNUSED (verbosity))
 {
+  while (DECL_ABSTRACT_ORIGIN(decl) && DECL_ABSTRACT_ORIGIN(decl) != decl)
+decl = DECL_ABSTRACT_ORIGIN(decl);
   gcc_assert (decl && DECL_NAME (decl));
   return IDENTIFIER_POINTER (DECL_NAME (decl));
 }
-- 
2.39.2



[PATCH v2] diagnostics: Follow DECL_ORIGIN in lhd_decl_printable_name [PR102061]

2024-07-03 Thread Peter Damianov
Currently, if a warning references a cloned function, the name of the cloned
function will be emitted in the "In function 'xyz'" part of the diagnostic,
which users aren't supposed to see. This patch follows the DECL_ORIGIN link
to get the name of the original function.

gcc/ChangeLog:
PR diagnostics/102061
* langhooks.cc (lhd_decl_printable_name): Follow DECL_ORIGIN
link

Signed-off-by: Peter Damianov 
---
v2: use DECL_ORIGIN instead of DECL_ABSTRACT_ORIGIN and remove loop

 gcc/langhooks.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc
index 61f2b676256..943b8345a95 100644
--- a/gcc/langhooks.cc
+++ b/gcc/langhooks.cc
@@ -223,6 +223,7 @@ lhd_get_alias_set (tree ARG_UNUSED (t))
 const char *
 lhd_decl_printable_name (tree decl, int ARG_UNUSED (verbosity))
 {
+  decl = DECL_ORIGIN(decl);
   gcc_assert (decl && DECL_NAME (decl));
   return IDENTIFIER_POINTER (DECL_NAME (decl));
 }
-- 
2.39.2



Re: [PATCH] rs6000: ROP - Emit hashst and hashchk insns on Power8 and later [PR114759]

2024-07-03 Thread Peter Bergner
On 7/3/24 4:01 AM, Kewen.Lin wrote:
>> -  if (TARGET_POWER10
>> +  if (TARGET_POWER8
>>&& info->calls_p
>>&& DEFAULT_ABI == ABI_ELFv2
>>&& rs6000_rop_protect)
> 
> Nit: I noticed that this is the only place to change
> info->rop_hash_size to non-zero, and ...
> 
>> @@ -3277,7 +3277,7 @@ rs6000_emit_prologue (void)
>>/* NOTE: The hashst isn't needed if we're going to do a sibcall,
>>   but there's no way to know that here.  Harmless except for
>>   performance, of course.  */
>> -  if (TARGET_POWER10 && rs6000_rop_protect && info->rop_hash_size != 0)
>> +  if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0)
> 
> ... this condition and ...
> 
>>  {
>>gcc_assert (DEFAULT_ABI == ABI_ELFv2);
>>rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
>> @@ -5056,7 +5056,7 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
>>  
>>/* The ROP hash check must occur after the stack pointer is restored
>>   (since the hash involves r1), and is not performed for a sibcall.  */
>> -  if (TARGET_POWER10
>> +  if (TARGET_POWER8>&& rs6000_rop_protect
>>&& info->rop_hash_size != 0
> 
> ... here, both check info->rop_hash_size isn't zero, I think we can drop these
> two TARGET_POWER10 (TARGET_POWER8) and rs6000_rop_protect checks?  Instead 
> just
> update the inner gcc_assert (now checking DEFAULT_ABI == ABI_ELFv2) by extra
> checkings on TARGET_POWER8 && rs6000_rop_protect?
> 
> The other looks good to me, ok for trunk with this nit tweaked (if you agree
> with it and re-tested well), thanks!

I agree with you, because the next patch I haven't submitted yet (waiting
on this to get in), makes that simplification as part of the adding earlier
checking of invalid options. :-)  The follow-on patch will not only remove
the TARGET_* and the 2nd/3rd rs6000_rop_protect usage, but will also remove
the test and asserts of ELFv2...because we've already verified valid option
usage earlier in the normal options handling code.

Therefore, I'd like to keep this patch as simple as possible and limited to
the TARGET_POWER10 -> TARGET_POWER8 change and the cleanup of those tests is
coming in the next patch...which has already been tested.

Peter




[PING*2][PATCH v2] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-07-03 Thread Peter Bergner
Ping * 2.   [Message-ID: <1e003d78-3b2e-4263-830a-7c00a3e9d...@linux.ibm.com>]

Segher, this resolves the issues you mentioned in your review.

Peter



On 6/18/24 5:59 PM, Peter Bergner wrote:
> Updated patch.  This passed bootstrap and regtesting on powerpc64le-linux
> with no regressions.  Ok for trunk?
>
> Changes from v1:
> 1. Moved the disabling of shrink-wrapping to rs6000_emit_prologue
>and beefed up comment.  Used a more accurate test.
> 2. Added comment to the test case on why rop_ok is needed.
>
> Peter
>
>
> rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]
>
> Only disable shrink-wrapping when using -mrop-protect when we know we
> will be emitting the ROP-protect hash instructions (ie, non-leaf functions).
>
> 2024-06-17  Peter Bergner  
>
> gcc/
>   PR target/114759
>   * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
>   the disabling of shrink-wrapping from here
>   * config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here.
>
> gcc/testsuite/
>   PR target/114759
>   * gcc.target/powerpc/pr114759-1.c: New test.
> ---
>  gcc/config/rs6000/rs6000-logue.cc |  5 +
>  gcc/config/rs6000/rs6000.cc   |  4 
>  gcc/testsuite/gcc.target/powerpc/pr114759-1.c | 16 
>  3 files changed, 21 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-1.c
>
> diff --git a/gcc/config/rs6000/rs6000-logue.cc 
> b/gcc/config/rs6000/rs6000-logue.cc
> index 193e2122c0f..c384e48e378 100644
> --- a/gcc/config/rs6000/rs6000-logue.cc
> +++ b/gcc/config/rs6000/rs6000-logue.cc
> @@ -3018,6 +3018,11 @@ rs6000_emit_prologue (void)
> && (lookup_attribute ("no_split_stack",
>   DECL_ATTRIBUTES 
> (cfun->decl))
> == NULL));
> +  /* If we are inserting ROP-protect hash instructions, disable shrink-wrap
> + until the bug where the hashst insn is emitted in the wrong location
> + is fixed.  See PR101324 for details.  */
> +  if (info->rop_hash_size)
> +flag_shrink_wrap = 0;
>  
>frame_pointer_needed_indeed
>  = frame_pointer_needed && df_regs_ever_live_p 
> (HARD_FRAME_POINTER_REGNUM);
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index e4dc629ddcc..fd6e013c346 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void)
>  }
>else if (!OPTION_SET_P (flag_cunroll_grow_size))
>  flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
> -
> -  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
> -  if (rs6000_rop_protect)
> -flag_shrink_wrap = 0;
>  }
>  
>  #ifdef TARGET_USES_LINUX64_OPT
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
> new file mode 100644
> index 000..579e08e920f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect 
> -fdump-rtl-pro_and_epilogue" } */
> +/* { dg-require-effective-target rop_ok } Only enable on supported ABIs. */
> +
> +/* Verify we still attempt shrink-wrapping when using -mrop-protect
> +   and there are no function calls.  */
> +
> +long
> +foo (long arg)
> +{
> +  if (arg)
> +asm ("" ::: "r20");
> +  return 0;
> +}
> +
> +/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 
> "pro_and_epilogue" } } */



[PING*3][PATCH v2] rs6000: ROP - Do not disable shrink-wrapping for, leaf functions [PR114759]

2024-07-09 Thread Peter Bergner
Ping * 3.   [Message-ID: <1e003d78-3b2e-4263-830a-7c00a3e9d...@linux.ibm.com>]

Segher, this resolves the issues you mentioned in your review.

Peter



On 6/18/24 5:59 PM, Peter Bergner wrote:
> Updated patch.  This passed bootstrap and regtesting on powerpc64le-linux
> with no regressions.  Ok for trunk?
>
> Changes from v1:
> 1. Moved the disabling of shrink-wrapping to rs6000_emit_prologue
>and beefed up comment.  Used a more accurate test.
> 2. Added comment to the test case on why rop_ok is needed.
>
> Peter
>
>
> rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]
>
> Only disable shrink-wrapping when using -mrop-protect when we know we
> will be emitting the ROP-protect hash instructions (ie, non-leaf functions).
>
> 2024-06-17  Peter Bergner  
>
> gcc/
>   PR target/114759
>   * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
>   the disabling of shrink-wrapping from here
>   * config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here.
>
> gcc/testsuite/
>   PR target/114759
>   * gcc.target/powerpc/pr114759-1.c: New test.
> ---
>  gcc/config/rs6000/rs6000-logue.cc |  5 +
>  gcc/config/rs6000/rs6000.cc   |  4 
>  gcc/testsuite/gcc.target/powerpc/pr114759-1.c | 16 
>  3 files changed, 21 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-1.c
>
> diff --git a/gcc/config/rs6000/rs6000-logue.cc 
> b/gcc/config/rs6000/rs6000-logue.cc
> index 193e2122c0f..c384e48e378 100644
> --- a/gcc/config/rs6000/rs6000-logue.cc
> +++ b/gcc/config/rs6000/rs6000-logue.cc
> @@ -3018,6 +3018,11 @@ rs6000_emit_prologue (void)
> && (lookup_attribute ("no_split_stack",
>   DECL_ATTRIBUTES 
> (cfun->decl))
> == NULL));
> +  /* If we are inserting ROP-protect hash instructions, disable shrink-wrap
> + until the bug where the hashst insn is emitted in the wrong location
> + is fixed.  See PR101324 for details.  */
> +  if (info->rop_hash_size)
> +flag_shrink_wrap = 0;
>  
>frame_pointer_needed_indeed
>  = frame_pointer_needed && df_regs_ever_live_p 
> (HARD_FRAME_POINTER_REGNUM);
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index e4dc629ddcc..fd6e013c346 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void)
>  }
>else if (!OPTION_SET_P (flag_cunroll_grow_size))
>  flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
> -
> -  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
> -  if (rs6000_rop_protect)
> -flag_shrink_wrap = 0;
>  }
>  
>  #ifdef TARGET_USES_LINUX64_OPT
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
> new file mode 100644
> index 000..579e08e920f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect 
> -fdump-rtl-pro_and_epilogue" } */
> +/* { dg-require-effective-target rop_ok } Only enable on supported ABIs. */
> +
> +/* Verify we still attempt shrink-wrapping when using -mrop-protect
> +   and there are no function calls.  */
> +
> +long
> +foo (long arg)
> +{
> +  if (arg)
> +asm ("" ::: "r20");
> +  return 0;
> +}
> +
> +/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 
> "pro_and_epilogue" } } */




[PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]

2024-07-09 Thread Peter Bergner
Hi Kewen,

Here is that promised cleanup patch we discussed in the previous patch review.
I'll note this patch is dependent on the previous patch you approved.  I have
not pushed that yet (in case you looked) since I'm waiting on Segher to approve
the updated patch for not disabling shrink-wrapping for leaf-functions in the
presence of -mrop-protect first.


We currently silently ignore the -mrop-protect option for old CPUs we don't
support with the ROP hash insns, but we throw an error for unsupported ABIs.
This patch treats unsupported CPUs and ABIs similarly by throwing an error
for both.  This matches clang behavior and allows us to simplify our ROP
tests in the code that generates our prologue and epilogue code.

This passed bootstrap and regtesting on powerpc64le-linux and powerpc64-linux
with no regressions.  Ok for trunk?

I'll note I did not create a test case for unsupported ABIs, since I'll be
working on adding ROP support for powerpc-linux and powerpc64-linux next.

Peter



2024-06-26  Peter Bergner  

gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_override_options_after_change):
Disallow CPUs and ABIs that do no support the ROP protection insns.
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now
unneeded tests.
(rs6000_emit_prologue): Likewise.
Remove unneeded gcc_assert.
(rs6000_emit_epilogue): Likewise.
* config/rs6000/rs6000.md: Likewise.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-3.c: New test.
---
 gcc/config/rs6000/rs6000.cc   | 11 ++
 gcc/config/rs6000/rs6000-logue.cc | 22 +--
 gcc/config/rs6000/rs6000.md   |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr114759-3.c | 19 
 4 files changed, 38 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-3.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index fd6e013c346..e9642fd5310 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3427,6 +3427,17 @@ rs6000_override_options_after_change (void)
 }
   else if (!OPTION_SET_P (flag_cunroll_grow_size))
 flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
+
+  if (rs6000_rop_protect)
+{
+  /* Disallow CPU targets we don't support.  */
+  if (!TARGET_POWER8)
+   error ("%<-mrop-protect%> requires %<-mcpu=power8%> or later");
+
+  /* Disallow ABI targets we don't support.  */
+  if (DEFAULT_ABI != ABI_ELFv2)
+   error ("%<-mrop-protect%> requires the ELFv2 ABI");
+}
 }
 
 #ifdef TARGET_USES_LINUX64_OPT
diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index bd363b625a4..fdb6414f486 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -716,17 +716,11 @@ rs6000_stack_info (void)
   info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame);
   info->rop_hash_size = 0;
 
-  if (TARGET_POWER8
-  && info->calls_p
-  && DEFAULT_ABI == ABI_ELFv2
-  && rs6000_rop_protect)
+  /* If we want ROP protection and this function makes a call, indicate
+ we need to create a stack slot to save the hashed return address in.  */
+  if (rs6000_rop_protect
+  && info->calls_p)
 info->rop_hash_size = 8;
-  else if (rs6000_rop_protect && DEFAULT_ABI != ABI_ELFv2)
-{
-  /* We can't check this in rs6000_option_override_internal since
-DEFAULT_ABI isn't established yet.  */
-  error ("%qs requires the ELFv2 ABI", "-mrop-protect");
-}
 
   /* Determine if we need to save the condition code registers.  */
   if (save_reg_p (CR2_REGNO)
@@ -3277,9 +3271,8 @@ rs6000_emit_prologue (void)
   /* NOTE: The hashst isn't needed if we're going to do a sibcall,
  but there's no way to know that here.  Harmless except for
  performance, of course.  */
-  if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0)
+  if (info->rop_hash_size)
 {
-  gcc_assert (DEFAULT_ABI == ABI_ELFv2);
   rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
   rtx addr = gen_rtx_PLUS (Pmode, stack_ptr,
   GEN_INT (info->rop_hash_save_offset));
@@ -5056,12 +5049,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* The ROP hash check must occur after the stack pointer is restored
  (since the hash involves r1), and is not performed for a sibcall.  */
-  if (TARGET_POWER8
-  && rs6000_rop_protect
-  && info->rop_hash_size != 0
+  if (info->rop_hash_size
   && epilogue_type != EPILOGUE_TYPE_SIBCALL)
 {
-  gcc_assert (DEFAULT_ABI == ABI_ELFv2);

Re: [PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]

2024-07-10 Thread Peter Bergner
On 7/10/24 1:01 AM, Kewen.Lin wrote:
>> +  if (rs6000_rop_protect)
>> +{
>> +  /* Disallow CPU targets we don't support.  */
>> +  if (!TARGET_POWER8)
>> +error ("%<-mrop-protect%> requires %<-mcpu=power8%> or later");
>> +
>> +  /* Disallow ABI targets we don't support.  */
>> +  if (DEFAULT_ABI != ABI_ELFv2)
>> +error ("%<-mrop-protect%> requires the ELFv2 ABI");
>> +}
> 
> I wonder if there is some reason to put this hunk here.  IMHO we want the hunk
> in rs6000_option_override_internal instead since no optimization options can
> affect cpu type and DEFAULT_ABI? 

So the original code that used to disable shrink-wrapping that looked like:

  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
  if (rs6000_rop_protect)
flag_shrink_wrap = 0;

...used to be in rs6000_option_override_internal, but was moved to
rs6000_override_options_after_change as part of PR101324 (commit cff7879a381d).
I guess I just placed this code here since it was the correct location for
that old usage (it's changed in the patch I'm waiting for Segher to re-review),
but I will look at moving it to rs6000_option_override_internal.  I think I
thought we could use a target attribute to enable -mrop-protect, but looking
closer, we don't actually allow that option there. 



> And we probably want to unset rs6000_rop_protect to align with the handlings
> on other options?

I'm not sure I know what you mean?  Why would we unset rs6000_rop_protect?
Either we've concluded the current target options allow ROP code gen or not
and for the cases where we don't/can't allow ROP, we want to give the user
and error to match clang's behavior and how we already handle unsupported
ABIs.  So what is it you're trying to describe here?

Peter



[PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]

2024-07-12 Thread Peter Bergner
René's patch seems to have stalled, so here is an updated version of the
patch with the requested changes to his patch.

I'll note I have added an additional code change, which is to also emit a
".machine altivec" if Altivec is enabled.  The problem this fixes is for
cpus like the G5, which is basically a power4 plus an Altivec unit, its
".machine power4" doesn't enable the assembler to recognize Altivec insns.
That isn't a problem if you use gcc -mcpu=G5 to assemble the assembler file,
since gcc passes -maltivec to the assembler.  However, if you try to assemble
the assembler file with as by hand, you'll get "unrecognized opcode" errors.
I did not do the same for VSX, since all ".machine " for cpus that
support VSX already enable VSX insn recognition, so it's not needed.


rs6000: Fix .machine cpu selection w/ altivec [PR97367]

There are various non-IBM CPUs with altivec, so we cannot use that
flag to determine which .machine cpu to use, so ignore it.
Emit an additional ".machine altivec" if Altivec is enabled so
that the assembler doesn't require an explicit -maltivec option
to assemble any Altivec instructions for those targets where
the ".machine cpu" is insufficient to enable Altivec.  For example,
-mcpu=G5 emits a ".machine power4".

This passed bootstrap and regtesting on powrpc64-linux (running the testsuite
in both 32-bit and 64-bit modes) with no regressions.

Ok for trunk and the release branches after some trunk burn-in time?

Peter


2024-07-12  René Rebe  
Peter Bergner  

gcc/
PR target/97367
* config/rs6000/rs6000.c (rs6000_machine_from_flags): Do not consider
OPTION_MASK_ALTIVEC.
(emit_asm_machine): For Altivec compiles, emit a ".machine altivec".

gcc/testsuite/
PR target/97367
* gcc.target/powerpc/pr97367.c: New test.

Signed-of-by: René Rebe 
---
 gcc/config/rs6000/rs6000.cc|  5 -
 gcc/testsuite/gcc.target/powerpc/pr97367.c | 13 +
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97367.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 2cbea6ea2d7..2cb8f35739b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5888,7 +5888,8 @@ rs6000_machine_from_flags (void)
   HOST_WIDE_INT flags = rs6000_isa_flags;
 
   /* Disable the flags that should never influence the .machine selection.  */
-  flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | 
OPTION_MASK_ISEL);
+  flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | OPTION_MASK_ISEL
+| OPTION_MASK_ALTIVEC);
 
   if ((flags & (ISA_3_1_MASKS_SERVER & ~ISA_3_0_MASKS_SERVER)) != 0)
 return "power10";
@@ -5913,6 +5914,8 @@ void
 emit_asm_machine (void)
 {
   fprintf (asm_out_file, "\t.machine %s\n", rs6000_machine);
+  if (TARGET_ALTIVEC)
+fprintf (asm_out_file, "\t.machine altivec\n");
 }
 #endif
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97367.c 
b/gcc/testsuite/gcc.target/powerpc/pr97367.c
new file mode 100644
index 000..f9118dbcdec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr97367.c
@@ -0,0 +1,13 @@
+/* PR target/97367 */
+/* { dg-options "-mdejagnu-cpu=G5" } */
+
+/* Verify we emit a ".machine power4" and ".machine altivec" rather
+   than a ".machine power7".  */
+
+int dummy (void)
+{
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */
+/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */
-- 
2.45.2



Re: [PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]

2024-07-15 Thread Peter Bergner
On 7/11/24 1:24 AM, Kewen.Lin wrote:
> Sorry for the confusion, I meant for most target options when we emit some 
> error
> message meanwhile we also unset it, such as:
> 
>   if (TARGET_CRYPTO && !TARGET_ALTIVEC)
> {
>   if (rs6000_isa_flags_explicit & OPTION_MASK_CRYPTO)
>   error ("%qs requires %qs", "-mcrypto", "-maltivec");
>   rs6000_isa_flags &= ~OPTION_MASK_CRYPTO;
> }

That is not what is happening here though.  The code here to disable
crypto is for the case where crypto is enabled implicitly, via say
-mcpu=power8, but the user has also explicitly disabled Altivec which
crypto depends on.  In that case, we do not emit an error or warning and
we silently disable crypto.  This is similar to -mcpu=power8 -mno-altivec
where we silently disable VSX.  The other cases you showed are of similar
scenarios.

In my ROP code, we know ROP was explicitly enabled (it is never turned on
implicitly with any -mcpu= option) and the target cpu and/or ABI does
not support it, so there's nothing more to do, other than to emit an error.
There is no matching implicit use case where we silently disable ROP as
there was in your case above.  Therefore, I think the code as I showed it
is correct as is...other than the code snipit location, which I have moved
and am testing now.

Peter





[PATCH v2] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]

2024-07-15 Thread Peter Bergner
Hi Kewen,

Here's the updated patch per your review comments, minus your suggestion
to disable the ROP mask which I mentioned isn't needed in my other reply.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
Ok for trunk?

Peter


Changes from v1:
  * Moved checks for invalid targets from rs6000_override_options_after_change
to rs6000_option_override_internal.


rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns 
[PR114759]

We currently silently ignore the -mrop-protect option for old CPUs we don't
support with the ROP hash insns, but we throw an error for unsupported ABIs.
This patch treats unsupported CPUs and ABIs similarly by throwing an error
both both.  This matches clang behavior and allows us to simplify our tests
in the code that generates our prologue and epilogue code.

2024-07-15  Peter Bergner  

gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Disallow
CPUs and ABIs that do no support the ROP protection insns.
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now
unneeded tests.
(rs6000_emit_prologue): Likewise.
Remove unneeded gcc_assert.
(rs6000_emit_epilogue): Likewise.
* config/rs6000/rs6000.md: Likewise.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-3.c: New test.
---
 gcc/config/rs6000/rs6000-logue.cc | 22 +--
 gcc/config/rs6000/rs6000.cc   | 12 ++
 gcc/config/rs6000/rs6000.md   |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr114759-3.c | 19 
 4 files changed, 39 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-3.c

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index bd363b625a4..fdb6414f486 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -716,17 +716,11 @@ rs6000_stack_info (void)
   info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame);
   info->rop_hash_size = 0;
 
-  if (TARGET_POWER8
-  && info->calls_p
-  && DEFAULT_ABI == ABI_ELFv2
-  && rs6000_rop_protect)
+  /* If we want ROP protection and this function makes a call, indicate
+ we need to create a stack slot to save the hashed return address in.  */
+  if (rs6000_rop_protect
+  && info->calls_p)
 info->rop_hash_size = 8;
-  else if (rs6000_rop_protect && DEFAULT_ABI != ABI_ELFv2)
-{
-  /* We can't check this in rs6000_option_override_internal since
-DEFAULT_ABI isn't established yet.  */
-  error ("%qs requires the ELFv2 ABI", "-mrop-protect");
-}
 
   /* Determine if we need to save the condition code registers.  */
   if (save_reg_p (CR2_REGNO)
@@ -3277,9 +3271,8 @@ rs6000_emit_prologue (void)
   /* NOTE: The hashst isn't needed if we're going to do a sibcall,
  but there's no way to know that here.  Harmless except for
  performance, of course.  */
-  if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0)
+  if (info->rop_hash_size)
 {
-  gcc_assert (DEFAULT_ABI == ABI_ELFv2);
   rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
   rtx addr = gen_rtx_PLUS (Pmode, stack_ptr,
   GEN_INT (info->rop_hash_save_offset));
@@ -5056,12 +5049,9 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* The ROP hash check must occur after the stack pointer is restored
  (since the hash involves r1), and is not performed for a sibcall.  */
-  if (TARGET_POWER8
-  && rs6000_rop_protect
-  && info->rop_hash_size != 0
+  if (info->rop_hash_size
   && epilogue_type != EPILOGUE_TYPE_SIBCALL)
 {
-  gcc_assert (DEFAULT_ABI == ABI_ELFv2);
   rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
   rtx addr = gen_rtx_PLUS (Pmode, stack_ptr,
   GEN_INT (info->rop_hash_save_offset));
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index fd6e013c346..1cee9c2011d 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4825,6 +4825,18 @@ rs6000_option_override_internal (bool global_init_p)
}
 }
 
+  /* We only support ROP protection on certain targets.  */
+  if (rs6000_rop_protect)
+{
+  /* Disallow CPU targets we don't support.  */
+  if (!TARGET_POWER8)
+   error ("%<-mrop-protect%> requires %<-mcpu=power8%> or later");
+
+  /* Disallow ABI targets we don't support.  */
+  if (DEFAULT_ABI != ABI_ELFv2)
+   error ("%<-mrop-protect%> requires the ELFv2 ABI");
+}
+
   /* Initialize all of the registers.  */
   rs6

Re: [PATCH] rs6000, update effective target for tests builtins-10*.c and, vec_perm-runnable-i128.c

2024-07-15 Thread Peter Bergner
On 7/15/24 5:43 PM, Carl Love wrote:
> -/* { dg-do run } */
> +/* { dg-do run { target { lp64 } && { int128 } } } */

Why isn't this just:

  /* { dg-do run { target int128 } } */

???   The int128 test should disable this on 32-bit systems just fine.

Peter




Re: [PATCH] rs6000: Error on CPUs and ABIs that don't support the ROP, protection insns [PR114759]

2024-07-15 Thread Peter Bergner
On 7/15/24 9:19 PM, Kewen.Lin wrote:
> on 2024/7/16 04:30, Peter Bergner wrote:
>> On 7/11/24 1:24 AM, Kewen.Lin wrote:
>>> Sorry for the confusion, I meant for most target options when we emit some 
>>> error
>>> message meanwhile we also unset it, such as:
>>>
>>>   if (TARGET_CRYPTO && !TARGET_ALTIVEC)
>>> {
>>>   if (rs6000_isa_flags_explicit & OPTION_MASK_CRYPTO)
>>> error ("%qs requires %qs", "-mcrypto", "-maltivec");
>>>   rs6000_isa_flags &= ~OPTION_MASK_CRYPTO;
>>> }
>>
>> That is not what is happening here though.  The code here to disable
>> crypto is for the case where crypto is enabled implicitly, via say
>> -mcpu=power8, but the user has also explicitly disabled Altivec which
>> crypto depends on.  In that case, we do not emit an error or warning and
> 
> But if it's just for the case implicit enabling, the unmask should be
> in an else arm and not for both implicit and explicit, the code still does
> unmasking for explicit enabling.   Since it's very unlikely to cause some
> unexpected behaviors in following processing even in future, it's your call. 
> :)

I think that is just Mike's coding style.  Yes, the clearing of the flag
could be in an else block, but clearing it even in the error case is
harmless, since we're going to emit an error and exit anyway.

Peter




Re: [PATCH ver 2] rs6000, update effective target for tests builtins-10*.c and, vec_perm-runnable-i128.c

2024-07-16 Thread Peter Bergner
On 7/16/24 6:19 PM, Carl Love wrote:
> use __int128 types that are not supported on all platforms.  The
> __int128 type is only supported on 64-bit platforms.  Need to check that
> the platform is 64-bits and support the __int128 type.  Add the int128 and
> lp64 flags to the target test.

The test cases themselves look good, but you need to update your git log entry
to not mention the lp64/64-bits since you removed them.  Yes, currently, only
64-bit targets support __int128, but our hope is that one day, even 32-bit
targets will as well.  So how about the following text instead?


...
use __int128 types that are not supported on all platforms.  Update the
tests to check int128 effective target to avoid unsupported type errors
on unsupported platforms.


Peter


Re: [PATCH] rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359]

2024-07-17 Thread Peter Bergner
On 7/17/24 4:09 AM, Kewen.Lin wrote:
>   * config/rs6000/rs6000.md (@extenddf2): Remove condition
>   TARGET_LONG_DOUBLE_128 for FLOAT128_IEEE_P modes.

This all LGTM, except this ChangeLog fragment doesn't match the code changes
below.  Rather than removing TARGET_LONG_DOUBLE_128, you've added
FLOAT128_IEEE_P (mode)).


> -  "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
> +  "TARGET_HARD_FLOAT
> +   && (TARGET_LONG_DOUBLE_128 || FLOAT128_IEEE_P (mode))"


Peter



Re: [PATCH] rs6000: Relax some FLOAT128 expander condition for FLOAT128_IEEE_P [PR105359]

2024-07-18 Thread Peter Bergner
On 7/18/24 12:19 AM, Kewen.Lin wrote:
> I guess you meant "Remove" is expected to remove some code explicitly and
> can be misleading here, if so how about "Don't check TARGET_LONG_DOUBLE_128
> for FLOAT128_IEEE_P modes"?

Yeah, I think that is more clear.  Thanks!

Peter




Re: [PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]

2024-07-18 Thread Peter Bergner
On 7/18/24 4:14 AM, Kewen.Lin wrote:
>> +/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */
>> +/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */
> 
> Nit: Both \m looks useless and can be removed.

Fine with me.  Is that because the \. acts like a \m?



>> Ok for trunk and the release branches after some trunk burn-in time?
> 
> OK for all with/without the below minor nit tweaked.

Great, thanks!


Peter




Re: [PATCH 3/3] Add power11 tests

2024-07-18 Thread Peter Bergner
On 7/18/24 8:23 AM, Segher Boessenkool wrote:
> Everything in gcc.target/powerpc is only run for target powerpc*-*-*
> anyway, so make this just
> 
> /* { dg-do compile } */
> 
> please.  (Or nothing, it is the default after all, but you may want to
> have it explicit).

Personally, I do like seeing the /* { dg-do compile } */ even though
that is the default in this testsuite directory.




>> +/* { dg-require-effective-target powerpc_vsx_ok } */
> 
> Why is this needed?  It needs a comment saying that.

After Kewen's testsuite patch, powerpc_vsx_ok is pretty useless as it only
checks whether the assembler knows about vsx, and what people normally want
is powerpc_vsx which verifies that the compiler options used to compile the
test support generating VSX insns.  That said, do we really need that here?
Can't we compile this test case with -mcpu=power4 as the default (ie, no
VSX) and the clones should be able to generate power* and VSX just fine,
correct?  I don't understand the requirement on VSX here.




> Saying that this is "to test if can use the target attr" is nonsense.
> That does not need Linux either btw.  Target selection might not work
> correctly currently on some other systems, but this is not a run test!

Agreed.  Also, if the clones stuff really doesn't work for those
targets even in a compile test, then rather than diabling this way,
I think I'd like to see a target-supports clones_ok test or similar.

Peter




Re: [PATCH v2] rs6000: Fix .machine cpu selection w/ altivec [PR97367]

2024-07-18 Thread Peter Bergner
On 7/18/24 9:14 AM, Peter Bergner wrote:
> On 7/18/24 4:14 AM, Kewen.Lin wrote:
>>> +/* { dg-final { scan-assembler {\.\mmachine power4\M} } } */
>>> +/* { dg-final { scan-assembler {\.\mmachine altivec\M} } } */
>>
>> Nit: Both \m looks useless and can be removed.
> 
> Fine with me.  Is that because the \. acts like a \m?
> 
> 
> 
>>> Ok for trunk and the release branches after some trunk burn-in time?
>>
>> OK for all with/without the below minor nit tweaked.

Ok, I pushed the commit with the tweak to trunk and will let it burn-in
there for a bit before backporting to the release branches.

Thanks René and Kewen!

Peter



[PATCH, Obvious] rs6000: Catch unsupported ABI errors when using -mrop-protect [PR114759,PR115988]

2024-07-18 Thread Peter Bergner
I'm not sure how my testing of the pr114759-3.c test case didn't see
the excess errors on BE, as I did test there.  Anyway, the following
obvious patches fixes the problem Bill saw.  Tested on LE and 32-bit
and 64-bit BE. 

I'll push this tomorrow if there are no objections.

Peter


rs6000: Catch unsupported ABI errors when using -mrop-protect 
[PR114759,PR115988]

2024-07-18  Peter Bergner  

gcc/testsuite/
PR target/114759
PR target/115988
* gcc.target/powerpc/pr114759-3.c: Catch unsupported ABI errors.
---
 gcc/testsuite/gcc.target/powerpc/pr114759-3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr114759-3.c
index 6770a9aec3b..e2f1d42e111 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr114759-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr114759-3.c
@@ -2,7 +2,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -mdejagnu-cpu=power7 -mrop-protect" } */
 
-/* Verify we emit an error if we use -mrop-protect with an unsupported cpu.  */
+/* Verify we emit an error if we use -mrop-protect with an unsupported cpu
+   or ABI.  */
 
 extern void foo (void);
 
@@ -17,3 +18,4 @@ bar (void)
in the final line (which is all that dg-error inspects). Hence, we have
to tell dg-error to ignore the line number.  */
 /* { dg-error "'-mrop-protect' requires '-mcpu=power8'" "PR114759" { target 
*-*-* } 0 } */
+/* { dg-error "'-mrop-protect' requires the ELFv2 ABI" "PR114759" { target { ! 
rop_ok } } 0 } */


Re: [REPOST 1/3] Add support for -mcpu=power11

2024-07-19 Thread Peter Bergner
On 7/19/24 12:34 PM, Michael Meissner wrote:
> On Thu, Jul 18, 2024 at 08:08:44AM -0500, Segher Boessenkool wrote:
>>> --- a/gcc/config/rs6000/ppc-auxv.h
>>> +++ b/gcc/config/rs6000/ppc-auxv.h
>>> @@ -47,9 +47,8 @@
>>>  #define PPC_PLATFORM_PPC47612
>>>  #define PPC_PLATFORM_POWER813
>>>  #define PPC_PLATFORM_POWER914
>>> -
>>> -/* This is not yet official.  */
>>>  #define PPC_PLATFORM_POWER10   15
>>> +#define PPC_PLATFORM_POWER11   16
>>
>> Please add a comment where the official thing is?
>>
>> It is in glibc dl-procinfo.h, and there *cannot* be more than 16
>> platforms currently, so how can this work?  Or do we get data
>> directly from the kernel, or what?
>>
>> But you tested it, so you do obviously get PPC_PLATFORM_POWER11 from
>> somewhere.  I cannot see from where though?
> 
> In other discussions, I was told that 16 with be the platform number for the
> kernel in the future.

The AT_PLATFORM from the kernel is a string, not an integer.
To make __builtin_cpu_is ("power11") work efficiently, GLIBC stores
an integer representing each cpu AT_PLATFORM string in the TCB.
It is therefore GLIBC which "owns" the integer versions of the
platform values and yes, 16 is the correct value for Power11 and
that value exists in upstream GLIBC.

Peter




Re: [PATCH] rs6000: Don't pass -many to the assembler [PR112868]

2024-05-22 Thread Peter Bergner
On 5/21/24 8:27 AM, jeevitha wrote:
> The following patch has been bootstrapped and regtested with default 
> configuration
> [--enable-checking=yes] and with --enable-checking=release on 
> powerpc64le-linux.
> 
> This patch removes passing the -many assembler option for release builds. Now,
> GCC no longer passes -many under any conditions to the assembler.
> 
> 2024-05-15  Jeevitha Palanisamy  
> 
> gcc/
>   PR target/112868
>   * config/rs6000/rs6000.h (ASM_OPT_ANY): Removed Define.
>   (ASM_CPU_SPEC): Remove ASM_OPT_ANY usage.

You are missing a ChangeLog entry for the target-supports.exp change plus
there is no mention of why it's needed in the git log entry.

Otherwise, the rest LGTM.

Peter




Re: [PATCH-1v2, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-05-22 Thread Peter Bergner
On 5/19/24 10:28 PM, HAO CHEN GUI wrote:
> +(define_expand "isinf2"
> +  [(use (match_operand:SI 0 "gpc_reg_operand"))
> +   (use (match_operand:SFDF 1 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT && TARGET_P9_VECTOR"
> +{
> +  emit_insn (gen_xststdcp (operands[0], operands[1], GEN_INT (0x30)));
> +  DONE;
> +})

Is there a reason not to use the vsx_register_operand predicate for op1
which matches the predicate for the operand of the xststdcp pattern
we're passing op1 to?

Ditto for the other optab patches you've submitted.

Peter



[PATCH] .gitattributes: disable crlf translation

2024-05-22 Thread Peter Damianov
By default, git has the "autocrlf" """feature""" enabled. This causes the files
to have CRLF line endings when checked out on windows, which in the case of
configure, causes confusing errors like:

./gcc/configure: line 14: $'\r': command not found
./gcc/configure: line 29: syntax error near unexpected token `newline'
'/gcc/configure: line 29: ` ;;

when it is invoked.

Any files damaged in this way can be fixed with:
$ git config core.autocrlf false
$ git reset
$ git checkout .

But, it's better to simply avoid this problem in the first place.
This behavior is never helpful or desired for gcc.

Signed-off-by: Peter Damianov 
---
 .gitattributes | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.gitattributes b/.gitattributes
index e75bfc595bf..1e116987c98 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -8,3 +8,6 @@ ChangeLog 
whitespace=indent-with-non-tab,space-before-tab,trailing-space
 # Use together with git config diff.md.xfuncname '^\(define.*$'
 # which is run by contrib/gcc-git-customization.sh too.
 *.md diff=md
+
+# Disable lf -> crlf translation on windows.
+* -crlf
-- 
2.39.2



[PATCH] libcpp: Correct typo 'r' -> '\r'

2024-05-25 Thread Peter Damianov
libcpp/ChangeLog:
* lex.cc (do_peek_prev): Correct typo in argument to __builtin_expect()

Signed-off-by: Peter Damianov 
---
 libcpp/lex.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index c9e44e6..de752bdc9c8 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -5038,7 +5038,7 @@ do_peek_prev (const unsigned char *peek, const unsigned 
char *bound)
 
   unsigned char c = *--peek;
   if (__builtin_expect (c == '\n', false)
-  || __builtin_expect (c == 'r', false))
+  || __builtin_expect (c == '\r', false))
 {
   if (peek == bound)
return peek;
-- 
2.39.2



[PATCH v3 2/3] diagnostics: Don't hardcode auto_enable_urls to false for mingw hosts

2024-06-03 Thread Peter Damianov
Windows terminal and mintty both have support for link escape sequences, and so
auto_enable_urls shouldn't be hardcoded to false. For older versions of the
windows console, mingw_ansi_fputs's console API translation logic does mangle
these sequences, but there's nothing useful it could do even if this weren't
the case, so check if the ansi escape sequences are supported at all.

conhost.exe doesn't support link escape sequences, but printing them does not
cause any problems.

gcc/ChangeLog:
* diagnostic-color.cc (auto_enable_urls): Don't hardcode to return
false on mingw hosts.
* diagnostic-color.cc (auto_enable_urls): Return true if console
supports ansi escape sequences.

Signed-off-by: Peter Damianov 
---
v3: Fix minor comment formatting nit.

 gcc/diagnostic-color.cc | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/gcc/diagnostic-color.cc b/gcc/diagnostic-color.cc
index 261a14b6c5f..184419cea36 100644
--- a/gcc/diagnostic-color.cc
+++ b/gcc/diagnostic-color.cc
@@ -384,9 +384,6 @@ parse_env_vars_for_urls ()
 static bool
 auto_enable_urls ()
 {
-#ifdef __MINGW32__
-  return false;
-#else
   const char *term, *colorterm;
 
   /* First check the terminal is capable of printing color escapes,
@@ -394,6 +391,21 @@ auto_enable_urls ()
   if (!should_colorize ())
 return false;
 
+#ifdef __MINGW32__
+  HANDLE handle;
+  DWORD mode;
+
+  handle = GetStdHandle (STD_ERROR_HANDLE);
+  if ((handle == INVALID_HANDLE_VALUE) || (handle == NULL))
+return false;
+
+  /* If ansi escape sequences aren't supported by the console, then URLs will
+ print mangled from mingw_ansi_fputs's console API translation. It wouldn't
+ be useful even if this weren't the case.  */
+  if (GetConsoleMode (handle, &mode) && !(mode & 
ENABLE_VIRTUAL_TERMINAL_PROCESSING))
+return false;
+#endif
+
   /* xfce4-terminal is known to not implement URLs at this time.
  Recently new installations (0.8) will safely ignore the URL escape
  sequences, but a large number of legacy installations (0.6.3) print
@@ -428,7 +440,6 @@ auto_enable_urls ()
 return false;
 
   return true;
-#endif
 }
 
 /* Determine if URLs should be enabled, based on RULE,
-- 
2.39.2



[PATCH v3 1/3] diagnostics: Enable escape sequence processing on windows consoles

2024-06-03 Thread Peter Damianov
Since windows 10 release v1511, the windows console has had support for VT100
escape sequences. We should try to enable this, and utilize it where possible.

gcc/ChangeLog:
* diagnostic-color.cc (should_colorize): Enable processing of VT100
escape sequences on windows consoles

Signed-off-by: Peter Damianov 
---
Pinging these patches. The wine ordeal is a problem, but disabling the
diagnostics colors with the environment variable should be resolving that.
I don't want to intentionally make testing harder, but until wine fixes:
https://bugs.winehq.org/show_bug.cgi?id=49780

There is really nothing I can do.

 gcc/diagnostic-color.cc | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/gcc/diagnostic-color.cc b/gcc/diagnostic-color.cc
index cbe57ce763f..261a14b6c5f 100644
--- a/gcc/diagnostic-color.cc
+++ b/gcc/diagnostic-color.cc
@@ -300,12 +300,23 @@ should_colorize (void)
  pp_write_text_to_stream() in pretty-print.cc calls fputs() on
  that stream.  However, the code below for non-Windows doesn't seem
  to care about it either...  */
-  HANDLE h;
-  DWORD m;
+  HANDLE handle;
+  DWORD mode;
+  BOOL isconsole = false;
 
-  h = GetStdHandle (STD_ERROR_HANDLE);
-  return (h != INVALID_HANDLE_VALUE) && (h != NULL)
- && GetConsoleMode (h, &m);
+  handle = GetStdHandle (STD_ERROR_HANDLE);
+
+  if ((handle != INVALID_HANDLE_VALUE) && (handle != NULL))
+isconsole = GetConsoleMode (handle, &mode);
+
+  if (isconsole)
+{
+  /* Try to enable processing of VT100 escape sequences */
+  mode |= ENABLE_PROCESSED_OUTPUT | ENABLE_VIRTUAL_TERMINAL_PROCESSING;
+  SetConsoleMode (handle, mode);
+}
+
+  return isconsole;
 #else
   char const *t = getenv ("TERM");
   /* emacs M-x shell sets TERM="dumb".  */
-- 
2.39.2



[PATCH v3 3/3] pretty-print: Don't translate escape sequences to windows console API

2024-06-03 Thread Peter Damianov
Modern versions of windows (after windows 10 v1511) support VT100 escape
sequences, so translation for them is not necessary. The translation also
mangles embedded warning documentation links.

gcc/ChangeLog:
* pretty-print.cc (mingw_ansi_fputs): Don't translate escape sequences 
if
the console has ENABLE_VIRTUAL_TERMINAL_PROCESSING.

Signed-off-by: Peter Damianov 
---
v3: fix minor comment formatting nit.

 gcc/pretty-print.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc
index eb59bf424b7..505230ed4d6 100644
--- a/gcc/pretty-print.cc
+++ b/gcc/pretty-print.cc
@@ -674,8 +674,9 @@ mingw_ansi_fputs (const char *str, FILE *fp)
   /* Don't mess up stdio functions with Windows APIs.  */
   fflush (fp);
 
-  if (GetConsoleMode (h, &mode))
-/* If it is a console, translate ANSI escape codes as needed.  */
+  if (GetConsoleMode (h, &mode) && !(mode & 
ENABLE_VIRTUAL_TERMINAL_PROCESSING))
+/* If it is a console, and doesn't support ANSI escape codes, translate
+   them as needed.  */
 for (;;)
   {
if ((esc_code = find_esc_head (&prefix_len, &esc_head, read)) == 0)
-- 
2.39.2



[PATCH v3] diagnostics: Follow DECL_ORIGIN in lhd_print_error_function [PR102061]

2024-08-07 Thread Peter Damianov
Currently, if a warning references a cloned function, the name of the cloned
function will be emitted in the "In function 'xyz'" part of the diagnostic,
which users aren't supposed to see. This patch follows the DECL_ORIGIN link
to get the name of the original function, so the internal compiler details
aren't exposed.

gcc/ChangeLog:
PR diagnostics/102061
* langhooks.cc (lhd_print_error_function): Follow DECL_ORIGIN
links.
* gcc.dg/pr102061.c: New testcase.

Signed-off-by: Peter Damianov 
---
v3: also follow DECL_ORIGIN when emitting "inlined from" warnings, I missed 
this before.
Add testcase.

 gcc/langhooks.cc|  3 +++
 gcc/testsuite/gcc.dg/pr102061.c | 35 +
 2 files changed, 38 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr102061.c

diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc
index 61f2b676256..7a2a66b3c39 100644
--- a/gcc/langhooks.cc
+++ b/gcc/langhooks.cc
@@ -395,6 +395,8 @@ lhd_print_error_function (diagnostic_context *context, 
const char *file,
  else
fndecl = current_function_decl;
 
+ fndecl = DECL_ORIGIN(fndecl);
+
  if (TREE_CODE (TREE_TYPE (fndecl)) == METHOD_TYPE)
pp_printf
  (context->printer, _("In member function %qs"),
@@ -439,6 +441,7 @@ lhd_print_error_function (diagnostic_context *context, 
const char *file,
}
  if (fndecl)
{
+ fndecl = DECL_ORIGIN(fndecl);
  expanded_location s = expand_location (*locus);
  pp_comma (context->printer);
  pp_newline (context->printer);
diff --git a/gcc/testsuite/gcc.dg/pr102061.c b/gcc/testsuite/gcc.dg/pr102061.c
new file mode 100644
index 000..dbdd23965e7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr102061.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+/* { dg-message "inlined from 'bar'" "" { target *-*-* } 0 } */
+/* { dg-excess-errors "" } */
+
+static inline void
+foo (char *p)
+{
+  __builtin___memcpy_chk (p, "abc", 3, __builtin_object_size (p, 0));
+}
+static void
+bar (char *p) __attribute__((noinline));
+static void
+bar (char *p)
+{
+  foo (p);
+}
+void f(char*) __attribute__((noipa));
+char buf[2];
+void
+baz (void) __attribute__((noinline));
+void
+baz (void)
+{
+  bar (buf);
+  f(buf);
+}
+
+void f(char*)
+{}
+
+int main(void)
+{
+baz();
+}
-- 
2.39.2



Re: [PATCH] rs6000: Add TARGET_P10_VECTOR for Power10 vector insns [PR116266]

2024-08-09 Thread Peter Bergner
On 8/9/24 4:50 AM, Kewen.Lin wrote:
> As PR116266 shows, we miss TARGET_P10_VECTOR to guard those
> Power10 related vector instructions as well as their
> according built-in functions.  This patch is to introduce...

LGTM.

The only change I would suggest is s/according/associated/ in
the sentence above.  Thanks for fixing this!

Peter



Re: [PATCH] rs6000: Add TARGET_P10_VECTOR for Power10 vector insns [PR116266]

2024-08-09 Thread Peter Bergner
On 8/9/24 12:54 PM, Segher Boessenkool wrote:
>> --- a/gcc/config/rs6000/altivec.md
>> +++ b/gcc/config/rs6000/altivec.md
>> @@ -623,7 +623,7 @@ (define_insn "altivec_eqv1ti"
>>[(set (match_operand:V1TI 0 "altivec_register_operand" "=v")
>>   (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v")
>>(match_operand:V1TI 2 "altivec_register_operand" "v")))]
>> -  "TARGET_POWER10"
>> +  "TARGET_P10_VECTOR"
>>"vcmpequq %0,%1,%2"
>>[(set_attr "type" "veccmpfx")])
> 
> This very first one is incorrect, already.  This is a Vector insn
> (it needs MSR[VEC]=1), not a VSX insn (for which MSR[VSX]=1 is needed).
> 
> We test TARGET_ALTIVEC for that, not TARGET_VSX.

I guess you are correct that *_VECTOR is not specific enough because
yeah, we could have -mcpu=power10 -maltivec -mno-vsx so we'd need two
macros, TARGET_P10_ALTIVEC and TARGET_P10_VSX rather than one catch-all.
That said...


> In general, we want to get rid of TARGET_Pxxx_VECTOR, not introduce new
> stuff like it!

I'm fine with the TARGET_P10_* macro, since it's more readable than saying
TARGET_POWER10 && TARGET_ALTIVEC && TARGET_VSX, especially when we use the
negated version.

What we really wanted to get rid of, was having a separate OPTION_MASK_Pxxx_*
flag for things like this which we used to have.  I agree that was bad.
This isn't that though.  This is just a convenience macro to make the code
more (IMHO) readable.

Peter





Re: [PATCH] lra: emit caller-save register spills before call insn [PR116028]

2024-08-09 Thread Peter Bergner
On 8/9/24 12:02 PM, Vladimir Makarov wrote:
> I believe your should reverse the original patch and all the patches you
> submitted to fix the issues with the original patch.

I agree this commit should be reverted and Kyrill has pushed that already,
so bootstrap should be restored now.  Thank you Kyrill.  Surya was testing
her revert commit too, but was waiting for her bootstrap to complete before
pushing (per process rules).  It's just the cfarm systems she has access
to were being slow.

I do *not* agree that all the previous patches should be reverted, since they
fixed *real* bugs and really only exposed other latent issues which Surya
has been working on.



> You should do a better testing you patches

It is not correct to presume she did not test her patches thoroughly.
I know she has, but with the large number of supported architectures
and ABIs on those architectures, she cannot possibly test everything.
That's why we have stage1 and our patch revert process.


Peter




[PATCH v4] diagnostics: Follow DECL_ORIGIN in lhd_print_error_function [PR102061]

2024-08-10 Thread Peter Damianov
Currently, if a warning references a cloned function, the name of the cloned
function will be emitted in the "In function 'xyz'" part of the diagnostic,
which users aren't supposed to see. This patch follows the DECL_ORIGIN link
to get the name of the original function, so the internal compiler details
aren't exposed.

gcc/ChangeLog:
PR diagnostics/102061
* langhooks.cc (lhd_print_error_function): Follow DECL_ORIGIN
links.
* gcc.dg/pr102061.c: New testcase.

Signed-off-by: Peter Damianov 
---
v4: Address formatting nits, add comment.
v4: Rework testcase. It is now shorter and covers more, including both
"inlined from" and "in function" parts of diagnostics.

I would still appreciate a review regarding cp_print_error_function, perhaps
with more info about whether it needs adjustment or whatever circumstances it
is used in.

 gcc/langhooks.cc|  6 ++
 gcc/testsuite/gcc.dg/pr102061.c | 31 +++
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr102061.c

diff --git a/gcc/langhooks.cc b/gcc/langhooks.cc
index 61f2b676256..270f7aee1c1 100644
--- a/gcc/langhooks.cc
+++ b/gcc/langhooks.cc
@@ -395,6 +395,11 @@ lhd_print_error_function (diagnostic_context *context, 
const char *file,
  else
fndecl = current_function_decl;
 
+ // Follow DECL_ORIGIN link, in case this is a cloned function.
+ // Otherwise, we will emit names like "foo.constprop" or "bar.isra"
+ // in the diagnostic.
+ fndecl = DECL_ORIGIN (fndecl);
+
  if (TREE_CODE (TREE_TYPE (fndecl)) == METHOD_TYPE)
pp_printf
  (context->printer, _("In member function %qs"),
@@ -439,6 +444,7 @@ lhd_print_error_function (diagnostic_context *context, 
const char *file,
}
  if (fndecl)
{
+ fndecl = DECL_ORIGIN (fndecl);
  expanded_location s = expand_location (*locus);
  pp_comma (context->printer);
  pp_newline (context->printer);
diff --git a/gcc/testsuite/gcc.dg/pr102061.c b/gcc/testsuite/gcc.dg/pr102061.c
new file mode 100644
index 000..aff1062ca5a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr102061.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+/* { dg-message "In function 'foo'" "" { target *-*-* } 0 } */
+/* { dg-message "inlined from 'bar'" "" { target *-*-* } 0 } */
+/* { dg-message "isra" "" { xfail *-*-* } 0 } */
+/* { dg-excess-errors "" } */
+
+// The warnings generated here should not contain names of clones, like
+// 'foo.isra' and 'bar.isra'
+
+// Emit warning with "In function 'foo'"
+__attribute__((noinline))
+static int foo(char* p) {
+__builtin_strncpy(p, p, 1);
+return 0;
+}
+
+// Emit warning with "inlined from 'bar'"
+// For some reason, this function needs to be infinite recursive
+// for the warning to show up in an isra clone.
+static int bar(char* p) {
+__builtin_strncpy(p, p, 1);
+bar(p);
+return 0;
+}
+
+void baz() {
+char c[0];
+foo(c);
+bar(c);
+}
-- 
2.39.2



  1   2   3   4   5   6   7   8   9   10   >