Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition

2023-05-31 Thread Peter Bergner via Gcc-patches
On 5/22/23 4:04 AM, Kewen.Lin wrote:
> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
>> @@ -3161,12 +3161,15 @@
>>void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>>  TR_STXVRBX vsx_stxvrbx {stvec}
>>  
>> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
>> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>>  TR_STXVRHX vsx_stxvrhx {stvec}
>>  
>> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
>> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>>  TR_STXVRWX vsx_stxvrwx {stvec}
> 
> Good catching!

This hunk should be its own patch and commit, as it is independent of
the other change.  Especially since other built-ins also don't have
{,un}simgned long * as arguments, not just __builtin_altivec_tr_stxvr*x.




>> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
>> +TR_STXVRLX vsx_stxvrdx {stvec}
>> +
> 
> This is mapped to the one used for type long long, it's a hard mapping,
> IMHO it's wrong and not consistent with what the users expect, since on Power
> the size of type long int is 4 bytes at -m32 while 8 bytes at -m64, this
> implementation binding to 8 bytes can cause trouble in 32-bit.  I wonder if
> it's a good idea to add one overloaded version for type long int, for now
> openxl also emits error message for long int type pointer (see its doc [1]),
> users can use casting to make it to the acceptable pointer types (long long
> or int as its size).

I'm the person who noticed that we don't accept signed/unsigned long * as
an argument type and asked Carl to investigate.  I find it hard to believe
we accept all integer pointer types, except long *.  I agree that it shouldn't
always map to long long *, since as you say, that's wrong for -m32.
My hope was that we could somehow automagically handle the long * types
in the built-in machinery, mapping them to either the int * built-in or
the long long * built-in depending on -m32 or -m64.  Again, this limitation
is no limited to __builtin_altivec_tr_stx* built-ins, but others as well,
so I was kind of hoping for a general solution that would fix them all.
I'm not sure of that's possible though.

Peter




Re: [PATCH v4] tree-ssa-sink: Improve code sinking pass

2023-05-31 Thread Peter Bergner via Gcc-patches
This is not a review of the patch itself, but...

On 5/31/23 2:01 AM, Ajit Agarwal wrote:
> tree-ssa-sink: Improve code sinking pass
> 
> Code Sinking sinks the blocks after call.This increases register pressure
> for callee-saved registers. Improves code sinking before call in the use
> blocks or immediate dominator of use blocks.

I think the wording of your git log comment could be improved a little.
How about something like the following?

Currently, code sinking will sink code after function calls.  This increases
register pressure for callee-saved registers.  The following patch improves
code sinking by placing the sunk code before calls in the use block or in
the immediate dominator of the use blocks.



> gcc/ChangeLog:
> 
>   * tree-ssa-sink.cc (statement_sink_location): Move statements before
>   calls.
>   (def_use_same_block): New function.
>   (select_best_block): Add heuristics to select the best blocks in the
>   immediate post dominator.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
>   * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.

Please don't forget to add "PR tree-optimization/81953" to both sections
of the ChangeLog entries.

Peter




Re: [PATCH] rs6000: Change bitwise xor to inequality operator [PR106907]

2023-06-15 Thread Peter Bergner via Gcc-patches
On 6/12/23 6:18 AM, P Jeevitha wrote:
> Bitwise xor performed on bool
> is similar to checking inequality. So changed to inequality
> operator (!=) instead of bitwise xor (^).
[snip'
> -   if (swapped ^ !BYTES_BIG_ENDIAN
[snip]
> +   if (swapped != !BYTES_BIG_ENDIAN

I know Andreas mentioned using "swapped != !BYTES_BIG_ENDIAN" in
the bugzilla, but that's the same as "swapped == BYTES_BIG_ENDIAN",
and it doesn't contain a double-negative and seems a little clearer.

It's up to Segher though...and if we go with this, then the ChangeLog
entry needs to be updated slightly since we're no longer testing for
inequality.

Peter



Re: [PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-22 Thread Peter Bergner via Gcc-patches
On 7/22/22 1:17 PM, Segher Boessenkool wrote:
> On Fri, Jul 22, 2022 at 10:22:51AM +0800, Kewen.Lin wrote:
>> on 2022/7/22 02:48, Segher Boessenkool wrote:
>> As PR106345 shows, GCC can use an explicit tune setting when it's
>> configured, even if there is one "-mdejagnu-cpu=", it doesn't
>> override the explicit given one, so we need one explicit
>> "-mdejagnu-tune=".
> 
> And that is the problem.  GCC's automatic setting is *not* an explicit
> option, not given by the user.  --with-tune= should not result in adding
> an -mtune= option in the resulting compiler, it should not set command-
> line options.
>
[snip]
> And it should not do that.

Currently, our rs6000.h has this:

/* Only for use in the testsuite: -mdejagnu-cpu= simply overrides -mcpu=.
   With older versions of Dejagnu the command line arguments you set in
   RUNTESTFLAGS override those set in the testcases; with this option,
   the testcase will always win.  Ditto for -mdejagnu-tune=.  */
#define DRIVER_SELF_SPECS \
  "%{mdejagnu-cpu=*: %

Re: [PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-22 Thread Peter Bergner via Gcc-patches
On 7/22/22 1:53 PM, Peter Bergner wrote:
> So I think the way the code above *should* work is:
>   1) Any -mdejagnu-cpu= usage should filter out all -mcpu= and -mtune= 
> options.
>   2) Any -mdejagnu-tune= usage should filter all -mtune= options.
>  It should not filter out any -mcpu= options.

Like this:

diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 3b8941a8658..26874943795 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -86,7 +86,7 @@
RUNTESTFLAGS override those set in the testcases; with this option,
the testcase will always win.  Ditto for -mdejagnu-tune=.  */
 #define DRIVER_SELF_SPECS \
-  "%{mdejagnu-cpu=*: %

[PING][PATCH] c: Handle initializations of opaque types [PR106016] (need review of expr.cc hunk)

2022-07-25 Thread Peter Bergner via Gcc-patches
I'd like to ping the following patch.  Segher has approved the
test case change, so I just need a review for the expr.cc change.

Peter



Message-ID: <009c391d-3994-8755-0d22-9e80faf91...@linux.ibm.com>
Date: Fri, 17 Jun 2022 23:50:35 -0500
To: GCC Patches 
Subject: [PATCH] c: Handle initializations of opaque types [PR106016]


On 6/17/22 11:50 PM, Peter Bergner via Gcc-patches wrote:
> The initial commit that added opaque types thought that there couldn't
> be any valid initializations for variables of these types, but the test
> case in the bug report shows that isn't true.  The solution is to handle
> OPAQUE_TYPE initializations just like the other scalar types.
> 
> This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
> Ok for trunk?  This is an issue in GCC 12 and 11 too.  Ok for the release
> branches after some burn-in on trunk?
> 
> Peter
> 
> gcc/
>   PR c/106016
>   * expr.cc (count_type_elements): Handle OPAQUE_TYPE.
> 
> gcc/testsuite/
>   PR c/106016
>   * gcc.target/powerpc/pr106016.c: New test.
> 
> diff --git a/gcc/expr.cc b/gcc/expr.cc
> index 78c839ab425..1675198a146 100644
> --- a/gcc/expr.cc
> +++ b/gcc/expr.cc
> @@ -6423,13 +6423,13 @@ count_type_elements (const_tree type, bool for_ctor_p)
>  case OFFSET_TYPE:
>  case REFERENCE_TYPE:
>  case NULLPTR_TYPE:
> +case OPAQUE_TYPE:
>return 1;
>  
>  case ERROR_MARK:
>return 0;
>  
>  case VOID_TYPE:
> -case OPAQUE_TYPE:
>  case METHOD_TYPE:
>  case FUNCTION_TYPE:
>  case LANG_TYPE:
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106016.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106016.c
> new file mode 100644
> index 000..3db8345dcc6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106016.c
> @@ -0,0 +1,14 @@
> +/* PR target/106016 */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
> +
> +/* Make sure we do not ICE on the following test case.  */
> +
> +extern void bar (__vector_quad *);
> +
> +void
> +foo (__vector_quad *a, __vector_quad *b)
> +{
> +  __vector_quad arr[2] = {*a, *b};
> +  bar (&arr[0]);
> +}



Re: [PING][PATCH] c: Handle initializations of opaque types [PR106016] (need review of expr.cc hunk)

2022-07-26 Thread Peter Bergner via Gcc-patches
On 7/26/22 1:57 AM, Richard Biener via Gcc-patches wrote:
>> On 6/17/22 11:50 PM, Peter Bergner via Gcc-patches wrote:
>>> The initial commit that added opaque types thought that there couldn't
>>> be any valid initializations for variables of these types, but the test
>>> case in the bug report shows that isn't true.  The solution is to handle
>>> OPAQUE_TYPE initializations just like the other scalar types.
>>>
>>> This passed bootstrap and regtesting with no regressions on 
>>> powerpc64le-linux.
>>> Ok for trunk?  This is an issue in GCC 12 and 11 too.  Ok for the release
>>> branches after some burn-in on trunk?
> 
> OK.

Ok, pushed to trunk.  I'll push the backports after a little burn-in
on trunk.  Thanks!

Peter



Re: [PATCH V2] rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]

2023-08-06 Thread Peter Bergner via Gcc-patches
On 7/19/23 11:46 AM, jeevitha via Gcc-patches wrote:
> gcc/
>   PR target/110411
>   * config/rs6000/mma.md (define_insn_and_split movoo): Disallow
>   AltiVec address in movoo and movxo pattern.

No need to mention movxo here, since the next change covers movxo.
And maybe better as "Disallow AltiVec address operands."?


>   (define_insn_and_split movxo): Likewise.

Fine.


>   *config/rs6000/predicates.md (vsx_quad_dform_memory_operand):Remove
 ^   ^
Need a space in the two spots above.

I cannot approve it, but it looks good to me with the above bits fixed.

Peter




Re: [PATCH ver 2] rs6000, add overloaded DFP quantize support

2023-08-16 Thread Peter Bergner via Gcc-patches
On 8/16/23 7:19 PM, Carl Love wrote:
> +(define_insn "dfp_dquan_"
> +  [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
> +(unspec:DDTD [(match_operand:DDTD 1 "gpc_reg_operand" "d")
> +   (match_operand:DDTD 2 "gpc_reg_operand" "d")
> +   (match_operand:QI 3 "immediate_operand" "i")]
> + UNSPEC_DQUAN))]
> +  "TARGET_DFP"
> +  "dqua %0,%1,%2,%3"
> +  [(set_attr "type" "dfp")
> +   (set_attr "size" "")])

operand 3 refers to the RMC operand field of the insn we are emitting.
RMC is a two bit unsigned operand, so I think the predicate should be
const_0_to_3_operand rather than immediate_operand.  It's always best
to use a tighter predicate if we have one. Ditto for the other patterns
with an RMC operand.

I don't think we allow anything other than an integer for that operand
value, so I _think_ that "n" is probably a better constraint than "i"?
Ke Wen/Segher???


> +(define_insn "dfp_dquan_i"
> +  [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
> +(unspec:DDTD [(match_operand:SI 1 "const_int_operand" "n")
> +   (match_operand:DDTD 2 "gpc_reg_operand" "d")
> +   (match_operand:SI 3 "immediate_operand" "i")]
> + UNSPEC_DQUAN))]
> +  "TARGET_DFP"
> +  "dquai %1,%0,%2,%3"
> +  [(set_attr "type" "dfp")
> +   (set_attr "size" "")])

operand 1 refers to the TE operand field and that is a 5-bit signed operand.
For that, I think we should be using the s5bit_cint_operand predicate,
rather than const_int_operand.



Peter


Re: [PING][PATCH] ira: update allocated_hardreg_p[] in improve_allocation() [PR110254]

2023-08-18 Thread Peter Bergner via Gcc-patches
On 8/2/23 8:23 AM, Vladimir Makarov wrote:
>>> gcc/
>>> PR rtl-optimization/PR110254
>>> * ira-color.cc (improve_allocation): Update array
> 
> I guess you missed the next line in the changelog.  I suspect it should be 
> "Update array allocated_hard_reg_p."
> 
> Please, fix it before committing the patch.

Is this a fix we want backported?

Peter




Re: [PATCH 1/2] go: update usage of TARGET_AIX to TARGET_AIX_OS

2023-06-22 Thread Peter Bergner via Gcc-patches
On 6/16/23 12:01 PM, Ian Lance Taylor via Gcc-patches wrote:
> On Fri, Jun 16, 2023 at 9:00 AM Paul E. Murphy via Gcc-patches
>  wrote:
>>
>> TARGET_AIX is defined to a non-zero value on linux and maybe other
>> powerpc64le targets.  This leads to unexpected behavior such as
>> dropping the .go_export section when linking a shared library
>> on linux/powerpc64le.
>>
>> Instead, use TARGET_AIX_OS to toggle AIX specific behavior.
>>
>> Fixes golang/go#60798.
>>
>> gcc/go/ChangeLog:
>>
>> * go-backend.cc [TARGET_AIX]: Rename and update usage to
>> TARGET_AIX_OS.
>> * go-lang.cc: Likewise.
> 
> This is OK.
> 
> Thanks.
> 
> Ian

I pushed this to trunk for Paul.

Peter



Re: [PATCH 1/2] go: update usage of TARGET_AIX to TARGET_AIX_OS

2023-06-22 Thread Peter Bergner via Gcc-patches
On 6/22/23 6:37 PM, Peter Bergner via Gcc-patches wrote:
> On 6/16/23 12:01 PM, Ian Lance Taylor via Gcc-patches wrote:
>> On Fri, Jun 16, 2023 at 9:00 AM Paul E. Murphy via Gcc-patches
>>  wrote:
>>>
>>> TARGET_AIX is defined to a non-zero value on linux and maybe other
>>> powerpc64le targets.  This leads to unexpected behavior such as
>>> dropping the .go_export section when linking a shared library
>>> on linux/powerpc64le.
>>>
>>> Instead, use TARGET_AIX_OS to toggle AIX specific behavior.
>>>
>>> Fixes golang/go#60798.
>>>
>>> gcc/go/ChangeLog:
>>>
>>> * go-backend.cc [TARGET_AIX]: Rename and update usage to
>>> TARGET_AIX_OS.
>>> * go-lang.cc: Likewise.
>>
>> This is OK.
>>
>> Thanks.
>>
>> Ian
> 
> I pushed this to trunk for Paul.

I see this is broken on the release branches too.  Are backports ok
after some burn-in on trunk?

Peter





Re: [PATCH 2/2] rust: update usage of TARGET_AIX to TARGET_AIX_OS

2023-06-22 Thread Peter Bergner via Gcc-patches
On 6/21/23 11:20 AM, Paul E Murphy via Gcc-patches wrote:
> 
> 
> On 6/19/23 3:39 AM, Thomas Schwinge wrote:
>> Hi Paul!
>>
>> On 2023-06-16T11:00:02-0500, "Paul E. Murphy via Gcc-patches" 
>>  wrote:
>>> This was noticed when fixing the gccgo usage of the macro, the
>>> rust usage is very similar.
>>>
>>> TARGET_AIX is defined as a non-zero value on linux/powerpc64le
>>> which may cause unexpected behavior.  TARGET_AIX_OS should be
>>> used to toggle AIX specific behavior.
>>>
>>> gcc/rust/ChangeLog:
>>>
>>>    * rust-object-export.cc [TARGET_AIX]: Rename and update
>>>    usage to TARGET_AIX_OS.
>>
>> I don't have rights to formally approve this GCC/Rust change, but I'll
>> note that it follows "as obvious" (see
>> , "Obvious fixes") to the
>> corresponding GCC/Go change, which has been approved:
>> ,
>> and which is where this GCC/Rust code has been copied from, so I suggest
>> you push both patches at once.
>>
>>
>> Grüße
>>   Thomas
> 
> Hi Thomas,
> 
> Thank you for reviewing.  I do not have commit access, so I cannot push this 
> myself.  If this is OK, could one of the rust maintainers push this patch?
> 
> Thanks,
> Paul

I pushed this to trunk for Paul.

Peter



Re: [PATCH v5] tree-ssa-sink: Improve code sinking pass

2023-06-22 Thread Peter Bergner via Gcc-patches
On 6/1/23 11:54 PM, Ajit Agarwal via Gcc-patches wrote:
> 
> 
> On 01/06/23 2:06 pm, Bernhard Reutner-Fischer wrote:
>> On 1 June 2023 09:20:08 CEST, Ajit Agarwal  wrote:
>>> Hello All:
>>>
>>> This patch improves code sinking pass to sink statements before call to 
>>> reduce
>>> register pressure.
>>> Review comments are incorporated.
>>
>> Hi Ajit!
>>
>> I had two comments for v4 that you did not address in v5 or followed up.
>> thanks,
> 
> Which comments I didn't address. Please let me know.

I believe he's referring to these two comments:

  > + && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb))
  > +   {
  > + if (def_use_same_block (use))
  > +   return best_bb;
  > +
  > + return new_best_bb;
  > +   }
  > +   return best_bb;
  > +}
  >  

  Many returns.
  I'd have said
  && !def_use_same_block (use)
return new_best_bb;
  else
return best_bb;

  and rephrase the comment above list of Things to consider accordingly.


I agree with Bernhard's comment that it could be rewritten to be clearer.
Although, the "else" isn't really required.  So Bernhard's version would
look like:

  if (new_best_bb
  && use
  && new_best_bb != best_bb
  && new_best_bb != early_bb
  && !is_gimple_call (stmt)
  && gsi_end_p (gsi_start_phis (new_best_bb))
  && gimple_bb (use) != early_bb
  && !is_gimple_call (use)
  && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb (use))
  && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
  && !def_use_same_block (use))
return new_best_bb;
  else
return best_bb;

...or just:

  if (new_best_bb
  && use
  && new_best_bb != best_bb
  && new_best_bb != early_bb
  && !is_gimple_call (stmt)
  && gsi_end_p (gsi_start_phis (new_best_bb))
  && gimple_bb (use) != early_bb
  && !is_gimple_call (use)
  && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb (use))
  && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
  && !def_use_same_block (use))
return new_best_bb;

  return best_bb;


Either works.


Peter



Re: [PATCH] rs6000, __builtin_set_fpscr_rn add retrun value

2023-06-29 Thread Peter Bergner via Gcc-patches
On 6/29/23 4:13 AM, Kewen.Lin wrote:
> on 2023/6/19 23:57, Carl Love wrote:
>> The following patch changes the return type of the
>>  __builtin_set_fpscr_rn builtin from void to double.  The return value
>> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
>> XE, NI, RN bit positions when the builtin is called.  The builtin then
>> updated the RN field with the new value provided as an argument to the
>> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
>> check that the builtin returns the current value of the FPSCR fields
>> and then updates the RN field.
> 
> But this patch also introduces one more overloading instance with argument
> type double, I had a check on glibc usage of mffscrn/mffscrni, I don't
> think it's necessary to add this new instance, as it takes the given
> rounding mode as integral type.

I agree with Ke Wen, I don't know why there is an extra overloaded
instance.  I assumed we'd have just the one we had before, except now
its return type is double and not void.

That said, we need to announce to users that the return type has
changed, so new code compiled with a new GCC can get the new return
value, but if that new code is compiled with an old GCC (still has void
return type), it knows it needs to use some other method to get the
FPSCR value, if it needs it.  We normally do that with a predefine
macro (#define ...) that the user can test for in their code, ala:

#ifdef __SET_FPSCR_RN_RETURNS_FPSCR__  /* Or whatever name.  */
  ret = __builtin_set_fpscr_rn (..);
#else
  __builtin_set_fpscr_rn (..);
  ret = ;
#endif

We add those predefined macros in rs6000-c.cc:rs6000_target_modify_macros()
and it should be gated via the same conditions that the built-in itself
is enabled.  Look at my addition of the __TM_FENCE__ macro that let the
user know our HTM insn patterns were fixed to now act as memory barriers.


Peter



Re: [PATCH 1/2] go: update usage of TARGET_AIX to TARGET_AIX_OS

2023-06-30 Thread Peter Bergner via Gcc-patches
On 6/22/23 10:30 PM, Ian Lance Taylor wrote:
> On Thu, Jun 22, 2023, 4: 47 PM Peter Bergner  
> wrote: On 6/22/23 6: 37 PM, Peter Bergner via Gcc-patches wrote: > On 6/16/23 
> >> On Fri, Jun 16, 2023 at 9:00 AM Paul E. Murphy via Gcc-patches
> >> mailto:gcc-patches@gcc.gnu.org>> wrote:
> >>>
> >>> TARGET_AIX is defined to a non-zero value on linux and maybe other
> >>> powerpc64le targets.  This leads to unexpected behavior such as
> >>> dropping the .go_export section when linking a shared library
> >>> on linux/powerpc64le.
> >>>
> >>> Instead, use TARGET_AIX_OS to toggle AIX specific behavior.
> >>>
> >>> Fixes golang/go#60798.
> >>>
> >>> gcc/go/ChangeLog:
> >>>
> >>>         * go-backend.cc [TARGET_AIX]: Rename and update usage to
> >>>         TARGET_AIX_OS.
> >>>         * go-lang.cc: Likewise.
> >>
> >> This is OK.
> >>
> >> Thanks.
> >>
> >> Ian
> >
> > I pushed this to trunk for Paul.
> 
> I see this is broken on the release branches too.  Are backports ok
> after some burn-in on trunk?
> 
> Yes.  Thanks.

Ok, I backported the Go fix to GCC 13, 12, 11 and 10 (before the 10.5 freeze).
I also backported to the rust change to GCC 13, which was the first release
with rust.   Thanks.

Peter



Re: [PATCH] rs6000: Update the vsx-vector-6.* tests.

2023-06-30 Thread Peter Bergner via Gcc-patches
On 6/30/23 5:20 PM, Carl Love wrote:
> So, we have the issue that looking at the assembly gives different
> instruction counts then what 
> 
>dg-final { scan-assembler-times {\mxxlor\M} }
> 
> comes up with???

I recommend not even counting xxlor at all, since the majority of
them come from vsx register copies and whether and how many we
generate seemingly varies with the phase of the moon, day of the
week, etc. etc. 

If you really want to verify an xxlor count, you almost have to extract
the given test into it's own file so it's not corrupted by any of the
other tests and it has to be as small as possible and compiled with
a fair amount of optimization.  Even then you may get some copies.
So I'd recommend just removing the xxlor counts altogether.

Peter




Re: [PATCH] rs6000: Update the vsx-vector-6.* tests.

2023-06-30 Thread Peter Bergner via Gcc-patches
On 6/30/23 6:50 PM, Carl Love wrote:
> With a little help from Peter and Julian Wang.  Objdump decodes some of
> the xxlor instructions as xxmr instsructions.  The xxmr is a new
> mnemonic which will be out in the next ISA.  But objdump already
> produces it.  So if you add the counts for grep xxlor and grep xxmr you
> get a total of 34 which agress with the count of xxlor in the gcc -S
> generated assembly.

Right, xxmr is new and objdump defaults to emitting it for xxlor's used
as copies.   You can use the -Mraw objdump option to display the real
mnemonics instead of any extended mnemonics.

Peter





Re: [PATCH] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-07-06 Thread Peter Bergner via Gcc-patches
On 6/28/23 3:07 AM, Kewen.Lin wrote:
> I think the reason why we need to check common_deferred_options is at this 
> time
> we can't distinguish the fixed_regs[2] is from the initialization or command 
> line
> user explicit specification.  But could we just update the FIXED_REGISTERS 
> without
> FIXED_R2 and set FIXED_R2 when it's needed in this function instead?  Then I'd
> expect that when we find fixed_regs[2] is set at the beginning of this 
> function, it
> would mean users specify it explicitly and then we don't need this option 
> checking?

Correct, rs6000_conditional_register_usage() is called after the handling of the
-ffixed-* options, so looking at fixed_regs[2] cannot tell us whether the user
used the -ffixed-r2 option or not if we initialize the FIXED_REGISTERS[2] slot
to 1.  I think we went this route for two reasons:

  1) We don't have to worry about anyone in the future adding more uses of
 FIXED_REGISTERS and needing to update the value depending on ABI, options,
 etc.
  2) The options in common_deferred_options are "rare" options, so the common
 case is that common_deferred_options will be NULL and we'll never drop
 into that section.

I believe the untested patch below should also work, without having to scan
the (uncommonly used) options.  Jeevitha, can you bootstrap and regtest the
patch below?



> Besides, IMHO we need a corresponding test case to cover this -ffixed-r2 
> handling.

Good idea.  I think we can duplicate the pr110320_2.c test case, replacing the
-mno-pcrel option with -ffixed-r2.  Jeevitha, can you give that a try?




>> +/* { dg-require-effective-target power10_ok } */
>> +/* { dg-require-effective-target powerpc_pcrel } */
> 
> Do we have some environment combination which supports powerpc_pcrel but not
> power10_ok?  I'd expect that only powerpc_pcrel is enough.

I think I agree testing for powerpc_pcrel should be enough.


Peter





diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d197c3f3289..7c356a73ac6 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10160,9 +10160,13 @@ rs6000_conditional_register_usage (void)
 for (i = 32; i < 64; i++)
   fixed_regs[i] = call_used_regs[i] = 1;
 
+  /* For non PC-relative code, GPR2 is unavailable for register allocation.  */
+  if (FIXED_R2 && !rs6000_pcrel_p ())
+fixed_regs[2] = 1;
+
   /* The TOC register is not killed across calls in a way that is
  visible to the compiler.  */
-  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
+  if (fixed_regs[2] && (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2))
 call_used_regs[2] = 0;
 
   if (DEFAULT_ABI == ABI_V4 && flag_pic == 2)
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 3503614efbd..2a24fbdf9fd 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -812,7 +812,7 @@ enum data_align { align_abi, align_opt, align_both };
 
 #define FIXED_REGISTERS  \
   {/* GPRs */ \
-   0, 1, FIXED_R2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, FIXED_R13, 0, 0, \
+   0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, FIXED_R13, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
/* FPRs */ \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \



Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Peter Bergner via Gcc-patches
On 7/6/23 12:33 PM, Segher Boessenkool wrote:
> On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote:
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -9894,6 +9894,8 @@ rs6000_legitimate_address_p (machine_mode mode, rtx x, 
>> bool reg_ok_strict)
>>  
>>/* Handle unaligned altivec lvx/stvx type addresses.  */
>>if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
>> +  && mode !=  OOmode
>> +  && mode !=  XOmode
>>&& GET_CODE (x) == AND
>>&& CONST_INT_P (XEXP (x, 1))
>>&& INTVAL (XEXP (x, 1)) == -16)
> 
> Why do we need this for OOmode and XOmode here, but not for the other
> modes that are equally not allowed?  That makes no sense.

VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) already filters those modes out
(eg, SImode, DFmode, etc.), just not OOmode and XOmode, since those both
are modes used in/with VSX registers.



> Should you check for anything that is more than a register, for example?
> If so, do *that*?

Well rs6000_legitimate_address_p() is only passed the MEM rtx, so we have
no idea if this is a load or store, so we're clueless on number of regs
needed to hold this mode.  The best we could do is something like

  GET_MODE_SIZE (mode) == GET_MODE_SIZE (V16QImode)

or some such thing.  Would you prefer something like that?



>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr110411.c
>> @@ -0,0 +1,21 @@
>> +/* PR target/110411 */
>> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -S -mblock-ops-vector-pair" } */
> 
> -S in testcases is wrong.  Why do you want this?  It is *good* if this
> is hauled through the assembler as well!  If you *really* want this you
> use "dg-do assemble", but you shouldn't.

For test cases checking for ICEs, we don't need to assemble, so I agree,
we just need to remove the -S option, which is implied by this being a
dg-do compile test case (the default for this test directory).


Peter




Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

2023-07-06 Thread Peter Bergner via Gcc-patches
On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
> rs6000, __builtin_set_fpscr_rn add retrun value

s/retrun/return/

Maybe better written as:

rs6000: Add return value to __builtin_set_fpscr_rn


> Change the return value from void to double.  The return value consists of
> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
> overloaded version which accepts a double argument.

You're not adding an overloaded version anymore, so I think you can just
remove the last sentence.



> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
> double reterun value and the new double argument.

s/reterun/return/   ...and there is no double argument anymore, so that
part can be removed.



>   * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
>   define_expand.

Too many '('.



>   (rs6000_set_fpscr_rn): Addedreturn argument.  Updated to use new

Looks like a  after Added instead of a space.


>   rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
>_expands.

Don't split define_expand across two lines.



>   * doc/extend.texi (__builtin_set_fpscr_rn): Update description for
>   the return value and new double argument.  Add descripton for
>   __SET_FPSCR_RN_RETURNS_FPSCR__ macro.

s/descripton/description/






> +  /* Tell the user the __builtin_set_fpscr_rn now returns the FPSCR fields
> + in a double.  Originally the builtin returned void.  */

Either:
  1) s/Tell the user the __builtin_set_fpscr_rn/Tell the user 
__builtin_set_fpscr_rn/ 
  2) s/the __builtin_set_fpscr_rn now/the __builtin_set_fpscr_rn built-in now/ 


> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> +  rs6000_define_or_undefine_macro (define_p, 
> "__SET_FPSCR_RN_RETURNS_FPSCR__");

This doesn't look like it's indented correctly.




> +(define_expand "rs6000_get_fpscr_fields"
> + [(match_operand:DF 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> + RN) from the FPSCR and return them.  */
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (0x000700FFULL)));
> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> +  emit_move_insn (operands[0], tmp_rtn);
> +  DONE;
> +})

This doesn't look correct.  You first set tmp_di to a new reg rtx but then
throw that away with the return value of simplify_gen_subreg().  I'm guessing
you want that tmp_di as a gen_reg_rtx for the destination of the gen_anddi3, so
you probably want a different rtx for the subreg that feeds the gen_anddi3.



> +(define_expand "rs6000_update_fpscr_rn_field"
> + [(match_operand:DI 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Insert the new RN value from operands[0] into FPSCR bit [62:63].  */
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);

Ditto.




> +The @code{__builtin_set_fpscr_rn} builtin allows changing both of the 
> floating
> +point rounding mode bits and returning the various FPSCR fields before the RN
> +field is updated.  The builtin returns a double consisting of the initial 
> value
> +of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions with 
> all
> +other bits set to zero. The builtin argument is a 2-bit value for the new RN
> +field value.  The argument can either be an @code{const int} or stored in a
> +variable.  Earlier versions of @code{__builtin_set_fpscr_rn} returned void.  
> A
> +@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If defined, then
> +the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR fields.  If not
> +defined, the @code{__builtin_set_fpscr_rn} does not return a vaule.  If the
> +@option{-msoft-float} option is used, the @code{__builtin_set_fpscr_rn} 
> builtin
> +will not return a value.

Multiple occurrences of "builtin" that should be spelled "built-in" (not in the
built-in function name itself though).



> +/* Originally the __builtin_set_fpscr_rn builtin was defined to return
> +   void.  It was later extended to return a double with the various
> +   FPSCR bits.  The extended builtin is inteded to be a drop in replacement
> +   for the original version.  This test is for the original version of the
> +   builtin and should work exactly as before.  */

Ditto.




> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> @@ -0,0 +1,153 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */

powerpc*-*-* is the default for this test directory, so you can drop that,
but you need to disable this test for soft-float systems, so you probably want:

  /* { dg-do run { target powerpc_fprs } } */

I know you didn't write it, but tes

Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

2023-07-06 Thread Peter Bergner via Gcc-patches
On 7/6/23 5:54 PM, Peter Bergner wrote:
> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
>> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>> @@ -0,0 +1,153 @@
>> +/* { dg-do run { target { powerpc*-*-* } } } */
> 
> powerpc*-*-* is the default for this test directory, so you can drop that,
> but you need to disable this test for soft-float systems, so you probably 
> want:
> 
>   /* { dg-do run { target powerpc_fprs } } */

We actually want something like powerpc_fprs_hw, but that doesn't exist.

Peter




Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

2023-07-07 Thread Peter Bergner via Gcc-patches
On 7/7/23 12:08 AM, Kewen.Lin wrote:
> on 2023/7/7 07:00, Peter Bergner wrote:
>> On 7/6/23 5:54 PM, Peter Bergner wrote:
>>> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
 +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
 @@ -0,0 +1,153 @@
 +/* { dg-do run { target { powerpc*-*-* } } } */
>>>
>>> powerpc*-*-* is the default for this test directory, so you can drop that,
>>> but you need to disable this test for soft-float systems, so you probably 
>>> want:
>>>
>>>   /* { dg-do run { target powerpc_fprs } } */
>>
>> We actually want something like powerpc_fprs_hw, but that doesn't exist.
>>
> 
> Yeah, good point!  I noticed that we have a few test cases which need to
> check soft-float env as well but they don't, I didn't find any related
> issues have been reported, so I would assume that there are very few
> actual testings on this area.  Based on this, I'm not sure if it's worthy
> to add a new effective target for it.  Personally I'm happy with just using
> powerpc_fprs here to keep it simple. :)

I think powerpc_fprs_hw can be added later by someone if they care.
Using powerpc_fprs is an improvement over powerpc*-*-*, since it will
reduce some FAILs, just not all of them a powerpc_fprs_hw would.
I doubt many people are running the testsuite on real ppc hardware
that doesn't have an FP unit.

Peter



Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-07 Thread Peter Bergner via Gcc-patches
On 7/6/23 6:28 PM, Segher Boessenkool wrote:
> On Thu, Jul 06, 2023 at 02:48:19PM -0500, Peter Bergner wrote:
>> On 7/6/23 12:33 PM, Segher Boessenkool wrote:
>>> On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote:
 --- a/gcc/config/rs6000/rs6000.cc
 +++ b/gcc/config/rs6000/rs6000.cc
 @@ -9894,6 +9894,8 @@ rs6000_legitimate_address_p (machine_mode mode, rtx 
 x, bool reg_ok_strict)
  
/* Handle unaligned altivec lvx/stvx type addresses.  */
if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
 +  && mode !=  OOmode
 +  && mode !=  XOmode
&& GET_CODE (x) == AND
&& CONST_INT_P (XEXP (x, 1))
&& INTVAL (XEXP (x, 1)) == -16)
>>>
>>> Why do we need this for OOmode and XOmode here, but not for the other
>>> modes that are equally not allowed?  That makes no sense.
>>
>> VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) already filters those modes out
>> (eg, SImode, DFmode, etc.), just not OOmode and XOmode, since those both
>> are modes used in/with VSX registers.
> 
> It does not filter anything out, no.  That simply checks if a datum of
> that mode can be loaded into vector registers or not.  For example
> SImode could very well be loaded into vector registers!  (It just is not
> such a great idea).

I spent some time looking at how the compiler fixes this up in the
-mno-block-ops-vector-pair case and I see the constraints used in the
vsx_mov_64bit pattern for loads and stores disallows these types
of addresses, so LRA fixes them up for us.  Clearly movoo should do the
same and that is enough to fix the ICE.  I'll work with Jeevitha on
submitting a patch using that solution.

That said, I think it would be good to modify rs6000_legitimate_address_p
to disallow these altivec style addresses for OOmode and XOmode, since we
know early-on that they're not going to be valid, but that would be a
different patch.




> dg-do compile *does* invoke the assembler, btw.  As it should.

There is dg-do "preprocess", "compile", "assemble", "link" and "run"
(ignoring "precompile" and "repo").  Dg-do compile produces an assembly
file, but doesn't actually call the assembler, which we don't strictly
need for a test case that checks whether GCC ICEs or not.  If you want
to run the assembler too and then stop, then you'd want dg-do assemble.

Peter




[PATCH, OBVIOUS] rs6000: Remove redundant MEM_P predicate usage

2023-07-10 Thread Peter Bergner via Gcc-patches
While helping someone on the team debug an issue, I noticed some redundant
tests in a couple of our predicates which can be removed.  I'm going to
commit the following as obvious once bootstrap and regtesting come back
clean.

Peter


rs6000: Remove redundant MEM_P predicate usage

The quad_memory_operand and vsx_quad_dform_memory_operand predicates contain
a (match_code "mem") test, making their MEM_P usage redundant.  Remove them.

gcc/
* config/rs6000/predicates.md (quad_memory_operand): Remove redundant
MEM_P usage.
(vsx_quad_dform_memory_operand): Likewise.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 8479331482e..3552d908e9d 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -912,7 +912,7 @@ (define_predicate "quad_memory_operand"
   if (!TARGET_QUAD_MEMORY && !TARGET_SYNC_TI)
 return false;
 
-  if (GET_MODE_SIZE (mode) != 16 || !MEM_P (op) || MEM_ALIGN (op) < 128)
+  if (GET_MODE_SIZE (mode) != 16 || MEM_ALIGN (op) < 128)
 return false;
 
   return quad_address_p (XEXP (op, 0), mode, false);
@@ -924,7 +924,7 @@ (define_predicate "quad_memory_operand"
 (define_predicate "vsx_quad_dform_memory_operand"
   (match_code "mem")
 {
-  if (!TARGET_P9_VECTOR || !MEM_P (op) || GET_MODE_SIZE (mode) != 16)
+  if (!TARGET_P9_VECTOR || GET_MODE_SIZE (mode) != 16)
 return false;
 
   return quad_address_p (XEXP (op, 0), mode, false);


Re: [PATCH ver3] rs6000, Add return value to __builtin_set_fpscr_rn

2023-07-10 Thread Peter Bergner via Gcc-patches
On 7/10/23 2:18 PM, Carl Love wrote:
> +  /* Get the current FPSCR fields, bits 29:31 (DRN) and bits 56:63 (VE, OE, 
> UE,
> +  ZE, XE, NI, RN) from the FPSCR and return them.  */

The 'Z' above should line up directly under the 'G' in Get.


> -  /* Insert new RN mode into FSCPR.  */
> -  emit_insn (gen_rs6000_mffs (tmp_df));
> -  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> -  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> -  emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> +  /* Insert the new RN value from tmp_rn into FPSCR bit [62:63].  */
> +  emit_insn (gen_anddi3 (tmp_di1, tmp_di2, GEN_INT (-4)));
> +  emit_insn (gen_iordi3 (tmp_di1, tmp_di1, tmp_rn));

This is an expander, so you shouldn't reuse temporaries as multiple
destination pseudos, since that limits the register allocator's freedom.
I know the old code did it, but since you're changing the line, you
might as well use a new temp.


I cannot approve it, but it LGTM with those fixed.

Peter




Re: [PATCH, OBVIOUS] rs6000: Remove redundant MEM_P predicate usage

2023-07-10 Thread Peter Bergner via Gcc-patches
On 7/10/23 11:47 AM, Peter Bergner wrote:
> While helping someone on the team debug an issue, I noticed some redundant
> tests in a couple of our predicates which can be removed.  I'm going to
> commit the following as obvious once bootstrap and regtesting come back
> clean.
> 
> Peter
> 
> 
> rs6000: Remove redundant MEM_P predicate usage
> 
> The quad_memory_operand and vsx_quad_dform_memory_operand predicates contain
> a (match_code "mem") test, making their MEM_P usage redundant.  Remove them.
> 
> gcc/
>   * config/rs6000/predicates.md (quad_memory_operand): Remove redundant
>   MEM_P usage.
>   (vsx_quad_dform_memory_operand): Likewise.

Testing was clean as expected.  Pushed to trunk.

Peter



Re: [PATCH] rs6000: Remove redundant initialization [PR106907]

2023-07-10 Thread Peter Bergner via Gcc-patches
On 6/29/23 4:31 AM, Kewen.Lin via Gcc-patches wrote:
> This is okay for trunk (no backports needed btw), this fix can even be
> taken as obvious, thanks!
> 
>>
>> 2023-06-07  Jeevitha Palanisamy  
>>
>> gcc/
>>  PR target/106907
> 
> One curious question is that this PR106907 seemed not to report this issue,
> is there another PR reporting this?  Or do I miss something?

I think Jeevitha just ran cppcheck by hand and noticed the "new" warnings
and added them to the list of things to fixup.  Yeah, it would be nice to
add the new warnings to the PR for historical reasons.

Peter





[PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-08 Thread Peter Bergner via Gcc-patches
PR109073 shows a problem where GCC 11 and GCC 10 do not accept a const
__vector_pair pointer operand to some MMA builtins, which GCC 12 and later
correctly accept.  Fixed here by initializing the builtins to accept const
pointers.

This patch was tested in both GCC 11 and GCC 10 on powerpc64le-linux and
showed no regressions.  Ok for backports?

Peter


gcc/

PR target/109073
* config/rs6000/rs6000-call.c (mma_init_builtins): Accept const pointer
operands for lxvp, stxvp and disassemble builtins.

gcc/testsuite/

PR target/109073
* gcc.target/powerpc/mma-builtin-4.c): New const * test. Update
expected instruction counts.
* gcc.target/powerpc/mma-builtin-5.c: Likewise.
* gcc.target/powerpc/mma-builtin-7.c: Likewise.


diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 1be4797e834..3b6d40f0aef 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14343,22 +14343,30 @@ mma_init_builtins (void)
{
  op[nopnds++] = build_pointer_type (void_type_node);
  if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC)
-   op[nopnds++] = build_pointer_type (vector_quad_type_node);
+   op[nopnds++] = build_pointer_type (build_qualified_type
+(vector_quad_type_node,
+ TYPE_QUAL_CONST));
  else
-   op[nopnds++] = build_pointer_type (vector_pair_type_node);
+   op[nopnds++] = build_pointer_type (build_qualified_type
+(vector_pair_type_node,
+ TYPE_QUAL_CONST));
}
   else if (d->code == VSX_BUILTIN_LXVP)
{
  op[nopnds++] = vector_pair_type_node;
  op[nopnds++] = sizetype;
- op[nopnds++] = build_pointer_type (vector_pair_type_node);
+ op[nopnds++] = build_pointer_type (build_qualified_type
+  (vector_pair_type_node,
+   TYPE_QUAL_CONST));
}
   else if (d->code == VSX_BUILTIN_STXVP)
{
  op[nopnds++] = void_type_node;
  op[nopnds++] = vector_pair_type_node;
  op[nopnds++] = sizetype;
- op[nopnds++] = build_pointer_type (vector_pair_type_node);
+ op[nopnds++] = build_pointer_type (build_qualified_type
+  (vector_pair_type_node,
+   TYPE_QUAL_CONST));
}
   else
{
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-4.c 
b/gcc/testsuite/gcc.target/powerpc/mma-builtin-4.c
index a9fb0107d12..0ba650fcee7 100644
--- a/gcc/testsuite/gcc.target/powerpc/mma-builtin-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-4.c
@@ -46,6 +46,15 @@ bar2 (vec_t *dst, __vector_pair *src)
   dst[4] = res[1];
 }
 
+void
+bar3 (vec_t *dst, const __vector_pair *src)
+{
+  vec_t res[2];
+  __builtin_vsx_disassemble_pair (res, src);
+  dst[0] = res[0];
+  dst[4] = res[1];
+}
+
 #if !__has_builtin (__builtin_vsx_assemble_pair)
 #  error "__has_builtin (__builtin_vsx_assemble_pair) failed"
 #endif
@@ -67,7 +76,7 @@ bar2 (vec_t *dst, __vector_pair *src)
 #endif
 
 /* { dg-final { scan-assembler-times {\mlxv\M} 6 } } */
-/* { dg-final { scan-assembler-times {\mlxvp\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mstxv\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mstxv\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-5.c 
b/gcc/testsuite/gcc.target/powerpc/mma-builtin-5.c
index 00503b7343d..998c436a8bb 100644
--- a/gcc/testsuite/gcc.target/powerpc/mma-builtin-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-5.c
@@ -31,6 +31,17 @@ bar (vec_t *dst, __vector_quad *src)
   dst[12] = res[3];
 }
 
+void
+bar2 (vec_t *dst, const __vector_quad *src)
+{
+  vec_t res[4];
+  __builtin_mma_disassemble_acc (res, src);
+  dst[0] = res[0];
+  dst[4] = res[1];
+  dst[8] = res[2];
+  dst[12] = res[3];
+}
+
 #if !__has_builtin (__builtin_mma_assemble_acc)
 #  error "__has_builtin (__builtin_mma_assemble_acc) failed"
 #endif
@@ -40,8 +51,8 @@ bar (vec_t *dst, __vector_quad *src)
 #endif
 
 /* { dg-final { scan-assembler-times {\mlxv\M} 8 } } */
-/* { dg-final { scan-assembler-times {\mlxvp\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mstxv\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mstxv\M} 8 } } */
 /* { dg-final { scan-assembler-times {\mstxvp\M} 4 } } */
-/* { dg-final { scan-assembler-times {\mxxmfacc\M} 3 } } */
-/* { dg-final { scan-assembler-times {\mxxmtacc\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxmfacc\M} 4 } } */
+/* { dg-final { scan-assembler-times {\m

Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-09 Thread Peter Bergner via Gcc-patches
On 3/9/23 8:55 AM, Segher Boessenkool wrote:
> On Thu, Mar 09, 2023 at 05:30:53PM +0800, Kewen.Lin wrote:
>> on 2023/3/9 07:01, Peter Bergner via Gcc-patches wrote:
>>> This patch was tested in both GCC 11 and GCC 10 on powerpc64le-linux and
>>> showed no regressions.  Ok for backports?
> 
> It isn't truly a backport. You can put it on 11 and 10 at the same time,
> there is no benefit doing it on 11 only first.

Correct.  I just meant that they're targeted at the two release branches
and not trunk.



>>>   op[nopnds++] = build_pointer_type (void_type_node);
>>>   if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC)
>>> -   op[nopnds++] = build_pointer_type (vector_quad_type_node);
>>> +   op[nopnds++] = build_pointer_type (build_qualified_type
>>> +(vector_quad_type_node,
>>> + TYPE_QUAL_CONST));
>>
>> Nit: Maybe we can build them out of the loop once and then just use the
>> built one in the loop.
> 
> Or as globals even.  Currently we have X and pointer to X, but no
> pointer to const X (and no const X either, but that isn't so useful).
> 
> The generic code doesn't have this either, hrm.

I can have a look at that, but was trying to keep the change as small
as possible.  Especially since we're not trying to create code that
will be "easier" to maintain in the future, because this is all changed
in GCC12 with Bill's builtin re-write.



>> Simply testing __builtin_mma_xxmtacc and __builtin_mma_xxmfacc as below:
>>
>> $ cat test.C
>> void foo0(const __vector_quad *acc) {
>>   __builtin_mma_xxmtacc(acc);
>>   __builtin_mma_xxmfacc(acc);
>> }
>>
>> test.C:2:25: error: invalid conversion from ‘const __vector_quad*’ to 
>> ‘__vector_quad*’ [-fpermissive]
>> 2 |   __builtin_mma_xxmtacc(acc);
>>
>> test.C:3:25: error: invalid conversion from ‘const __vector_quad*’ to 
>> ‘__vector_quad*’ [-fpermissive]
>> 3 |   __builtin_mma_xxmfacc(acc);
>>
>> They also suffered the same error on gcc11 branch but not on trunk.
> 
> Yeah, there is more to be done here.

Well I'm sure there are non-MMA builtins that have the same issue.
I was just fixing the ones Chip ran into and similar builtins.
I don't think we want to go and make everything work like it does
on trunk, especially when no one has complained about hitting
them.

As for the __builtin_mma_xxm[ft]acc() errors, I'm not sure any actual
code will ever hit this.  All realistic examples declare a __vector_quad
var and the pointer passed to the builtin comes from doing &var as the
operand.  Clearly we cannot have a "const __vector_quad var;" and
use that in the builtins.


>> Besides, I'm not sure if the existing bif declarations using 
>> ptr_vector_pair_type_node
>> and ptr_vector_quad_type_node are all intentional, at least it looks weird 
>> to me that
>> we declare const __vector_pair* for this __builtin_vsx_stxvp, which is meant 
>> to store 32
>> bytes into the memory provided by the pointer biasing the sizetype offset, 
>> but the "const"
>> qualifier seems to tell that this bif doesn't modify the memory pointed by 
>> the given pointer.

I'm not a language lawyer and I don't play one on TV.  What we're accepting
here, is a pointer with a "const" value that points to non-const memory.
I'll double check the trunk code, but I don't think it allows (and we don't
want it to) using a pointer (const or non-const) that points to a const memory
...at least for the stxvp builtin.


> Since the patch is a strict improvement already, it is okay for 11 and
> 10.  But you (Peter) may want to flesh it out a bit first?  Or first
> commit only this if that works better for you.

I'll see about making some of the changes above and then I'll report back.

Peter



Re: [PATCH] libffi: fix handling of homogeneous float128 structs [PR109447]

2023-05-09 Thread Peter Bergner via Gcc-patches
On 5/5/23 4:42 PM, Jakub Jelinek wrote:
> On Thu, May 04, 2023 at 02:29:34PM -0500, Peter Bergner wrote:
>> Merged from upstream libffi commit: 464b4b66e3cf3b5489e730c1466ee1bf825560e0
>>
>> 2023-05-03  Dan Horák 
>>
>> libffi/
>>  PR libffi/109447
>>  * src/powerpc/ffi_linux64.c (ffi_prep_args64): Update arg.f128 pointer.
> 
> Ok for 14/13.2/12.4 (i.e. after 12.3 is out)/11.4

Thanks, I've pushed the GCC trunk and GCC 13 commits and now that GCC 12.3
is released, I have pushed the GCC 12 backport too.

I have yet to push to GCC 11 yet, due to bootstrap is broken when building
GCC 11 on our Fedora 36 system, so I cannot test there yet.  It also seems
GCC 11 is missing some IEEE128 changes from upstream libffi that GCC 12 and
later have, so it might not even be appropriate, but I'll wait for bootstrap
to be restored before making any decisions.  The problem seem to be that the
system ld.gold which is used to link libgo.so is dying with an undefined
runtime symbol:

/usr/bin/ld.gold: 
/home/bergner/gcc/build/gcc-fsf-11-baselib-regtest/powerpc64le-linux/libstdc++-v3/src/.libs/libstdc++.so.6:
 version `GLIBCXX_3.4.30' not found (required by /usr/bin/ld.gold)
collect2: error: ld returned 1 exit status

Running the link command by hand or via a make in powerpc64le-linux/libgo,
the link succeeds.  It's only when I type make in the top level build dir
where I see this error.  It's almost as if the top level build machinery
adds a LD_LIBRARY_PATH=... forcing the system ld.gold (which was built
with a gcc12 based compiler) to pick up the build's gcc11 libstdc++ which
doesn't have that symbol, rather than the gcc12 system libstdc++.  Has anyone
seen anything like that before?

Peter



Re: [PATCH] libffi: fix handling of homogeneous float128 structs [PR109447]

2023-05-09 Thread Peter Bergner via Gcc-patches
On 5/9/23 3:50 PM, Andreas Schwab wrote:
> On Mai 09 2023, Peter Bergner via Gcc-patches wrote:
> 
>> It's almost as if the top level build machinery
>> adds a LD_LIBRARY_PATH=...
> 
> See how the toplevel Makefile sets LD_LIBRARY_PATH (via RPATH_ENVVAR) if
> gcc-bootstrap is set.

I'm sorry to be dense, but can you point to the specific line?  In my
$GCC_BUILD/Makefile, the only mention of LD_LIBRARY_PATH is:

  RPATH_ENVVAR = LD_LIBRARY_PATH

...so that isn't setting LD_LIBRARY_PATH, but using it.

Peter





Re: [PATCH] libffi: fix handling of homogeneous float128 structs [PR109447]

2023-05-13 Thread Peter Bergner via Gcc-patches
On 5/10/23 2:34 AM, Andreas Schwab wrote:
> On Mai 09 2023, Peter Bergner via Gcc-patches wrote:
>> I'm sorry to be dense, but can you point to the specific line?  In my
>> $GCC_BUILD/Makefile, the only mention of LD_LIBRARY_PATH is:
>>
>>   RPATH_ENVVAR = LD_LIBRARY_PATH
>>
>> ...so that isn't setting LD_LIBRARY_PATH, but using it.
> 
> Have you considered searching for uses of RPATH_ENVVAR?

Ah, I misread that as RPATH_ENVVAR = $LD_LIBRARY_PATH and was
expecting to see "export LD_LIBRARY_PATH=..." in the code.
Thanks for the pointer!

Peter




Re: [PATCH] rs6000: Update powerpc test fold-vec-extract-int.p8.c

2023-05-18 Thread Peter Bergner via Gcc-patches
On 5/18/23 6:16 AM, Ajit Agarwal via Gcc-patches wrote:
> -/* { dg-final { scan-assembler-times {\mrldicl\M} 7 { target { le } } } } */
> +/* { dg-final { scan-assembler-times {\mrldicl\M} 5 { target { le } } } } */
>  /* { dg-final { scan-assembler-times {\mrldicl\M} 4 { target { lp64 && be } 
> } } } */

Can you please check whether the big-endian count needs updating too?
Thanks.

Peter




Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition

2023-05-18 Thread Peter Bergner via Gcc-patches
On 5/10/23 1:06 PM, Carl Love wrote:
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>  TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>  TR_STXVRWX vsx_stxvrwx {stvec}

In my estimation, these two changes are "obvious" fixes.




> +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
> +TR_STXVRLX  TR_STXVRLX_S
> +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
> +TR_STXVRLX  TR_STXVRLX_U

Not a comment on these two changes, and not a request to expand this
specific patch, but I believe I saw other built-ins that were missing
signed long */unsigned long * versions where they could/should accept
them.  Can you double-check whether there are other built-ins that
need similar changes and if so, please post a separate patch to fix
those as well?  Thanks.

Peter



Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions

2023-05-23 Thread Peter Bergner via Gcc-patches
On 5/23/23 12:24 AM, Kewen.Lin wrote:
> on 2023/5/23 01:31, Carl Love wrote:
>> The builtins were requested for use in GLibC.  As of version 2.31 they
>> were added as inline asm.  They requested a builtin so the asm could be
>> removed.
>
> So IMHO we also want the similar support for mffscrn, that is to make
> use of mffscrn and mffscrni on Power9 and later, but falls back to 
> __builtin_set_fpscr_rn + mffs similar on older platforms.

So __builtin_set_fpscr_rn everything we want (sets the RN bits) and
uses mffscrn/mffscrni on P9 and later and uses older insns on pre-P9.
The only problem is we don't return the current FPSCR bits, as the bif
is defined to return void.  Crazy idea, but could we extend the built-in
with an overload that returns the FPSCR bits?  To be honest, I like
the __builtin_set_fpscr_rn name better than __builtin_mffscrn[i].
The built-in machinery can see that the usage is expecting a return value
or not and for the pre-P9 code, can skip generating the ending mffs if
we don't want the return value.

Peter




Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions

2023-05-24 Thread Peter Bergner via Gcc-patches
On 5/24/23 10:20 AM, Carl Love wrote:
> Extending the builtin to pre Power 9 is straight forward and I agree
> would make good sense to do.
> 
> I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the
> new functionality.  Peter suggests overloading the builtin to either
> return void or returns FPSCR bits.  It is my understanding that the
> return value for a given builtin had to be the same, i.e. you can't
> overload the return value. Maybe you can with Bill's new
> infrastructure?  I recall having problems trying to overload the return
> value in the past and Bill said you couldn't do it.  I play with this
> and see if I can overload the return value.

In this case, I don't think we need a built-in overload, but just change
the current built-in to return a double rather than void.  All of the
old code should still work, since they'll just ignore the return
value.  As I said, the built-in machinery can see whether we're assigning
the built-in return value to a variable or not, ie, the difference between

  __builtin_set_fpscr_rn ();

versus:

  foo = __builtin_set_fpscr_rn ();

In the former case, the built-in can expand exactly as how it does now.
In the later case, we'll use the target rtx we're passed in as the
destination of the mffscrn[i] insn for P9/10 and for pre-P9, we'll
use the target for an additional mffs instruction which we don't
generate now.  Note we'd only generate the mffs when we're passed in
a target rtx as in the second case.  The first case we won't.

This is all assuming Segher is fine with the change this way.  If we do
go this way, I would recommend adding a predefined macro that users can
test for to know whether the built-in returns a value or not.

If Segher doesn't like this idea, then it's all moot! :-)

Peter


Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Peter Bergner via Gcc-patches
On 1/28/21 1:47 PM, Segher Boessenkool wrote:
> On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote:
>> The second patch I want you to review is:
> 
> "This patch replaces the following three patches:"
> 
> Please send a patch that modifies *current* code, and that is *tested*
> with that.  With a good explanation, and a commit message for *that*
> patch (not for other, older patches).

Isn't the email Mike linked to which was submitted yesterday exactly this?
Admittedly, it doesn't have a "commit message header", but it does seem to
be against current code and he did say he tested it and there is some
explanation given, just not in a commit message form.  Is it just a nice
commit message you're looking for now?


Peter





[PATCH, rs6000, expand, hooks]: Fix PR98872, handle uninitialized opaque mode variables

2021-02-04 Thread Peter Bergner via Gcc-patches
Adding Richard since he's reviewed the generic opaque mode code in
the past and this patch contains some more eneric support.

GCC handles pseudos that are used uninitialized, by emitting a
(set (reg: ) CONST0_RTX(regmode)) before their uninitialized
pseudo usage.  Currently, CONST0_RTX(mode) is NULL for opaque modes,
which leads to an ICE.  The solution is to enhance init_emit_once() to
add initialization of CONST0_RTX for opaque modes using a target hook,
since CONST0_RTX probably are different for each opaque mode and each
target.  The default hook throws an error to force the target to think
hard about what their CONST0_RTX values should be for each mode.

This passed bootstrap and regtesting on x86_64-linux and powerpc64le-linux
with no regressions.  Ok for mainline?

Peter


gcc/
PR target/98872
* config/rs6000/mma.md (*movoo): Accept zero constraint.
(mma_xxsetaccz): Use CONST0_RTX.
* config/rs6000/predicates.md: Recognize OOmode CONST0_RTX.
* config/rs6000/rs6000.c (TARGET_OPAQUE_CONST0_RTX): Define.
(rs6000_split_multireg_move): Handle splitting an OOmode register
set to CONST0_RTX.
(rs6000_opaque_const0_rtx): New function.
* emit-rtl.c (init_emit_once): Initialize CONST0_RTX for opaque modes.
* hooks.c (hook_rtx_mode_unreachable): New function.
* hooks.h (hook_rtx_mode_unreachable): Declare
* target.def (opaque_const0_rtx): New target hook.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate.

gcc/testsuite/
* gcc.target/powerpc/pr98872.c: New test.

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 87569f1c31d..fab849a4f12 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -272,7 +272,7 @@
 
 (define_insn_and_split "*movoo"
   [(set (match_operand:OO 0 "nonimmediate_operand" "=wa,m,wa")
-   (match_operand:OO 1 "input_operand" "m,wa,wa"))]
+   (match_operand:OO 1 "input_operand" "m,wa,waO"))]
   "TARGET_MMA
&& (gpc_reg_operand (operands[0], OOmode)
|| gpc_reg_operand (operands[1], OOmode))"
@@ -473,9 +473,7 @@
(const_int 0))]
   "TARGET_MMA"
 {
-  rtx xo0 = gen_rtx_UNSPEC (XOmode, gen_rtvec (1, const0_rtx),
-   UNSPEC_MMA_XXSETACCZ);
-  emit_insn (gen_rtx_SET (operands[0], xo0));
+  emit_insn (gen_rtx_SET (operands[0], CONST0_RTX (XOmode)));
   DONE;
 })
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 5d1952e59d3..30805ab0619 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1081,6 +1081,10 @@
   && easy_vector_constant (op, mode))
 return 1;
 
+  /* For OOmode, zero is an easy constant.  */
+  if (mode == OOmode && op == CONST0_RTX (mode))
+return 1;
+
   /* For floating-point or multi-word mode, the only remaining valid type
  is a register.  */
   if (SCALAR_FLOAT_MODE_P (mode)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f5565a1a253..c726aa09d26 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1752,6 +1752,10 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 
 #undef TARGET_INVALID_CONVERSION
 #define TARGET_INVALID_CONVERSION rs6000_invalid_conversion
+
+#undef TARGET_OPAQUE_CONST0_RTX
+#define TARGET_OPAQUE_CONST0_RTX rs6000_opaque_const0_rtx
+
 
 
 /* Processor table.  */
@@ -16624,6 +16628,19 @@ rs6000_split_multireg_move (rtx dst, rtx src)
  return;
}
 
+  /* Split the clearing of an OOmode register pair into clearing
+of its two constituent registers.  */
+  if (REG_P (dst) && mode == OOmode && src == CONST0_RTX (mode))
+   {
+ int regno = REGNO (dst);
+ gcc_assert (VSX_REGNO_P (regno));
+ emit_insn (gen_rtx_SET (gen_rtx_REG (reg_mode, regno),
+ CONST0_RTX (reg_mode)));
+ emit_insn (gen_rtx_SET (gen_rtx_REG (reg_mode, regno + 1),
+ CONST0_RTX (reg_mode)));
+ return;
+   }
+
   /* Register -> register moves can use common code.  */
 }
 
@@ -27477,6 +27494,25 @@ rs6000_output_addr_vec_elt (FILE *file, int value)
   fprintf (file, "\n");
 }
 
+/* Implement TARGET_OPAQUE_CONST0_RTX.  */
+
+rtx
+rs6000_opaque_const0_rtx (machine_mode mode)
+{
+  gcc_assert (OPAQUE_MODE_P (mode));
+
+  switch (mode)
+{
+case E_OOmode:
+  return const0_rtx;
+case E_XOmode:
+  return gen_rtx_UNSPEC (XOmode, gen_rtvec (1, const0_rtx),
+UNSPEC_MMA_XXSETACCZ);
+default:
+  gcc_unreachable ();
+}
+}
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-rs6000.h"
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index b23284e5e56..4ac26491481 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -6408,6 +6408,11 @@ init_emit_once (void)
 if (GET_MODE_CLASS ((machine_mode) i) == MODE_CC)
   const_tiny_rtx[0][i] = co

[PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-04 Thread Peter Bergner via Gcc-patches
The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and
__builtin_mma_disassemble_pair built-ins to __builtin_vsx_assemble_pair and
__builtin_vsx_disassemble_pair respectively.  It's too late to remove the
old names, so this patch adds support for creating compatibility built-ins
(ie, multiple built-in functions generate the same code) and then creates
compatibility built-ins using the new names.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for mainline?

This will need backporting to GCC 10.  Ok there too once it's baked on
trunk for a little while?

Peter


gcc/
* gcc/config/rs6000/rs6000-builtin.def (BU_COMPAT): Add support macro
for defining compatibility built-ins.
(vsx_assemble_pair): Add compatibility built-in.
* gcc/config/rs6000/rs6000-call.c (struct builtin_compatibility): New.
(bdesc_compat): New.
(RS6000_BUILTIN_COMPAT): Define.
(rs6000_init_builtins): Register compatibility built-ins.
* gcc/testsuite/gcc.target/powerpc/mma-builtin-4.c: Add tests for
__builtin_vsx_assemble_pair and __builtin_vsx_disassemble_pair.
Update expected instruction counts.


diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 058a32abf4c..268f5d5b52b 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -43,6 +43,10 @@
ATTRbuiltin attribute information.
ICODE   Insn code of the function that implements the builtin.  */
 
+#ifndef RS6000_BUILTIN_COMPAT
+  #undef BU_COMPAT
+  #define BU_COMPAT(ENUM, COMPAT_NAME)
+
 #ifndef RS6000_BUILTIN_0
   #error "RS6000_BUILTIN_0 is not defined."
 #endif
@@ -87,6 +91,36 @@
   #error "RS6000_BUILTIN_X is not defined."
 #endif
 
+#else
+  /* Compatibility builtins.  These builtins are simply mapped into
+ their compatible builtin function identified by ENUM.  */
+  #undef BU_COMPAT
+  #define BU_COMPAT(ENUM, COMPAT_NAME) { ENUM, "__builtin_" COMPAT_NAME },
+
+  #undef RS6000_BUILTIN_0
+  #undef RS6000_BUILTIN_1
+  #undef RS6000_BUILTIN_2
+  #undef RS6000_BUILTIN_3
+  #undef RS6000_BUILTIN_4
+  #undef RS6000_BUILTIN_A
+  #undef RS6000_BUILTIN_D
+  #undef RS6000_BUILTIN_H
+  #undef RS6000_BUILTIN_M
+  #undef RS6000_BUILTIN_P
+  #undef RS6000_BUILTIN_X
+  #define RS6000_BUILTIN_0(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_1(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_2(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_3(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_4(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_A(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_D(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_H(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_M(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_P(ENUM, NAME, MASK, ATTR, ICODE)
+  #define RS6000_BUILTIN_X(ENUM, NAME, MASK, ATTR, ICODE)
+#endif
+
 #ifndef BU_AV_1
 /* Define convenience macros using token pasting to allow fitting everything in
one line.  */
@@ -3137,8 +3171,10 @@ BU_MMA_1 (XXSETACCZ, "xxsetaccz",MISC, 
mma_xxsetaccz)
 
 BU_MMA_2 (DISASSEMBLE_ACC, "disassemble_acc",  QUAD, mma_disassemble_acc)
 BU_MMA_2 (DISASSEMBLE_PAIR,"disassemble_pair", PAIR, mma_disassemble_pair)
+BU_COMPAT (MMA_BUILTIN_DISASSEMBLE_PAIR, "vsx_disassemble_pair")
 
 BU_MMA_3 (ASSEMBLE_PAIR,"assemble_pair",   MISC, mma_assemble_pair)
+BU_COMPAT (MMA_BUILTIN_ASSEMBLE_PAIR, "vsx_assemble_pair")
 BU_MMA_3 (XVBF16GER2,  "xvbf16ger2",   MISC, mma_xvbf16ger2)
 BU_MMA_3 (XVF16GER2,   "xvf16ger2",MISC, mma_xvf16ger2)
 BU_MMA_3 (XVF32GER,"xvf32ger", MISC, mma_xvf32ger)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index ae0c761f0a4..240533dec55 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -89,6 +89,12 @@
 #define TARGET_NO_PROTOTYPE 0
 #endif
 
+struct builtin_compatibility
+{
+  const enum rs6000_builtins code;
+  const char *const name;
+};
+
 struct builtin_description
 {
   const HOST_WIDE_INT mask;
@@ -8839,6 +8845,13 @@ def_builtin (const char *name, tree type, enum 
rs6000_builtins code)
 (int)code, name, attr_string);
 }
 
+static const struct builtin_compatibility bdesc_compat[] =
+{
+#define RS6000_BUILTIN_COMPAT
+#include "rs6000-builtin.def"
+};
+#undef RS6000_BUILTIN_COMPAT
+
 /* Simple ternary operations: VECd = foo (VECa, VECb, VECc).  */
 
 #undef RS6000_BUILTIN_0
@@ -13445,6 +13458,18 @@ rs6000_init_builtins (void)
 #ifdef SUBTARGET_INIT_BUILTINS
   SUBTARGET_INIT_BUILTINS;
 #endif
+
+  /* Register the compatibility builtins after all of the normal
+ builtins have been defined.  */
+  const struct builtin_compatibility *d = bdesc_compat;
+  unsigned i;
+  for (i = 0; i < ARRAY_SIZE (bdesc_compat); i++, d++)
+{
+  tree decl = rs6000_builtin_decls[(

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-04 Thread Peter Bergner via Gcc-patches
On 2/4/21 3:16 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Feb 04, 2021 at 02:40:20PM -0600, Peter Bergner wrote:
>> The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and
>> __builtin_mma_disassemble_pair built-ins to __builtin_vsx_assemble_pair and
>> __builtin_vsx_disassemble_pair respectively.  It's too late to remove the
>> old names, so this patch adds support for creating compatibility built-ins
>> (ie, multiple built-in functions generate the same code) and then creates
>> compatibility built-ins using the new names.
> 
> This needs a documentation update.  You can just delete the old names
> there (i.e. rename everything there).

Good catch!  I had meant to do that and totally forgot! :-(


>> +#ifndef RS6000_BUILTIN_COMPAT
>> +  #undef BU_COMPAT
>> +  #define BU_COMPAT(ENUM, COMPAT_NAME)
> 
> Please do not do #undef unless necessary: it hides bugs (and that causes
> more bugs).

I thought it was needed, since rs6000-builtin.def is #included into
rs6000-call.c and rs6000.h multiple times.  The first 11 times, we must
make sure it expands to nothing/empty string and on the 12th time, it
expands to create the bdesc_compat array.  However, it looks like if you
redefine it to be the same value each time, then you do not need to
#undef it, so it looks like just the last usage when we create the
bdesc_compat array will we need to #undef it.  I'll give that a try
and see how it works out.



>>  BU_MMA_3 (ASSEMBLE_PAIR,"assemble_pair",MISC, mma_assemble_pair)
>> +BU_COMPAT (MMA_BUILTIN_ASSEMBLE_PAIR, "vsx_assemble_pair")
> 
> You should do those the other way around (the mma_ one is the
> compatibility one).  This matters, because if you disable the
> compatibility builtins the vsx_ one should still be there, but not the
> old name.  (It also makes more sense of course).

I actually did it that way initially, but decided to do it this was
just to make the patch smaller.  You are correct that it's "more"
correct to rename it though. 

Peter





Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-05 Thread Peter Bergner via Gcc-patches
On 2/5/21 4:28 AM, Florian Weimer wrote:
> * Peter Bergner via Gcc-patches:
> 
>> The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and
>> __builtin_mma_disassemble_pair built-ins to __builtin_vsx_assemble_pair and
>> __builtin_vsx_disassemble_pair respectively.  It's too late to remove the
>> old names, so this patch adds support for creating compatibility built-ins
>> (ie, multiple built-in functions generate the same code) and then creates
>> compatibility built-ins using the new names.
> 
> Maybe add a check that the compatibility builtins are flagged as
> availble using __has_builtin?

Do you mean add a test in the testsuite for this?  I can check on
adding that to the test case.

Peter




[PATCH, pushed] rs6000: Fix invalid address used in MMA built-in function

2021-02-11 Thread Peter Bergner via Gcc-patches
The mma_assemble_input_operand predicate is too lenient on the memory
operands it will accept, leading to an ICE when illegitimate addresses
are passed in.  The solution is to only accept memory operands with
addresses that are valid for quad word memory accesses.  The test case
is a minimized test case from the Eigen library.  The creduced test case
is very noisy with respect to warnings, so the test case has added -w to
silence them.

Thanks to Jakub for diagnosing the bug!

This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
Approved offline by Segher.  Committed and pushed.

Peter


gcc/
PR target/99041
* config/rs6000/predicates.md (mma_assemble_input_operand): Restrict
memory addresses that are legal for quad word accesses.

gcc/testsuite/
PR target/99041
* g++.target/powerpc/pr99041.C: New test.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 76328ecff3d..bd26c62b3a4 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1156,7 +1156,9 @@
 ;; Return 1 if this operand is valid for a MMA assemble accumulator insn.
 (define_special_predicate "mma_assemble_input_operand"
   (match_test "(mode == V16QImode
-   && (vsx_register_operand (op, mode) || MEM_P (op)))"))
+   && (vsx_register_operand (op, mode)
+   || (MEM_P (op)
+   && quad_address_p (XEXP (op, 0), mode, false"))
 
 ;; Return 1 if this operand is valid for an MMA disassemble insn.
 (define_predicate "mma_disassemble_output_operand"
diff --git a/gcc/testsuite/g++.target/powerpc/pr99041.C 
b/gcc/testsuite/g++.target/powerpc/pr99041.C
new file mode 100644
index 000..c83f9806570
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr99041.C
@@ -0,0 +1,84 @@
+/* PR target/99041 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -w" } */
+
+/* Verify we do not ICE on the following source.  */
+
+long a, b, c, d, e;
+ enum { f };
+ enum { g, aa };
+ enum { h };
+ template < typename > struct ai;
+ template < typename > struct ad;
+ template < typename, int, int ah, int = 0, int = f, int = ah > class ab;
+ template < typename > class ar;
+ template < typename > class ak;
+ template < typename, int, int = 1, bool = false > class aj;
+ template < typename > class an;
+ template < typename, typename > class al;
+ template < typename, typename, int = h > class am;
+ template < typename, unsigned > class ao;
+ template < typename > struct ap;
+ template < typename aq > struct av { typedef ar< aq > ac; };
+ template < typename > struct as {   typedef av< am< al< int, an< aj< ab< 
double, 5, 2 >, false > > >, int > >::ac   ac; };
+ template < typename at > at au(const typename ap< at >::ac *);
+ template < typename at, int > at cc(typename ap< at >::ac *aw) {   return au< 
at >(aw); }
+ typedef __attribute__((altivec(vector__))) double ax;
+ template <> struct ap< ax > { typedef double ac; };
+ template <> ax au(const double *aw) { return __builtin_vec_vsx_ld(0, aw); }
+ template < typename > struct ay {};
+ template < typename aq > class ae : public ad< aq > { public:   typedef 
typename ai< aq >::ba ba; };
+ template < typename aq > class ar : public ae< aq > { public:   ak< aq > 
bk(); };
+ template < typename > struct ad {};
+ template < int > class bc;
+ template < typename bd, typename bf, int bg > struct ai< am< bd, bf, bg > > { 
  typedef typename bd::ba ba; };
+ template < typename bh, typename bj, int bg > class am : public bc< bg > { 
public:   bh bu();   bj k(); };
+ template < int bg > class bc : public as< am< int, int, bg > >::ac {};
+ template < typename, typename, typename > struct l;
+ template < typename m, typename bl, typename ag > void n(m bm, bl bn, ag bo) 
{   l< m, bl, ag >::bz(bm, bn, bo); }
+ template < typename, typename, typename, typename > struct bp;
+ class o { public:   o(double *, long);   double &operator()(long i, long j) { 
return p[aa ? j + i * w : w]; }   template < typename q, int t > q br(long i, 
long j) { double &cl = operator()(i, j); return cc< q, t >(&cl);   }   
double *p;   long w; };
+ class bt : public o { public:   bt(double *cf, long bv) : o(cf, bv) {} };
+ struct bw {   enum { bx }; };
+ template < typename bq > class ak { public:   template < typename by > void 
operator=(by) { am< al< const an< const aj< aj< ab< double, 5, 5, 2 >, -1, 
-1 >, -1 > >, const an< const aj< aj< ab< double, 5, 5, 2 >, -1, -1 
>, -1 > > >, ao< aj< ab< double, 5, 0 >, 1 >, 0 > > ck; 
n(ca, ck, ay< typename by::ba >());   }   bq ca; };
+ template < typename aq > class dn : public av< aq >::ac {};
+ template < typename cd, int af, int ah, int ce, int co, int cg > struct ai< 
ab< cd, af, ah, ce, co, cg > > {   typedef cd ba;   enum { bs }; };
+ template < typename, int, int, int, int co, int cg > class ab : public dn< 
ab< int, co, cg 

rs6000: Fix invalid splits when using Altivec style addresses [PR98959]

2021-02-12 Thread Peter Bergner via Gcc-patches
The rs6000_emit_le_vsx_* functions assume they are not passed an Altivec
style "& ~16" address.  However, some of our expanders and splitters do
not verify we do not have an Altivec style address before calling those
functions, leading to an ICE.  The solution here is to guard the expanders
and splitters to ensure we do not call them if we're given an Altivec style
address.

This fixes the ICE.  Ok for mainline if my powerpc64le-linux regtesting
comes back clean? 

We'll want backports once this has time to bake on mainline for a while.
Ok there too assuming my regtests there are clean?

Peter


2021-02-12  Peter Bergner  

gcc/
PR target/98959
* config/rs6000/rs6000.c (rs6000_emit_le_vsx_permute): Add an assert
to ensure we do not have an Altivec style address.
* config/rs6000/vsx.md (*vsx_le_perm_load_): Disable if passed
an Altivec style address.
(*vsx_le_perm_store_): Likewise.
(splitters after *vsx_le_perm_store_): Likewise.
(vsx_load_): Disable special expander if passed an Altivec
style address.
(vsx_store_): Likewise.

gcc/testsuite/
PR target/98959
* gcc.target/powerpc/pr98959.c: New test.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ec068c58aa5..e147cbdb52f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -10059,6 +10059,11 @@ rs6000_const_vec (machine_mode mode)
 void
 rs6000_emit_le_vsx_permute (rtx dest, rtx source, machine_mode mode)
 {
+  if (MEM_P (dest))
+gcc_assert (!altivec_indexed_or_indirect_operand (dest, mode));
+  if (MEM_P (source))
+gcc_assert (!altivec_indexed_or_indirect_operand (source, mode));
+
   /* Scalar permutations are easier to express in integer modes rather than
  floating-point modes, so cast them here.  We use V1TImode instead
  of TImode to ensure that the values don't go through GPRs.  */
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 3e0518631df..f6fe88d3600 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -987,11 +987,13 @@ (define_insn_and_split "*vsx_le_undo_permute_"
 (define_insn_and_split "*vsx_le_perm_load_"
   [(set (match_operand:VSX_LE_128 0 "vsx_register_operand" "=wa,r")
 (match_operand:VSX_LE_128 1 "memory_operand" "Z,Q"))]
-  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
+   && !altivec_indexed_or_indirect_operand (operands[1], mode)"
   "@
#
#"
-  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
+   && !altivec_indexed_or_indirect_operand (operands[1], mode)"
   [(const_int 0)]
 {
   rtx tmp = (can_create_pseudo_p ()
@@ -1008,7 +1010,8 @@ (define_insn_and_split "*vsx_le_perm_load_"
 (define_insn "*vsx_le_perm_store_"
   [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
 (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
-  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
+   & !altivec_indexed_or_indirect_operand (operands[0], mode)"
   "@
#
#"
@@ -1019,7 +1022,8 @@ (define_insn "*vsx_le_perm_store_"
 (define_split
   [(set (match_operand:VSX_LE_128 0 "memory_operand")
 (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
-  "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed && !TARGET_P9_VECTOR"
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed && !TARGET_P9_VECTOR
+   && !altivec_indexed_or_indirect_operand (operands[0], mode)"
   [(const_int 0)]
 {
   rtx tmp = (can_create_pseudo_p ()
@@ -1075,7 +1079,8 @@ (define_peephole2
 (define_split
   [(set (match_operand:VSX_LE_128 0 "memory_operand")
 (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
-  "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR"
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
+   && !altivec_indexed_or_indirect_operand (operands[0], mode)"
   [(const_int 0)]
 {
   rs6000_emit_le_vsx_permute (operands[1], operands[1], mode);
@@ -1241,7 +1246,8 @@ (define_expand "vsx_load_"
   "VECTOR_MEM_VSX_P (mode)"
 {
   /* Expand to swaps if needed, prior to swap optimization.  */
-  if (!BYTES_BIG_ENDIAN && !TARGET_P9_VECTOR)
+  if (!BYTES_BIG_ENDIAN && !TARGET_P9_VECTOR
+  && !altivec_indexed_or_indirect_operand(operands[1], mode))
 {
   rs6000_emit_le_vsx_move (operands[0], operands[1], mode);
   DONE;
@@ -1254,7 +1260,8 @@ (define_expand "vsx_store_"
   "VECTOR_MEM_VSX_P (mode)"
 {
   /* Expand to swaps if needed, prior to swap optimization.  */
-  if (!BYTES_BIG_ENDIAN && !TARGET_P9_VECTOR)
+  if (!BYTES_BIG_ENDIAN && !TARGET_P9_VECTOR
+  && !altivec_indexed_or_indirect_operand(operands[0], mode))
 {
   rs6000_emit_le_vsx_move (operands[0], operands[1], mode);
   DONE;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr98959

rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]

2021-02-12 Thread Peter Bergner via Gcc-patches
On 2/8/21 7:29 AM, Segher Boessenkool wrote:
>> I think we should either add a new rtx code for constant opaque modes
>> or make init-regs just emit the clobber for opaque modes (and not emit
>> the move).
> 
> Thanks for looking Richard.  That last option sounds good to me as well.

Ok, guarding the emit_move_insn with a CONST0_RTX test which causes us
to only emit the clobber for opaque modes fixes the problem.  Nice!
That means we can eliminate the rest of the patch other than the test case!



>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr98872.c
>>> @@ -0,0 +1,20 @@
>>> +/* PR target/98872 */
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target power10_ok } */
>>> +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
>>> +
>>> +/* Verify we do not ICE on the tests below.  */
> 
> Do the existing tests already check the expected code for this?

Yes, our mma-builtin-*.c tests check for expected output.



The updated patch below fixes the bug too and is much simpler! :-)


rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]

The initialize_uninitialized_regs function emits (set (reg:) (CONST0_RTX))
for all uninitialized pseudo uses.  However, some modes (eg, opaque modes)
may not have a CONST0_RTX defined, leading to an ICE when we try and create
the initialization insn.  The fix is to skip emitting the initialization
if there is no CONST0_RTX defined for the mode.

This following patch fixes the ICE and is currently regtesting.
Ok for trunk if the bootstrap and regtesting come back clean?

Peter


2021-02-12  Peter Bergner  

gcc/
PR rtl-optimization/98872
* init-regs.c (initialize_uninitialized_regs): Skip initialization
if CONST0_RTX is NULL.

gcc/testsuite/
PR rtl-optimization/98872
* gcc.target/powerpc/pr98872.c: New test.

diff --git a/gcc/init-regs.c b/gcc/init-regs.c
index 903c6541f10..72e898f3e33 100644
--- a/gcc/init-regs.c
+++ b/gcc/init-regs.c
@@ -105,7 +105,10 @@ initialize_uninitialized_regs (void)
 
  start_sequence ();
  emit_clobber (reg);
- emit_move_insn (reg, CONST0_RTX (GET_MODE (reg)));
+ /* PR98872: Only emit an initialization if MODE has a
+CONST0_RTX defined.  */
+ if (CONST0_RTX (GET_MODE (reg)))
+   emit_move_insn (reg, CONST0_RTX (GET_MODE (reg)));
  move_insn = get_insns ();
  end_sequence ();
  emit_insn_before (move_insn, insn);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr98872.c 
b/gcc/testsuite/gcc.target/powerpc/pr98872.c
new file mode 100644
index 000..f33ad9b48b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr98872.c
@@ -0,0 +1,19 @@
+/* PR target/98872 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+/* Verify we do not ICE on the following tests.  */
+
+void
+foo (__vector_quad *dst)
+{
+  __vector_quad acc;
+  *dst = acc;
+}
+
+void
+bar (__vector_pair *dst)
+{
+  __vector_pair pair;
+  *dst = pair;
+}



Re: rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]

2021-02-12 Thread Peter Bergner via Gcc-patches
On 2/12/21 4:21 PM, Peter Bergner wrote:
> rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]
> 
> The initialize_uninitialized_regs function emits (set (reg:) (CONST0_RTX))
> for all uninitialized pseudo uses.  However, some modes (eg, opaque modes)
> may not have a CONST0_RTX defined, leading to an ICE when we try and create
> the initialization insn.  The fix is to skip emitting the initialization
> if there is no CONST0_RTX defined for the mode.
> 
> This following patch fixes the ICE and is currently regtesting.
> Ok for trunk if the bootstrap and regtesting come back clean?
> 
> Peter
> 
> 
> 2021-02-12  Peter Bergner  
> 
> gcc/
>   PR rtl-optimization/98872
>   * init-regs.c (initialize_uninitialized_regs): Skip initialization
>   if CONST0_RTX is NULL.
> 
> gcc/testsuite/
>   PR rtl-optimization/98872
>   * gcc.target/powerpc/pr98872.c: New test.

Testing came back clean with no regressions.

Peter



Re: rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]

2021-02-15 Thread Peter Bergner via Gcc-patches
On 2/15/21 6:25 AM, Richard Sandiford wrote:
> Peter Bergner  writes:
>> 2021-02-12  Peter Bergner  
>>
>> gcc/
>>  PR rtl-optimization/98872
>>  * init-regs.c (initialize_uninitialized_regs): Skip initialization
>>  if CONST0_RTX is NULL.
>>
>> gcc/testsuite/
>>  PR rtl-optimization/98872
>>  * gcc.target/powerpc/pr98872.c: New test.
> 
> OK, thanks.

Ok, patch pushed to mainline.  Thanks!

Peter



Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-23 Thread Peter Bergner via Gcc-patches
On 2/5/21 12:28 PM, Segher Boessenkool wrote:
> On Fri, Feb 05, 2021 at 04:11:30PM +0100, Florian Weimer wrote:
>> * Peter Bergner:
>>> On 2/5/21 4:28 AM, Florian Weimer wrote:
 Maybe add a check that the compatibility builtins are flagged as
 availble using __has_builtin?
>>>
>>> Do you mean add a test in the testsuite for this?  I can check on
>>> adding that to the test case.
>>
>> Right, in the test case.  Given that it's a new kind of built-in.
>> (Not sure if it makes sense.)
> 
> There aren't many such tests yet, so it will be helpful just because of
> that.  But __has_builtin should work for any builtin function
> whatsoever, and there is nothing special about these compatibility
> builtins (it is just a name, it is defined as any other).

__has_builtin does work for the compat builtins, so I added tests using it.
Here is the updated patch given the review comments:



rs6000: Add support for compatibility built-ins

The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and
__builtin_mma_disassemble_pair built-ins to __builtin_vsx_assemble_pair and
__builtin_vsx_disassemble_pair respectively.  It's too late to remove the
old names, so this patch renames the built-ins to the new names and then
adds support for creating compatibility built-ins (ie, multiple built-in
functions generate the same code) and then creates compatibility built-ins
using the old names.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for mainline?

This will need backporting to GCC 10.  Ok there too once it's baked on
trunk for a little while?

Peter

2021-02-23  Peter Bergner  

gcc/
* config/rs6000/mma.md (mma_assemble_pair): Rename from this...
(vsx_assemble_pair): ...to this.
(*mma_assemble_pair): Rename from this...
(*vsx_assemble_pair): ...to this.
(mma_disassemble_pair): Rename from this...
(vsx_disassemble_pair): ...to this.
(*mma_disassemble_pair): Rename from this...
(*vsx_disassemble_pair): ...to this.
* gcc/config/rs6000/rs6000-builtin.def (BU_MMA_V2, BU_MMA_V3,
BU_COMPAT): New macros.
(mma_assemble_pair): Rename from this...
(vsx_assemble_pair): ...to this.
(mma_disassemble_pair): Rename from this...
(vsx_disassemble_pair): ...to this.
(mma_assemble_pair): Add compatibility built-in.
(mma_disassemble_pair): Likewise.
* config/rs6000/rs6000-call.c (struct builtin_compatibility): New.
(RS6000_BUILTIN_COMPAT): Define.
(bdesc_compat): New.
(mma_expand_builtin): Use VSX_BUILTIN_DISASSEMBLE_PAIR_INTERNAL.
(rs6000_gimple_fold_mma_builtin): Use MMA_BUILTIN_DISASSEMBLE_PAIR
and VSX_BUILTIN_ASSEMBLE_PAIR.
(rs6000_init_builtins): Register compatibility built-ins.
(mma_init_builtins): USE VSX_BUILTIN_ASSEMBLE_PAIR,
VSX_BUILTIN_ASSEMBLE_PAIR_INTERNAL, VSX_BUILTIN_DISASSEMBLE_PAIR and
VSX_BUILTIN_DISASSEMBLE_PAIR_INTERNAL.
* doc/extend.texi (__builtin_mma_assemble_pair): Rename from this...
(__builtin_vsx_assemble_pair): ...to this.
(__builtin_mma_disassemble_pair): Rename from this...
(__builtin_vsx_disassemble_pair): ...to this.

gcc/testsuite/
* gcc/testsuite/gcc.target/powerpc/mma-builtin-4.c: Add tests for
__builtin_vsx_assemble_pair and __builtin_vsx_disassemble_pair.
Add __has_builtin tests for built-ins.
Update expected instruction counts.

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 87569f1c31d..c40501f2e09 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -321,7 +321,7 @@
(set_attr "length" "*,*,16")
(set_attr "max_prefixed_insns" "2,2,*")])
 
-(define_expand "mma_assemble_pair"
+(define_expand "vsx_assemble_pair"
   [(match_operand:OO 0 "vsx_register_operand")
(match_operand:V16QI 1 "mma_assemble_input_operand")
(match_operand:V16QI 2 "mma_assemble_input_operand")]
@@ -334,7 +334,7 @@
   DONE;
 })
 
-(define_insn_and_split "*mma_assemble_pair"
+(define_insn_and_split "*vsx_assemble_pair"
   [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
(unspec:OO [(match_operand:V16QI 1 "mma_assemble_input_operand" "mwa")
(match_operand:V16QI 2 "mma_assemble_input_operand" "mwa")]
@@ -351,7 +351,7 @@
   DONE;
 })
 
-(define_expand "mma_disassemble_pair"
+(define_expand "vsx_disassemble_pair"
   [(match_operand:V16QI 0 "mma_disassemble_output_operand")
(match_operand:OO 1 "vsx_register_operand")
(match_operand 2 "const_0_to_1_operand")]
@@ -366,7 +366,7 @@
   DONE;
 })
 
-(define_insn_and_split "*mma_disassemble_pair"
+(define_insn_and_split "*vsx_disassemble_pair"
   [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
(unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa")
  (match_operand 2 "const_0_to_1_operand")]
diff --git a/gcc/config/rs600

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-23 Thread Peter Bergner via Gcc-patches
On 2/23/21 4:53 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Feb 23, 2021 at 04:00:42PM -0600, Peter Bergner wrote:
>>  (mma_assemble_pair): Add compatibility built-in.
> s/Add/New/ is better (it makes clear you do not add something to the
> (already existing) mma_assemble_pair, that it is in fact new here).
> 
>>  (mma_init_builtins): USE VSX_BUILTIN_ASSEMBLE_PAIR,
> 
> s/USE/Use/
> 
> The patch is okay for trunk and for 10.  Thank you!

Fixed and pushed to trunk.  I'll push the backport after a day or two
of burn in on trunk.  Thanks!

Peter




rs6000: Fix ICE in rs6000_init_builtins when compiling with -mcpu=440 [PR99279]

2021-02-25 Thread Peter Bergner via Gcc-patches
The initialization of compat builtins assumes the builtin we are creating
a compatible builtin for exists and ICEs if it doesn't.  However, there are
valid reasons why some builtins are disabled for a particular compile.
In this case, the MMA builtins are disabled for -mcpu=440 (and other cpus),
so instead of ICEing, we should just skip adding the MMA compat builtin.

This passed bootstrap and regtesting on powerpc64-linux, with running the
testsuite in both 32-bit and 64-bit modes, with no regressions.
Ok for mainline?

The compat builtin patch was approved for backporting to GCC10, so we'll
need this fix to go along with it.

Peter


2021-02-25  Peter Bergner  

gcc/
PR target/99279
* config/rs6000/rs6000-call.c (rs6000_init_builtins): Replace assert
with an "if" test.

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index d2bd03e..f567625 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13468,9 +13468,9 @@ rs6000_init_builtins (void)
   for (i = 0; i < ARRAY_SIZE (bdesc_compat); i++, d++)
 {
   tree decl = rs6000_builtin_decls[(int)d->code];
-  gcc_assert (decl != NULL);
-  add_builtin_function (d->name, TREE_TYPE (decl), (int)d->code,
-   BUILT_IN_MD, NULL, NULL_TREE);
+  if (decl != NULL)
+   add_builtin_function (d->name, TREE_TYPE (decl), (int)d->code,
+ BUILT_IN_MD, NULL, NULL_TREE);
 }
 }
 


Re: rs6000: Fix ICE in rs6000_init_builtins when compiling with -mcpu=440 [PR99279]

2021-02-25 Thread Peter Bergner via Gcc-patches
On 2/25/21 7:08 PM, David Edelsohn wrote:
> On Thu, Feb 25, 2021 at 8:05 PM Peter Bergner  wrote:
>>
>> The initialization of compat builtins assumes the builtin we are creating
>> a compatible builtin for exists and ICEs if it doesn't.  However, there are
>> valid reasons why some builtins are disabled for a particular compile.
>> In this case, the MMA builtins are disabled for -mcpu=440 (and other cpus),
>> so instead of ICEing, we should just skip adding the MMA compat builtin.
>>
>> This passed bootstrap and regtesting on powerpc64-linux, with running the
>> testsuite in both 32-bit and 64-bit modes, with no regressions.
>> Ok for mainline?
> 
> Okay.

Thanks, pushed.



On 2/25/21 7:09 PM, Segher Boessenkool wrote:
> On Thu, Feb 25, 2021 at 07:05:26PM -0600, Peter Bergner wrote:
>> The compat builtin patch was approved for backporting to GCC10, so we'll
>> need this fix to go along with it.
> 
> Okay for that as well of course.

Thanks, I'll give this a day or two and then push the two to gcc10.
Do you want them committed separately or squashed into one commit
since the 2nd patch fixes the issue in the first patch?

Peter



[PATCH] rs6000: Fix ICE expanding lxvp and stxvp gimple built-ins [PR101849]

2021-08-10 Thread Peter Bergner via Gcc-patches
PR101849 shows we ICE on a test case when we pass a non __vector_pair *
pointer to the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins
that is cast to __vector_pair *.  The problem is that when we expand
the built-in, the cast has already been removed from gimple and we are
only given the base pointer.  The solution used here (which fixes the ICE)
is to catch this case and convert the pointer to a __vector_pair * pointer
when expanding the built-in.

This passed bootstrap and regression testing on powerpc64le-linux with
no regressions.  Ok for mainline?  This also affects GCC 11 and 10, so
ok there too after it has baked on trunk for a few days?

Peter


gcc/
PR target/101849
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Cast
pointer to __vector_pair *.

gcc/testsuite/
PR target/101849
* gcc.target/powerpc/pr101849.c: New test.


diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 904e104c058..d04011c0489 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11919,6 +11919,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
   tree offset = gimple_call_arg (stmt, 0);
   tree ptr = gimple_call_arg (stmt, 1);
   tree lhs = gimple_call_lhs (stmt);
+  if (TREE_TYPE (TREE_TYPE (ptr)) != vector_pair_type_node)
+   ptr = build1 (VIEW_CONVERT_EXPR,
+ build_pointer_type (vector_pair_type_node), ptr);
   tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
   TREE_TYPE (ptr), ptr, offset));
   gimplify_assign (lhs, mem, &new_seq);
@@ -11932,6 +11935,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
   tree src = gimple_call_arg (stmt, 0);
   tree offset = gimple_call_arg (stmt, 1);
   tree ptr = gimple_call_arg (stmt, 2);
+  if (TREE_TYPE (TREE_TYPE (ptr)) != vector_pair_type_node)
+   ptr = build1 (VIEW_CONVERT_EXPR,
+ build_pointer_type (vector_pair_type_node), ptr);
   tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
   TREE_TYPE (ptr), ptr, offset));
   gimplify_assign (mem, src, &new_seq);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr101849.c 
b/gcc/testsuite/gcc.target/powerpc/pr101849.c
new file mode 100644
index 000..6d2e3b79282
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr101849.c
@@ -0,0 +1,19 @@
+/* PR target/101849 */
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+/* Verify we do not ICE on the tests below.  */
+
+__vector_pair vp;
+void
+foo (double *x)
+{
+   vp = __builtin_vsx_lxvp (0, (__vector_pair *)(void *)x);
+}
+
+void
+bar (__vector_pair *src, double *x)
+{
+  __builtin_vsx_stxvp (*src, 0, (__vector_pair *)(void *)x);
+}


Re: [PATCH] rs6000: Fix ICE expanding lxvp and stxvp gimple built-ins [PR101849]

2021-08-13 Thread Peter Bergner via Gcc-patches
On 8/12/21 1:20 PM, David Edelsohn wrote:
> On Tue, Aug 10, 2021 at 7:37 PM Peter Bergner  wrote:
>> gcc/
>> PR target/101849
>> * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Cast
>> pointer to __vector_pair *.
>>
>> gcc/testsuite/
>> PR target/101849
>> * gcc.target/powerpc/pr101849.c: New test.
> 
> Okay.
> 
> Thanks, David

V2:
  The previous patch was "ok'd" by David, but in the bugzilla, richi made
a suggestion that using build2, the ptr's type is ignored and we could just
pass in the pointer type we want, which is a little simpler, so that is
what I have done with this version, which I think is better than the
previous one.  I have also expanded the test case a little to test both
constant and non-constant offsets.

This version passed bootstrap and regression testing on powerpc64le-linux
with no regressions.  Ok for mainline?  This also affects GCC 11 and 10,
so ok there too after it has baked on trunk for a few days?

...or do we want to stick with the previous patch?

Peter



rs6000: Fix ICE expanding lxvp and stxvp gimple built-ins [PR101849]

PR101849 shows we ICE on a test case when we pass a non __vector_pair *
pointer to the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins
that is cast to __vector_pair *.  The problem is that when we expand
the built-in, the cast has already been removed from gimple and we are
only given the base pointer.  The solution used here (which fixes the ICE)
is to always use a __vector_pair * pointer when expanding the built-in.

gcc/
PR target/101849
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Force
pointer to __vector_pair *.

gcc/testsuite/
PR target/101849
* gcc.target/powerpc/pr101849.c: New test.


diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 904e104c058..90d7171fa46 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11919,8 +11919,10 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
   tree offset = gimple_call_arg (stmt, 0);
   tree ptr = gimple_call_arg (stmt, 1);
   tree lhs = gimple_call_lhs (stmt);
-  tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
-  TREE_TYPE (ptr), ptr, offset));
+  tree ptr_t = build_pointer_type (vector_pair_type_node);
+  tree mem = build2 (MEM_REF, vector_pair_type_node,
+build2 (POINTER_PLUS_EXPR, ptr_t, ptr, offset),
+build_int_cst (ptr_t, 0));
   gimplify_assign (lhs, mem, &new_seq);
   pop_gimplify_context (NULL);
   gsi_replace_with_seq (gsi, new_seq, true);
@@ -11932,8 +11934,10 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
   tree src = gimple_call_arg (stmt, 0);
   tree offset = gimple_call_arg (stmt, 1);
   tree ptr = gimple_call_arg (stmt, 2);
-  tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
-  TREE_TYPE (ptr), ptr, offset));
+  tree ptr_t = build_pointer_type (vector_pair_type_node);
+  tree mem = build2 (MEM_REF, vector_pair_type_node,
+build2 (POINTER_PLUS_EXPR, ptr_t, ptr, offset),
+build_int_cst (ptr_t, 0));
   gimplify_assign (mem, src, &new_seq);
   pop_gimplify_context (NULL);
   gsi_replace_with_seq (gsi, new_seq, true);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr101849.c 
b/gcc/testsuite/gcc.target/powerpc/pr101849.c
new file mode 100644
index 000..823fbfe9647
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr101849.c
@@ -0,0 +1,22 @@
+/* PR target/101849 */
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+/* Verify we do not ICE on the tests below.  */
+
+void
+foo (__vector_pair *dst, double *x, long offset)
+{
+  dst[0] = __builtin_vsx_lxvp (0, (__vector_pair *)(void *)x);
+  dst[1] = __builtin_vsx_lxvp (32, (__vector_pair *)(void *)x);
+  dst[2] = __builtin_vsx_lxvp (offset, (__vector_pair *)(void *)x);
+}
+
+void
+bar (__vector_pair *src, double *x, long offset)
+{
+  __builtin_vsx_stxvp (src[0], 0, (__vector_pair *)(void *)x);
+  __builtin_vsx_stxvp (src[1], 32, (__vector_pair *)(void *)x);
+  __builtin_vsx_stxvp (src[2], offset, (__vector_pair *)(void *)x);
+}




Re: [PATCH] rs6000: Fix ICE expanding lxvp and stxvp gimple built-ins [PR101849]

2021-08-19 Thread Peter Bergner via Gcc-patches
On 8/13/21 12:15 PM, Bill Schmidt wrote:
> Honestly, I don't see how it matters.  So far as I can tell, all you've done
> here is hand-inlined what build_simple_mem_ref would do.  So I guess I have
> a slight preference for your original patch (but with the new test case,
> of course).

Ok, I ended up pushing the original patch then with the expanded test case.
I'll let this bake on trunk for a bit before back porting.  Thanks everyone.

Peter


[PATCH] rs6000: Disable optimizing multiple xxsetaccz instructions into one xxsetaccz

2021-08-27 Thread Peter Bergner via Gcc-patches
Fwprop will happily optimize two xxsetaccz instructions into one xxsetaccz
by propagating the results of the first to the uses of the second.
We really don't want that to happen given the late priming/depriming of
accumulators.  I fixed this by making the xxsetaccz source operand an
unspec volatile.  I also removed the mma_xxsetaccz define_expand and
define_insn_and_split and replaced it with a simple define_insn.
The expand and splitter patterns were leftovers from the pre opaque mode
code when the xxsetaccz code was part of the movpxi pattern, and we don't
need them now.

Rather than a new test case, I was able to just modify the current test case
to add another __builtin_mma_xxsetaccz call which shows the bad code gen
with unpatched compilers.

This passed bootstrap on powerpc64le-linux with no regressions.
Ok for trunk?  We'll need this for sure in GCC11.  Ok there too after
some trunk burn in time?

GCC10 suffers from the same issue, but since the code is different, I'll
have to determine a different solution which I'll post as a separate
patch.

Peter


gcc/
* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_XXSETACCZ.
(unspecv): Add UNSPECV_MMA_XXSETACCZ.
(*mma_xxsetaccz): Delete.
(mma_xxsetaccz): Change to define_insn.  Remove match_operand.
Use UNSPECV_MMA_XXSETACCZ.
* config/rs6000/rs6000.c (rs6000_rtx_costs): Use UNSPECV_MMA_XXSETACCZ.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-6.c: Add second call to xxsetacc
built-in.  Update instruction counts.

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 1f6fc03d2ac..b26ae7a5d04 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -91,7 +91,10 @@ (define_c_enum "unspec"
UNSPEC_MMA_XVI8GER4SPP
UNSPEC_MMA_XXMFACC
UNSPEC_MMA_XXMTACC
-   UNSPEC_MMA_XXSETACCZ
+  ])
+
+(define_c_enum "unspecv"
+  [UNSPECV_MMA_XXSETACCZ
   ])
 
 ;; MMA instructions with 1 accumulator argument
@@ -469,26 +472,12 @@ (define_insn "mma_"
 
 ;; We can't have integer constants in XOmode so we wrap this in an UNSPEC.
 
-(define_expand "mma_xxsetaccz"
-  [(set (match_operand:XO 0 "fpr_reg_operand")
-   (const_int 0))]
-  "TARGET_MMA"
-{
-  rtx xo0 = gen_rtx_UNSPEC (XOmode, gen_rtvec (1, const0_rtx),
-   UNSPEC_MMA_XXSETACCZ);
-  emit_insn (gen_rtx_SET (operands[0], xo0));
-  DONE;
-})
-
-(define_insn_and_split "*mma_xxsetaccz"
+(define_insn "mma_xxsetaccz"
   [(set (match_operand:XO 0 "fpr_reg_operand" "=d")
-   (unspec:XO [(match_operand 1 "const_0_to_1_operand" "O")]
-UNSPEC_MMA_XXSETACCZ))]
+   (unspec_volatile:XO [(const_int 0)]
+   UNSPECV_MMA_XXSETACCZ))]
   "TARGET_MMA"
   "xxsetaccz %A0"
-  "&& reload_completed"
-  [(set (match_dup 0) (unspec:XO [(match_dup 1)] UNSPEC_MMA_XXSETACCZ))]
-  ""
   [(set_attr "type" "mma")
(set_attr "length" "4")])
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e073b26b430..40dc71c8171 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21919,7 +21919,7 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
   break;
 
 case UNSPEC:
-  if (XINT (x, 1) == UNSPEC_MMA_XXSETACCZ)
+  if (XINT (x, 1) == UNSPECV_MMA_XXSETACCZ)
{
  *total = 0;
  return true;
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-6.c 
b/gcc/testsuite/gcc.target/powerpc/mma-builtin-6.c
index 0c6517211e3..715b28138e9 100644
--- a/gcc/testsuite/gcc.target/powerpc/mma-builtin-6.c
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-6.c
@@ -5,14 +5,16 @@
 void
 foo (__vector_quad *dst)
 {
-  __vector_quad acc;
-  __builtin_mma_xxsetaccz (&acc);
-  *dst = acc;
+  __vector_quad acc0, acc1;
+  __builtin_mma_xxsetaccz (&acc0);
+  __builtin_mma_xxsetaccz (&acc1);
+  dst[0] = acc0;
+  dst[1] = acc1;
 }
 
 /* { dg-final { scan-assembler-not {\mlxv\M} } } */
 /* { dg-final { scan-assembler-not {\mlxvp\M} } } */
 /* { dg-final { scan-assembler-not {\mxxmtacc\M} } } */
-/* { dg-final { scan-assembler-times {\mxxsetaccz\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mxxmfacc\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mstxvp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxxsetaccz\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxxmfacc\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mstxvp\M} 4 } } */


Re: [PATCH] rs6000: inefficient 64-bit constant generation for consecutive 1-bits

2020-09-15 Thread Peter Bergner via Gcc-patches
> rs6000_is_valid_shift_mask handles this already (but it requires you to
> pass in the shift needed).  rs6000_is_valid_mask will handle it.
> rs6000_is_valid_and_mask does not get a shift count parameter, so cannot
> use rldic currently.

After talking with you off line, I changed to using rs6000_is_valid_mask.
The did mean I had to change num_insns_constant_gpr to take a mode param
so it could be passed down to rs6000_is_valid_mask. 




> Please improve something there instead?
> 
>> -  HOST_WIDE_INT ud1, ud2, ud3, ud4;
>> +  HOST_WIDE_INT ud1, ud2, ud3, ud4, value = c;
> 
> Do not put declarations for uninitialised and initialised variables on
> one line, please.

The new patch doesn't even touch this function anymore.



>> +(define_insn "rldic"
>> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
>> +(unspec:DI [(match_operand:DI 1 "gpc_reg_operand" "r")
>> +(match_operand:DI 2 "u6bit_cint_operand" "n")
>> +(match_operand:DI 3 "u6bit_cint_operand" "n")]
>> +   UNSPEC_RLDIC))]
>> +  "TARGET_POWERPC64"
>> +  "rldic %0,%1,%2,%3")

...and this is gone too.  I've replaced it with a generic splitter
that matches an already existing define_insn (rotl3_mask).

 
>> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,8,8" } } */
>> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,24,8" } } */
>> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,8" } } */
>> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,48" } } */
>> +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,23" } } */
> 
> Please use {} quotes, and \m\M.  \d can be helpful, too.

That was how I wrote it initially, but for some reason, it wouldn't match
at all.  Do I need extra \'s for my regexs when using {}?

\d is any digit?  Yeah, that would be better.  Gotta find a manpage or ???
that describes what regex patterns are allowed.


This all said, Alan's rtx_costs patch touches this same area and he talked
about removing a similar splitter, so I think I will wait for his code to
be committed and then rework this on top of his changes.

Peter




Re: [PATCH 2/4, revised patch applied] PowerPC: Rename functions for min, max, cmove

2020-09-15 Thread Peter Bergner via Gcc-patches
On 9/15/20 1:38 PM, Alexandre Oliva wrote:
>> +case SFmode:
>> +case DFmode:
> 
> gcc110 (ppc64) in the build farm didn't like this.  The bootstrap
> compiler barfs on these expressions, because of some constexpr issue I
> haven't really looked into.
> 
> I'm testing this patch.  I'll check it in when I'm done.
> 
> 
> use E_*mode instead of just *mode

Bill's nightly testing on one of our old systems just hit this too.
Thanks for fixing and testing the fix!

Peter




Re: [PATCH 1/2] rs6000: Support _mm_insert_epi{8,32,64}

2020-09-25 Thread Peter Bergner via Gcc-patches
On 9/24/20 6:22 PM, Segher Boessenkool wrote:
>> +  result [(__N & 0b)] = __D;
> 
> Hrm, GCC supports binary constants like this since 2007, so okay.  But I
> have to wonder if this improves anything over hex (or decimal even!)
> The parens are superfluous (and only hinder legibility), fwiw.

+1 for using hex constants when using them with logical ops like '&'.

Peter




Re: [PATCH] rs6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]

2022-03-14 Thread Peter Bergner via Gcc-patches
I forgot to CC the gcc-patches mailing list on the original patch submission.
Adding it in now.  Sorry.



On 3/14/22 8:24 PM, Segher Boessenkool wrote:
>> gcc/
>>     PR target/104923
>>     * config/rs6000/predicates.md (mma_disassemble_output_operand):
>>     Restrict acceptable MEM addresses.
>>
>> gcc/testsuite/
>>     PR target/104923
>>     * gcc.target/powerpc/pr104923.c: New test.
> 
> Changelogs are indented with tabs, not with four spaces.  Maybe the
> commit script will fix this up though, dunno.

They are actually indented correctly.  I think when I was fixing one
of my Thunderbird setting for my IBM account, it reverted my settings
to always send in plain-text and probably munged the whitespace on
me.  I think I fixed that now.  Thanks for letting me know it was
broken.




> Including that tab the maximum line length is 80.  You don't have to end
> that first line with a colon (it looks like something is missing like
> this)

Ok, I can move "Restrict" to the line above and still stay within 80 chars.



>> +  if (MEM_P (op))
>> +    return indexed_or_indirect_address (XEXP (op, 0), mode)
>> +       || quad_address_p (XEXP (op, 0), mode, false);
> 
> The indent of this last line is wrong.  The || should align with
> indexed_or_indirect_address, and every leading eight spaces is a tab.

Again, it's indented correctly for me, but got munged when sending.
Hopefully fixed now.


> You might want to name that common expression, "rtx addr = XEXP (op, 0);"
> or something.  Dunno what is best

Will do.



> Please put that new MEM_P code first, followed by a blank line, and only
> then do the SUBREG thing.  As written it will allow subregs of mem.  And
> the blank line is important of course ;-)

Will do.


> Okay for trunk with those changes.  Also okay for 10 and 11 after an
> appropriate soak period.  Thanks!

Thanks.  I'll verify things are still working with the changes agreed
to above before committing.  Thanks.


> Btw.  A good cleanup would be to have mma_assemble_input_operand written
> like this, too?

Ok, I'll have a look at doing that as part of a separate change.

Peter



Re: [PATCH] rs6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]

2022-03-15 Thread Peter Bergner via Gcc-patches
On 3/14/22 10:06 PM, Peter Bergner wrote:
> On 3/14/22 8:24 PM, Segher Boessenkool wrote:
>> You might want to name that common expression, "rtx addr = XEXP (op, 0);"
>> or something.  Dunno what is best
> 
> Will do.
> 
> 
>> Please put that new MEM_P code first, followed by a blank line, and only
>> then do the SUBREG thing.  As written it will allow subregs of mem.  And
>> the blank line is important of course ;-)
> 
> Will do.
> 
> 
>> Okay for trunk with those changes.  Also okay for 10 and 11 after an
>> appropriate soak period.  Thanks!

Testing of the updates came back clean so I pushed the fix.
I'll wait a few days before pushing the backports.  Thanks!

Peter



Re: [PATCH] rs6000: Allow using -mlong-double-64 after -mabi={ibm,ieee}longdouble [PR104208, PR87496]

2022-03-15 Thread Peter Bergner via Gcc-patches
On 3/4/22 8:14 PM, Peter Bergner wrote:
> On 3/4/22 11:33 AM, Peter Bergner wrote:
>>> Ok pushed to trunk.  I haven't determined yet whether we need this on GCC 
>>> 11 yet.
>>> I'll check on that and report back.  Thanks!
>>
>> I've confirmed that GCC 11 fails the same way and that the backported patch
>> fixes the issue there too.  Ok for GCC 11 assuming my full regression testing
>> is clean?
>>
>> GCC 10 has the same checking code, so it looks to need the backport as well.
>> I'll go ahead and backport and regression test it there too.
> 
> The backports to GCC 11 and GCC 10 bootstrapped and regtested with no 
> regressions.
> Ok for the GCC 11 and GCC 10 release branches after a day or two of baking on
> trunk?

Ping.

The trunk patch has been confirmed to fix the glibc build errors and no issues
with the patch has surfaced, so ok for the GCC11 and GCC10 release branches?

Peter





Re: [PATCH] rs6000: Allow using -mlong-double-64 after -mabi={ibm,ieee}longdouble [PR104208, PR87496]

2022-03-18 Thread Peter Bergner via Gcc-patches
On 3/16/22 7:33 AM, Segher Boessenkool wrote:
> On Tue, Mar 15, 2022 at 02:49:39PM -0500, Peter Bergner wrote:
>> The trunk patch has been confirmed to fix the glibc build errors and no 
>> issues
>> with the patch has surfaced, so ok for the GCC11 and GCC10 release branches?
> 
> If you can confirm it fixes the glibc build error on 11 and 10 as well,
> then yes please.  Thanks!

I confirmed I see the same glibc issue with unpatched gcc11 and gcc10 compilers
and that the patched gcc11 and gcc10 fix the issue so I went ahead and pushed
the backports.  Thanks!

Peter




rs6000/testsuite: Use -mdejagnu-cpu= and -mdejagnu-tune= options

2022-03-25 Thread Peter Bergner via Gcc-patches
This patch updates the POWER testsuite test cases using -mcpu= and -mtune=
to use the preferred -mdejagnu-cpu= and -mdejagnu-tune= options.  This also
obviates the need for the dg-skip-if directive, since the user cannot
override the -mcpu= value being used to compile the test case.

This passed regression testing with no regressions on powerpc64le-linux.
Ok for trunk?

Peter


gcc/testsuite/

* g++.dg/pr65240-1.C: Use -mdejagnu-cpu=.  Remove dg-skip-if.
* g++.dg/pr65240-2.C: Likewise.
* g++.dg/pr65240-3.C: Likewise.
* g++.dg/pr65240-4.C: Likewise.
* g++.dg/pr65242.C: Likewise.
* g++.dg/pr67211.C: Likewise.
* g++.dg/pr69667.C: Likewise.
* g++.dg/pr71294.C: Likewise.
* g++.dg/pr84279.C: Likewise.
* g++.dg/torture/ppc-ldst-array.C: Likewise.
* gfortran.dg/nint_p7.f90: Likewise.
* gfortran.dg/pr102860.f90: Likewise.
* gcc.target/powerpc/fusion.c: Use -mdejagnu-cpu= and -mdejagnu-tune=.
* gcc.target/powerpc/fusion2.c: Likewise.
* gcc.target/powerpc/int_128bit-runnable.c: Use -mdejagnu-cpu=.
* gcc.target/powerpc/test_mffsl.c: Likewise.
* gfortran.dg/pr47614.f: Likewise.
* gfortran.dg/pr58968.f: Likewise.


diff --git a/gcc/testsuite/g++.dg/pr65240-1.C b/gcc/testsuite/g++.dg/pr65240-1.C
index d2e25b65fca..ff8910df6a1 100644
--- a/gcc/testsuite/g++.dg/pr65240-1.C
+++ b/gcc/testsuite/g++.dg/pr65240-1.C
@@ -1,8 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O3 -ffast-math -mcmodel=small -mno-fp-in-toc 
-Wno-return-type" } */
+/* { dg-options "-mdejagnu-cpu=power8 -O3 -ffast-math -mcmodel=small 
-mno-fp-in-toc -Wno-return-type" } */
 
 /* target/65240, compiler got a 'insn does not satisfy its constraints' error. 
 */
 
diff --git a/gcc/testsuite/g++.dg/pr65240-2.C b/gcc/testsuite/g++.dg/pr65240-2.C
index 38d5020bd19..bdb7a62d73d 100644
--- a/gcc/testsuite/g++.dg/pr65240-2.C
+++ b/gcc/testsuite/g++.dg/pr65240-2.C
@@ -1,8 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O3 -ffast-math -mcmodel=small -mfp-in-toc 
-Wno-return-type" } */
+/* { dg-options "-mdejagnu-cpu=power8 -O3 -ffast-math -mcmodel=small 
-mfp-in-toc -Wno-return-type" } */
 
 /* target/65240, compiler got a 'insn does not satisfy its constraints' error. 
 */
 
diff --git a/gcc/testsuite/g++.dg/pr65240-3.C b/gcc/testsuite/g++.dg/pr65240-3.C
index e8463c91494..f37db9025d1 100644
--- a/gcc/testsuite/g++.dg/pr65240-3.C
+++ b/gcc/testsuite/g++.dg/pr65240-3.C
@@ -1,8 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O3 -ffast-math -mcmodel=medium 
-Wno-return-type" } */
+/* { dg-options "-mdejagnu-cpu=power8 -O3 -ffast-math -mcmodel=medium 
-Wno-return-type" } */
 
 /* target/65240, compiler got a 'insn does not satisfy its constraints' error. 
 */
 
diff --git a/gcc/testsuite/g++.dg/pr65240-4.C b/gcc/testsuite/g++.dg/pr65240-4.C
index a119752d18e..efb6a6c06e7 100644
--- a/gcc/testsuite/g++.dg/pr65240-4.C
+++ b/gcc/testsuite/g++.dg/pr65240-4.C
@@ -1,8 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
-/* { dg-options "-mcpu=power7 -O3 -ffast-math -Wno-return-type" } */
+/* { dg-options "-mdejagnu-cpu=power7 -O3 -ffast-math -Wno-return-type" } */
 
 /* target/65240, compiler got a 'insn does not satisfy its constraints' error. 
 */
 
diff --git a/gcc/testsuite/g++.dg/pr65242.C b/gcc/testsuite/g++.dg/pr65242.C
index be2ddaa85b2..662f375015f 100644
--- a/gcc/testsuite/g++.dg/pr65242.C
+++ b/gcc/testsuite/g++.dg/pr65242.C
@@ -1,8 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O3" } */
+/* { dg-options "-mdejagnu-cpu=power8 -O3" } */
 
 class A {
 public:
diff --git a/gcc/testsuite/g++.dg/pr67211.C b/gcc/testsuite/g++.dg/pr67211.C
index cb3d342c122..ac241818ab5 100644
--- a/gcc/testsuite/g++.dg/pr672

Re: rs6000/testsuite: Use -mdejagnu-cpu= and -mdejagnu-tune= options

2022-03-25 Thread Peter Bergner via Gcc-patches
On 3/25/22 4:08 PM, Segher Boessenkool wrote:
> On Fri, Mar 25, 2022 at 02:51:38PM -0500, Peter Bergner wrote:
>> This patch updates the POWER testsuite test cases using -mcpu= and -mtune=
>> to use the preferred -mdejagnu-cpu= and -mdejagnu-tune= options.  This also
>> obviates the need for the dg-skip-if directive, since the user cannot
>> override the -mcpu= value being used to compile the test case.
> 
> So this is all testcases that say "do not override -mcpu"?

Not all of them, but most of them, yes.


> It seems likely many of these tests should move to g++.target/powerpc .

Probably, that can be a follow on patch.  Maybe a good first patch for Surya.



> Those that should not should likely not use -mcpu= in the first place
> (instead, those tests should use has_arch_pwrN).

If the test cases explicitly added -mcpu=, I'm guessing they need them
to test whatever the test case is checking for.  If we remove the -mcpu=
and reply on dg-require has_arch_pwrN or whatever, then the test case
would only run whenever the default flags match that, right?  So it
would seem we'd get less test coverage than before.



>> +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
> 
> I missed these two in reviewing when the -mcpu= was introduced, oops.

It's WAY too easy to miss those, since -mcpu= is such a common option
that we see and use everyday, we almost expect to see it, so it doesn't
look out of place or wrong.


> Okay for trunk.  Also okay for backports if you want / if you think it
> useful.  Thanks!

Thanks, commit pushed.  I had not thought about backports, but if it
helps stabilize our test results there, it can't hurt.  I'll have a
look and see if the tests even exist there or not.

Peter





Re: [PATCH] rs6000: Adjust mov optabs for opaque modes [PR103353]

2022-04-01 Thread Peter Bergner via Gcc-patches
On 4/1/22 3:50 PM, will schmidt wrote:
> Is there a testcase, new or existing, that illustrates this error path?

Well, the already existsing test case pr101849.c is where the issue was seen,
but only when compiled by hand outside of the test harness and using only the
-maltivec option and not the -mcpu=power10 that the test case uses.

That said, I agree, pr101849.c should probably be copied into another test case
file (pr103353.c) that uses options that trigger the issue so we can be sure the
fix won't regress.

Peter




[PATCH] rs6000: Handle pcrel sibcalls to longcall functions [PR104894]

2022-04-05 Thread Peter Bergner via Gcc-patches
Before PCREL in POWER10, we were not allowed to perform sibcalls to longcall
functions since callee's return would skip the TOC restore in the caller.
However, with PCREL we can now safely perform a sibling call to longcall
functions.  The problem with the current code in rs6000_sibcall_aix is that it
asserts we do not have a longcall and fixing that, it generates a direct call
to a PLT stub, when it should generate an indirect sibcall due to -fno-plt.
The solution here is to check for a pcrel longcall and emit an indirect sibcall
using an inline plt stub in that case.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk?

This is marked as a GCC 11/12 regression.  Ok for a GCC 11 backport after
some burn-in on trunk?

Peter


gcc/
PR target/104894
* config/rs6000/rs6000.cc (rs6000_sibcall_aix): Handle pcrel sibcalls
to longcall functions.

gcc/testsuite/
PR target/104894
* gcc.target/powerpc/pr104894.c: New test.
* gcc.target/powerpc/pr104894-2.c: New test.


diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index cb18db06a2d..d38a1d61cfe 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -25659,11 +25659,21 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx 
tlsarg, rtx cookie)
   rtx r12 = NULL_RTX;
   rtx func_addr = func_desc;
 
-  gcc_assert (INTVAL (cookie) == 0);
-
   if (global_tlsarg)
 tlsarg = global_tlsarg;
 
+  /* Handle longcall attributes.  */
+  if ((INTVAL (cookie) & CALL_LONG) != 0
+  && GET_CODE (func_desc) == SYMBOL_REF)
+{
+  /* Only PCREL can do a sibling call to a longcall function
+because we don't need to restore the TOC register.  */
+  gcc_assert (rs6000_pcrel_p ());
+  func_desc = rs6000_longcall_ref (func_desc, tlsarg);
+}
+  else
+gcc_assert (INTVAL (cookie) == 0);
+
   /* For ELFv2, r12 and CTR need to hold the function address
  for an indirect call.  */
   if (GET_CODE (func_desc) != SYMBOL_REF && DEFAULT_ABI == ABI_ELFv2)
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104894.c 
b/gcc/testsuite/gcc.target/powerpc/pr104894.c
new file mode 100644
index 000..f46fe88168f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104894.c
@@ -0,0 +1,20 @@
+/* PR target/104894 */
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fno-plt" } */
+
+/* Verify we do not ICE on the following test case and that we emit an
+   indirect sibcall, with r12 and CTR containing the function address.  */
+
+void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+}
+
+/* { dg-final { scan-assembler-times {\mmtctr 12\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mbctr\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mbl\M} } } */
+/* { dg-final { scan-assembler-not {\mbctrl\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104894-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr104894-2.c
new file mode 100644
index 000..d1a011ef4d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104894-2.c
@@ -0,0 +1,22 @@
+/* PR target/104894 */
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fno-plt" } */
+
+/* Verify we do not ICE on the following test case and that we emit one
+   indirect call and one indirect sibcall, with r12 and CTR containing
+   the function addresses.  */
+
+void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+  foo ();
+}
+
+/* { dg-final { scan-assembler-times {\mmtctr 12\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mbctrl\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mbctr\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mbl\M} } } */


Re: [PATCH] rs6000: Handle pcrel sibcalls to longcall functions [PR104894]

2022-04-05 Thread Peter Bergner via Gcc-patches
On 4/5/22 5:32 PM, Segher Boessenkool wrote:
> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
>>  
>> +  /* Handle longcall attributes.  */
>> +  if ((INTVAL (cookie) & CALL_LONG) != 0
>> +  && GET_CODE (func_desc) == SYMBOL_REF)
> 
> Don't say "!= 0" here please.
> 
>   if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))

Yes, cut & paste.  I'll change it.



>> +{
>> +  /* Only PCREL can do a sibling call to a longcall function
>> + because we don't need to restore the TOC register.  */
> 
> Don't say "Only", it is the "New" syndrome: it is out of date before you
> know it :-(  You can just leave it away here.

Ok, I can drop "Only" and keep the rest the same.




>> +  gcc_assert (rs6000_pcrel_p ());
>> +  func_desc = rs6000_longcall_ref (func_desc, tlsarg);
>> +}
>> +  else
>> +gcc_assert (INTVAL (cookie) == 0);
> 
> So in the old code the cookie could *only* contain the CALL_LONG flag,
> now it can contain any others as long as it has that flag as well.
> Please fix.

No, the old code only allowed INTVAL (cookie) == 0, which means no
attributes are allowed.  The new code now allows the CALL_LONG attribute
iff the function is a SYMBOL_REF.  This is only allowed for pcrel calls
though.  I debated on whether to do a gcc_assert on rs6000_pcrel_p() or
fold the rs6000_pcrel_p() into the if () and let the original assert
on INTVAL (cookie) == 0 catch the illegal uses.  It's up to you on
what you prefer.



> Not every LONG_CALL needs a TOC restore though?  

I believe if the function we're calling has the CALL_LONG attribute
set, we have to assume that the TOC needs to be restored.  Now whether
the actual callee's TOC at run time is different or not from the callers
is another question.  We're just forced to assume they are different.


> I won't ask you to make it work in those cases of course, but change
> the comment to not be as misleading?  Taking the "Only" out is good
> already I think :-)

Yes, as I said above, I will drop "Only" from the comment.




> You probably should have the same condition here as actually doing a
> longcall as well, something involving SYMBOL_REF_FUNCTION_P?

I believe if we're here in rs6000_sibcall_aix() and func_desc is a
SYMBOL_REF, then it also must be SYMBOL_REF_FUNCTION_P, correct?
Otherwise, why would we be attempting to do a sibcall to it?

Peter




Re: [PATCH] rs6000: Handle pcrel sibcalls to longcall functions [PR104894]

2022-04-06 Thread Peter Bergner via Gcc-patches
On 4/5/22 10:33 PM, Peter Bergner via Gcc-patches wrote:
> On 4/5/22 5:32 PM, Segher Boessenkool wrote:
>> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
>>>  
>>> +  /* Handle longcall attributes.  */
>>> +  if ((INTVAL (cookie) & CALL_LONG) != 0
>>> +  && GET_CODE (func_desc) == SYMBOL_REF)
>>
>> Don't say "!= 0" here please.
>>
>>   if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))
> 
> Yes, cut & paste.  I'll change it.
[snip]
>> Don't say "Only", it is the "New" syndrome: it is out of date before you
>> know it :-(  You can just leave it away here.
> 
> Ok, I can drop "Only" and keep the rest the same.

So the updated change looks like below with the ChangeLog entry and tests being 
the same:

Is this better and ok for trunk?

Peter


@@ -25659,11 +25659,20 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx 
tlsarg, rtx cookie)
   rtx r12 = NULL_RTX;
   rtx func_addr = func_desc;
 
-  gcc_assert (INTVAL (cookie) == 0);
-
   if (global_tlsarg)
 tlsarg = global_tlsarg;
 
+  /* Handle longcall attributes.  */
+  if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))
+{
+  /* PCREL can do a sibling call to a longcall function
+because we don't need to restore the TOC register.  */
+  gcc_assert (rs6000_pcrel_p ());
+  func_desc = rs6000_longcall_ref (func_desc, tlsarg);
+}
+  else
+gcc_assert (INTVAL (cookie) == 0);
+
   /* For ELFv2, r12 and CTR need to hold the function address
  for an indirect call.  */
   if (GET_CODE (func_desc) != SYMBOL_REF && DEFAULT_ABI == ABI_ELFv2)



Re: [PATCH] rs6000: Handle pcrel sibcalls to longcall functions [PR104894]

2022-04-11 Thread Peter Bergner via Gcc-patches
On 4/11/22 4:13 PM, Segher Boessenkool wrote:
> On Wed, Apr 06, 2022 at 02:33:52PM -0500, Peter Bergner wrote:
>> On 4/5/22 10:33 PM, Peter Bergner via Gcc-patches wrote:
>>> On 4/5/22 5:32 PM, Segher Boessenkool wrote:
>>>> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
>> So the updated change looks like below with the ChangeLog entry and tests 
>> being the same:
> 
> Please don't send patches in existing threads, it confuses things.

It wasn't really meant as a patch, but more of a confirmation I made
the changes you wanted...and a ping. :-) 




>> Is this better and ok for trunk?
> 
> Assuming you write a good changelog for this, it is fine.  Thanks!

Done and pushed.  Thanks!   We need this on GCC11 and GCC10 as well.
With GCC11 due soon, I'd like this in there.  Ok for backports after
a day or so of trunk burn-in?

Peter



Re: Ping: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-04-20 Thread Peter Bergner via Gcc-patches
On 4/20/22 11:01 AM, Michael Meissner wrote:
> Ping patch.
> 
> | Date: Tue, 12 Apr 2022 21:14:55 -0400
> | From: Michael Meissner 
> | Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR 
> target/102059
> | Message-ID: 
> 
> I feel this is an important patch.  Please look at it and approve the patch or
> give me feedback on how to change it.  Note, I will be in today (April 20th)
> and tomorrow (April 21st), but I will be away from a computer on April 22-25
> (Friday through Monday).

I agree this is important and we want this is in so we can get it backported.
I'm being pinged about this from a customer who is using GCC10 and this issue
is holding them back, so the quicker we get this in, the better.

Peter




Re: [PATCH, V2] Use system default for long double if not specified on PowerPC.

2022-02-04 Thread Peter Bergner via Gcc-patches
On 2/4/22 12:03 PM, Segher Boessenkool wrote:
> On Fri, Feb 04, 2022 at 04:43:53PM +0100, Andreas Schwab wrote:
>> On Feb 04 2022, Michael Meissner via Gcc-patches wrote:
>>> If the user did not specify a default long double format when configuring
>>> GCC, use the long double default from the host compiler.
>>
>> That doesn't make any sense.  The host compiler can be any random
>> compiler completely unrelated to the target.
> 
> Yes, see .
[snip]
> I already NAKed this patch for weeks, and I do it again now.

Did you NAK the patch due to its specific implementation or are you
even against the aim of the patch, namely that gcc configure tries
to determine the long double default of the underlying system and
matches that?

Peter




Re: [PATCH, V2] Use system default for long double if not specified on PowerPC.

2022-02-08 Thread Peter Bergner via Gcc-patches
On 2/4/22 4:26 PM, Segher Boessenkool wrote:
> As I said before, I didn't even read the patch, just the one line
> summary was enough for a NAK.  If the patch in fact does something else,
> then it is still incorrect, and needs a very different subject and
> summary.
> 
> I hope you see how "using the default of the underlying system" is
> questionable in itself, but is something completely different from using
> the default of the build compiler, which makes even less sense.
> 
> You want a configure flag to set the default long double format to be
> IEEE QP.  This cannot be enabled by default until a (big) majority of
> systems "in the wild" will work with that (only on powerpc64le-linux
> or some *big* thing like that is fine, only default it to enabled there
> then).  At that point in time, configure shouls complain, and the user
> would have to explicitly *disable* it to build without that support.

Can you please clarify one thing for me.  Do you think it's possible
that we can come up with some type of configure patch that automatically
sets the long double default given something on the system we can test
for or do you think that's impossible and we'll just have to live with
explicitly using a configure option to set the default?

...at least until some time in the far future when enough systems/distros
have changed to IEEE128 that we can change the default without a test.

Peter



[PATCH] rs6000: Retry tbegin. instructions that can fail intermittently

2022-02-15 Thread Peter Bergner via Gcc-patches
The HTM tbegin. instruction can fail intermittently due to many reasons.
This can lead to the htm-1.c testsuite test case FAILing from time to time.
The solution is to allow retrying the instruction a few times before aborting.

Bill has seen this fail a 1-2 times per 1 runs.  With the change below, I
haven't seen the test case fail in over 1 million runs.

Ok for trunk and backports to the release branches?

Peter


gcc/testsuite/
* gcc.target/powerpc/htm-1.c: Retry intermittent failing tbegins.

diff --git a/gcc/testsuite/gcc.target/powerpc/htm-1.c 
b/gcc/testsuite/gcc.target/powerpc/htm-1.c
index f27e32ca281..399a7ec8812 100644
--- a/gcc/testsuite/gcc.target/powerpc/htm-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/htm-1.c
@@ -12,14 +12,21 @@ main (void)
 {
   long i;
   unsigned long mask = 0;
+  unsigned long retry_count = 0;
 
 repeat:
   if (__builtin_tbegin (0))
 {
   mask++;
+  retry_count = 0;
 }
   else
-abort();
+{
+  /* Retry a limited number of times before aborting.  */
+  if (retry_count++ < 10)
+   goto repeat;
+  abort ();
+}
 
   if (mask == 1)
 {


Re: [PATCH] rs6000: Retry tbegin. instructions that can fail intermittently

2022-02-15 Thread Peter Bergner via Gcc-patches
On 2/15/22 3:54 PM, Segher Boessenkool wrote:
> On Tue, Feb 15, 2022 at 01:03:09PM -0600, Peter Bergner wrote:
>> The HTM tbegin. instruction can fail intermittently due to many reasons.
> 
> Just a few really.  But, if the transaction fails it will appear as if
> the tbegin. had an error as well, is that what you are seeing?

Yeah, exactly.  The transaction itself fails, which rolls us back to
the tbegin. with an error return value so we fall into the else/error
clause.



> That is the way any HTM code should be written in the first place
> (except for rollback-only transactions, but let's not go there --
> besides, it is normal for those to fail as well, and there needs to be a
> fallback there as well :-) )

Agreed and I'm not sure why I didn't write it that way to begin with.
Maybe I thought it was so simple that the likelihood of it failing was
so small we'd never see it?  Anyway, we do now, so...



> The patch is fine.  Okay for trunk and backports (after soak time ofc).
> Thanks!

Thanks, pushed.

Peter



Re: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-18 Thread Peter Bergner via Gcc-patches
On 2/17/22 1:04 PM, Robin Dapp via Gcc-patches wrote:
>> Please send patches as plain text, not as base64.
> 
> It seems like Thunderbird does not support this anymore since later
> versions, grml.  Probably need to look for another mail client.

I use Thunderbird with no problems.  I use the Quicktext addon/extension
and it gives me a toolbar option "Other" which has a "Insert file as text"
which allows adding a patch to your email inline with no whitespace munging.

Peter


[PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-29 Thread Peter Bergner via Gcc-patches
Sorry for dropping the ball on testing the patch from the bugzilla!

The following patch fixes the ICE reported in the bugzilla on the pre-existing
gcc testsuite test case, bootstraps and shows no testsuite regressions
on powerpc64le-linux.  Ok for trunk?

Peter


For -ftrivial-auto-var-init=*, skip initializing the register variable if it
is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes.

gcc/
PR middle-end/103127
* internal-fn.c (expand_DEFERRED_INIT): Skip if VAR_TYPE is opaque.

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 0cba95411a6..7cc0e9d5293 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3070,6 +3070,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
 }
   else
 {
+  /* Skip variables of opaque types that are in a register.  */
+  if (OPAQUE_TYPE_P (var_type))
+   return;
+
   /* If this variable is in a register use expand_assignment.
 For boolean scalars force zero-init.  */
   tree init;


Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-30 Thread Peter Bergner via Gcc-patches
On 11/30/21 2:37 AM, Richard Biener wrote:
> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
> I think that's inconsistent indeed.  Peter, what are "opaque"
> registers?  rs6000-modes.def suggests
> that there's __vector_pair and __vector_quad, what's the GIMPLE types
> for those?  It seems they
> are either SSA names or expanded to pseudo registers but there's no
> constants for them.

The __vector_pair and __vector_quad types are target specific types
for use with our Matrix-Math-Assist (MMA) unit and they are only
usable with our associated MMA built-in functions.  What they hold
is really dependent on which MMA built-ins you use on them.
You can think of them a generic (and large) vector type where the
subtype is undefined...or defined by which built-in function you
happen to be using.

We do not have any constants defined for them.  How we initialize them
is either by loading values from memory into them or by zeroing them
out using the xxsetaccz instruction (only for __vector_quads).




> Can they be initialized?  I see they can be copied at least.

__vector_quads can be zero initialized using the __builtin_mma_xxsetaccz()
built-in function.  We don't have a method (or use case) for zero initializing
__vector_pairs.



> If such "things" cannot be initialized they should indeed be exempt
> from auto-init.  The
> documentation suggests that they act as bit-bucked but even bit-buckets should
> be initializable, thus why exactly does CONST0_RTX not exist for them?

We used to have CONST0_RTX defined (but nothing else), but we had problems
with the compiler CSEing the initialization for multiple __vector_quads and
then copying the values around.  We'd end up with one xxsetaccz instruction
and copies out of that accumulator register into the other accumulator
registers.  Copies are VERY expensive, while xxsetaccz's are cheap, so we
don't want that.  That said, I think a fix I put in to disable fwprop on
these types may have been the culprit for that problem, so maybe we could
add the CONST0_RTX back?  I'd have to verify that.  If so, then we'd at least
be able to support -ftrivial-auto-var-init=zero.  The =pattern version
would be more problematical...unless the value for pattern was loaded from
memory.

Peter




Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-30 Thread Peter Bergner via Gcc-patches
On 11/30/21 11:51 AM, Qing Zhao wrote:
> So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
> initialized even at source code level. 
> 
> The only way to initialize such variable (only __vector_quad, not for 
> __vector_pairs) at source code level is through call to 
> __builtin_mma_xxsetaccz as:
> 
> void
> foo (__vector_quad *dst)
> {
>   __vector_quad acc;
>   __builtin_mma_xxsetaccz(&acc);
>   *dst = acc;
> }
> 
> Is this the correct understanding?

Correct.  Or via...


> Is there way to initialize such variable to other values than zero at source 
> code level?

Not for any constant values.  You can load it from memory though like below,
which is also allowed for __vector_pair:

void
foo (__vector_quad *dst, __vector_quad *src)
{
  __vector_quad acc;
  acc = *src;
  ...
}
void
bar (__vector_pair *dst, __vector_pair *src)
{
  __vector_pair pair;
  pair = *src;
  ...
}

We do not accept things like:

  acc = 0;
  acc = {0, 0, ... };
  etc.

Peter


Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-30 Thread Peter Bergner via Gcc-patches
On 11/30/21 1:50 PM, Qing Zhao via Gcc-patches wrote:
>> void
>> bar (__vector_pair *dst, __vector_pair *src)
>> {
>>  __vector_pair pair;
>>  pair = *src;
>>  ...
>> }
> 
> However, even with the above, the memory pointed by “src” still need to
> be initialized somewhere. How to provide the initial value to the variable
> in the beginning for __vector_pair type?

Well no initialization is required here in this function.  Isn't that what
matters here?  When generating code for bar(), we assume that src already
points to initialized memory.

As for what src points to, that could be initialized how any other memory/
array could be initialized, so either a static array, read in some data
from a file into an array, compute the array values in a loop, etc. etc.

Peter



Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-11-30 Thread Peter Bergner via Gcc-patches
On 11/30/21 2:44 PM, Qing Zhao via Gcc-patches wrote:
> Sorry for the confusing…
> My major question is:  
> 
> for a variable of type __vector_pair,  could it be in a register?

Yes.  To be pedantic, it will live in a vector register pair.


> If it could be in a register, can we initialize this register with some 
> constant value? 

For a __vector_pair, no, not as it is setup now.  We also do not have a
use case where we would want to initialize a __vector_pair to a constant.
Our normal (only?) use case with a __vector_pair is to load it up with
some actual data from memory that represents a (partial) row of a matrix. 

For __vector_quad, it too lives in a register (accumulator register) and
represents a small matrix.  We have the __builtin_mma_xxsetaccz (&acc)
builtin to initialize it to a zero constant.

Peter



Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Peter Bergner via Gcc-patches
On 12/1/21 9:06 AM, Qing Zhao wrote:
>> On Dec 1, 2021, at 3:01 AM, Richard Biener  
>> wrote:
>> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
>> instead of fixing up things at expansion time.
> 
> Agreed. 
> 
> So, Peter, will you update the routine “is_var_need_auto_init” in gimplify.c
> to exclude OPAQUE_TYPE to fix this issue?

Sure, I can give that a try and verify it fixes the ICE too.

Peter




Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Peter Bergner via Gcc-patches
On 12/1/21 3:01 AM, Richard Biener wrote:
> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
> instead of fixing up things at expansion time.

The following fixes the ICE.  The bootstrap/regtesting is still running though.

Peter


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 8624f8221fd..326476f0238 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl)
  || !DECL_HARD_REGISTER (decl))
   && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
   && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))
+  && !OPAQUE_TYPE_P (TREE_TYPE (decl))
   && !is_empty_type (TREE_TYPE (decl)))
 return true;
   return false;


Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Peter Bergner via Gcc-patches
On 12/1/21 1:07 PM, Richard Biener wrote:
> On December 1, 2021 6:42:21 PM GMT+01:00, Peter Bergner 
>  wrote:
>> On 12/1/21 3:01 AM, Richard Biener wrote:
>>> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
>>> instead of fixing up things at expansion time.
>>
>> The following fixes the ICE.  The bootstrap/regtesting is still running 
>> though.
> 
> OK. 

Great, pushed.  For posterity, below is what I committed.

Peter


middle-end: Skip initialization of opaque type variables [PR103127]

For -ftrivial-auto-var-init=*, skip initializing the variable if it is an
opaque type, because CONST0_RTX(mode) is not defined for opaque modes.

2021-12-01  Peter Bergner  

gcc/
PR middle-end/103127
* gimplify.c (is_var_need_auto_init): Handle opaque types.

gcc/testsuite/
PR middle-end/103127
* gcc.target/powerpc/pr103127.c: New test.

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 8624f8221fd..326476f0238 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl)
  || !DECL_HARD_REGISTER (decl))
   && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
   && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))
+  && !OPAQUE_TYPE_P (TREE_TYPE (decl))
   && !is_empty_type (TREE_TYPE (decl)))
 return true;
   return false;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103127.c 
b/gcc/testsuite/gcc.target/powerpc/pr103127.c
new file mode 100644
index 000..801fc0a4620
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103127.c
@@ -0,0 +1,19 @@
+/* PR target/103127 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -ftrivial-auto-var-init=zero" } */
+
+/* Verify we do not ICE on the following tests.  */
+
+void
+foo (__vector_quad *dst)
+{
+  __vector_quad acc;
+  *dst = acc;
+}
+
+void
+bar (__vector_pair *dst)
+{
+  __vector_pair pair;
+  *dst = pair;
+}



[PING] [PATCH] rs6000: testsuite: Add rop_ok effective-target function

2021-12-02 Thread Peter Bergner via Gcc-patches
I'd like to ping the following patch.

The test case in the previously OK'd fix for PR101324 depends on this change,
so I've had to hold off committing it until this is in.  Thanks.

  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584208.html

Peter



Message-ID: 


This patch adds a new effective-target function that tests whether
it is safe to emit the ROP-protect instructions and updates the
ROP test cases to use it.

Segher, as we discussed offline, this uses the double [] which you said
isn't needed in general regex's, but for some reason is needed in the gcc
testsuite regex.

Tested on powerpc64le*-linux with no regressions.  Ok for mainline?

Peter


gcc/testsuite/
* lib/target-supports.exp (check_effective_target_rop_ok): New function.
* gcc.target/powerpc/rop-1.c: Use it.
* gcc.target/powerpc/rop-2.c: Likewise.
* gcc.target/powerpc/rop-3.c: Likewise.
* gcc.target/powerpc/rop-4.c: Likewise.
* gcc.target/powerpc/rop-5.c: Likewise.


Re: [PATCH] rs6000: testsuite: Add rop_ok effective-target function

2021-12-03 Thread Peter Bergner via Gcc-patches
On 12/2/21 5:15 PM, Segher Boessenkool wrote:
>> Tested on powerpc64le*-linux with no regressions.  Ok for mainline?
> 
> What can "*" be there other than the empty string?  Which valuse of "*"
> did you test?  :-)

Heh, too used to typing powerpc64*-linux.  Yeah, in this case * == "".



> Okay for trunk.  Thanks!

Thanks, pushed.

Peter




Re: rs6000: Fix up flag_shrink_wrap handling in presence of -mrop-protect [PR101324]

2021-12-03 Thread Peter Bergner via Gcc-patches
On 10/29/21 4:45 PM, Segher Boessenkool wrote:
> On Wed, Oct 27, 2021 at 10:17:39PM -0500, Peter Bergner wrote:
>> 2021-10-27  Martin Liska  
>>
>> gcc/
>>  PR target/101324
>>  * config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
>>  disabling of shrink-wrapping when using -mrop-protect from here...
>>  (rs6000_override_options_after_change): ...to here.
>>
>> 2021-10-27  Peter Bergner  
>>
>> gcc/testsuite/
>>  PR target/101324
>>  * gcc.target/powerpc/pr101324.c: New test.
> 
> Okay for trunk with similar robustification.  Thanks!

With the rop_ok change finally committed, I have finally pushed this
change with your suggested test case changes.  Thanks Martin for the
fix and Segher for the reviews.

Peter



Re: rs6000: Fix up flag_shrink_wrap handling in presence of -mrop-protect [PR101324]

2021-12-03 Thread Peter Bergner via Gcc-patches
On 12/3/21 2:39 PM, Peter Bergner wrote:
> On 10/29/21 4:45 PM, Segher Boessenkool wrote:
>> On Wed, Oct 27, 2021 at 10:17:39PM -0500, Peter Bergner wrote:
>>> 2021-10-27  Martin Liska  
>>>
>>> gcc/
>>> PR target/101324
>>> * config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
>>> disabling of shrink-wrapping when using -mrop-protect from here...
>>> (rs6000_override_options_after_change): ...to here.
>>>
>>> 2021-10-27  Peter Bergner  
>>>
>>> gcc/testsuite/
>>> PR target/101324
>>> * gcc.target/powerpc/pr101324.c: New test.
>>
>> Okay for trunk with similar robustification.  Thanks!
> 
> With the rop_ok change finally committed, I have finally pushed this
> change with your suggested test case changes.  Thanks Martin for the
> fix and Segher for the reviews.

So I just checked and we have the same failure on GCC11 too.
Ok for the GCC11 release branch after this has burned-in on
trunk for a couple of days?

Ditto for the rop_ok testsuite patch that goes with this one?

Peter



Re: rs6000: Fix up flag_shrink_wrap handling in presence of -mrop-protect [PR101324]

2021-12-03 Thread Peter Bergner via Gcc-patches
On 12/3/21 3:27 PM, Peter Bergner wrote:
> On 12/3/21 2:39 PM, Peter Bergner wrote:
>> On 10/29/21 4:45 PM, Segher Boessenkool wrote:
>>> On Wed, Oct 27, 2021 at 10:17:39PM -0500, Peter Bergner wrote:
 2021-10-27  Martin Liska  

 gcc/
PR target/101324
* config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
disabling of shrink-wrapping when using -mrop-protect from here...
(rs6000_override_options_after_change): ...to here.

 2021-10-27  Peter Bergner  

 gcc/testsuite/
PR target/101324
* gcc.target/powerpc/pr101324.c: New test.
>>>
>>> Okay for trunk with similar robustification.  Thanks!
>>
>> With the rop_ok change finally committed, I have finally pushed this
>> change with your suggested test case changes.  Thanks Martin for the
>> fix and Segher for the reviews.
> 
> So I just checked and we have the same failure on GCC11 too.
> Ok for the GCC11 release branch after this has burned-in on
> trunk for a couple of days?
> 
> Ditto for the rop_ok testsuite patch that goes with this one?

FYI, both commits backported cleanly and showed no testsuite regressions
on powerrpc64le-linux.

Peter





Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2021-12-03 Thread Peter Bergner via Gcc-patches
On 12/2/21 9:46 PM, Kewen.Lin via Gcc-patches wrote:
> on 2021/11/30 上午12:57, Segher Boessenkool wrote:
>> On Wed, Sep 01, 2021 at 02:55:51PM +0800, Kewen.Lin wrote:
>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>> and LTO mode.  As Martin pointed out, currently the function
>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>> NULL, but it's wrong, we should use the command line options
>>> from target_option_default_node as default.
>>
>> This is not documented.
>>
> 
> Yeah, but according to the document for the target attribute [1],
> "Multiple target back ends implement the target attribute to specify
> that a function is to be compiled with different target options than
> specified on the command line. The original target command-line options
> are ignored. ", it seems to say the function without any target
> attribute/pragma will be compiled with target options specified on the
> command line.  I think it's a normal expectation for users.
> 
> Excepting for the inconsistent behaviors between LTO and non-LTO,
> it can also make the below case different.

I thought Martin and richi mentioned that target attribute options
are treated as if they are appended to the end of the command line
options, so they can potentially override earlier options, but they
don't actually ignore them?

Peter



Re: [PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-03 Thread Peter Bergner via Gcc-patches
On 12/2/21 4:24 PM, Segher Boessenkool wrote:
> On Thu, Dec 02, 2021 at 10:43:24AM -0600, Bill Schmidt wrote:
>> The new built-in infrastructure is now enabled!
> 
> Congratulations, and thanks for all the work!

A big +1!

Peter





Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2021-12-06 Thread Peter Bergner via Gcc-patches
On 12/6/21 3:35 AM, Martin Liška wrote:
> On 12/4/21 00:23, Peter Bergner wrote:
>> I thought Martin and richi mentioned that target attribute options
>> are treated as if they are appended to the end of the command line
>> options, so they can potentially override earlier options, but they
>> don't actually ignore them?
> 
> No, the described behavior is true for optimize attribute:
> 
> optimize (string, …)
> ...
> The optimize attribute arguments of a function behave behave as if appended 
> to the command-line.
> 
> but:
> 
> target (string, …)
> ...
> The original target command-line options are ignored.

Ok, you learn something new every day.  Thanks for setting me straight!

Peter




Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2021-12-06 Thread Peter Bergner via Gcc-patches
On 12/6/21 7:06 AM, Segher Boessenkool wrote:
> On Fri, Dec 03, 2021 at 11:46:53AM +0800, Kewen.Lin wrot
>> Without this fix, bar inlines foo but get the error as below:
>>
>> In function ‘foo’,
>> inlined from ‘bar’ at test.c:8:8:
>> test.c:3:9: error: ‘__builtin_ttest’ requires the ‘-mhtm’ option
>> 3 |   *b += __builtin_ttest();
>>   | ^
>>
>> Since bar doesn't support foo's required HTM feature.
>>
>> With this fix, bar doesn't inline foo and the case gets compiled well.
> 
> And if you inline something no-htm into something that allows htm?  That
> should work fine, be allowed just fine.

It is only ok to inline a function to be compiled with no-htm into a
function that will be compiled with htm if the no-htm function is
being compiled with no-htm by default.  If the user explicitly used
-fno-htm on the function, then we should not inline the htm function
into it.  Ditto for the other way around.  Meaning, the caller's
option flags should be a superset of the callee's flags, but the
explicit flags used on both functions need to match exactly.

Peter



[PATCH] rs6000: Fix check_effective_target_rop_ok [PR103556, PR103586]

2021-12-07 Thread Peter Bergner via Gcc-patches
The new rop_ok effective target test doesn't correctly compute its expression
result.  Solution is to wrap the test within a expr {} statement.

This has been verified to work on both powerpc64le-linux and powerpc64-linux.
The original test case accidentally worked on LE, but failed on BE.  Now we
correctly run the rop tests on LE and mark them as UNSUPPORTED on BE.

Ok for trunk?

Peter

gcc/testsuite/
PR target/103556
PR target/103586
* lib/target-supports.exp (check_effective_target_rop_ok): Use expr {}.

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 7956c387c55..50eef9d453b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6631,8 +6631,8 @@ proc check_effective_target_powerpc_elfv2 { } {
 # Return 1 if this is a PowerPC target supporting -mrop-protect
 
 proc check_effective_target_rop_ok { } {
-return [check_effective_target_power10_ok]
-   && [check_effective_target_powerpc_elfv2]
+return [expr { [check_effective_target_power10_ok]
+  && [check_effective_target_powerpc_elfv2] }]
 }
 
 # The VxWorks SPARC simulator accepts only EM_SPARC executables and


Re: [PATCH] rs6000: Fix check_effective_target_rop_ok [PR103556, PR103586]

2021-12-07 Thread Peter Bergner via Gcc-patches
On 12/7/21 12:20 PM, Peter Bergner wrote:
>  proc check_effective_target_rop_ok { } {
> -return [check_effective_target_power10_ok]
> - && [check_effective_target_powerpc_elfv2]
> +return [expr { [check_effective_target_power10_ok]
> +&& [check_effective_target_powerpc_elfv2] }]

Speaking with Segher offline, he mentioned that a newline ends a
statement, so the easiest "fix" is to just place the original code
on one line.  That was approved by Segher and committed.  Thanks.

Peter




Re: [PATCH] sched1: Fix -fcompare-debug issue in schedule_region [PR105586]

2022-08-23 Thread Peter Bergner via Gcc-patches
On 8/23/22 6:49 AM, Surya Kumari Jangala wrote:
> sched1: Fix -fcompare-debug issue in schedule_region [PR105586]
> 
> In schedule_region(), a basic block that does not contain any real insns
> is not scheduled and the dfa state at the entry of the bb is not copied
> to the fallthru basic block. However a DEBUG insn is treated as a real
> insn, and if a bb contains non-real insns and a DEBUG insn, it's dfa
> state is copied to the fallthru bb. This was resulting in
> -fcompare-debug failure as the incoming dfa state of the fallthru block
> is different with -g. We should always copy the dfa state of a bb to
> it's fallthru bb even if the bb does not contain real insns.
> 
> 2022-08-22  Surya Kumari Jangala  
> 
> gcc/
>   PR rtl-optimization/105586
>   * sched-rgn.cc (schedule_region): Always copy dfa state to
>   fallthru block.


It looks good to me, but I cannot approve it.  That said, you're missing
a ChangeLog entry for your new helper function.  The ChangeLog mentions
what changed, not why it changed.  Maybe something like the following?


gcc/
PR rtl-optimization/105586
* sched-rgn.cc (save_state_for_fallthru_edge): New function.
(schedule_region): Use it for all blocks.


Peter


[PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-08-26 Thread Peter Bergner via Gcc-patches
When we expand an MMA disassemble built-in with C++ using a pointer that
is casted to a valid MMA type, the type isn't passed down to the expand
machinery and we end up using the base type of the pointer which leads to
an ICE.  This patch enforces we always use the correct MMA type regardless
of the pointer type being used.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk and backports after some burn-in time?

Peter

gcc/
PR target/101322
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin):
Enforce the use of a valid MMA pointer type.

gcc/testsuite/
PR target/101322
* g++.target/powerpc/pr101322.C: New test.

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 12afa86854c..e796e74f072 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1085,7 +1085,12 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi,
   unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
   tree dst_ptr = gimple_call_arg (stmt, 0);
   tree src_ptr = gimple_call_arg (stmt, 1);
-  tree src_type = TREE_TYPE (src_ptr);
+  tree src_type = (fncode == RS6000_BIF_DISASSEMBLE_ACC)
+ ? build_pointer_type (vector_quad_type_node)
+ : build_pointer_type (vector_pair_type_node);
+  if (TREE_TYPE (TREE_TYPE (src_ptr)) != src_type)
+   src_ptr = build1 (VIEW_CONVERT_EXPR, src_type, src_ptr);
+
   tree src = create_tmp_reg_or_ssa_name (TREE_TYPE (src_type));
   gimplify_assign (src, build_simple_mem_ref (src_ptr), &new_seq);
 
diff --git a/gcc/testsuite/g++.target/powerpc/pr101322.C 
b/gcc/testsuite/g++.target/powerpc/pr101322.C
new file mode 100644
index 000..59e71e8eb89
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr101322.C
@@ -0,0 +1,17 @@
+/* PR target/101322 */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+/* { dg-require-effective-target power10_ok } */
+
+/* Verify we don't ICE on the following test cases.  */
+
+void
+foo (char *resp, char *vpp)
+{
+  __builtin_vsx_disassemble_pair (resp, (__vector_pair *) vpp);
+}
+
+void
+bar (char *resp, char *vpp)
+{
+  __builtin_mma_disassemble_acc (resp, (__vector_quad *)vpp);
+}


[PATCH] rs6000: Allow conversions of MMA pointer types [PR106017]

2022-08-27 Thread Peter Bergner via Gcc-patches
GCC incorrectly disables conversions between MMA pointer types, which
are allowed with clang.  The original intent was to disable conversions
between MMA types and other other types, but pointer conversions should
have been allowed.  The fix is to just remove the MMA pointer conversion
handling code altogether.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk and backports after some burn-in time?

Peter

gcc/
PR target/106017
* config/rs6000/rs6000.cc (rs6000_invalid_conversion): Remove handling
of MMA pointer conversions.

gcc/testsuite/
PR target/106017
* gcc.target/powerpc/pr106017.c: New test.

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index df491bee2ea..2f3146e56f8 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -28188,28 +28188,6 @@ rs6000_invalid_conversion (const_tree fromtype, 
const_tree totype)
   if (tomode == OOmode)
return N_("invalid conversion to type %<__vector_pair%>");
 }
-  else if (POINTER_TYPE_P (fromtype) && POINTER_TYPE_P (totype))
-{
-  /* We really care about the modes of the base types.  */
-  frommode = TYPE_MODE (TREE_TYPE (fromtype));
-  tomode = TYPE_MODE (TREE_TYPE (totype));
-
-  /* Do not allow conversions to/from XOmode and OOmode pointer
-types, except to/from void pointers.  */
-  if (frommode != tomode
- && frommode != VOIDmode
- && tomode != VOIDmode)
-   {
- if (frommode == XOmode)
-   return N_("invalid conversion from type %<__vector_quad *%>");
- if (tomode == XOmode)
-   return N_("invalid conversion to type %<__vector_quad *%>");
- if (frommode == OOmode)
-   return N_("invalid conversion from type %<__vector_pair *%>");
- if (tomode == OOmode)
-   return N_("invalid conversion to type %<__vector_pair *%>");
-   }
-}
 
   /* Conversion allowed.  */
   return NULL;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106017.c 
b/gcc/testsuite/gcc.target/powerpc/pr106017.c
new file mode 100644
index 000..46d6c7a4a33
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr106017.c
@@ -0,0 +1,19 @@
+/* PR target/106017 */
+/* { dg-options "-O1 -mdejagnu-cpu=power10" } */
+/* { dg-require-effective-target power10_ok } */
+
+/* Make sure we do not flag any errors on the following test cases.  */
+
+void takeacc(__vector_quad *);
+void
+foo (void)
+{
+  __vector_quad arr[4];
+  takeacc (arr);
+}
+
+unsigned char *
+bar (__vector_quad *a)
+{
+  return (unsigned char *)a;
+}


Re: [PATCH] rs6000: Allow conversions of MMA pointer types [PR106017]

2022-08-27 Thread Peter Bergner via Gcc-patches
On 8/27/22 4:37 PM, Segher Boessenkool wrote:
> Such conversions are explicitly allowed in C, even (6.3.2.3/7).

Yeah, I think I just got a little carried away disabling them originally. :-(


>> The fix is to just remove the MMA pointer conversion
>> handling code altogether.
> 
> Okay for trunk and all backports.  Thanks!

Ok, pushed to trunk.  I'll backport after some burn-in.  Thanks!

Peter



Re: [PATCH] rs6000: Allow conversions of MMA pointer types [PR106017]

2022-08-29 Thread Peter Bergner via Gcc-patches
On 8/27/22 7:47 PM, Peter Bergner via Gcc-patches wrote:
> On 8/27/22 4:37 PM, Segher Boessenkool wrote:
>>> The fix is to just remove the MMA pointer conversion
>>> handling code altogether.
>>
>> Okay for trunk and all backports.  Thanks!
> 
> Ok, pushed to trunk.  I'll backport after some burn-in.  Thanks!

Test results on Bill's autotesters were clean on multiple systems,
so I pushed the backports to GCC 12, 11 and 10, so it's fixed
everywhere.  Thanks!

Peter



[PING][PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-08-30 Thread Peter Bergner via Gcc-patches
I'd like to ping the following patch.

Peter


https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600451.html


Message-ID: 

>When we expand an MMA disassemble built-in with C++ using a pointer that
>is casted to a valid MMA type, the type isn't passed down to the expand
>machinery and we end up using the base type of the pointer which leads to
>an ICE.  This patch enforces we always use the correct MMA type regardless
>of the pointer type being used.
>
>This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
>Ok for trunk and backports after some burn-in time?
>
>Peter
>
>gcc/
>   PR target/101322
>   * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin):
>   Enforce the use of a valid MMA pointer type.
>
>gcc/testsuite/
>   PR target/101322
>   * g++.target/powerpc/pr101322.C: New test.
>
>diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
>b/gcc/config/rs6000/rs6000-builtin.cc
>index 12afa86854c..e796e74f072 100644
>--- a/gcc/config/rs6000/rs6000-builtin.cc
>+++ b/gcc/config/rs6000/rs6000-builtin.cc
>@@ -1085,7 +1085,12 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
>*gsi,
>   unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
>   tree dst_ptr = gimple_call_arg (stmt, 0);
>   tree src_ptr = gimple_call_arg (stmt, 1);
>-  tree src_type = TREE_TYPE (src_ptr);
>+  tree src_type = (fncode == RS6000_BIF_DISASSEMBLE_ACC)
>+? build_pointer_type (vector_quad_type_node)
>+: build_pointer_type (vector_pair_type_node);
>+  if (TREE_TYPE (TREE_TYPE (src_ptr)) != src_type)
>+  src_ptr = build1 (VIEW_CONVERT_EXPR, src_type, src_ptr);
>+
>   tree src = create_tmp_reg_or_ssa_name (TREE_TYPE (src_type));
>   gimplify_assign (src, build_simple_mem_ref (src_ptr), &new_seq);
> 
>diff --git a/gcc/testsuite/g++.target/powerpc/pr101322.C 
>b/gcc/testsuite/g++.target/powerpc/pr101322.C
>new file mode 100644
>index 000..59e71e8eb89
>--- /dev/null
>+++ b/gcc/testsuite/g++.target/powerpc/pr101322.C
>@@ -0,0 +1,17 @@
>+/* PR target/101322 */
>+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
>+/* { dg-require-effective-target power10_ok } */
>+
>+/* Verify we don't ICE on the following test cases.  */
>+
>+void
>+foo (char *resp, char *vpp)
>+{
>+  __builtin_vsx_disassemble_pair (resp, (__vector_pair *) vpp);
>+}
>+
>+void
>+bar (char *resp, char *vpp)
>+{
>+  __builtin_mma_disassemble_acc (resp, (__vector_quad *)vpp);
>+}



  1   2   3   >