date:20231107

Re: [PATCH] RISC-V: Fixed failed rvv combine testcases

2023-11-07 Thread Lehua Ding


Hi Robin,

On 2023/11/7 15:57, Robin Dapp wrote:

Thanks, what I was slightly concerned about is that we now have
the implicit assumption that the initial value is 0.  I mean
that's what the vectorizer does for reductions but theoretically,
wouldn't we also combine other values into 0 now?


Sorry, I'm not understanding what you mean. I think it's only safe to do 
this combine if it's initialized to 0. Because this combine actually 
throws away the operation of adding 0 (via mask operand).


--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

[PATCH] RISC-V regression test: Fix FAIL bb-slp-cond-1.c for RVV

2023-11-07 Thread Juzhe-Zhong

Previously, in this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635392.html
I use vect64 && vect128 to represent both RVV and AMDGCN. However, it caused 
additional FAIL on ARM SVE.
I don't know why ARM SVE vect64 is set as true since their AdvSIMD is 128bit 
vector and they don't use 64bit vector.

So, here we leverage current AMDGCN solution, just add RISCV like AMDGCN.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-cond-1.c: Add riscv.

---
 gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
index c8024429e9c..4089eb51b2e 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
@@ -47,6 +47,6 @@ int main ()
 }
 
 /* { dg-final { scan-tree-dump {(no need for alias check [^\n]* when VF is 
1|no alias between [^\n]* when [^\n]* is outside \(-16, 16\))} "vect" { target 
vect_element_align } } } */
-/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { 
vect_element_align && { ! amdgcn-*-* } } } } } */
-/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target 
amdgcn-*-* } } } */
+/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { 
vect_element_align && { ! { amdgcn-*-* riscv*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target { 
amdgcn-*-* riscv*-*-* } } } } */
 
-- 
2.36.3

[PATCH] testsuite: Change expectation for bb-slp-over-widen-n.c

2023-11-07 Thread Robin Dapp

Hi,

this patch makes sure we check for
  note: Basic block will be vectorized using SLP
instead of
  optimized: basic block
which will also match
  optimized: basic block part
of which there are many more in an RVV dump.

Tested on x86 and aarch64 as well as RVV.

Regards
 Robin

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-over-widen-1.c: Change test expectation.
* gcc.dg/vect/bb-slp-over-widen-2.c: Ditto.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c | 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c
index b556a1d6278..7646c4f5a8b 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c
@@ -65,4 +65,4 @@ main (void)
 /* { dg-final { scan-tree-dump "demoting int to signed short" "slp2" { target 
{ ! vect_widen_shift } } } } */
 /* { dg-final { scan-tree-dump "demoting int to unsigned short" "slp2" { 
target { ! vect_widen_shift } } } } */
 /* { dg-final { scan-tree-dump {\.AVG_FLOOR} "slp2" { target vect_avg_qi } } } 
*/
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { 
target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "note: Basic block will be vectorized" 2 
"slp2" { target vect_hw_misalign } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
index d1aa161c3ad..d9838c0917d 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
@@ -64,4 +64,4 @@ main (void)
 /* { dg-final { scan-tree-dump "demoting int to signed short" "slp2" { target 
{ ! vect_widen_shift } } } } */
 /* { dg-final { scan-tree-dump "demoting int to unsigned short" "slp2" { 
target { ! vect_widen_shift } } } } */
 /* { dg-final { scan-tree-dump {\.AVG_FLOOR} "slp2" { target vect_avg_qi } } } 
*/
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { 
target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "note: Basic block will be vectorized" 2 
"slp2" { target vect_hw_misalign } } } */
-- 
2.41.0

Re: [V2 PATCH] Handle bitop with INTEGER_CST in analyze_and_compute_bitop_with_inv_effect.

2023-11-07 Thread Richard Biener

On Tue, Nov 7, 2023 at 7:08 AM liuhongt  wrote:
>
> analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> loop invariant which is not the case when it's INTEGER_CST.
>
> Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

So this addresses a missed optimization, right?  It seems to me that
even with two SSA names we are only "lucky" when rhs1 is the invariant
one.  So instead of swapping this way I'd do

 unsigned i;
 for (i = 0; i < 2; ++i)
   if (TREE_CODE (match_op[i]) == SSA_NAME
   && ...)
break; /* found! */

  if (i == 2)
return NULL_TREE;
  if (i == 0)
std::swap (match_op[0], match_op[1]);

to also handle a "swapped" pair of SSA names?

> gcc/ChangeLog:
>
> PR tree-optimization/105735
> PR tree-optimization/111972
> * tree-scalar-evolution.cc
> (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> INTEGER_CST.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr105735-3.c: New test.
> ---
>  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++
>  gcc/tree-scalar-evolution.cc   |  3 +
>  2 files changed, 90 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c 
> b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> new file mode 100644
> index 000..9e268a1a997
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> @@ -0,0 +1,87 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } 
> */
> +
> +unsigned int
> +__attribute__((noipa))
> +foo (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 64; bit++)
> +tmp &= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo1 (unsigned int tmp)
> +{
> +  for (int bit = 63; bit >= 0; bit -=3)
> +tmp &= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo2 (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 64; bit++)
> +tmp |= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo3 (unsigned int tmp)
> +{
> +  for (int bit = 63; bit >= 0; bit -=3)
> +tmp |= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo4 (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 64; bit++)
> +tmp ^= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +foo5 (unsigned int tmp)
> +{
> +  for (int bit = 0; bit < 63; bit++)
> +tmp ^= 11304;
> +  return tmp;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +f (unsigned int tmp, int bit)
> +{
> +  unsigned int res = tmp;
> +  for (int i = 0; i < bit; i++)
> +res &= 11304;
> +  return res;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +f1 (unsigned int tmp, int bit)
> +{
> +  unsigned int res = tmp;
> +  for (int i = 0; i < bit; i++)
> +res |= 11304;
> +  return res;
> +}
> +
> +unsigned int
> +__attribute__((noipa))
> +f2 (unsigned int tmp, int bit)
> +{
> +  unsigned int res = tmp;
> +  for (int i = 0; i < bit; i++)
> +res ^= 11304;
> +  return res;
> +}
> diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> index 70b17c5bca1..f61277c32df 100644
> --- a/gcc/tree-scalar-evolution.cc
> +++ b/gcc/tree-scalar-evolution.cc
> @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class loop* 
> loop, tree phidef,
>match_op[0] = gimple_assign_rhs1 (def);
>match_op[1] = gimple_assign_rhs2 (def);
>
> +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> +std::swap (match_op[0], match_op[1]);
> +
>if (TREE_CODE (match_op[1]) != SSA_NAME
>|| !expr_invariant_in_loop_p (loop, match_op[0])
>|| !(header_phi = dyn_cast  (SSA_NAME_DEF_STMT (match_op[1])))
> --
> 2.31.1
>

ping / [Patch] OpenMP: Add C++ support for 'omp allocate' with stack variables

2023-11-07 Thread Tobias Burnus


As I know my lack of C++ FE knowledge, I would appreciate very much if
someone could have a look to reduce the chance that I did something
stupid or missed something obvious...

https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633782.html

Tobias

On 20.10.23 18:49, Tobias Burnus wrote:

This patch adds C++ support for OpenMP's 'omp allocate' for
stack/automatic arrays.

Comments and suggestions? — I bet there are given my little knowledge
about the C++ FE...

Tobias

PS: I think I should write some additional C++-specific code, I bet some
corner cases are missed. The question is only which ones.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße
201, 80634 München; Gesellschaft mit beschränkter Haftung;
Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft:
München; Registergericht München, HRB 106955

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH] vect/ifcvt: Add vec_cond fallback and check for vector versioning.

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Robin Dapp wrote:

> Hi,
> 
> this restricts tree-ifcvt to only create COND_OPs when we versioned the
> loop for vectorization.  Apart from that it re-creates a VEC_COND_EXPR
> in vect_expand_fold_left if we emitted a COND_OP.
> 
> I'm still missing the "bail out" part for vect_expand_fold_left, though?
> 
> Bootstrap, testsuites are unchanged but so they were for the original
> patch, so... :)

Looks good, maybe ...

> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/112361
>   PR target/112359
>   PR middle-end/112406
> 
>   * tree-if-conv.cc (convert_scalar_cond_reduction): Remember if
>   loop was versioned and only then create COND_OPs.
>   (predicate_scalar_phi): Do not create COND_OP when not
>   vectorizing.
>   * tree-vect-loop.cc (vect_expand_fold_left): Re-create
>   VEC_COND_EXPR.
>   (vectorize_fold_left_reduction): Pass mask and whether a cond_op
>   was used to vect_expand_fold_left.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/pr112359.c: New test.
> ---
>  gcc/testsuite/gcc.dg/pr112359.c | 14 +++
>  gcc/tree-if-conv.cc | 41 ++---
>  gcc/tree-vect-loop.cc   | 22 --
>  3 files changed, 62 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr112359.c
> 
> diff --git a/gcc/testsuite/gcc.dg/pr112359.c b/gcc/testsuite/gcc.dg/pr112359.c
> new file mode 100644
> index 000..26c77457399
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr112359.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -mavx512fp16 -ftree-loop-if-convert" } */
> +
> +int i, c;
> +unsigned long long u;
> +
> +void
> +foo (void)
> +{
> +  for (; i; i++)
> +if (c)
> +  u |= i;
> +}
> +
> diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
> index a7a751b668c..0190cf2369e 100644
> --- a/gcc/tree-if-conv.cc
> +++ b/gcc/tree-if-conv.cc
> @@ -1845,12 +1845,15 @@ is_cond_scalar_reduction (gimple *phi, gimple 
> **reduc, tree arg_0, tree arg_1,
>  res_2 = res_13 + _ifc__1;
>Argument SWAP tells that arguments of conditional expression should be
>swapped.
> +  If LOOP_VERSIONED is true if we assume that we versioned the loop for
> +  vectorization.  In that case we can create a COND_OP.
>Returns rhs of resulting PHI assignment.  */
>  
>  static tree
>  convert_scalar_cond_reduction (gimple *reduc, gimple_stmt_iterator *gsi,
>  tree cond, tree op0, tree op1, bool swap,
> -bool has_nop, gimple* nop_reduc)
> +bool has_nop, gimple* nop_reduc,
> +bool loop_versioned)
>  {
>gimple_stmt_iterator stmt_it;
>gimple *new_assign;
> @@ -1874,7 +1877,7 @@ convert_scalar_cond_reduction (gimple *reduc, 
> gimple_stmt_iterator *gsi,
>   The COND_OP will have a neutral_op else value.  */
>internal_fn ifn;
>ifn = get_conditional_internal_fn (reduction_op);
> -  if (ifn != IFN_LAST
> +  if (loop_versioned && ifn != IFN_LAST
>&& vectorized_internal_fn_supported_p (ifn, TREE_TYPE (lhs))
>&& !swap)
>  {
> @@ -2129,11 +2132,13 @@ cmp_arg_entry (const void *p1, const void *p2, void * 
> /* data.  */)
> The generated code is inserted at GSI that points to the top of
> basic block's statement list.
> If PHI node has more than two arguments a chain of conditional
> -   expression is produced.  */
> +   expression is produced.
> +   LOOP_VERSIONED should be true if we know that the loop was versioned for
> +   vectorization. */
>  
>  
>  static void
> -predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
> +predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi, bool 
> loop_versioned)
>  {
>gimple *new_stmt = NULL, *reduc, *nop_reduc;
>tree rhs, res, arg0, arg1, op0, op1, scev;
> @@ -2213,7 +2218,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
> *gsi)
> /* Convert reduction stmt into vectorizable form.  */
> rhs = convert_scalar_cond_reduction (reduc, gsi, cond, op0, op1,
>  true_bb != gimple_bb (reduc),
> -has_nop, nop_reduc);
> +has_nop, nop_reduc,
> +loop_versioned);
> redundant_ssa_names.safe_push (std::make_pair (res, rhs));
>   }
>else
> @@ -2311,7 +2317,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
> *gsi)
>   {
> /* Convert reduction stmt into vectorizable form.  */
> rhs = convert_scalar_cond_reduction (reduc, gsi, cond, op0, op1,
> -swap, has_nop, nop_reduc);
> +swap, has_nop, nop_reduc,
> +loop_versioned);
> redundant_ssa_names.safe_push (std::make_pair

[PATCH] testsuite/vect: Make check more accurate.

2023-11-07 Thread Robin Dapp

Hi,

similar to before this modifies a check so we do only match a
vectorization attempt if it succeeded.  On riscv we potentially try
several modes of which some may fail.

I tested on riscv, aarch64 and x86 but on the cfarm machines
there is no vect_fold_extract_last.  Maybe gcn would work?

Regards
 Robin

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-cond-reduc-4.c: Make check more accurate.
---
 gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c 
b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c
index 8ea8c538713..c5aa989ec29 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c
@@ -42,7 +42,7 @@ main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
FOLD_EXTRACT_LAST" 2 "vect" { target { vect_fold_extract_last && 
vect_pack_trunc } } } } */
+/* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
FOLD_EXTRACT_LAST(?:(?!failed)(?!Re-trying).)*succeeded" 2 "vect" { target { 
vect_fold_extract_last && vect_pack_trunc } } } } */
 /* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
FOLD_EXTRACT_LAST" 4 "vect" { target { { vect_fold_extract_last } && { ! 
vect_pack_trunc } } } } } */
 /* { dg-final { scan-tree-dump-times "condition expression based on integer 
induction." 2 "vect" { target { ! vect_fold_extract_last } } } } */
 
-- 
2.41.0

Re: [PATCH][GCC13] PR tree-optimization/105834 - Choose better initial values for ranger.

2023-11-07 Thread Richard Biener

On Mon, Nov 6, 2023 at 7:15 PM Andrew MacLeod  wrote:
>
> As requested porting this patch from trunk resolves this PR in GCC 13.
>
> Bootstraps on x86_64-pc-linux-gnu with no regressions.  OK for the gcc
> 13 branch?

The change caused PR110540 on trunk (still unfixed).  I don't think we want to
trade one missed optimization regression for another on the branch.

Thanks,
Richard.

> Andrew
>
>
>

Re: [PATCH] testsuite/vect: Make check more accurate.

2023-11-07 Thread Richard Biener

On Tue, Nov 7, 2023 at 9:22 AM Robin Dapp  wrote:
>
> Hi,
>
> similar to before this modifies a check so we do only match a
> vectorization attempt if it succeeded.  On riscv we potentially try
> several modes of which some may fail.

OK.

> I tested on riscv, aarch64 and x86 but on the cfarm machines
> there is no vect_fold_extract_last.  Maybe gcn would work?
>
> Regards
>  Robin
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/vect-cond-reduc-4.c: Make check more accurate.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c 
> b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c
> index 8ea8c538713..c5aa989ec29 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-4.c
> @@ -42,7 +42,7 @@ main (void)
>  }
>
>  /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
> -/* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
> FOLD_EXTRACT_LAST" 2 "vect" { target { vect_fold_extract_last && 
> vect_pack_trunc } } } } */
> +/* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
> FOLD_EXTRACT_LAST(?:(?!failed)(?!Re-trying).)*succeeded" 2 "vect" { target { 
> vect_fold_extract_last && vect_pack_trunc } } } } */
>  /* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
> FOLD_EXTRACT_LAST" 4 "vect" { target { { vect_fold_extract_last } && { ! 
> vect_pack_trunc } } } } } */
>  /* { dg-final { scan-tree-dump-times "condition expression based on integer 
> induction." 2 "vect" { target { ! vect_fold_extract_last } } } } */
>
> --
> 2.41.0
>

[PATCH] testsuite/vect: Make check more accurate.

2023-11-07 Thread juzhe.zh...@rivai.ai

Hi, Robin.

/* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
FOLD_EXTRACT_LAST" 4 "vect" { target { { vect_fold_extract_last } && { ! 
vect_pack_trunc } } } } } */
This check should be removed. Previously, I added it since we didn't enable 
vect_pack_trunc test.
But I think we don't need it any more. Your fix looks more reasonable.



juzhe.zh...@rivai.ai

Re: [PATCH] testsuite: Change expectation for bb-slp-over-widen-n.c

2023-11-07 Thread Richard Biener

On Tue, Nov 7, 2023 at 9:07 AM Robin Dapp  wrote:
>
> Hi,
>
> this patch makes sure we check for
>   note: Basic block will be vectorized using SLP
> instead of
>   optimized: basic block
> which will also match
>   optimized: basic block part
> of which there are many more in an RVV dump.
>
> Tested on x86 and aarch64 as well as RVV.

OK

> Regards
>  Robin
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/bb-slp-over-widen-1.c: Change test expectation.
> * gcc.dg/vect/bb-slp-over-widen-2.c: Ditto.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c | 2 +-
>  gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c
> index b556a1d6278..7646c4f5a8b 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-1.c
> @@ -65,4 +65,4 @@ main (void)
>  /* { dg-final { scan-tree-dump "demoting int to signed short" "slp2" { 
> target { ! vect_widen_shift } } } } */
>  /* { dg-final { scan-tree-dump "demoting int to unsigned short" "slp2" { 
> target { ! vect_widen_shift } } } } */
>  /* { dg-final { scan-tree-dump {\.AVG_FLOOR} "slp2" { target vect_avg_qi } } 
> } */
> -/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { 
> target vect_hw_misalign } } } */
> +/* { dg-final { scan-tree-dump-times "note: Basic block will be vectorized" 
> 2 "slp2" { target vect_hw_misalign } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
> index d1aa161c3ad..d9838c0917d 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-over-widen-2.c
> @@ -64,4 +64,4 @@ main (void)
>  /* { dg-final { scan-tree-dump "demoting int to signed short" "slp2" { 
> target { ! vect_widen_shift } } } } */
>  /* { dg-final { scan-tree-dump "demoting int to unsigned short" "slp2" { 
> target { ! vect_widen_shift } } } } */
>  /* { dg-final { scan-tree-dump {\.AVG_FLOOR} "slp2" { target vect_avg_qi } } 
> } */
> -/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { 
> target vect_hw_misalign } } } */
> +/* { dg-final { scan-tree-dump-times "note: Basic block will be vectorized" 
> 2 "slp2" { target vect_hw_misalign } } } */
> --
> 2.41.0
>

Re: [PATCH] RISC-V regression test: Fix FAIL bb-slp-cond-1.c for RVV

2023-11-07 Thread Richard Biener

On Tue, Nov 7, 2023 at 9:07 AM Juzhe-Zhong  wrote:
>
> Previously, in this patch: 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635392.html
> I use vect64 && vect128 to represent both RVV and AMDGCN. However, it caused 
> additional FAIL on ARM SVE.
> I don't know why ARM SVE vect64 is set as true since their AdvSIMD is 128bit 
> vector and they don't use 64bit vector.
>
> So, here we leverage current AMDGCN solution, just add RISCV like AMDGCN.

OK.

> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/bb-slp-cond-1.c: Add riscv.
>
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> index c8024429e9c..4089eb51b2e 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> @@ -47,6 +47,6 @@ int main ()
>  }
>
>  /* { dg-final { scan-tree-dump {(no need for alias check [^\n]* when VF is 
> 1|no alias between [^\n]* when [^\n]* is outside \(-16, 16\))} "vect" { 
> target vect_element_align } } } */
> -/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { 
> vect_element_align && { ! amdgcn-*-* } } } } } */
> -/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target 
> amdgcn-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { 
> vect_element_align && { ! { amdgcn-*-* riscv*-*-* } } } } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target { 
> amdgcn-*-* riscv*-*-* } } } } */
>
> --
> 2.36.3
>

Re: [PATCH] testsuite/vect: Make check more accurate.

2023-11-07 Thread Robin Dapp

Sorry, didn't reply-all:

> /* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
> FOLD_EXTRACT_LAST" 4 "vect" { target { { vect_fold_extract_last } && { ! 
> vect_pack_trunc } } } } } */
> 
> This check should be removed. Previously, I added it since we didn't enable 
> vect_pack_trunc test.
> But I think we don't need it any more. Your fix looks more reasonable.

Ah, good, will remove it before committing.

Regards
 Robin

[PATCH] test: Fix bb-slp-33.c for RVV

2023-11-07 Thread Juzhe-Zhong

As https://godbolt.org/z/hPsqahEa5 shows.
RVV failed dump check since "vectorizing stmts using SLP" shows 3 times instead 
of 2.

The root cause is this code in main:

  if (a[0] != 1
  || a[1] != 2
  || a[2] != 3
  || a[3] != 4
  || a[4] != 7
  || a[5] != 0
  || a[6] != 0
  || a[7] != 0
  || a[8] != 0)
abort ();

is vectorized. So add -fno-tree-vectorize avoid the confusing check.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-33.c: Add -fno-tree-vectorize to main.

---
 gcc/testsuite/gcc.dg/vect/bb-slp-33.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-33.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-33.c
index bbb13ef798e..f44cbdcfbcf 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-33.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-33.c
@@ -17,7 +17,8 @@ test(int *__restrict__ a, int *__restrict__ b)
   a[8] = 0;
 }
 
-int main()
+int __attribute__((optimize(("-fno-tree-vectorize"
+main()
 {
   int a[9];
   int b[4];
-- 
2.36.3

[COMMITTED] ada: Fix internal error on address of element of packed array component

2023-11-07 Thread Marc Poulhiès

From: Eric Botcazou 

This occurs when the component is part of a discriminated type and its
offset depends on a discriminant, the problem being that the front-end
generates an incomplete Bit_Position attribute reference.

gcc/ada/

* exp_pakd.adb (Get_Base_And_Bit_Offset): Use the full component
reference instead of just the selector name for 'Bit_Position.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index c3908a54538..ad12aec1e23 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -2112,8 +2112,8 @@ package body Exp_Pakd is
 
   --  We build up an expression serially that has the form
 
-  --linear-subscript * component_size   for each array reference
-  --  +  field'Bit_Position for each record field
+  --linear-subscript * component_size for each array component ref
+  --  +  pref.component'Bit_Position  for each record component ref
   --  +  ...
 
   loop
@@ -2135,7 +2135,7 @@ package body Exp_Pakd is
  elsif Nkind (Base) = N_Selected_Component then
 Term :=
   Make_Attribute_Reference (Loc,
-Prefix => Selector_Name (Base),
+Prefix => Base,
 Attribute_Name => Name_Bit_Position);
 
  else
-- 
2.42.0

[COMMITTED] ada: Fix scope of semantic style_check pragmas

2023-11-07 Thread Marc Poulhiès

From: Viljar Indus 

Restore the original state of Style_Check pragmas before analyzing
each compilation unit to avoid Style_Check pragmas from unit affecting
the style checks of a different unit.

gcc/ada/

* sem_ch10.adb: (Analyze_Compilation_Unit): Restore the orignal
state of style check pragmas at the end of the analysis.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch10.adb | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/ada/sem_ch10.adb b/gcc/ada/sem_ch10.adb
index ba4beae2851..90d2f3c6c74 100644
--- a/gcc/ada/sem_ch10.adb
+++ b/gcc/ada/sem_ch10.adb
@@ -638,6 +638,7 @@ package body Sem_Ch10 is
   Par_Spec_Name : Unit_Name_Type;
   Spec_Id   : Entity_Id;
   Unum  : Unit_Number_Type;
+  Options   : Style_Check_Options;
 
--  Start of processing for Analyze_Compilation_Unit
 
@@ -717,6 +718,11 @@ package body Sem_Ch10 is
  Set_Context_Pending (N);
   end if;
 
+  --  Store the style check options before analyzing context pragmas that
+  --  might change them for this compilation unit.
+
+  Save_Style_Check_Options (Options);
+
   Analyze_Context (N);
 
   Set_Context_Pending (N, False);
@@ -1395,6 +1401,10 @@ package body Sem_Ch10 is
  Pop_Scope;
   end if;
 
+  --  Finally restore all the original style check options
+
+  Set_Style_Check_Options (Options);
+
   --  If No_Elaboration_Code_All was encountered, this is where we do the
   --  transitive test of with'ed units to make sure they have the aspect.
   --  This is delayed till the end of analyzing the compilation unit to
-- 
2.42.0

[COMMITTED] ada: Simplify code for Ignore_Style_Checks_Pragmas

2023-11-07 Thread Marc Poulhiès

From: Viljar Indus 

gcc/ada/

* sem_prag.adb: (Analyze_Pragma): Reduce the number of nested if
statements.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index b7655759d31..c391e2779bf 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -25109,6 +25109,10 @@ package body Sem_Prag is
 else
Check_Arg_Count (1);
 
+   if Ignore_Style_Checks_Pragmas then
+  return;
+   end if;
+
if Nkind (A) = N_String_Literal then
   S := Strval (A);
 
@@ -25129,9 +25133,7 @@ package body Sem_Prag is
 --  them in the parser.
 
 if J = Slen then
-   if not Ignore_Style_Checks_Pragmas then
-  Set_Style_Check_Options (Options);
-   end if;
+   Set_Style_Check_Options (Options);
 
exit;
 end if;
@@ -25142,23 +25144,17 @@ package body Sem_Prag is
 
elsif Nkind (A) = N_Identifier then
   if Chars (A) = Name_All_Checks then
- if not Ignore_Style_Checks_Pragmas then
-if GNAT_Mode then
-   Set_GNAT_Style_Check_Options;
-else
-   Set_Default_Style_Check_Options;
-end if;
+ if GNAT_Mode then
+Set_GNAT_Style_Check_Options;
+ else
+Set_Default_Style_Check_Options;
  end if;
 
   elsif Chars (A) = Name_On then
- if not Ignore_Style_Checks_Pragmas then
-Style_Check := True;
- end if;
+ Style_Check := True;
 
   elsif Chars (A) = Name_Off then
- if not Ignore_Style_Checks_Pragmas then
-Style_Check := False;
- end if;
+ Style_Check := False;
   end if;
end if;
 end if;
-- 
2.42.0

[COMMITTED] ada: Cleanup getting of actual subtypes

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

Avoid potentially unnecessary call to Etype.

gcc/ada/

* sem_util.adb (Get_Actual_Subtype_If_Available): Only call Etype
when necessary.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index da531e53466..d5df05b88e1 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10218,8 +10218,6 @@ package body Sem_Util is
-
 
function Get_Actual_Subtype_If_Available (N : Node_Id) return Entity_Id is
-  Typ : constant Entity_Id := Etype (N);
-
begin
   --  If what we have is an identifier that references a subprogram
   --  formal, or a variable or constant object, then we get the actual
@@ -10245,7 +10243,7 @@ package body Sem_Util is
   --  Otherwise the Etype of N is returned unchanged
 
   else
- return Typ;
+ return Etype (N);
   end if;
end Get_Actual_Subtype_If_Available;
 
-- 
2.42.0

[COMMITTED] ada: Fix style in declaration of routine for expansion of packed arrays

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

Style cleanup.

gcc/ada/

* exp_pakd.adb (Setup_Inline_Packed_Array_Reference): Remove extra
whitespace from the list of parameters.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index ad12aec1e23..2d3abbd349d 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -153,11 +153,11 @@ package body Exp_Pakd is
--  reference the corresponding packed array type.
 
procedure Setup_Inline_Packed_Array_Reference
- (N  : Node_Id;
-  Atyp   : Entity_Id;
-  Obj: in out Node_Id;
-  Cmask  : out Uint;
-  Shift  : out Node_Id);
+ (N : Node_Id;
+  Atyp  : Entity_Id;
+  Obj   : in out Node_Id;
+  Cmask : out Uint;
+  Shift : out Node_Id);
--  This procedure performs common processing on the N_Indexed_Component
--  parameter given as N, whose prefix is a reference to a packed array.
--  This is used for the get and set when the component size is 1, 2, 4,
@@ -2472,11 +2472,11 @@ package body Exp_Pakd is
-
 
procedure Setup_Inline_Packed_Array_Reference
- (N  : Node_Id;
-  Atyp   : Entity_Id;
-  Obj: in out Node_Id;
-  Cmask  : out Uint;
-  Shift  : out Node_Id)
+ (N : Node_Id;
+  Atyp  : Entity_Id;
+  Obj   : in out Node_Id;
+  Cmask : out Uint;
+  Shift : out Node_Id)
is
   Loc  : constant Source_Ptr := Sloc (N);
   PAT  : Entity_Id;
-- 
2.42.0

[COMMITTED] ada: Fix handling of actual subtypes for expanded names

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

gcc/ada/

* sem_util.adb
(Get_Actual_Subtype,Get_Actual_Subtype_If_Available): Fix handling
of expanded names.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 5440c6ae0aa..da531e53466 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10104,7 +10104,7 @@ package body Sem_Util is
   --  formal, or a variable or constant object, then we get the actual
   --  subtype from the referenced entity if one has been built.
 
-  if Nkind (N) = N_Identifier
+  if Nkind (N) in N_Identifier | N_Expanded_Name
 and then
   (Is_Formal (Entity (N))
 or else Ekind (Entity (N)) = E_Constant
@@ -10225,7 +10225,7 @@ package body Sem_Util is
   --  formal, or a variable or constant object, then we get the actual
   --  subtype from the referenced entity if one has been built.
 
-  if Nkind (N) = N_Identifier
+  if Nkind (N) in N_Identifier | N_Expanded_Name
 and then
   (Is_Formal (Entity (N))
 or else Ekind (Entity (N)) = E_Constant
-- 
2.42.0

[COMMITTED] ada: Error in prefix-notation call

2023-11-07 Thread Marc Poulhiès

From: Bob Duff 

The compiler gives a wrong error for a call of the form X.Y(...)
when Y is inherited indirectly via an interface.

gcc/ada/

* sem_ch4.adb (Is_Private_Overriding): Return True in the case
where a primitive operation is publicly inherited but privately
overridden.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch4.adb | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index 50ba6c9c847..2f3dfe71590 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -10223,9 +10223,15 @@ package body Sem_Ch4 is
 
elsif not Comes_From_Source (Visible_Op)
  and then Alias (Visible_Op) = Op
- and then not Is_Hidden (Visible_Op)
then
-  return True;
+  --  If Visible_Op or what it overrides is not hidden, then we
+  --  have found what we're looking for.
+
+  if not Is_Hidden (Visible_Op)
+or else not Is_Hidden (Overridden_Operation (Op))
+  then
+ return True;
+  end if;
end if;
 
Visible_Op := Homonym (Visible_Op);
-- 
2.42.0

[COMMITTED] ada: Change local variables to constants in expansion of packed arrays

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

Cleanup; semantics is unaffected.

gcc/ada/

* exp_pakd.adb
(Expand_Bit_Packed_Element_Set): Change local Decl object from
variable to constant.
(Setup_Inline_Packed_Array_Reference): Likewise for Csiz.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index 2d3abbd349d..1641e8a51c2 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -1137,14 +1137,12 @@ package body Exp_Pakd is
 
   if Nkind (Rhs) = N_String_Literal then
  declare
-Decl : Node_Id;
- begin
-Decl :=
+Decl : constant Node_Id :=
   Make_Object_Declaration (Loc,
 Defining_Identifier => Make_Temporary (Loc, 'T', Rhs),
 Object_Definition   => New_Occurrence_Of (Ctyp, Loc),
 Expression  => New_Copy_Tree (Rhs));
-
+ begin
 Insert_Actions (N, New_List (Decl));
 Rhs := New_Occurrence_Of (Defining_Identifier (Decl), Loc);
  end;
@@ -2481,12 +2479,10 @@ package body Exp_Pakd is
   Loc  : constant Source_Ptr := Sloc (N);
   PAT  : Entity_Id;
   Otyp : Entity_Id;
-  Csiz : Uint;
+  Csiz : constant Uint := Component_Size (Atyp);
   Osiz : Uint;
 
begin
-  Csiz := Component_Size (Atyp);
-
   Convert_To_PAT_Type (Obj);
   PAT := Etype (Obj);
 
-- 
2.42.0

[COMMITTED] ada: Avoid extra conversion in expansion of packed array assignments

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

Expansion of assignments to packed array objects with string literals on
the right-hand side, created an unnecessary conversion, i.e.:

  ... :=
component_type
  (declare
 temp : component_type := "string_literal";
   begin
 temp)

Now the expansion omits the outer type conversion.

Cleanup; behavior is unaffected.

gcc/ada/

* exp_pakd.adb (Expand_Bit_Packed_Element_Set): Simplify handling of
assignments with string literals.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index ef0ec1e0014..e197211736a 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -1143,9 +1143,10 @@ package body Exp_Pakd is
 Insert_Actions (N, New_List (Decl));
 Rhs := New_Occurrence_Of (Defining_Identifier (Decl), Loc);
  end;
+  else
+ Rhs := Convert_To (Ctyp, Rhs);
   end if;
 
-  Rhs := Convert_To (Ctyp, Rhs);
   Set_Parent (Rhs, N);
 
   --  If we are building the initialization procedure for a packed array,
-- 
2.42.0

[COMMITTED] ada: Fix expansion of type aspects with handling of aspects

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

The new handling of aspects stores the aspect expression as the
Expression_Copy of the aspect and not as the Entity of the aspect
identified. This has been changed for most of the aspects, but not for
Type_Invariant and Default_Initial_Condition, which have custom
expansion. Apparently this change only affects GNATprove and not GNAT.

gcc/ada/

* exp_util.adb (Add_Own_DIC, Add_Own_Invariants): Store the aspect
expression in Expression_Copy.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_util.adb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index 3e8d5997949..730889cae3e 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -1893,7 +1893,7 @@ package body Exp_Util is
  --  routines.
 
  if Present (DIC_Asp) then
-Set_Entity (Identifier (DIC_Asp), New_Copy_Tree (Expr));
+Set_Expression_Copy (DIC_Asp, New_Copy_Tree (Expr));
  end if;
 
  --  Once the DIC assertion expression is fully processed, add a check
@@ -3153,7 +3153,7 @@ package body Exp_Util is
--  Check_Aspect_At_xxx routines.
 
if Present (Prag_Asp) then
-  Set_Entity (Identifier (Prag_Asp), New_Copy_Tree (Expr));
+  Set_Expression_Copy (Prag_Asp, New_Copy_Tree (Expr));
end if;
 
Add_Invariant_Check (Prag, Expr, Checks);
-- 
2.42.0

[COMMITTED] ada: Simplify expansion of packed array assignments

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

When expanding assignment to a packed array object, e.g. a formal
parameter with mode OUT that might have unconstrained type, we took the
component type and component size from the constrained actual subtype.
It is simpler to take these properties from the nominal type of the
assigned object.

Semantics is unaffected, because constraining the array doesn't change
the type or size of the array components.

gcc/ada/

* exp_pakd.adb (Expand_Bit_Packed_Element_Set): Change Ctyp and Csiz
from variables to constants and compute them using the nominal type
of the assigned array object.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index 68f0db3d56d..2b92c467187 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -1059,10 +1059,12 @@ package body Exp_Pakd is
   Obj   : Node_Id;
   Atyp  : Entity_Id;
   PAT   : Entity_Id;
-  Ctyp  : Entity_Id;
-  Csiz  : Int;
   Cmask : Uint;
 
+  Arr_Typ : constant Entity_Id := Etype (Prefix (Lhs));
+  Ctyp: constant Entity_Id := Component_Type (Arr_Typ);
+  Csiz: constant Int := UI_To_Int (Component_Size (Arr_Typ));
+
   Shift : Node_Id;
   --  The expression for the shift value that is required
 
@@ -,8 +1113,6 @@ package body Exp_Pakd is
   Convert_To_Actual_Subtype (Obj);
   Atyp := Etype (Obj);
   PAT  := Packed_Array_Impl_Type (Atyp);
-  Ctyp := Component_Type (Atyp);
-  Csiz := UI_To_Int (Component_Size (Atyp));
 
   --  We remove side effects, in case the rhs modifies the lhs, because we
   --  are about to transform the rhs into an expression that first READS
-- 
2.42.0

[COMMITTED] ada: Simplify handling of known values in expansion of packed arrays

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

If an expression value is not known at compile time, it can be
represented with No_Uint and doesn't require a dedicated flag.

Code cleanup; behavior is unaffected.

gcc/ada/

* exp_pakd.adb (Expand_Bit_Packed_Element_Set): Remove Rhs_Val_Known;
represent unknown value by assigning Rhs_Val with No_Uint.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index 1641e8a51c2..ef0ec1e0014 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -1073,12 +1073,9 @@ package body Exp_Pakd is
   New_Lhs : Node_Id;
   New_Rhs : Node_Id;
 
-  Rhs_Val_Known : Boolean;
-  Rhs_Val   : Uint;
+  Rhs_Val : Uint;
   --  If the value of the right hand side as an integer constant is
-  --  known at compile time, Rhs_Val_Known is set True, and Rhs_Val
-  --  contains the value. Otherwise Rhs_Val_Known is set False, and
-  --  the Rhs_Val is undefined.
+  --  known at compile time, Rhs_Val contains the value.
 
   function Get_Shift return Node_Id;
   --  Function used to get the value of Shift, making sure that it
@@ -1230,8 +1227,7 @@ package body Exp_Pakd is
  --  Determine if right side is all 0 bits or all 1 bits
 
  if Compile_Time_Known_Value (Rhs) then
-Rhs_Val   := Expr_Rep_Value (Rhs);
-Rhs_Val_Known := True;
+Rhs_Val := Expr_Rep_Value (Rhs);
 
  --  The following test catches the case of an unchecked conversion of
  --  an integer literal. This results from optimizing aggregates of
@@ -1240,19 +1236,17 @@ package body Exp_Pakd is
  elsif Nkind (Rhs) = N_Unchecked_Type_Conversion
and then Compile_Time_Known_Value (Expression (Rhs))
  then
-Rhs_Val   := Expr_Rep_Value (Expression (Rhs));
-Rhs_Val_Known := True;
+Rhs_Val := Expr_Rep_Value (Expression (Rhs));
 
  else
-Rhs_Val   := No_Uint;
-Rhs_Val_Known := False;
+Rhs_Val := No_Uint;
  end if;
 
  --  Some special checks for the case where the right hand value is
  --  known at compile time. Basically we have to take care of the
  --  implicit conversion to the subtype of the component object.
 
- if Rhs_Val_Known then
+ if Present (Rhs_Val) then
 
 --  If we have a biased component type then we must manually do the
 --  biasing, since we are taking responsibility in this case for
@@ -1289,7 +1283,7 @@ package body Exp_Pakd is
 
  --  First we deal with the "and"
 
- if not Rhs_Val_Known or else Rhs_Val /= Cmask then
+ if No (Rhs_Val) or else Rhs_Val /= Cmask then
 declare
Mask1 : Node_Id;
Lit   : Node_Id;
@@ -1319,7 +1313,7 @@ package body Exp_Pakd is
 
  --  Then deal with the "or"
 
- if not Rhs_Val_Known or else Rhs_Val /= 0 then
+ if No (Rhs_Val) or else Rhs_Val /= 0 then
 declare
Or_Rhs : Node_Id;
 
@@ -1359,7 +1353,7 @@ package body Exp_Pakd is
end Fixup_Rhs;
 
 begin
-   if Rhs_Val_Known
+   if Present (Rhs_Val)
  and then Compile_Time_Known_Value (Get_Shift)
then
   Or_Rhs :=
@@ -1387,7 +1381,7 @@ package body Exp_Pakd is
   --  which will be properly retyped when we analyze and
   --  resolve the expression.
 
-  elsif Rhs_Val_Known then
+  elsif Present (Rhs_Val) then
 
  --  Note that Rhs_Val has already been normalized to
  --  be an unsigned value with the proper number of bits.
-- 
2.42.0

[COMMITTED] ada: Fix incorrect resolution of overloaded function call in instance

2023-11-07 Thread Marc Poulhiès

From: Eric Botcazou 

The problem occurs when the function call is the operand of an equality
operator, the type used to do the comparison is declared outside of the
generic construct but visible inside it, and this generic construct also
declares two functions with the same profile except for the result type,
one result type being the aforementioned type, the other being derived
from this type but not visible inside the generic construct.  When the
second operand is either a literal or also overloaded, the call may be
resolved to the second function instead of the first in instances.

gcc/ada/

* gen_il-fields.ads (Opt_Field_Enum): Add Compare_Type.
* gen_il-gen-gen_nodes.adb (N_Op_Eq): Likewise.
(N_Op_Ge): Likewise.
(N_Op_Gt): Likewise.
(N_Op_Le): Likewise.
(N_Op_Lt): Likewise.
(N_Op_Ne): Likewise.
* sinfo.ads (Compare_Type): Document new field.
* sem_ch4.adb (Analyze_Comparison_Equality_Op): If the entity is
already present, set the Compare_Type on overloaded operands if it
is present on the node.
* sem_ch12.adb (Check_Private_View): Look into the Compare_Type
instead of the Etype for comparison operators.
(Copy_Generic_Node): Remove obsolete code for comparison
operators.
(Save_Global_References.Save_References): Do not walk into the
descendants of N_Implicit_Label_Declaration nodes.
(Save_Global_References.Set_Global_Type): Look into the
Compare_Type instead of the Etype for comparison operators.
* sem_res.adb (Resolve_Comparison_Op): Set Compare_Type.
(Resolve_Equality_Op): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-fields.ads|  1 +
 gcc/ada/gen_il-gen-gen_nodes.adb | 18 +---
 gcc/ada/sem_ch12.adb | 72 ++--
 gcc/ada/sem_ch4.adb  | 15 +--
 gcc/ada/sem_res.adb  |  2 +
 gcc/ada/sinfo.ads| 20 +
 6 files changed, 87 insertions(+), 41 deletions(-)

diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
index 1b40cd9472e..a0bfb398ebb 100644
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -99,6 +99,7 @@ package Gen_IL.Fields is
   Comes_From_Check_Or_Contract,
   Comes_From_Extended_Return_Statement,
   Comes_From_Iterator,
+  Compare_Type,
   Compile_Time_Known_Aggregate,
   Component_Associations,
   Component_Clauses,
diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index fdf928d60a3..996d8d78aea 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -267,32 +267,38 @@ begin -- Gen_IL.Gen.Gen_Nodes
Cc (N_Op_Eq, N_Op_Compare,
(Sm (Chars, Name_Id),
 Sy (Left_Opnd, Node_Id),
-Sy (Right_Opnd, Node_Id)));
+Sy (Right_Opnd, Node_Id),
+Sm (Compare_Type, Node_Id)));
 
Cc (N_Op_Ge, N_Op_Compare,
(Sm (Chars, Name_Id),
 Sy (Left_Opnd, Node_Id),
-Sy (Right_Opnd, Node_Id)));
+Sy (Right_Opnd, Node_Id),
+Sm (Compare_Type, Node_Id)));
 
Cc (N_Op_Gt, N_Op_Compare,
(Sm (Chars, Name_Id),
 Sy (Left_Opnd, Node_Id),
-Sy (Right_Opnd, Node_Id)));
+Sy (Right_Opnd, Node_Id),
+Sm (Compare_Type, Node_Id)));
 
Cc (N_Op_Le, N_Op_Compare,
(Sm (Chars, Name_Id),
 Sy (Left_Opnd, Node_Id),
-Sy (Right_Opnd, Node_Id)));
+Sy (Right_Opnd, Node_Id),
+Sm (Compare_Type, Node_Id)));
 
Cc (N_Op_Lt, N_Op_Compare,
(Sm (Chars, Name_Id),
 Sy (Left_Opnd, Node_Id),
-Sy (Right_Opnd, Node_Id)));
+Sy (Right_Opnd, Node_Id),
+Sm (Compare_Type, Node_Id)));
 
Cc (N_Op_Ne, N_Op_Compare,
(Sm (Chars, Name_Id),
 Sy (Left_Opnd, Node_Id),
-Sy (Right_Opnd, Node_Id)));
+Sy (Right_Opnd, Node_Id),
+Sm (Compare_Type, Node_Id)));
 
Cc (N_Op_Or, N_Op_Boolean,
(Sm (Chars, Name_Id),
diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index 582940da74b..f73e1b53b0e 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -7685,7 +7685,9 @@ package body Sem_Ch12 is

 
procedure Check_Private_View (N : Node_Id) is
-  Typ : constant Entity_Id := Etype (N);
+  Comparison : constant Boolean := Nkind (N) in N_Op_Compare;
+  Typ: constant Entity_Id :=
+(if Comparison then Compare_Type (N) else Etype (N));
 
   procedure Check_Private_Type (T : Entity_Id; Private_View : Boolean);
   --  Check that the available view of T matches Private_View and, if not,
@@ -7749,10 +7751,16 @@ package body Sem_Ch12 is
and then (not In_Open_Scopes (Scope (Typ))
   or else Nkind (Parent (N)) = N_Subtype_Declaration)
  then
---  In the generic, only the private declaration wa

[COMMITTED] ada: Fix extra whitespace after END keywords

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

Style cleanup.

gcc/ada/

* exp_pakd.adb, libgnarl/s-osinte__android.ads,
libgnarl/s-osinte__linux.ads, libgnarl/s-osinte__qnx.ads,
libgnarl/s-osinte__rtems.ads, libgnat/s-gearop.adb,
libgnat/s-poosiz.adb, sem_util.adb: Fix style.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb   | 2 +-
 gcc/ada/libgnarl/s-osinte__android.ads | 2 +-
 gcc/ada/libgnarl/s-osinte__linux.ads   | 2 +-
 gcc/ada/libgnarl/s-osinte__qnx.ads | 2 +-
 gcc/ada/libgnarl/s-osinte__rtems.ads   | 2 +-
 gcc/ada/libgnat/s-gearop.adb   | 2 +-
 gcc/ada/libgnat/s-poosiz.adb   | 2 +-
 gcc/ada/sem_util.adb   | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index e197211736a..68f0db3d56d 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -2203,7 +2203,7 @@ package body Exp_Pakd is
  end loop;
 
  return False;
-  end  In_Partially_Packed_Record;
+  end In_Partially_Packed_Record;
 
--  Start of processing for Known_Aligned_Enough
 
diff --git a/gcc/ada/libgnarl/s-osinte__android.ads 
b/gcc/ada/libgnarl/s-osinte__android.ads
index fb4310a1a43..04b0a68 100644
--- a/gcc/ada/libgnarl/s-osinte__android.ads
+++ b/gcc/ada/libgnarl/s-osinte__android.ads
@@ -622,7 +622,7 @@ private
 
type pthread_mutexattr_t is record
   Data : char_array (1 .. OS_Constants.PTHREAD_MUTEXATTR_SIZE);
-   end  record;
+   end record;
pragma Convention (C, pthread_mutexattr_t);
for pthread_mutexattr_t'Alignment use Interfaces.C.int'Alignment;
 
diff --git a/gcc/ada/libgnarl/s-osinte__linux.ads 
b/gcc/ada/libgnarl/s-osinte__linux.ads
index a5e645d334d..adf040e9fc9 100644
--- a/gcc/ada/libgnarl/s-osinte__linux.ads
+++ b/gcc/ada/libgnarl/s-osinte__linux.ads
@@ -652,7 +652,7 @@ private
 
type pthread_mutexattr_t is record
   Data : char_array (1 .. OS_Constants.PTHREAD_MUTEXATTR_SIZE);
-   end  record;
+   end record;
pragma Convention (C, pthread_mutexattr_t);
for pthread_mutexattr_t'Alignment use Interfaces.C.int'Alignment;
 
diff --git a/gcc/ada/libgnarl/s-osinte__qnx.ads 
b/gcc/ada/libgnarl/s-osinte__qnx.ads
index 3282abe8869..320a71dfece 100644
--- a/gcc/ada/libgnarl/s-osinte__qnx.ads
+++ b/gcc/ada/libgnarl/s-osinte__qnx.ads
@@ -597,7 +597,7 @@ private
 
type pthread_mutexattr_t is record
   Data : char_array (1 .. OS_Constants.PTHREAD_MUTEXATTR_SIZE);
-   end  record;
+   end record;
pragma Convention (C, pthread_mutexattr_t);
for pthread_mutexattr_t'Alignment use Interfaces.C.int'Alignment;
 
diff --git a/gcc/ada/libgnarl/s-osinte__rtems.ads 
b/gcc/ada/libgnarl/s-osinte__rtems.ads
index 6572bc4e472..43d137f2068 100644
--- a/gcc/ada/libgnarl/s-osinte__rtems.ads
+++ b/gcc/ada/libgnarl/s-osinte__rtems.ads
@@ -617,7 +617,7 @@ private
 
type pthread_mutexattr_t is record
   Data : char_array (1 .. OS_Constants.PTHREAD_MUTEXATTR_SIZE);
-   end  record;
+   end record;
pragma Convention (C, pthread_mutexattr_t);
for pthread_mutexattr_t'Alignment use Interfaces.C.int'Alignment;
 
diff --git a/gcc/ada/libgnat/s-gearop.adb b/gcc/ada/libgnat/s-gearop.adb
index e735bb0036a..000e59ccf69 100644
--- a/gcc/ada/libgnat/s-gearop.adb
+++ b/gcc/ada/libgnat/s-gearop.adb
@@ -901,7 +901,7 @@ is
   (for all KK in R'Range (2) => R (J, KK)'Initialized);
  end loop;
   end return;
-   end  Matrix_Matrix_Product;
+   end Matrix_Matrix_Product;
 

-- Matrix_Vector_Solution --
diff --git a/gcc/ada/libgnat/s-poosiz.adb b/gcc/ada/libgnat/s-poosiz.adb
index e5e6d0ff77b..0b2baec2d5a 100644
--- a/gcc/ada/libgnat/s-poosiz.adb
+++ b/gcc/ada/libgnat/s-poosiz.adb
@@ -408,5 +408,5 @@ package body System.Pool_Size is
  pragma Warnings (On);
   end Size;
 
-   end  Variable_Size_Management;
+   end Variable_Size_Management;
 end System.Pool_Size;
diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index d5df05b88e1..cfd8b88a26e 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10045,7 +10045,7 @@ package body Sem_Util is
("value of discriminant & is out of range", Discrim_Value, Discrim);
  Report_Errors := True;
  return;
-  end  if;
+  end if;
 
   --  If we have found the corresponding choice, recursively add its
   --  components to the Into list. The nested components are part of
-- 
2.42.0

[COMMITTED] ada: Fix debug info for aliased packed array with unconstrained nominal subtype

2023-11-07 Thread Marc Poulhiès

From: Eric Botcazou 

The front-end now rewrites it as a renaming when it is initialized with a
function call and the same processing must be applied in the renaming case
as in the regular case for this kind of special objects.

gcc/ada/

* gcc-interface/decl.cc (gnat_to_gnu_entity) : Apply the
specific rewriting done for an aliased object with an unconstrained
array nominal subtype in the renaming case too.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gcc-interface/decl.cc | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
index 20ab185d577..95fa508c559 100644
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -1145,6 +1145,18 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree 
gnu_expr, bool definition)
   the entity as indirect reference to the renamed object.  */
if (Materialize_Entity (gnat_entity))
  {
+   /* If this is an aliased object with an unconstrained array
+  nominal subtype, we make its type a thin reference, i.e.
+  the reference counterpart of a thin pointer, exactly as
+  we would have done in the non-renaming case below.  */
+   if (Is_Constr_Subt_For_UN_Aliased (gnat_type)
+   && Is_Array_Type (gnat_und_type)
+   && !type_annotate_only)
+ {
+   tree gnu_array
+ = gnat_to_gnu_type (Base_Type (gnat_type));
+   gnu_type = TYPE_OBJECT_RECORD_TYPE (gnu_array);
+ }
gnu_type = build_reference_type (gnu_type);
const_flag = true;
volatile_flag = false;
-- 
2.42.0

[COMMITTED] ada: Fix documentation of -gnatwc

2023-11-07 Thread Marc Poulhiès

From: Ronan Desplanques 

-gnatwc has been correctly emitting warnings for expressions outside
of tests for a while, but its documentation in the user's guide had
never been updated to reflect that. Also, the documentation used
"conditional expressions" to designate boolean expressions, but
"conditional expressions" has been defined by Ada 2012 to designate
if expressions and case expressions. This patch fixes those issues.

gcc/ada/

* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Fix
-gnatwc documentation.
* gnat_ugn.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 .../doc/gnat_ugn/building_executable_programs_with_gnat.rst   | 4 ++--
 gcc/ada/gnat_ugn.texi | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst 
b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
index a708ef4b995..21e277d5916 100644
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -2942,8 +2942,8 @@ of the pragma in the :title:`GNAT_Reference_manual`).
 
   .. index:: Conditionals, constant
 
-  This switch activates warnings for conditional expressions used in
-  tests that are known to be True or False at compile time. The default
+  This switch activates warnings for boolean expressions that are known to
+  be True or False at compile time. The default
   is that such warnings are not generated.
   Note that this warning does
   not get issued for the use of boolean constants whose
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index 1d91f2c13fa..78f8849e379 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -11000,8 +11000,8 @@ of biased representation.
 @geindex Conditionals
 @geindex constant
 
-This switch activates warnings for conditional expressions used in
-tests that are known to be True or False at compile time. The default
+This switch activates warnings for boolean expressions that are known to
+be True or False at compile time. The default
 is that such warnings are not generated.
 Note that this warning does
 not get issued for the use of boolean constants whose
-- 
2.42.0

[COMMITTED] ada: Remove duplicated code for expansion of packed array assignments

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

Expansion of assignments to packed array objects has two cases and
had duplicated code for both these cases.

gcc/ada/

* exp_pakd.adb (Expand_Bit_Packed_Element_Set): Remove code from the
ELSE branch, because it was is identical to code before the IF
statements itself.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index 2b92c467187..19d158ffad0 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -1432,7 +1432,6 @@ package body Exp_Pakd is
 Bits_nn : constant Entity_Id := RTE (Bits_Id (Csiz));
 Set_nn  : Entity_Id;
 Subscr  : Node_Id;
-Atyp: Entity_Id;
 Rev_SSO : Node_Id;
 
  begin
@@ -1454,9 +1453,6 @@ package body Exp_Pakd is
 
 --  Now generate the set reference
 
-Obj := Relocate_Node (Prefix (Lhs));
-Convert_To_Actual_Subtype (Obj);
-Atyp := Etype (Obj);
 Compute_Linear_Subscript (Atyp, Lhs, Subscr);
 
 --  Set indication of whether the packed array has reverse SSO
-- 
2.42.0

[COMMITTED] ada: Cleanup "not Present" on List_Id

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

gcc/ada/

* exp_ch6.adb, exp_disp.adb, sem_ch13.adb, sem_ch3.adb: Fix newly
detected violations.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb  | 2 +-
 gcc/ada/exp_disp.adb | 2 +-
 gcc/ada/sem_ch13.adb | 2 +-
 gcc/ada/sem_ch3.adb  | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index c1d5fa3c08b..0e6f950f5d7 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -4630,7 +4630,7 @@ package body Exp_Ch6 is
 --  It may be possible that we are re-expanding an already
 --  expanded call when are are dealing with dispatching ???
 
-if not Present (Parameter_Associations (Call_Node))
+if No (Parameter_Associations (Call_Node))
   or else Nkind (Last (Parameter_Associations (Call_Node)))
 /= N_Parameter_Association
   or else not Is_Accessibility_Actual
diff --git a/gcc/ada/exp_disp.adb b/gcc/ada/exp_disp.adb
index 9e0c87a5095..89b47c042a0 100644
--- a/gcc/ada/exp_disp.adb
+++ b/gcc/ada/exp_disp.adb
@@ -537,7 +537,7 @@ package body Exp_Disp is
 then
Target_List := Priv_Decls;
 
-elsif not Present (Vis_Decls) then
+elsif No (Vis_Decls) then
Target_List := New_List;
Set_Private_Declarations (Spec, Target_List);
 else
diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index c4699436268..ae97da58da3 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -16400,7 +16400,7 @@ package body Sem_Ch13 is
   if Nkind (N) = N_Aggregate then
  if Present (Component_Associations (N))
 or else Null_Record_Present (N)
-or else not Present (Expressions (N))
+or else No (Expressions (N))
  then
 Error_Msg_N ("bad Stable_Properties aspect specification", N);
 return;
diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
index c1113e4fc42..8583ac05261 100644
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -7394,7 +7394,7 @@ package body Sem_Ch3 is
   Set_Is_Constrained
 (Derived_Type,
  (Is_Constrained (Parent_Type) or else Constraint_Present)
-   and then not Present (Discriminant_Specifications (N)));
+   and then No (Discriminant_Specifications (N)));
 
   if Constraint_Present then
  if not Has_Discriminants (Parent_Type) then
-- 
2.42.0

[COMMITTED] ada: Compiler crash on early alignment clause

2023-11-07 Thread Marc Poulhiès

From: Bob Duff 

This patch fixes a bug: if "for T'Alignment use..." is followed
by "for T use ();" the compiler crashes. A workaround is
to move the alignment clause after the enumeration rep clause.

gcc/ada/

* sem_ch13.ads (Set_Enum_Esize): Do not set alignment.
* sem_ch13.adb (Set_Enum_Esize): Do not set alignment. Archaeology
seems to show that this line of code dates from when "Alignment =
0" meant "the Alignment is not known at compile time" and "the
Alignment is not yet known at compile time" as well as "the
Alignment is zero". In any case, it seems to be unnecessary, and
in this case harmful, because gigi would crash. Alignment_Clause
is set (because there is one), so gigi would query the Alignment,
but Alignment was destroyed.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 2 --
 gcc/ada/sem_ch13.ads | 3 +--
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index 5747ee9c539..302fab74757 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -17381,8 +17381,6 @@ package body Sem_Ch13 is
   Sz : Unat;
 
begin
-  Reinit_Alignment (T);
-
   --  Find the minimum standard size (8,16,32,64,128) that fits
 
   Lo := Enumeration_Rep (Entity (Type_Low_Bound (T)));
diff --git a/gcc/ada/sem_ch13.ads b/gcc/ada/sem_ch13.ads
index 1386096535e..555d302fb10 100644
--- a/gcc/ada/sem_ch13.ads
+++ b/gcc/ada/sem_ch13.ads
@@ -79,8 +79,7 @@ package Sem_Ch13 is
procedure Set_Enum_Esize (T : Entity_Id);
--  This routine sets the Esize field for an enumeration type T, based
--  on the current representation information available for T. Note that
-   --  the setting of the RM_Size field is not affected. This routine also
-   --  initializes the alignment field to zero.
+   --  the setting of the RM_Size field is not affected.
 
Unknown_Minimum_Size : constant Nonzero_Int := -1;
 
-- 
2.42.0

[COMMITTED] ada: Cleanup more "not Present"

2023-11-07 Thread Marc Poulhiès

From: Piotr Trojanek 

We had a GNATcheck rule that suggests replacing "not Present (...)" with
"No (...)", but it only detected calls with a parameter of type Node_Id.
Now this rules also detects parameters of type Elist_Id.

gcc/ada/

* sem_ch3.adb, sem_ch4.adb, sem_eval.adb: Fix newly detected
violations.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch3.adb  | 12 +---
 gcc/ada/sem_ch4.adb  |  2 +-
 gcc/ada/sem_eval.adb |  2 +-
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
index e92b46fa6f6..c1113e4fc42 100644
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -3385,7 +3385,7 @@ package body Sem_Ch3 is
   --  case we bypass this.
 
   if Ekind (T) /= E_Void then
- if not Present (Direct_Primitive_Operations (T)) then
+ if No (Direct_Primitive_Operations (T)) then
 if Etype (T) = T then
Set_Direct_Primitive_Operations (T, New_Elmt_List);
 
@@ -3397,8 +3397,7 @@ package body Sem_Ch3 is
 
 elsif Etype (T) = Base_Type (T) then
 
-   if not Present (Direct_Primitive_Operations (Base_Type (T)))
-   then
+   if No (Direct_Primitive_Operations (Base_Type (T))) then
   Set_Direct_Primitive_Operations
 (Base_Type (T), New_Elmt_List);
end if;
@@ -3416,7 +3415,7 @@ package body Sem_Ch3 is
  --  If T already has a Direct_Primitive_Operations list but its
  --  base type doesn't then set the base type's list to T's list.
 
- elsif not Present (Direct_Primitive_Operations (Base_Type (T))) then
+ elsif No (Direct_Primitive_Operations (Base_Type (T))) then
 Set_Direct_Primitive_Operations
   (Base_Type (T), Direct_Primitive_Operations (T));
  end if;
@@ -10345,7 +10344,7 @@ package body Sem_Ch3 is
   --  If not already set, initialize the derived type's list of primitive
   --  operations to an empty element list.
 
-  if not Present (Direct_Primitive_Operations (Derived_Type)) then
+  if No (Direct_Primitive_Operations (Derived_Type)) then
  Set_Direct_Primitive_Operations (Derived_Type, New_Elmt_List);
 
  --  If Etype of the derived type is the base type (as opposed to
@@ -10355,8 +10354,7 @@ package body Sem_Ch3 is
  --  between the two.
 
  if Etype (Derived_Type) = Base_Type (Derived_Type)
-   and then
- not Present (Direct_Primitive_Operations (Etype (Derived_Type)))
+   and then No (Direct_Primitive_Operations (Etype (Derived_Type)))
  then
 Set_Direct_Primitive_Operations
   (Etype (Derived_Type),
diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index 2f3dfe71590..e3badc3e19d 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -9911,7 +9911,7 @@ package body Sem_Ch4 is
  if (not Is_Tagged_Type (Obj_Type)
   and then
 (not (Core_Extensions_Allowed or Allow_Extensions)
-  or else not Present (Primitive_Operations (Obj_Type
+  or else No (Primitive_Operations (Obj_Type
or else Is_Incomplete_Type (Obj_Type)
  then
 Obj_Type := Prev_Obj_Type;
diff --git a/gcc/ada/sem_eval.adb b/gcc/ada/sem_eval.adb
index f744ab38be6..f88a00aa380 100644
--- a/gcc/ada/sem_eval.adb
+++ b/gcc/ada/sem_eval.adb
@@ -6812,7 +6812,7 @@ package body Sem_Eval is
 
  --  No constraint on the parent type
 
- or else not Present (Discriminant_Constraint (Etype (Typ)))
+ or else No (Discriminant_Constraint (Etype (Typ)))
  or else Is_Empty_Elmt_List
(Discriminant_Constraint (Etype (Typ)))
 
-- 
2.42.0

Re: [PATCH] binutils: experimental use of libdiagnostics in gas

2023-11-07 Thread Clément Chigot

Hi David,

Thanks for that interesting RFC ! I'm fully in favor of such
improvements, having uniformed error messages across gcc, gas and
later ld, would greatly help integration of these tools, let alone the
SARIF format output.
However, I'm not sure how you're planning to make the transition. But
currently, it looks like libdiagnostics is either enabled and thus the
new format being produced, either it's not and we do have the legacy
format. I think the transition should be smoother than that, there are
probably thousands of tests, scripts, whatever out in the wild
expecting this legacy format. Allowing both formats within the same
executable, basically chosen by a flag, would probably ease the
transition.

Apart from that, just a few remarks on the implementation details, see below.

Thanks,
Clément

On Mon, Nov 6, 2023 at 11:30 PM David Malcolm  wrote:
>
> Here's a patch for gas in binutils that makes it use libdiagnostics
> (with some nasty hardcoded paths to specific places on my hard drive
> to make it easier to develop the API).
>
> For now this hardcodes adding two sinks: a text sink on stderr, and
> also a SARIF output to stderr (which happens after all regular output).
>
> For example, without this patch:
>
>gas testsuite/gas/all/warn-1.s
>
> emits:
> 
> testsuite/gas/all/warn-1.s: Assembler messages:
> testsuite/gas/all/warn-1.s:3: Warning: a warning message
> testsuite/gas/all/warn-1.s:4: Error: .warning argument must be a string
> testsuite/gas/all/warn-1.s:5: Warning: .warning directive invoked in source 
> file
> testsuite/gas/all/warn-1.s:6: Warning: .warning directive invoked in source 
> file
> testsuite/gas/all/warn-1.s:7: Warning:
> 
>
> whereas with this patch:
>   LD_LIBRARY_PATH=/home/david/coding-3/gcc-newgit-canvas-2023/build/gcc 
> ./as-new testsuite/gas/all/warn-1.s
> emits:
>
> 
> testsuite/gas/all/warn-1.s:3: warning: a warning message
> 3 |  .warning "a warning message"   ;# { dg-warning "Warning: a warning 
> message" }
>   |
> testsuite/gas/all/warn-1.s:4: error: .warning argument must be a string
> 4 |  .warning a warning message ;# { dg-error "Error: .warning 
> argument must be a string" }
>   |
> testsuite/gas/all/warn-1.s:5: warning: .warning directive invoked in source 
> file
> 5 |  .warning   ;# { dg-warning "Warning: .warning 
> directive invoked in source file" }
>   |
> testsuite/gas/all/warn-1.s:6: warning: .warning directive invoked in source 
> file
> 6 |  .warning ".warning directive invoked in source file"   ;# { 
> dg-warning "Warning: .warning directive invoked in source file" }
>   |
> testsuite/gas/all/warn-1.s:7: warning:
> 7 |  .warning "";# { dg-warning "Warning: " }
>   |
> {"$schema": 
> "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json";,
>  "version": "2.1.0", "runs": [{"tool": {"driver": {"rules": []}}, 
> "invocations": [{"executionSuccessful": true, "toolExecutionNotifications": 
> []}], "originalUriBaseIds": {"PWD": {"uri": 
> "file:///home/david/coding-3/binutils-gdb/gas/"}}, "artifacts": [{"location": 
> {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "contents": 
> {"text": ";# Test .warning directive.\n;# { dg-do assemble }\n .warning \"a 
> warning message\"\t;# { dg-warning \"Warning: a warning message\" }\n 
> .warning a warning message\t;# { dg-error \"Error: .warning argument must be 
> a string\" }\n .warning\t\t\t;# { dg-warning \"Warning: .warning directive 
> invoked in source file\" }\n .warning \".warning directive invoked in source 
> file\"\t;# { dg-warning \"Warning: .warning directive invoked in source 
> file\" }\n .warning \"\"\t\t\t;# { dg-warning \"Warning: \" }\n"}}], 
> "results": [{"ruleId": "warning", "level": "warning", "message": {"text": "a 
> warning message"}, "locations": [{"physicalLocation": {"artifactLocation": 
> {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "region": 
> {"startLine": 3, "startColumn": 0, "endColumn": 1}, "contextRegion": 
> {"startLine": 3, "snippet": {"text": " .warning \"a warning message\"\t;# { 
> dg-warning \"Warning: a warning message\" }\n"], "relatedLocations": 
> [{"physicalLocation": {"artifactLocation": {"uri": 
> "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "region": {"startLine": 4, 
> "startColumn": 0, "endColumn": 1}, "contextRegion": {"startLine": 4, 
> "snippet": {"text": " .warning a warning message\t;# { dg-error \"Error: 
> .warning argument must be a string\" }\n"}}}, "message": {"text": ".warning 
> argument must be a string"}}, {"physicalLocation": {"artifactLocation": 
> {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "region": 
> {"startLine": 5, "startColum

[PATCH] test: Fix FAIL of pr65518.c for RVV[PR112420]

2023-11-07 Thread Juzhe-Zhong

PR target/112420

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr65518.c: Fix check for RVV.

---
 gcc/testsuite/gcc.dg/vect/pr65518.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr65518.c 
b/gcc/testsuite/gcc.dg/vect/pr65518.c
index 3e5b986183c..189a65534f6 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65518.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65518.c
@@ -49,4 +49,6 @@ int main ()
sub-optimal and causes memory explosion (even though the cost model
should reject that in the end).  */
 
-/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 "vect" 
} } */
+/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 "vect" 
{ target {! riscv*-*-* } } } } */
+/* We end up using gathers for the strided load on RISC-V which would be OK.  
*/
+/* { dg-final { scan-tree-dump "using gather/scatter for strided/grouped 
access" "vect" { target { riscv*-*-* } } } } */
-- 
2.36.3

[COMMITTED] ada: Elide temporary for aliased array with unconstrained nominal subtype

2023-11-07 Thread Marc Poulhiès

From: Eric Botcazou 

When the array is initialized with the result of a call to a function whose
result type is unconstrained, then the result is allocated with its bounds,
so the array can be rewritten as a renaming of the result in this case too.

gcc/ada/

* exp_ch3.adb (Expand_N_Object_Declaration): Fold initialization
expression of Nominal_Subtype_Is_Constrained_Array constant into
the computation of Rewrite_As_Renaming and remove the constant.
Set it to True for an aliased array with unconstrained nominal
subtype if the subtype of the expression is also unconstrained.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch3.adb | 47 +++--
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 511d4c09b22..f88ac7e6542 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -7348,13 +7348,6 @@ package body Exp_Ch3 is
   Rewrite_As_Renaming : Boolean := False;
   --  Whether to turn the declaration into a renaming at the end
 
-  Nominal_Subtype_Is_Constrained_Array : constant Boolean :=
-Comes_From_Source (Obj_Def)
-and then Is_Array_Type (Typ) and then Is_Constrained (Typ);
-  --  Used to avoid rewriting as a renaming for constrained arrays,
-  --  which is only a problem for source arrays; others have the
-  --  correct bounds (see below).
-
--  Start of processing for Expand_N_Object_Declaration
 
begin
@@ -8098,10 +8091,25 @@ package body Exp_Ch3 is
 
   Is_Entity_Name (Original_Node (Obj_Def))
 
---  Nor if it is effectively an unconstrained declaration
+--  If we have "X : S := ...;", and S is a constrained array
+--  subtype, then we cannot rename, because renamings ignore
+--  the constraints of S, so that would change the semantics
+--  (sliding would not occur on the initial value). This is
+--  only a problem for source objects though, the others have
+--  the correct bounds.
+
+and then not (Comes_From_Source (Obj_Def)
+   and then Is_Array_Type (Typ)
+   and then Is_Constrained (Typ))
+
+--  Moreover, if we have "X : aliased S := "...;" and S is an
+--  unconstrained array type, then we can rename only if the
+--  initialization expression has an unconstrained subtype too,
+--  because the bounds must be present within X.
 
 and then not (Is_Array_Type (Typ)
-   and then Is_Constr_Subt_For_UN_Aliased (Typ))
+   and then Is_Constr_Subt_For_UN_Aliased (Typ)
+   and then Is_Constrained (Etype (Expr_Q)))
 
 --  We may use a renaming if the initialization expression is a
 --  captured function call that meets a few conditions.
@@ -8109,23 +8117,16 @@ package body Exp_Ch3 is
 and then
   (Is_Renamable_Function_Call (Expr_Q)
 
-   --  Or else if it is a variable with OK_To_Rename set
+--  Or else if it is a variable with OK_To_Rename set
 
-   or else (OK_To_Rename_Ref (Expr_Q)
- and then not Special_Ret_Obj)
+or else (OK_To_Rename_Ref (Expr_Q)
+  and then not Special_Ret_Obj)
 
-   --  Or else if it is a slice of such a variable
-
-   or else (Nkind (Expr_Q) = N_Slice
- and then OK_To_Rename_Ref (Prefix (Expr_Q))
- and then not Special_Ret_Obj))
-
---  If we have "X : S := ...;", and S is a constrained array
---  subtype, then we cannot rename, because renamings ignore
---  the constraints of S, so that would change the semantics
---  (sliding would not occur on the initial value).
+--  Or else if it is a slice of such a variable
 
-and then not Nominal_Subtype_Is_Constrained_Array;
+or else (Nkind (Expr_Q) = N_Slice
+  and then OK_To_Rename_Ref (Prefix (Expr_Q))
+  and then not Special_Ret_Obj));
 
 --  If the type needs finalization and is not inherently limited,
 --  then the target is adjusted after the copy and attached to the
-- 
2.42.0

[COMMITTED] ada: Fix Ada.Directories.Modification_Time on Windows

2023-11-07 Thread Marc Poulhiès

From: Ronan Desplanques 

Before this patch, Ada.Directories.Modification_Time called
GetFileAttributesExA under the hood on Windows. That would sometimes
fail to work with files whose names were non-ASCII.

This patch replaces the call to GetFileAttributesExA with a call to
GetFileAttributesEx preceded by an encoding scheme conversion, as is
done in other functions of the run-time library. This fixes the issues
that were observed with the previous implementations.

gcc/ada/

* adaint.c (__gnat_file_time): Fix Windows version.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/adaint.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index 2a193efc002..bb4ed2607e5 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -1507,7 +1507,16 @@ extern long long __gnat_file_time(char* name)
 long long ll_time;
   } t_write;
 
-  if (!GetFileAttributesExA(name, GetFileExInfoStandard, &fad)) {
+  TCHAR wname [GNAT_MAX_PATH_LEN + 2];
+  int name_len;
+
+  S2WSC (wname, name, GNAT_MAX_PATH_LEN + 2);
+  name_len = _tcslen (wname);
+
+  if (name_len > GNAT_MAX_PATH_LEN)
+return LLONG_MIN;
+
+  if (!GetFileAttributesEx(wname, GetFileExInfoStandard, &fad)) {
 return LLONG_MIN;
   }
 
-- 
2.42.0

RE: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-07 Thread Tamar Christina

> 
> You also drop (copysign, x, CST) -> abx (x) when x is not negative - I think 
> that's
> still worthwhile as it has one less argument?
> 
> Keeping that might also need less testsuite adjustments?

Done but still needed the testsuite updates.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/109154
* match.pd: Add new neg+abs rule, remove inverse copysign rule.

gcc/testsuite/ChangeLog:

PR tree-optimization/109154
* gcc.dg/fold-copysign-1.c: Updated.
* gcc.dg/pr55152-2.c: Updated.
* gcc.dg/tree-ssa/abs-4.c: Updated.
* gcc.dg/tree-ssa/backprop-6.c: Updated.
* gcc.dg/tree-ssa/copy-sign-2.c: Updated.
* gcc.dg/tree-ssa/mult-abs-2.c: Updated.
* gcc.target/aarch64/fneg-abs_1.c: New test.
* gcc.target/aarch64/fneg-abs_2.c: New test.
* gcc.target/aarch64/fneg-abs_3.c: New test.
* gcc.target/aarch64/fneg-abs_4.c: New test.
* gcc.target/aarch64/sve/fneg-abs_1.c: New test.
* gcc.target/aarch64/sve/fneg-abs_2.c: New test.
* gcc.target/aarch64/sve/fneg-abs_3.c: New test.
* gcc.target/aarch64/sve/fneg-abs_4.c: New test.

--- inline copy of the patch ---

diff --git a/gcc/match.pd b/gcc/match.pd
index 
68a1587ea2465a890875448993e140425ef6a68f..5928acbb14e2a18addff38000bf30dd273d4b1a6
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1118,14 +1118,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(hypots @0 (copysigns @1 @2))
(hypots @0 @1
 
-/* copysign(x, CST) -> [-]abs (x).  */
+/* copysign(x, CST) -> abs (x).  */
 (for copysigns (COPYSIGN_ALL)
  (simplify
   (copysigns @0 REAL_CST@1)
-  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
-   (negate (abs @0))
+  (if (!REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
(abs @0
 
+/* Transform fneg (fabs (X)) -> copysign (X, -1).  */
+(simplify
+ (negate (abs @0))
+ (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))
+
 /* copysign(copysign(x, y), z) -> copysign(x, z).  */
 (for copysigns (COPYSIGN_ALL)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c 
b/gcc/testsuite/gcc.dg/fold-copysign-1.c
index 
f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6
 100644
--- a/gcc/testsuite/gcc.dg/fold-copysign-1.c
+++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c
@@ -12,5 +12,5 @@ double bar (double x)
   return __builtin_copysign (x, minuszero);
 }
 
-/* { dg-final { scan-tree-dump-times "= -" 1 "cddce1" } } */
-/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 2 "cddce1" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_copysign" 1 "cddce1" } } */
+/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "cddce1" } } */
diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c b/gcc/testsuite/gcc.dg/pr55152-2.c
index 
54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6
 100644
--- a/gcc/testsuite/gcc.dg/pr55152-2.c
+++ b/gcc/testsuite/gcc.dg/pr55152-2.c
@@ -10,4 +10,5 @@ int f(int a)
   return (a<-a)?a:-a;
 }
 
-/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "ABS_EXPR" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
index 
6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d
 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
@@ -9,5 +9,6 @@ long double abs_ld(long double x) { return __builtin_signbit(x) 
? x : -x; }
 
 /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP */
 /* { dg-final { scan-tree-dump-not "signbit" "optimized"} } */
-/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 3 "optimized"} } */
-/* { dg-final { scan-tree-dump-times "= -" 3 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "= -" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "= \.COPYSIGN" 2 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
index 
31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1
 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
@@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)
 TEST_FUNCTION (double, )
 TEST_FUNCTION (long double, l)
 
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 6 "backprop" } } */
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 3 
"backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 4 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = \.COPYSIGN} 2 
"backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 1 
"backprop" } } */
diff --git a/gcc/testsuite/gcc.dg/tre

[COMMITTED] ada: Minor tweaks for comparison operators

2023-11-07 Thread Marc Poulhiès

From: Eric Botcazou 

No functional changes.

gcc/ada/

* gen_il-gen-gen_nodes.adb (N_Op_Boolean): Fix description.
* sem_ch4.adb (Analyze_Comparison_Equality_Op): Tidy up.
* sem_ch12.adb (Copy_Generic_Node): Use N_Op_Compare subtype.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-gen-gen_nodes.adb | 3 +--
 gcc/ada/sem_ch12.adb | 8 +---
 gcc/ada/sem_ch4.adb  | 9 -
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index 2ad6e60dae8..0d2a68ea681 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -255,8 +255,7 @@ begin -- Gen_IL.Gen.Gen_Nodes
 Sm (Do_Division_Check, Flag)));
 
Ab (N_Op_Boolean, N_Binary_Op);
-   --  Binary operators that take operands of a boolean type, and yield a
-   --  result of a boolean type.
+   --  Binary operators that yield a result of a boolean type
 
Cc (N_Op_And, N_Op_Boolean,
(Sm (Chars, Name_Id),
diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index c264f2a8283..80b3e16ea75 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -8196,13 +8196,7 @@ package body Sem_Ch12 is
  --  if one of the operands is of a universal type, we need
  --  to manually restore the full view of private types.
 
- if Nkind (N) in N_Op_Eq
-   | N_Op_Ge
-   | N_Op_Gt
-   | N_Op_Le
-   | N_Op_Lt
-   | N_Op_Ne
- then
+ if Nkind (N) in N_Op_Compare then
 if Yields_Universal_Type (Left_Opnd (Assoc)) then
if Present (Etype (Right_Opnd (Assoc)))
  and then
diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
index e3badc3e19d..78249258f55 100644
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -2099,16 +2099,15 @@ package body Sem_Ch4 is
  end loop;
   end if;
 
-  --  If there was no match, and the operator is inequality, this may be
+  --  If there was no match and the operator is inequality, this may be
   --  a case where inequality has not been made explicit, as for tagged
   --  types. Analyze the node as the negation of an equality operation.
-  --  This cannot be done earlier, because before analysis we cannot rule
+  --  This cannot be done earlier because, before analysis, we cannot rule
   --  out the presence of an explicit inequality.
 
-  if Etype (N) = Any_Type
-and then Nkind (N) = N_Op_Ne
-  then
+  if Etype (N) = Any_Type and then Nkind (N) = N_Op_Ne then
  Op_Id := Get_Name_Entity_Id (Name_Op_Eq);
+
  while Present (Op_Id) loop
 if Ekind (Op_Id) = E_Operator then
Find_Comparison_Equality_Types (L, R, Op_Id, N);
-- 
2.42.0

[COMMITTED] ada: Implement Aspects as fields under nodes

2023-11-07 Thread Marc Poulhiès

From: Viljar Indus 

In the previous implementation Aspect Specifications were
stored in a separate table and not directly under each node.
This implementation included a lot of extra code that needed
to be maintained manually.

The new implementation stores Aspect_Specfications as a syntactic
field under each node. This removes the extra code that was needed
to store, traverse and clone aspects for nodes.

gcc/ada/

* aspects.adb (Exchange_Aspects): Removed. This method was
typically called after a Rewrite method. Now since the Rewrite
switches the aspects between the new and the old node it is no
longer needed.
(Has_Aspects): Converted to a utility method that performs the same
before as the previous Has_Aspects field did. Meaning it shows whether
a node actually has aspects or not.
(Copy_Aspects): New utility method that performs a deep copy of the
From nodes aspects.
(Aspect_Specfications): Removed. No longer needed. Replaced
by the primitive operation for the Aspect_Specification fields.
(Set_Aspect_Specifications): Likewise.
(Aspect_Specifications_Hash_Table): Remove the table and all the
utility methods for storing the old aspects.
* aspects.ads: Likewise.
* atree.adb (Copy_Separate_Tree): Remove custom code for aspects.
(New_Copy): Likewise.
(Replace): Likewise.
(Rewrite): Likewise.
* exp_ch3.adb (Expand_N_Object_Declaration): Keep the aspects from the 
old node.
* exp_ch6.adb (Validate_Subprogram_Calls): Previously aspects were 
ignored
because they were not on the tree. Explicitly ignore them here
when traversing the tree.
* exp_unst.adb (Build_Tables): Likewise
* gen_il-fields.ads: Remove Has_Aspects and add
Aspect_Specifications fields.
* gen_il-gen-gen_nodes.adb: Add Aspect_Specification fields
for all nodes that can have aspects. Additionally add
Expression_Copy for Aspect_Speficiations to avoid reusing
the Associated_Node for generic instantiation and aspect
analysis.
* ghost.adb (Remove_Ignored_Ghost_Node): Remove call to Remove_Aspects.
The rewritten node is a Null_Statement that cannot have aspects
and there is not anything to gain from removing them from the
Orignal_Node of N since it technically is not part of the active
tree.
* inline.adb (Process_Formals_In_Aspects): Simplify code for node 
traversal.
* par-ch13.adb: Avoid setting the parent explicitly for the
Aspect_Specifications list. This is done explicitly in the setter.
* par-ch6.adb: Likewise.
* par_sco.adb (Traverse_Aspects): Handle early return.
* sem_ch10.adb: Simplify code for Analyze_Aspect_Specifications.
* sem_ch11.adb: Likewise.
* sem_ch12.adb (Analyze_Formal_Derived_Interface_Type): Keep the 
aspects from
the orignal node after rewrite.
(Analyze_Formal_Derived_Type): Likewise.
(Analyze_Formal_Interface_Type): Likewise.
(Analyze_Formal_Object_Declaration): Simplify code for
Analyze_Aspect_Specifications.
(Analyze_Formal_Package_Declaration): Likewise.
(Analyze_Formal_Subprogram_Declaration): Likewise.
(Analyze_Formal_Type_Declaration): Likewise.
(Analyze_Generic_Package_Declaration): Remove Exchange_Aspects.
The new node already has the correct aspects after the rewrite.
Also simplify code for Analyze_Aspect_Specifications.
(Analyze_Generic_Subprogram_Declaration): Likewise.
(Analyze_Package_Instantiation): Simplify code for
Analyze_Aspect_Specifications.
(Build_Instance_Compilation_Unit_Nodes): Remove explicit copy of
aspects that is no longer needed.
(Save_References): Update the traversal code to handle
Aspect_Specifications in the tree.
(Copy_Generic_Node): Remove explicit copy for aspects. New_Copy
took care of that already.
* sem_ch13.adb (Analyze_Aspect_Specifications): Add early return to 
simplify
code for its calls. Avoid reusing the Entity(Associated_Node)
field for storing the original expression. Instead use the
new Expression_Copy field since Entity(Associated_Node) is
also used in generic instantiation.
(Analyze_Aspects_On_Subprogram_Body_Or_Stub): Simlify call
to Analyze_Aspect_Specifications.
(Check_Aspect_At_End_Of_Declarations): Use Expression_Copy
instead of Entity.
(Check_Aspect_At_Freeze_Point): Likewise.
* sem_ch3.adb: Simplify calls to Analyze_Aspect_Specifications.
* sem_ch6.adb (Analyze_Abstract_Subprogram_Declaration): Simplify call 
to
Analyze_Aspect_Specifications.
(Analyze_Expression_Function): Keep the aspects from the
original node after a rewrite.
(Analyze_Gen

[COMMITTED] ada: Rename Is_Limited_View to reflect actual query

2023-11-07 Thread Marc Poulhiès

From: Yannick Moy 

Function Sem_Aux.Is_Limited_View returns whether the type is
"inherently limited" in a slightly different way from the "immutably
limited" definition in Ada 2012. Rename for clarity.

gcc/ada/

* exp_aggr.adb: Apply the renaming.
* exp_ch3.adb: Same.
* exp_ch4.adb: Same.
* exp_ch6.adb: Same.
* exp_ch7.adb: Same.
* exp_util.adb: Same.
* freeze.adb: Same.
* sem_aggr.adb: Same.
* sem_attr.adb: Same.
* sem_aux.adb: Alphabetize Is_Limited_Type. Rename.
* sem_aux.ads: Same.
* sem_ch3.adb: Apply the renaming.
* sem_ch6.adb: Same.
* sem_ch8.adb: Same.
* sem_prag.adb: Same.
* sem_res.adb: Same.
* sem_util.adb: Same.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb |  10 ++--
 gcc/ada/exp_ch3.adb  |   6 +--
 gcc/ada/exp_ch4.adb  |   2 +-
 gcc/ada/exp_ch6.adb  |   4 +-
 gcc/ada/exp_ch7.adb  |   4 +-
 gcc/ada/exp_util.adb |   4 +-
 gcc/ada/freeze.adb   |   5 +-
 gcc/ada/sem_aggr.adb |   2 +-
 gcc/ada/sem_attr.adb |   4 +-
 gcc/ada/sem_aux.adb  | 116 +--
 gcc/ada/sem_aux.ads  |  16 +++---
 gcc/ada/sem_ch3.adb  |   2 +-
 gcc/ada/sem_ch6.adb  |   6 +--
 gcc/ada/sem_ch8.adb  |   2 +-
 gcc/ada/sem_prag.adb |   2 +-
 gcc/ada/sem_res.adb  |   4 +-
 gcc/ada/sem_util.adb |  10 ++--
 17 files changed, 101 insertions(+), 98 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index 340c8c68465..319254dfd63 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -945,7 +945,7 @@ package body Exp_Aggr is
   --  If component is limited, aggregate must be expanded because each
   --  component assignment must be built in place.
 
-  if Is_Limited_View (Component_Type (Typ)) then
+  if Is_Inherently_Limited_Type (Component_Type (Typ)) then
  return False;
   end if;
 
@@ -3026,7 +3026,7 @@ package body Exp_Aggr is
--  call will be generated by Make_Tag_Ctrl_Assignment).
 
if Needs_Finalization (Init_Typ)
- and then not Is_Limited_View (Init_Typ)
+ and then not Is_Inherently_Limited_Type (Init_Typ)
then
   Set_No_Finalize_Actions (First (Assign));
else
@@ -8166,7 +8166,9 @@ package body Exp_Aggr is
   --  Extension aggregates, aggregates in extended return statements, and
   --  aggregates for C++ imported types must be expanded.
 
-  elsif Ada_Version >= Ada_2005 and then Is_Limited_View (Typ) then
+  elsif Ada_Version >= Ada_2005
+and then Is_Inherently_Limited_Type (Typ)
+  then
  if Nkind (Parent (N)) not in
   N_Component_Association | N_Object_Declaration
  then
@@ -8400,7 +8402,7 @@ package body Exp_Aggr is
   --  of their individual elements will receive an adjustment of its own.
 
   if Finalization_OK
-and then not Is_Limited_View (Comp_Typ)
+and then not Is_Inherently_Limited_Type (Comp_Typ)
 and then not
   (Is_Array_Type (Etype (N))
 and then Is_Array_Type (Comp_Typ)
diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 0217f8d7eb0..511d4c09b22 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -7255,7 +7255,7 @@ package body Exp_Ch3 is
 
  else pragma Assert (Is_Definite_Subtype (Typ)
or else (Has_Unknown_Discriminants (Typ)
- and then Is_Limited_View (Typ)));
+ and then Is_Inherently_Limited_Type (Typ)));
 
 Alloc_Typ := Typ;
  end if;
@@ -7692,7 +7692,7 @@ package body Exp_Ch3 is
--  and attached to the finalization list.
 
if Needs_Finalization (Typ)
- and then not Is_Limited_View (Typ)
+ and then not Is_Inherently_Limited_Type (Typ)
then
   Adj_Call :=
 Make_Adjust_Call (
@@ -8137,7 +8137,7 @@ package body Exp_Ch3 is
 --  the object declaration into a renaming declaration.
 
 if Needs_Finalization (Typ)
-  and then not Is_Limited_View (Typ)
+  and then not Is_Inherently_Limited_Type (Typ)
   and then Nkind (Expr_Q) /= N_Function_Call
   and then not Rewrite_As_Renaming
 then
diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index ec95d8b830b..f04ac615be9 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -941,7 +941,7 @@ package body Exp_Ch4 is
 
  if Needs_Finalization (DesigT)
and then Needs_Finalization (T)
-   and then not Is_Limited_View (T)
+   and then not Is_Inherently_Limited_Type (T)
and then not Aggr_In_Place
and then Nkind (Exp) /= N_Function_Call
and then not For_Special_Return_Object (N)
diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp

[COMMITTED] ada: Update the logo in the gnat doc

2023-11-07 Thread Marc Poulhiès

From: Julien Bortolussi 

Update the logo and the background color in the top right corner of the
GNAT User’s Guide for Native Platforms

gcc/ada/

* doc/share/conf.py: Changed the background color and the logo.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/doc/share/conf.py | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/doc/share/conf.py b/gcc/ada/doc/share/conf.py
index 48f1a96a309..4773ac96e39 100644
--- a/gcc/ada/doc/share/conf.py
+++ b/gcc/ada/doc/share/conf.py
@@ -138,10 +138,13 @@ tags.add(get_gnat_build_type())
 
 # Define figures to be included
 html_theme = 'sphinx_rtd_theme'
-if os.path.isfile('adacore_transparent.png'):
+html_theme_options = {
+"style_nav_header_background": "#12284c",
+}
+if os.path.isfile('adacore-logo-white.png'):
 # split html and pdf logos to avoid 'same name' error in sphinx <5.2+
-html_logo = 'adacore_transparent.png'
-latex_logo = 'adacore_transparent.png'
+html_logo = 'adacore-logo-white.png'
+latex_logo = 'adacore-logo-white.png'
 if os.path.isfile('favicon.ico'):
 html_favicon = 'favicon.ico'
 
-- 
2.42.0

[COMMITTED] ada: Fix spurious -Wstringop-overflow with link time optimization

2023-11-07 Thread Marc Poulhiès

From: Eric Botcazou 

It comes from an incomplete optimization performed by LTO that is caused by
an obsolete transformation done in Gigi, which is redundant with the common
uniquization of constant CONSTRUCTORs now performed during gimplification.

gcc/ada/

* gcc-interface/trans.cc (gnat_gimplify_expr) : Delete.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gcc-interface/trans.cc | 24 
 1 file changed, 24 deletions(-)

diff --git a/gcc/ada/gcc-interface/trans.cc b/gcc/ada/gcc-interface/trans.cc
index 89f0a07c824..c7d91628f80 100644
--- a/gcc/ada/gcc-interface/trans.cc
+++ b/gcc/ada/gcc-interface/trans.cc
@@ -8950,30 +8950,6 @@ gnat_gimplify_expr (tree *expr_p, gimple_seq *pre_p,
  }
   break;
 
-case CALL_EXPR:
-  /* If we are passing a constant fat pointer CONSTRUCTOR, make sure it is
-put into static memory; this performs a restricted version of constant
-propagation on fat pointers in calls.  But do not do it for strings to
-avoid blocking concatenation in the caller when it is inlined.  */
-  for (int i = 0; i < call_expr_nargs (expr); i++)
-   {
- tree arg = CALL_EXPR_ARG (expr, i);
-
- if (TREE_CODE (arg) == CONSTRUCTOR
- && TREE_CONSTANT (arg)
- && TYPE_IS_FAT_POINTER_P (TREE_TYPE (arg)))
-   {
- tree t = CONSTRUCTOR_ELT (arg, 0)->value;
- if (TREE_CODE (t) == NOP_EXPR)
-   t = TREE_OPERAND (t, 0);
- if (TREE_CODE (t) == ADDR_EXPR)
-   t = TREE_OPERAND (t, 0);
- if (TREE_CODE (t) != STRING_CST)
-   CALL_EXPR_ARG (expr, i) = tree_output_constant_def (arg);
-   }
-   }
-  break;
-
 case DECL_EXPR:
   op = DECL_EXPR_DECL (expr);
 
-- 
2.42.0

RE: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Tamar Christina wrote:

> > 
> > You also drop (copysign, x, CST) -> abx (x) when x is not negative - I 
> > think that's
> > still worthwhile as it has one less argument?
> > 
> > Keeping that might also need less testsuite adjustments?
> 
> Done but still needed the testsuite updates.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/109154
>   * match.pd: Add new neg+abs rule, remove inverse copysign rule.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/109154
>   * gcc.dg/fold-copysign-1.c: Updated.
>   * gcc.dg/pr55152-2.c: Updated.
>   * gcc.dg/tree-ssa/abs-4.c: Updated.
>   * gcc.dg/tree-ssa/backprop-6.c: Updated.
>   * gcc.dg/tree-ssa/copy-sign-2.c: Updated.
>   * gcc.dg/tree-ssa/mult-abs-2.c: Updated.
>   * gcc.target/aarch64/fneg-abs_1.c: New test.
>   * gcc.target/aarch64/fneg-abs_2.c: New test.
>   * gcc.target/aarch64/fneg-abs_3.c: New test.
>   * gcc.target/aarch64/fneg-abs_4.c: New test.
>   * gcc.target/aarch64/sve/fneg-abs_1.c: New test.
>   * gcc.target/aarch64/sve/fneg-abs_2.c: New test.
>   * gcc.target/aarch64/sve/fneg-abs_3.c: New test.
>   * gcc.target/aarch64/sve/fneg-abs_4.c: New test.
> 
> --- inline copy of the patch ---
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> 68a1587ea2465a890875448993e140425ef6a68f..5928acbb14e2a18addff38000bf30dd273d4b1a6
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1118,14 +1118,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (hypots @0 (copysigns @1 @2))
> (hypots @0 @1
>  
> -/* copysign(x, CST) -> [-]abs (x).  */
> +/* copysign(x, CST) -> abs (x).  */
>  (for copysigns (COPYSIGN_ALL)
>   (simplify
>(copysigns @0 REAL_CST@1)
> -  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
> -   (negate (abs @0))
> +  (if (!REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
> (abs @0
>  
> +/* Transform fneg (fabs (X)) -> copysign (X, -1).  */
> +(simplify
> + (negate (abs @0))
> + (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))
> +
>  /* copysign(copysign(x, y), z) -> copysign(x, z).  */
>  (for copysigns (COPYSIGN_ALL)
>   (simplify
> diff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c 
> b/gcc/testsuite/gcc.dg/fold-copysign-1.c
> index 
> f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6
>  100644
> --- a/gcc/testsuite/gcc.dg/fold-copysign-1.c
> +++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c
> @@ -12,5 +12,5 @@ double bar (double x)
>return __builtin_copysign (x, minuszero);
>  }
>  
> -/* { dg-final { scan-tree-dump-times "= -" 1 "cddce1" } } */
> -/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 2 "cddce1" } } */
> +/* { dg-final { scan-tree-dump-times "__builtin_copysign" 1 "cddce1" } } */
> +/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "cddce1" } } */
> diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c 
> b/gcc/testsuite/gcc.dg/pr55152-2.c
> index 
> 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6
>  100644
> --- a/gcc/testsuite/gcc.dg/pr55152-2.c
> +++ b/gcc/testsuite/gcc.dg/pr55152-2.c
> @@ -10,4 +10,5 @@ int f(int a)
>return (a<-a)?a:-a;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "ABS_EXPR" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> index 
> 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d
>  100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c
> @@ -9,5 +9,6 @@ long double abs_ld(long double x) { return 
> __builtin_signbit(x) ? x : -x; }
>  
>  /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP */
>  /* { dg-final { scan-tree-dump-not "signbit" "optimized"} } */
> -/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 3 "optimized"} } */
> -/* { dg-final { scan-tree-dump-times "= -" 3 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "= -" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "= \.COPYSIGN" 2 "optimized"} } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> index 
> 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1
>  100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c
> @@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)
>  TEST_FUNCTION (double, )
>  TEST_FUNCTION (long double, l)
>  
> -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 6 "backprop" } } */
> -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 3 
> "backprop" } } */
> +/* { dg-f

Re: [PATCH] rs6000, testcase: Add require-effective-target has_arch_ppc64 to pr106550_1.c

2023-11-07 Thread Jiufu Guo



Hi,

"Kewen.Lin"  writes:

> Hi,
>
> on 2023/11/6 15:20, Jiufu Guo wrote:
>> Hi,
>> 
>> With latest trunk, case pr106550_1.c can run with failure on ppc under -m32.
>> While, the case is testing 64bit constant building. So, "has_arch_ppc64"
>> is required.
>
> Please also mention that it failed with ICE initially due to PR111971, now
> that PR got fixed so this omission is exposed.
Thanks!
>
>> 
>> Test pass on ppc64{,le}.
>> 
>> BR,
>> Jeff (Jiufu Guo)
>> 
>> gcc/testsuite/ChangeLog:
>> 
>
> There is PR112340, need a corresponding PR marker line.
Yeap! Thanks for pointing out this!

>
> OK for trunk with these nits tweaked, thanks!

Committed via r14-5184.

BR,
Jeff (Jiufu Guo)

>
> BR,
> Kewen
>
>>  * gcc.target/powerpc/pr106550_1.c: Add has_arch_ppc64 target require.
>> 
>> ---
>>  gcc/testsuite/gcc.target/powerpc/pr106550_1.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c
>> index 7e709fcf9d8..5ab40d71a56 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c
>> @@ -1,5 +1,6 @@
>>  /* PR target/106550 */
>>  /* { dg-require-effective-target power10_ok } */
>> +/* { dg-require-effective-target has_arch_ppc64 } */
>>  /* { dg-options "-O2 -mdejagnu-cpu=power10 -fdisable-rtl-split1" } */
>>  /* force the constant splitter run after RA: -fdisable-rtl-split1.  */
>>

RE: [PATCH v6 0/21]middle-end: Support early break/return auto-vectorization

2023-11-07 Thread Richard Biener

On Mon, 6 Nov 2023, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Monday, November 6, 2023 2:25 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: Re: [PATCH v6 0/21]middle-end: Support early break/return auto-
> > vectorization
> > 
> > On Mon, 6 Nov 2023, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch adds initial support for early break vectorization in GCC.
> > > The support is added for any target that implements a vector cbranch
> > > optab, this includes both fully masked and non-masked targets.
> > >
> > > Depending on the operation, the vectorizer may also require support
> > > for boolean mask reductions using Inclusive OR.  This is however only
> > > checked then the comparison would produce multiple statements.
> > >
> > > Note: I am currently struggling to get patch 7 correct in all cases and 
> > > could
> > use
> > >   some feedback there.
> > >
> > > Concretely the kind of loops supported are of the forms:
> > >
> > >  for (int i = 0; i < N; i++)
> > >  {
> > >
> > >if ()
> > >  {
> > >...
> > >;
> > >  }
> > >
> > >  }
> > >
> > > where  can be:
> > >  - break
> > >  - return
> > >  - goto
> > >
> > > Any number of statements can be used before the  occurs.
> > >
> > > Since this is an initial version for GCC 14 it has the following
> > > limitations and
> > > features:
> > >
> > > - Only fixed sized iterations and buffers are supported.  That is to say 
> > > any
> > >   vectors loaded or stored must be to statically allocated arrays with 
> > > known
> > >   sizes. N must also be known.  This limitation is because our primary 
> > > target
> > >   for this optimization is SVE.  For VLA SVE we can't easily do cross page
> > >   iteraion checks. The result is likely to also not be beneficial. For 
> > > that
> > >   reason we punt support for variable buffers till we have First-Faulting
> > >   support in GCC.

Btw, for this I wonder if you thought about marking memory accesses
required for the early break condition as required to be vector-size
aligned, thus peeling or versioning them for alignment?  That should
ensure they do not fault.

OTOH I somehow remember prologue peeling isn't supported for early
break vectorization?  ..

> > > - any stores in  should not be to the same objects as in
> > >   .  Loads are fine as long as they don't have the possibility 
> > > to
> > >   alias.  More concretely, we block RAW dependencies when the intermediate
> > value
> > >   can't be separated fromt the store, or the store itself can't be moved.
> > > - Prologue peeling, alignment peelinig and loop versioning are supported.

.. but here you say it is.  Not sure if peeling for alignment works for
VLA vectors though.  Just to say x86 doesn't support first-faulting
loads.

> > > - Fully masked loops, unmasked loops and partially masked loops are
> > > supported
> > > - Any number of loop early exits are supported.
> > > - No support for epilogue vectorization.  The only epilogue supported is 
> > > the
> > >   scalar final one.  Peeling code supports it but the code motion code 
> > > cannot
> > >   find instructions to make the move in the epilog.
> > > - Early breaks are only supported for inner loop vectorization.
> > >
> > > I have pushed a branch to refs/users/tnfchris/heads/gcc-14-early-break
> > >
> > > With the help of IPA and LTO this still gets hit quite often.  During
> > > bootstrap it hit rather frequently.  Additionally TSVC s332, s481 and
> > > s482 all pass now since these are tests for support for early exit
> > vectorization.
> > >
> > > This implementation does not support completely handling the early
> > > break inside the vector loop itself but instead supports adding checks
> > > such that if we know that we have to exit in the current iteration
> > > then we branch to scalar code to actually do the final VF iterations which
> > handles all the code in .
> > >
> > > For the scalar loop we know that whatever exit you take you have to
> > > perform at most VF iterations.  For vector code we only case about the
> > > state of fully performed iteration and reset the scalar code to the 
> > > (partially)
> > remaining loop.
> > >
> > > That is to say, the first vector loop executes so long as the early
> > > exit isn't needed.  Once the exit is taken, the scalar code will
> > > perform at most VF extra iterations.  The exact number depending on 
> > > peeling
> > and iteration start and which
> > > exit was taken (natural or early).   For this scalar loop, all early 
> > > exits are
> > > treated the same.
> > >
> > > When we vectorize we move any statement not related to the early break
> > > itself and that would be incorrect to execute before the break (i.e.
> > > has side effects) to after the break.  If this is not possible we decline 
> > > to
> > vectorize.
> > >
> > > This means that we check at the start of iterations whether we are
> > > going t

[PATCH] RISC-V: Use stdint-gcc.h in rvv testsuite

2023-11-07 Thread Christoph Muellner

From: Christoph Müllner 

stdint.h can be replaced with stdint-gcc.h to resolve some missing
system headers in non-multilib installations.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadmemidx-helpers.h:
Replace stdint.h with stdint-gcc.h.

Signed-off-by: Christoph Müllner 
---
 gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h 
b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
index a97f08c5cc1..9d8ce124a93 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
@@ -1,7 +1,7 @@
 #ifndef XTHEADMEMIDX_HELPERS_H
 #define XTHEADMEMIDX_HELPERS_H
 
-#include 
+#include 
 
 #define intX_t long
 #define uintX_t unsigned long
-- 
2.41.0

Re: [PATCH 1/21]middle-end testsuite: Add more pragma novector to new tests

2023-11-07 Thread Richard Biener

On Mon, 6 Nov 2023, Tamar Christina wrote:

> Hi All,
> 
> This adds pragma GCC novector to testcases that have showed up
> since last regression run and due to this series detecting more.
> 
> Is it ok that when it comes time to commit I can just update any
> new cases before committing? since this seems a cat and mouse game..

Yeah, just update.

> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/no-scevccp-slp-30.c: Add pragma novector.
>   * gcc.dg/vect/no-scevccp-slp-31.c: Likewise.
>   * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
>   * gcc.target/aarch64/vect-xorsign_exec.c: Likewise.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-30.c 
> b/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-30.c
> index 
> 00d0eca56eeca6aee6f11567629dc955c0924c74..534bee4a1669a7cbd95cf6007f28dafd23bab8da
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-30.c
> +++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-30.c
> @@ -24,9 +24,9 @@ main1 ()
> }
>  
>/* check results:  */
> -#pragma GCC novector
> for (j = 0; j < N; j++)
> {
> +#pragma GCC novector
>  for (i = 0; i < N; i++)
>{
>  if (out[i*4] != 8
> diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-31.c 
> b/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-31.c
> index 
> 48b6a9b0681cf1fe410755c3e639b825b27895b0..22817a57ef81398cc018a78597755397d20e0eb9
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-31.c
> +++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-slp-31.c
> @@ -27,6 +27,7 @@ main1 ()
>  #pragma GCC novector
>   for (i = 0; i < N; i++)
> {
> +#pragma GCC novector
>  for (j = 0; j < N; j++) 
>{
>  if (a[i][j] != 8)
> diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c 
> b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
> index 
> a0e53d5fef91868dfdbd542dd0a98dff92bd265b..0861d488e134d3f01a2fa83c56eff7174f36ddfb
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
> +++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
> @@ -83,9 +83,9 @@ int main1 ()
>  }
>  
>/* check results:  */
> -#pragma GCC novector
>for (i = 0; i < N; i++)
>  {
> +#pragma GCC novector
>for (j = 0; j < N; j++)
>   {
>if (tmp1[2].e.n[1][i][j] != 8)
> @@ -103,9 +103,9 @@ int main1 ()
>  }
>  
>/* check results:  */
> -#pragma GCC novector
>for (i = 0; i < N - NINTS; i++)
>  {
> +#pragma GCC novector
>for (j = 0; j < N - NINTS; j++)
>   {
>if (tmp2[2].e.n[1][i][j] != 8)
> diff --git a/gcc/testsuite/gcc.target/aarch64/vect-xorsign_exec.c 
> b/gcc/testsuite/gcc.target/aarch64/vect-xorsign_exec.c
> index 
> cfa22115831272cb1d4e1a38512f10c3a1c6ad77..84f33d3f6cce9b0017fd12ab961019041245ffae
>  100644
> --- a/gcc/testsuite/gcc.target/aarch64/vect-xorsign_exec.c
> +++ b/gcc/testsuite/gcc.target/aarch64/vect-xorsign_exec.c
> @@ -33,6 +33,7 @@ main (void)
>  r[i] = a[i] * __builtin_copysignf (1.0f, b[i]);
>  
>/* check results:  */
> +#pragma GCC novector
>for (i = 0; i < N; i++)
>  if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
>abort ();
> @@ -41,6 +42,7 @@ main (void)
>  rd[i] = ad[i] * __builtin_copysign (1.0d, bd[i]);
>  
>/* check results:  */
> +#pragma GCC novector
>for (i = 0; i < N; i++)
>  if (rd[i] != ad[i] * __builtin_copysign (1.0d, bd[i]))
>abort ();
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] RISC-V: Add ABI requirement for XTheadFMemIdx tests

2023-11-07 Thread Christoph Müllner

On Tue, Nov 7, 2023 at 2:19 AM Kito Cheng  wrote:
>
> LGTM, and maybe change stdint.h to stdint-gcc.h in
> xtheadmemidx-helpers.h? that could make it more portable on multi-lib
> testing.

Can be found here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635508.html

Thanks!

>
> On Tue, Nov 7, 2023 at 3:44 AM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > The XTheadFMemIdx tests set the required ABI for RV32, but not
> > for RV64, which has the effect that the tests are expected to
> > succeed for RV64/LP64.  Let's set the ABI to LP64D in these
> > tests to clarify the requirements.
> >
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/xtheadfmemidx-index-update.c: Add ABI.
> > * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-index.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex-update.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex.c: Likewise.
> > ---
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c | 2 +-
> >  .../gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c  | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c   | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c| 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-update.c| 2 +-
> >  .../gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c  | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex.c   | 2 +-
> >  8 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
> > index 24bbb63d174..cb86b8ad296 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 
> > } } } */
> > +/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx -mabi=lp64d" { 
> > target { rv64 } } } */
> >  /* { dg-options "-march=rv32imafc_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" 
> > { target { rv32 } } } */
> >
> >  #include "xtheadmemidx-helpers.h"
> > diff --git 
> > a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
> > index 3b931a4b980..cc3f6219c05 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { 
> > target { rv64 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=lp64d" { target { rv64 } } } */
> >  /* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=ilp32f" { target { rv32 } } } */
> >
> >  #include "xtheadmemidx-helpers.h"
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
> > index 48858605c24..8ee98c87469 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { 
> > target { rv64 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=lp64d" { target { rv64 } } } */
> >  /* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=ilp32f" { target { rv32 } } } */
> >
> >  #include "xtheadmemidx-helpers.h"
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c
> > index 1bb231a9e88..35704063598 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 
> > } } } */
> > +/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx -mabi=lp6

Re: [PATCH 2/21]middle-end testsuite: Add tests for early break vectorization

2023-11-07 Thread Richard Biener

On Mon, 6 Nov 2023, Tamar Christina wrote:

> Hi All,
> 
> This adds new test to check for all the early break functionality.
> It includes a number of codegen and runtime tests checking the values at
> different needles in the array.
> 
> They also check the values on different array sizes and peeling positions,
> datatypes, VL, ncopies and every other variant I could think of.
> 
> Additionally it also contains reduced cases from issues found running over
> various codebases.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Also regtested with:
>  -march=armv8.3-a+sve
>  -march=armv8.3-a+nosve
>  -march=armv9-a
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * doc/sourcebuild.texi: Document it.

Document what?

> gcc/testsuite/ChangeLog:
> 
>   * lib/target-supports.exp:

?

For all runtime testcases you need to include "tree-vect.h"
and call check_vect () in main so appropriate cpuid checks
can be performed.

In vect/ you shouldn't use { dg-do run }, that's the default
and is overridden by some .exp magic.  If you add dg-do run
that magic doesn't work.

x86 also can do cbranch with SSE4.1, not sure how to
auto-magically add -msse4.1 for the tests though.
There's a sse4 target but that only checks whether you
can use -msse4.1.  Anyway, we can do x86 testsuite adjustments
as followup.


>   * g++.dg/vect/vect-early-break_1.cc: New test.
>   * g++.dg/vect/vect-early-break_2.cc: New test.
>   * g++.dg/vect/vect-early-break_3.cc: New test.
>   * gcc.dg/vect/vect-early-break-run_1.c: New test.
>   * gcc.dg/vect/vect-early-break-run_10.c: New test.
>   * gcc.dg/vect/vect-early-break-run_2.c: New test.
>   * gcc.dg/vect/vect-early-break-run_3.c: New test.
>   * gcc.dg/vect/vect-early-break-run_4.c: New test.
>   * gcc.dg/vect/vect-early-break-run_5.c: New test.
>   * gcc.dg/vect/vect-early-break-run_6.c: New test.
>   * gcc.dg/vect/vect-early-break-run_7.c: New test.
>   * gcc.dg/vect/vect-early-break-run_8.c: New test.
>   * gcc.dg/vect/vect-early-break-run_9.c: New test.
>   * gcc.dg/vect/vect-early-break-template_1.c: New test.
>   * gcc.dg/vect/vect-early-break-template_2.c: New test.
>   * gcc.dg/vect/vect-early-break_1.c: New test.
>   * gcc.dg/vect/vect-early-break_10.c: New test.
>   * gcc.dg/vect/vect-early-break_11.c: New test.
>   * gcc.dg/vect/vect-early-break_12.c: New test.
>   * gcc.dg/vect/vect-early-break_13.c: New test.
>   * gcc.dg/vect/vect-early-break_14.c: New test.
>   * gcc.dg/vect/vect-early-break_15.c: New test.
>   * gcc.dg/vect/vect-early-break_16.c: New test.
>   * gcc.dg/vect/vect-early-break_17.c: New test.
>   * gcc.dg/vect/vect-early-break_18.c: New test.
>   * gcc.dg/vect/vect-early-break_19.c: New test.
>   * gcc.dg/vect/vect-early-break_2.c: New test.
>   * gcc.dg/vect/vect-early-break_20.c: New test.
>   * gcc.dg/vect/vect-early-break_21.c: New test.
>   * gcc.dg/vect/vect-early-break_22.c: New test.
>   * gcc.dg/vect/vect-early-break_23.c: New test.
>   * gcc.dg/vect/vect-early-break_24.c: New test.
>   * gcc.dg/vect/vect-early-break_25.c: New test.
>   * gcc.dg/vect/vect-early-break_26.c: New test.
>   * gcc.dg/vect/vect-early-break_27.c: New test.
>   * gcc.dg/vect/vect-early-break_28.c: New test.
>   * gcc.dg/vect/vect-early-break_29.c: New test.
>   * gcc.dg/vect/vect-early-break_3.c: New test.
>   * gcc.dg/vect/vect-early-break_30.c: New test.
>   * gcc.dg/vect/vect-early-break_31.c: New test.
>   * gcc.dg/vect/vect-early-break_32.c: New test.
>   * gcc.dg/vect/vect-early-break_33.c: New test.
>   * gcc.dg/vect/vect-early-break_34.c: New test.
>   * gcc.dg/vect/vect-early-break_35.c: New test.
>   * gcc.dg/vect/vect-early-break_36.c: New test.
>   * gcc.dg/vect/vect-early-break_37.c: New test.
>   * gcc.dg/vect/vect-early-break_38.c: New test.
>   * gcc.dg/vect/vect-early-break_39.c: New test.
>   * gcc.dg/vect/vect-early-break_4.c: New test.
>   * gcc.dg/vect/vect-early-break_40.c: New test.
>   * gcc.dg/vect/vect-early-break_41.c: New test.
>   * gcc.dg/vect/vect-early-break_42.c: New test.
>   * gcc.dg/vect/vect-early-break_43.c: New test.
>   * gcc.dg/vect/vect-early-break_44.c: New test.
>   * gcc.dg/vect/vect-early-break_45.c: New test.
>   * gcc.dg/vect/vect-early-break_46.c: New test.
>   * gcc.dg/vect/vect-early-break_47.c: New test.
>   * gcc.dg/vect/vect-early-break_48.c: New test.
>   * gcc.dg/vect/vect-early-break_49.c: New test.
>   * gcc.dg/vect/vect-early-break_5.c: New test.
>   * gcc.dg/vect/vect-early-break_50.c: New test.
>   * gcc.dg/vect/vect-early-break_51.c: New test.
>   * gcc.dg/vect/vect-early-break_52.c: New test.
>   * gcc.dg/vect/vect-early-break_53.c: New test.
>   * gcc.dg/vect/vect-early-break_54.c: New test.
>   * gcc.dg/vec

Re: testsuite: introduce hostedlib effective target

2023-11-07 Thread Alexandre Oliva

[adding libstdc++@]

On Nov  5, 2023, Mike Stump  wrote:

> Ick.

Indeed ;-)

> I wish there were fewer changed lines and not 1 per test
> case. It feels like we've painted ourselves into a corner.

The libstdc++ testsuite took a different approach, detecting missing
headers (and libraries?) at error pruning time, and xfailing the tests,
which seems to be more in line with what you are looking for.

That approach, though more expedient, seems more fragile to me, in that
an actual bug that caused headers to go missing would cause tests to be
silently skipped rather than fail.

I expect the set of headers, and thus of affected tests, won't by very
dynamic, so it's kind of a one-shot change.

Of course new tests might be added that rely on such headers, and would
likely go unnoticed until someone tries them on a non-hosted libstdc++.
We could alleviate this if libstdc++ headers that are not installed on
hosted systems issued a warning (conditional on some macro defined by
the testsuite, say -D_GLIBCXX_WARN_HOSTED_ONLY).  For tests aimed
exclusively at hosted libstdc++, we'd then use a dg directive that both
implied this requirement, and changed the macro definition to suppress
the warning.  Then anyone who added a testcase that included hosted
headers without indicating its hostedlib requirement would get a fail
even when testing with a hosted libstdc++.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH] binutils: experimental use of libdiagnostics in gas

2023-11-07 Thread Jan Beulich

On 06.11.2023 23:29, David Malcolm wrote:
> Here's a patch for gas in binutils that makes it use libdiagnostics
> (with some nasty hardcoded paths to specific places on my hard drive
> to make it easier to develop the API).
> 
> For now this hardcodes adding two sinks: a text sink on stderr, and
> also a SARIF output to stderr (which happens after all regular output).
> 
> For example, without this patch:
> 
>gas testsuite/gas/all/warn-1.s
> 
> emits:
> 
> testsuite/gas/all/warn-1.s: Assembler messages:
> testsuite/gas/all/warn-1.s:3: Warning: a warning message
> testsuite/gas/all/warn-1.s:4: Error: .warning argument must be a string
> testsuite/gas/all/warn-1.s:5: Warning: .warning directive invoked in source 
> file
> testsuite/gas/all/warn-1.s:6: Warning: .warning directive invoked in source 
> file
> testsuite/gas/all/warn-1.s:7: Warning:
> 
> 
> whereas with this patch:
>   LD_LIBRARY_PATH=/home/david/coding-3/gcc-newgit-canvas-2023/build/gcc 
> ./as-new testsuite/gas/all/warn-1.s
> emits:
> 
> 
> testsuite/gas/all/warn-1.s:3: warning: a warning message
> 3 |  .warning "a warning message"   ;# { dg-warning "Warning: a warning 
> message" }
>   |
> testsuite/gas/all/warn-1.s:4: error: .warning argument must be a string
> 4 |  .warning a warning message ;# { dg-error "Error: .warning 
> argument must be a string" }
>   |
> testsuite/gas/all/warn-1.s:5: warning: .warning directive invoked in source 
> file
> 5 |  .warning   ;# { dg-warning "Warning: .warning 
> directive invoked in source file" }
>   |
> testsuite/gas/all/warn-1.s:6: warning: .warning directive invoked in source 
> file
> 6 |  .warning ".warning directive invoked in source file"   ;# { 
> dg-warning "Warning: .warning directive invoked in source file" }
>   |
> testsuite/gas/all/warn-1.s:7: warning:
> 7 |  .warning "";# { dg-warning "Warning: " }
>   |
>   {"$schema": 
> "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json";,
>  "version": "2.1.0", "runs": [{"tool": {"driver": {"rules": []}}, 
> "invocations": [{"executionSuccessful": true, "toolExecutionNotifications": 
> []}], "originalUriBaseIds": {"PWD": {"uri": 
> "file:///home/david/coding-3/binutils-gdb/gas/"}}, "artifacts": [{"location": 
> {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "contents": 
> {"text": ";# Test .warning directive.\n;# { dg-do assemble }\n .warning \"a 
> warning message\"\t;# { dg-warning \"Warning: a warning message\" }\n 
> .warning a warning message\t;# { dg-error \"Error: .warning argument must be 
> a string\" }\n .warning\t\t\t;# { dg-warning \"Warning: .warning directive 
> invoked in source file\" }\n .warning \".warning directive invoked in source 
> file\"\t;# { dg-warning \"Warning: .warning directive invoked in source 
> file\" }\n .warning \"\"\t\t\t;# { dg-warning \"Warning: \" }\n"}}], 
> "results": [{"ruleId": "warning", "level": "warning", "message": {"text": "a 
> warning message"}, "locations": [{"physicalLocation": {"artifactLocation": 
> {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "region": 
> {"startLine": 3, "startColumn": 0, "endColumn": 1}, "contextRegion": 
> {"startLine": 3, "snippet": {"text": " .warning \"a warning message\"\t;# { 
> dg-warning \"Warning: a warning message\" }\n"], "relatedLocations": 
> [{"physicalLocation": {"artifactLocation": {"uri": 
> "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "region": {"startLine": 4, 
> "startColumn": 0, "endColumn": 1}, "contextRegion": {"startLine": 4, 
> "snippet": {"text": " .warning a warning message\t;# { dg-error \"Error: 
> .warning argument must be a string\" }\n"}}}, "message": {"text": ".warning 
> argument must be a string"}}, {"physicalLocation": {"artifactLocation": 
> {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": "PWD"}, "region": 
> {"startLine": 5, "startColumn": 0, "endColumn": 1}, "contextRegion": 
> {"startLine": 5, "snippet": {"text": " .warning\t\t\t;# { dg-warning 
> \"Warning: .warning directive invoked in source file\" }\n"}}}, "message": 
> {"text": ".warning directive invoked in source file"}}, {"physicalLocation": 
> {"artifactLocation": {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": 
> "PWD"}, "region": {"startLine": 6, "startColumn": 0, "endColumn": 1}, 
> "contextRegion": {"startLine": 6, "snippet": {"text": " .warning \".warning 
> directive invoked in source file\"\t;# { dg-warning \"Warning: .warning 
> directive invoked in source file\" }\n"}}}, "message": {"text": ".warning 
> directive invoked in source file"}}, {"physicalLocation": 
> {"artifactLocation": {"uri": "testsuite/gas/all/warn-1.s", "uriBaseId": 
> "PWD"}, "region": {"startLine": 7, "

Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Andrew Stubbs


On 07/11/2023 07:44, Juzhe-Zhong wrote:

This test shows vectorizing stmts using SLP 4 times instead of 2 for RVV.
The reason is RVV has 512 bit vector.
Here is comparison between RVV ans ARM SVE:
https://godbolt.org/z/xc5KE5rPs

But I notice AMDGCN also has 512 bit vector, seems this patch will cause FAIL 
in GCN ?

Not sure whether GCN is 2 times or 4 times ?


The pattern matches 4 times on GCN.


gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr97428.c: Fix FAIL for RVV.

---
  gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c 
b/gcc/testsuite/gcc.dg/vect/pr97428.c
index ad6416096aa..352c9bf04a7 100644
--- a/gcc/testsuite/gcc.dg/vect/pr97428.c
+++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
@@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t src[], int n)
  /* { dg-final { scan-tree-dump "Detected interleaving store of size 16" 
"vect" } } */
  /* We're not able to peel & apply re-aligning to make accesses well-aligned 
for !vect_hw_misalign,
 but we could by peeling the stores for alignment and applying re-aligning 
loads.  */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail { ! vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
xfail { { ! vect_hw_misalign } || { ! vect512 } } } } } */
  /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" } } */

Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread juzhe.zh...@rivai.ai

So, this patch not only fixes RVV FAIL, but also fixes GCN ?




juzhe.zh...@rivai.ai
 
From: Andrew Stubbs
Date: 2023-11-07 18:09
To: Juzhe-Zhong; gcc-patches@gcc.gnu.org
CC: jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 07:44, Juzhe-Zhong wrote:
> This test shows vectorizing stmts using SLP 4 times instead of 2 for RVV.
> The reason is RVV has 512 bit vector.
> Here is comparison between RVV ans ARM SVE:
> https://godbolt.org/z/xc5KE5rPs
> 
> But I notice AMDGCN also has 512 bit vector, seems this patch will cause FAIL 
> in GCN ?
> 
> Not sure whether GCN is 2 times or 4 times ?
 
The pattern matches 4 times on GCN.
 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/vect/pr97428.c: Fix FAIL for RVV.
> 
> ---
>   gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c 
> b/gcc/testsuite/gcc.dg/vect/pr97428.c
> index ad6416096aa..352c9bf04a7 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr97428.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
> @@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t src[], int n)
>   /* { dg-final { scan-tree-dump "Detected interleaving store of size 16" 
> "vect" } } */
>   /* We're not able to peel & apply re-aligning to make accesses well-aligned 
> for !vect_hw_misalign,
>  but we could by peeling the stores for alignment and applying 
> re-aligning loads.  */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { xfail { ! vect_hw_misalign } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { xfail { { ! vect_hw_misalign } || { ! vect512 } } } } } */
>   /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" } } */

Re: [PATCH] RISC-V: Use stdint-gcc.h in rvv testsuite

2023-11-07 Thread Kito Cheng

LGTM, but title is little bit misleading, it's not really related to rvv,
change to either RISC-V or T-head is fine, anyway, you can commit without
send v2 :)

Christoph Muellner  於 2023年11月7日 週二 17:45 寫道：

> From: Christoph Müllner 
>
> stdint.h can be replaced with stdint-gcc.h to resolve some missing
> system headers in non-multilib installations.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xtheadmemidx-helpers.h:
> Replace stdint.h with stdint-gcc.h.
>
> Signed-off-by: Christoph Müllner 
> ---
>  gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
> b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
> index a97f08c5cc1..9d8ce124a93 100644
> --- a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
> @@ -1,7 +1,7 @@
>  #ifndef XTHEADMEMIDX_HELPERS_H
>  #define XTHEADMEMIDX_HELPERS_H
>
> -#include 
> +#include 
>
>  #define intX_t long
>  #define uintX_t unsigned long
> --
> 2.41.0
>
>

Re: testsuite: introduce hostedlib effective target

2023-11-07 Thread Jonathan Wakely

On Tue, 7 Nov 2023 at 10:04, Alexandre Oliva  wrote:
>
> [adding libstdc++@]
>
> On Nov  5, 2023, Mike Stump  wrote:
>
> > Ick.
>
> Indeed ;-)
>
> > I wish there were fewer changed lines and not 1 per test
> > case. It feels like we've painted ourselves into a corner.
>
> The libstdc++ testsuite took a different approach, detecting missing
> headers (and libraries?) at error pruning time, and xfailing the tests,
> which seems to be more in line with what you are looking for.
>
> That approach, though more expedient, seems more fragile to me, in that
> an actual bug that caused headers to go missing would cause tests to be
> silently skipped rather than fail.

I don't think we XFAIL based on missing headers. We XFAIL based on a
specific #error message in certain headers.

If a header goes missing, we'll still XFAIL.

>
> I expect the set of headers, and thus of affected tests, won't by very
> dynamic, so it's kind of a one-shot change.
>
> Of course new tests might be added that rely on such headers, and would
> likely go unnoticed until someone tries them on a non-hosted libstdc++.

Since GCC 13 you don't need to build a non-hosted libstdc++ to test
it, you can just add -ffreestanding to the runtestflags.

> We could alleviate this if libstdc++ headers that are not installed on
> hosted systems issued a warning (conditional on some macro defined by
> the testsuite, say -D_GLIBCXX_WARN_HOSTED_ONLY).

That's exactly what happens (except #error not #warning) when you
compile with -ffreestanding.

>  For tests aimed
> exclusively at hosted libstdc++, we'd then use a dg directive that both
> implied this requirement, and changed the macro definition to suppress
> the warning.  Then anyone who added a testcase that included hosted
> headers without indicating its hostedlib requirement would get a fail
> even when testing with a hosted libstdc++.

I don't think we need to add checks for a new macro and then use that
when testing, you can just test with -ffreestanding instead. This
already works today.

[PATCH v2 0/3] libgfortran: empty array fixes

2023-11-07 Thread Mikael Morin

Hello,

Harald's review of the previous version [1] of these patches spotted a possible
misbehaving case in one patch, and a latent bug in the area of the second
patch.
So here is the second try, bootstraped and regression tested on 
x86_64-pc-linux-gnu.
OK for master?

Mikael

[1]:
https://gcc.gnu.org/pipermail/fortran/2023-November/059896.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635305.html

Changes from version 1:
 * Add patch 1/3 to the series fixing the unallocated empty result issue.
 * In patch 2/3 (formerly 1/2) clamp negative extent to zero.


Mikael Morin (3):
  libgfortran: Don't skip allocation if size is zero [PR112412]
  libgfortran: Remove early return if extent is zero [PR112371]
  libgfortran: Remove empty array descriptor first dimension overwrite
[PR112371]

 gcc/testsuite/gfortran.dg/allocated_4.f90 | 195 +++
 gcc/testsuite/gfortran.dg/bound_10.f90| 207 
 gcc/testsuite/gfortran.dg/bound_11.f90| 588 ++
 libgfortran/generated/all_l1.c|   9 +-
 libgfortran/generated/all_l16.c   |   9 +-
 libgfortran/generated/all_l2.c|   9 +-
 libgfortran/generated/all_l4.c|   9 +-
 libgfortran/generated/all_l8.c|   9 +-
 libgfortran/generated/any_l1.c|   9 +-
 libgfortran/generated/any_l16.c   |   9 +-
 libgfortran/generated/any_l2.c|   9 +-
 libgfortran/generated/any_l4.c|   9 +-
 libgfortran/generated/any_l8.c|   9 +-
 libgfortran/generated/count_16_l.c|   9 +-
 libgfortran/generated/count_1_l.c |   9 +-
 libgfortran/generated/count_2_l.c |   9 +-
 libgfortran/generated/count_4_l.c |   9 +-
 libgfortran/generated/count_8_l.c |   9 +-
 libgfortran/generated/findloc1_c10.c  |  18 +-
 libgfortran/generated/findloc1_c16.c  |  18 +-
 libgfortran/generated/findloc1_c17.c  |  18 +-
 libgfortran/generated/findloc1_c4.c   |  18 +-
 libgfortran/generated/findloc1_c8.c   |  18 +-
 libgfortran/generated/findloc1_i1.c   |  18 +-
 libgfortran/generated/findloc1_i16.c  |  18 +-
 libgfortran/generated/findloc1_i2.c   |  18 +-
 libgfortran/generated/findloc1_i4.c   |  18 +-
 libgfortran/generated/findloc1_i8.c   |  18 +-
 libgfortran/generated/findloc1_r10.c  |  18 +-
 libgfortran/generated/findloc1_r16.c  |  18 +-
 libgfortran/generated/findloc1_r17.c  |  18 +-
 libgfortran/generated/findloc1_r4.c   |  18 +-
 libgfortran/generated/findloc1_r8.c   |  18 +-
 libgfortran/generated/findloc1_s1.c   |  18 +-
 libgfortran/generated/findloc1_s4.c   |  18 +-
 libgfortran/generated/iall_i1.c   |  30 +-
 libgfortran/generated/iall_i16.c  |  30 +-
 libgfortran/generated/iall_i2.c   |  30 +-
 libgfortran/generated/iall_i4.c   |  30 +-
 libgfortran/generated/iall_i8.c   |  30 +-
 libgfortran/generated/iany_i1.c   |  30 +-
 libgfortran/generated/iany_i16.c  |  30 +-
 libgfortran/generated/iany_i2.c   |  30 +-
 libgfortran/generated/iany_i4.c   |  30 +-
 libgfortran/generated/iany_i8.c   |  30 +-
 libgfortran/generated/iparity_i1.c|  30 +-
 libgfortran/generated/iparity_i16.c   |  30 +-
 libgfortran/generated/iparity_i2.c|  30 +-
 libgfortran/generated/iparity_i4.c|  30 +-
 libgfortran/generated/iparity_i8.c|  30 +-
 libgfortran/generated/maxloc1_16_i1.c |  30 +-
 libgfortran/generated/maxloc1_16_i16.c|  30 +-
 libgfortran/generated/maxloc1_16_i2.c |  30 +-
 libgfortran/generated/maxloc1_16_i4.c |  30 +-
 libgfortran/generated/maxloc1_16_i8.c |  30 +-
 libgfortran/generated/maxloc1_16_r10.c|  30 +-
 libgfortran/generated/maxloc1_16_r16.c|  30 +-
 libgfortran/generated/maxloc1_16_r17.c|  30 +-
 libgfortran/generated/maxloc1_16_r4.c |  30 +-
 libgfortran/generated/maxloc1_16_r8.c |  30 +-
 libgfortran/generated/maxloc1_16_s1.c |  30 +-
 libgfortran/generated/maxloc1_16_s4.c |  30 +-
 libgfortran/generated/maxloc1_4_i1.c  |  30 +-
 libgfortran/generated/maxloc1_4_i16.c |  30 +-
 libgfortran/generated/maxloc1_4_i2.c  |  30 +-
 libgfortran/generated/maxloc1_4_i4.c  |  30 +-
 libgfortran/generated/maxloc1_4_i8.c  |  30 +-
 libgfortran/generated/maxloc1_4_r10.c |  30 +-
 libgfortran/generated/maxloc1_4_r16.c |  30 +-
 libgfortran/generated/maxloc1_4_r17.c |  30 +-
 libgfortran/generated/maxloc1_4_r4.c  |  30 +-
 libgfortran/generated/maxloc1_4_r8.c  |  30 +-
 libgfortran/generated/maxloc1_4_s1.c  |  30 +-
 libgfortran/generated/maxloc1_4_s4.c  |  30 +-
 libgfortran/generated/maxloc1_8_i1.c  |  30 +-
 libgfortran/generated/maxloc1_8_i16.c |  30 +-
 libgfortran/generated/maxloc1_8_i2.c  |  30 +-
 libgfortran/generated/maxloc1_8_i4.c  |  30 +-
 libgfortran/generated/maxloc1_8_i8.c  |  30 +-
 libgfortran/generated/maxloc1_8_r10.c |  30

[PATCH v2 2/3] libgfortran: Remove early return if extent is zero [PR112371]

2023-11-07 Thread Mikael Morin

Remove the early return present in function templates for transformational
functions doing a (masked) reduction of an array along a dimension.
This early return, which triggered if the extent in the reduction dimension
was zero, was wrong because even if the reduction operation degenerates to
a constant value in that case, one has to loop anyway along the other
dimensions to initialize every element of the resulting array with that
constant value.  The case of negative extent (not sure whether it may happen
in practice) which was also early returning, is handled by clamping to zero.

The offending piece of code was present in several places, and this removes
them all.  Namely, the impacted m4 files are ifunction.m4 for regular
functions and types, ifunction-s.m4 for character minloc and maxloc, and
ifunction-s2.m4 for character minval and maxval.

PR fortran/112371

libgfortran/ChangeLog:

* m4/ifunction.m4 (START_MASKED_ARRAY_FUNCTION): Remove early return if
extent is zero or less, and clamp negative value to zero.
* m4/ifunction-s.m4 (START_MASKED_ARRAY_FUNCTION): Ditto.
* m4/ifunction-s2.m4 (START_MASKED_ARRAY_FUNCTION): Ditto.
* generated/iall_i1.c: Regenerate.
* generated/iall_i16.c: Regenerate.
* generated/iall_i2.c: Regenerate.
* generated/iall_i4.c: Regenerate.
* generated/iall_i8.c: Regenerate.
* generated/iany_i1.c: Regenerate.
* generated/iany_i16.c: Regenerate.
* generated/iany_i2.c: Regenerate.
* generated/iany_i4.c: Regenerate.
* generated/iany_i8.c: Regenerate.
* generated/iparity_i1.c: Regenerate.
* generated/iparity_i16.c: Regenerate.
* generated/iparity_i2.c: Regenerate.
* generated/iparity_i4.c: Regenerate.
* generated/iparity_i8.c: Regenerate.
* generated/maxloc1_16_i1.c: Regenerate.
* generated/maxloc1_16_i16.c: Regenerate.
* generated/maxloc1_16_i2.c: Regenerate.
* generated/maxloc1_16_i4.c: Regenerate.
* generated/maxloc1_16_i8.c: Regenerate.
* generated/maxloc1_16_r10.c: Regenerate.
* generated/maxloc1_16_r16.c: Regenerate.
* generated/maxloc1_16_r17.c: Regenerate.
* generated/maxloc1_16_r4.c: Regenerate.
* generated/maxloc1_16_r8.c: Regenerate.
* generated/maxloc1_16_s1.c: Regenerate.
* generated/maxloc1_16_s4.c: Regenerate.
* generated/maxloc1_4_i1.c: Regenerate.
* generated/maxloc1_4_i16.c: Regenerate.
* generated/maxloc1_4_i2.c: Regenerate.
* generated/maxloc1_4_i4.c: Regenerate.
* generated/maxloc1_4_i8.c: Regenerate.
* generated/maxloc1_4_r10.c: Regenerate.
* generated/maxloc1_4_r16.c: Regenerate.
* generated/maxloc1_4_r17.c: Regenerate.
* generated/maxloc1_4_r4.c: Regenerate.
* generated/maxloc1_4_r8.c: Regenerate.
* generated/maxloc1_4_s1.c: Regenerate.
* generated/maxloc1_4_s4.c: Regenerate.
* generated/maxloc1_8_i1.c: Regenerate.
* generated/maxloc1_8_i16.c: Regenerate.
* generated/maxloc1_8_i2.c: Regenerate.
* generated/maxloc1_8_i4.c: Regenerate.
* generated/maxloc1_8_i8.c: Regenerate.
* generated/maxloc1_8_r10.c: Regenerate.
* generated/maxloc1_8_r16.c: Regenerate.
* generated/maxloc1_8_r17.c: Regenerate.
* generated/maxloc1_8_r4.c: Regenerate.
* generated/maxloc1_8_r8.c: Regenerate.
* generated/maxloc1_8_s1.c: Regenerate.
* generated/maxloc1_8_s4.c: Regenerate.
* generated/maxval1_s1.c: Regenerate.
* generated/maxval1_s4.c: Regenerate.
* generated/maxval_i1.c: Regenerate.
* generated/maxval_i16.c: Regenerate.
* generated/maxval_i2.c: Regenerate.
* generated/maxval_i4.c: Regenerate.
* generated/maxval_i8.c: Regenerate.
* generated/maxval_r10.c: Regenerate.
* generated/maxval_r16.c: Regenerate.
* generated/maxval_r17.c: Regenerate.
* generated/maxval_r4.c: Regenerate.
* generated/maxval_r8.c: Regenerate.
* generated/minloc1_16_i1.c: Regenerate.
* generated/minloc1_16_i16.c: Regenerate.
* generated/minloc1_16_i2.c: Regenerate.
* generated/minloc1_16_i4.c: Regenerate.
* generated/minloc1_16_i8.c: Regenerate.
* generated/minloc1_16_r10.c: Regenerate.
* generated/minloc1_16_r16.c: Regenerate.
* generated/minloc1_16_r17.c: Regenerate.
* generated/minloc1_16_r4.c: Regenerate.
* generated/minloc1_16_r8.c: Regenerate.
* generated/minloc1_16_s1.c: Regenerate.
* generated/minloc1_16_s4.c: Regenerate.
* generated/minloc1_4_i1.c: Regenerate.
* generated/minloc1_4_i16.c: Regenerate.
* generated/minloc1_4_i2.c: Regenerate.
* generated/minloc1_4_i4.c: Regenerate.
* generated/minloc1_4_i8.c: Regenerate.
* gener

[PATCH 1/2] libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface

2023-11-07 Thread Victor Do Nascimento

The introduction of further architectural-feature dependent ifuncs
for AArch64 makes hard-coding ifunc `_i' suffixes to functions
cumbersome to work with.  It is awkward to remember which ifunc maps
onto which arch feature and makes the code harder to maintain when new
ifuncs are added and their suffixes possibly altered.

This patch uses pre-processor `#define' statements to map each suffix to
a descriptive feature name macro, for example:

  #define LSE2 _i1

and reconstructs function names with the pre-processor's token
concatenation feature, such that for `MACRO(name)', we would now have
`MACRO(name, feature)' and in the macro definition body we replace
`name` with `name##feature`.

libatomic/ChangeLog:
* config/linux/aarch64/atomic_16.S (CORE): New macro.
(LSE2): Likewise.
(ENTRY): Modify macro to take in `arch' argument.
(END): Likewise.
(ALIAS): Likewise.
(ENTRY1): New macro.
(END1): Likewise.
(ALIAS): Likewise.
---
 libatomic/config/linux/aarch64/atomic_16.S | 147 +++--
 1 file changed, 79 insertions(+), 68 deletions(-)

diff --git a/libatomic/config/linux/aarch64/atomic_16.S 
b/libatomic/config/linux/aarch64/atomic_16.S
index 0485c284117..3f6225830e6 100644
--- a/libatomic/config/linux/aarch64/atomic_16.S
+++ b/libatomic/config/linux/aarch64/atomic_16.S
@@ -39,22 +39,34 @@
 
.arch   armv8-a+lse
 
-#define ENTRY(name)\
-   .global name;   \
-   .hidden name;   \
-   .type name,%function;   \
-   .p2align 4; \
-name:  \
-   .cfi_startproc; \
+#define ENTRY(name, feat)  \
+   ENTRY1(name, feat)
+
+#define ENTRY1(name, feat) \
+   .global name##feat; \
+   .hidden name##feat; \
+   .type name##feat,%function; \
+   .p2align 4; \
+name##feat:\
+   .cfi_startproc; \
hint34  // bti c
 
-#define END(name)  \
-   .cfi_endproc;   \
-   .size name, .-name;
+#define END(name, feat)\
+   END1(name, feat)
 
-#define ALIAS(alias,name)  \
-   .global alias;  \
-   .set alias, name;
+#define END1(name, feat)   \
+   .cfi_endproc;   \
+   .size name##feat, .-name##feat;
+
+#define ALIAS(alias, from, to) \
+   ALIAS1(alias,from,to)
+
+#define ALIAS1(alias, from, to)\
+   .global alias##from;\
+   .set alias##from, alias##to;
+
+#define CORE
+#define LSE2   _i1
 
 #define res0 x0
 #define res1 x1
@@ -89,7 +101,7 @@ name:\
 #define SEQ_CST 5
 
 
-ENTRY (libat_load_16)
+ENTRY (libat_load_16, CORE)
mov x5, x0
cbnzw1, 2f
 
@@ -104,10 +116,10 @@ ENTRY (libat_load_16)
stxpw4, res0, res1, [x5]
cbnzw4, 2b
ret
-END (libat_load_16)
+END (libat_load_16, CORE)
 
 
-ENTRY (libat_load_16_i1)
+ENTRY (libat_load_16, LSE2)
cbnzw1, 1f
 
/* RELAXED.  */
@@ -127,10 +139,10 @@ ENTRY (libat_load_16_i1)
ldp res0, res1, [x0]
dmb ishld
ret
-END (libat_load_16_i1)
+END (libat_load_16, LSE2)
 
 
-ENTRY (libat_store_16)
+ENTRY (libat_store_16, CORE)
cbnzw4, 2f
 
/* RELAXED.  */
@@ -144,10 +156,10 @@ ENTRY (libat_store_16)
stlxp   w4, in0, in1, [x0]
cbnzw4, 2b
ret
-END (libat_store_16)
+END (libat_store_16, CORE)
 
 
-ENTRY (libat_store_16_i1)
+ENTRY (libat_store_16, LSE2)
cbnzw4, 1f
 
/* RELAXED.  */
@@ -159,10 +171,10 @@ ENTRY (libat_store_16_i1)
stlxp   w4, in0, in1, [x0]
cbnzw4, 1b
ret
-END (libat_store_16_i1)
+END (libat_store_16, LSE2)
 
 
-ENTRY (libat_exchange_16)
+ENTRY (libat_exchange_16, CORE)
mov x5, x0
cbnzw4, 2f
 
@@ -186,10 +198,10 @@ ENTRY (libat_exchange_16)
stlxp   w4, in0, in1, [x5]
cbnzw4, 4b
ret
-END (libat_exchange_16)
+END (libat_exchange_16, CORE)
 
 
-ENTRY (libat_compare_exchange_16)
+ENTRY (libat_compare_exchange_16, CORE)
ldp exp0, exp1, [x1]
cbz w4, 3f
cmp w4, RELEASE
@@ -228,10 +240,10 @@ ENTRY (libat_compare_exchange_16)
cbnzw4, 4b
mov x0, 1
ret
-END (libat_compare_exchange_16)
+END (libat_compare_exchange_16, CORE)
 
 
-ENTRY (libat_compare_exchange_16_i1)
+ENTRY (libat_compare_exchange_16, LSE2)
ldp exp0, exp1, [x1]
mov tmp0, exp0
mov tmp1, exp1
@@ -264,10 +276,10 @@ ENTRY (libat_compare_exchange_16_i1)
/* ACQ_REL/SEQ_CST.  */
 4: caspal  exp0, exp1, in0, in1, [x0]
b   0b
-END (libat_compare_exchange_16_i1)
+END (libat_compare_exchange_16, LSE2)
 
 
-ENTRY (libat_fetch_add_16)
+ENTRY (libat_fetch_add_16, CORE)

Re: testsuite: introduce hostedlib effective target

2023-11-07 Thread Jonathan Wakely

On Tue, 7 Nov 2023 at 10:18, Jonathan Wakely  wrote:
>
> On Tue, 7 Nov 2023 at 10:04, Alexandre Oliva  wrote:
> >
> > [adding libstdc++@]
> >
> > On Nov  5, 2023, Mike Stump  wrote:
> >
> > > Ick.
> >
> > Indeed ;-)
> >
> > > I wish there were fewer changed lines and not 1 per test
> > > case. It feels like we've painted ourselves into a corner.
> >
> > The libstdc++ testsuite took a different approach, detecting missing
> > headers (and libraries?) at error pruning time, and xfailing the tests,
> > which seems to be more in line with what you are looking for.
> >
> > That approach, though more expedient, seems more fragile to me, in that
> > an actual bug that caused headers to go missing would cause tests to be
> > silently skipped rather than fail.
>
> I don't think we XFAIL based on missing headers. We XFAIL based on a
> specific #error message in certain headers.
>
> If a header goes missing, we'll still XFAIL.
>
> >
> > I expect the set of headers, and thus of affected tests, won't by very
> > dynamic, so it's kind of a one-shot change.
> >
> > Of course new tests might be added that rely on such headers, and would
> > likely go unnoticed until someone tries them on a non-hosted libstdc++.
>
> Since GCC 13 you don't need to build a non-hosted libstdc++ to test
> it, you can just add -ffreestanding to the runtestflags.
>
> > We could alleviate this if libstdc++ headers that are not installed on
> > hosted systems issued a warning (conditional on some macro defined by
> > the testsuite, say -D_GLIBCXX_WARN_HOSTED_ONLY).
>
> That's exactly what happens (except #error not #warning) when you
> compile with -ffreestanding.
>
> >  For tests aimed
> > exclusively at hosted libstdc++, we'd then use a dg directive that both
> > implied this requirement, and changed the macro definition to suppress
> > the warning.  Then anyone who added a testcase that included hosted
> > headers without indicating its hostedlib requirement would get a fail
> > even when testing with a hosted libstdc++.
>
> I don't think we need to add checks for a new macro and then use that
> when testing, you can just test with -ffreestanding instead. This
> already works today.

Ah, reading back in the thread for  the context I missed, I see that
you're specifically testing a --disable-hosted-libstdcxx build. In
that case some headers really will be absent, not just
present-with-#error. But I am still not concerned about failing to
notice if a header goes unintentionally missing, because the libstdc++
testsuite will still notice that.

We don't prune based on "no such header" errors, so would still get
FAILs for those tests that depend on headers which are supposed to be
present for freestanding.

[PATCH 2/2] libatomic: Enable LSE128 128-bit atomics for armv9.4-a

2023-11-07 Thread Victor Do Nascimento

The armv9.4-a architectural revision adds three new atomic operations
associated with the LSE128 feature:

  * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit
  value held in a pair of registers, with original data loaded into
  the same 2 registers.
  * LDSETP - Atomic OR (bitset) of a location with 128-bit value held
  in a pair of registers, with original data loaded into the same 2
  registers.
  * SWPP - Atomic swap of one 128-bit value with 128-bit value held
  in a pair of registers.

This patch adds the logic required to make use of these when the
architectural feature is present and a suitable assembler available.

In order to do this, the following changes are made:

  1. Add a configure-time check to check for LSE128 support in the
  assembler.
  2. Edit host-config.h so that when N == 16, nifunc = 2.
  3. Where available due to LSE128, implement the second ifunc, making
  use of the novel instructions.
  4. For atomic functions unable to make use of these new
  instructions, define a new alias which causes the _i1 function
  variant to point ahead to the corresponding _i2 implementation.

libatomic/ChangeLog:

* Makefile.am (AM_CPPFLAGS): add conditional setting of
-DHAVE_FEAT_LSE128.
* acinclude.m4 (LIBAT_TEST_FEAT_LSE128): New.
* config/linux/aarch64/atomic_16.S (LSE128): New macro
definition.
(libat_exchange_16): New LSE128 variant.
(libat_fetch_or_16): Likewise.
(libat_or_fetch_16): Likewise.
(libat_fetch_and_16): Likewise.
(libat_and_fetch_16): Likewise.
* config/linux/aarch64/host-config.h (IFUNC_COND_2): New.
(IFUNC_NCOND): Add operand size checking.
(has_lse2): Renamed from `ifunc1`.
(has_lse128): New.
(HAS_LSE128): Likewise.
* libatomic/configure.ac: Add call to LIBAT_TEST_FEAT_LSE128.
* configure (ac_subst_vars): Regenerated via autoreconf.
* libatomic/Makefile.in: Likewise.
* libatomic/auto-config.h.in: Likewise.
---
 libatomic/Makefile.am|   3 +
 libatomic/Makefile.in|   1 +
 libatomic/acinclude.m4   |  19 +++
 libatomic/auto-config.h.in   |   3 +
 libatomic/config/linux/aarch64/atomic_16.S   | 170 ++-
 libatomic/config/linux/aarch64/host-config.h |  23 ++-
 libatomic/configure  |  59 ++-
 libatomic/configure.ac   |   1 +
 8 files changed, 271 insertions(+), 8 deletions(-)

diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
index c0b8dea5037..24e843db67d 100644
--- a/libatomic/Makefile.am
+++ b/libatomic/Makefile.am
@@ -130,6 +130,9 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix 
_$(s)_.lo,$(SIZEOBJS)))
 ## On a target-specific basis, include alternates to be selected by IFUNC.
 if HAVE_IFUNC
 if ARCH_AARCH64_LINUX
+if ARCH_AARCH64_HAVE_LSE128
+AM_CPPFLAGS = -DHAVE_FEAT_LSE128
+endif
 IFUNC_OPTIONS   = -march=armv8-a+lse
 libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix 
_$(s)_1_.lo,$(SIZEOBJS)))
 libatomic_la_SOURCES += atomic_16.S
diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index dc2330b91fd..cd48fa21334 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -452,6 +452,7 @@ M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files)))
 libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix \
_$(s)_.lo,$(SIZEOBJS))) $(am__append_1) $(am__append_3) \
$(am__append_4) $(am__append_5)
+@ARCH_AARCH64_HAVE_LSE128_TRUE@@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@AM_CPPFLAGS
 = -DHAVE_FEAT_LSE128
 @ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv8-a+lse
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a+fp 
-DHAVE_KERNEL64
 @ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=i586
diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4
index f35ab5b60a5..4197db8f404 100644
--- a/libatomic/acinclude.m4
+++ b/libatomic/acinclude.m4
@@ -83,6 +83,25 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[
   ])
 ])
 
+dnl
+dnl Test if the host assembler supports armv9.4-a LSE128 isns.
+dnl
+AC_DEFUN([LIBAT_TEST_FEAT_LSE128],[
+  AC_CACHE_CHECK([for armv9.4-a LSE128 insn support],
+[libat_cv_have_feat_lse128],[
+AC_LANG_CONFTEST([AC_LANG_PROGRAM([],[asm(".arch armv9-a+lse128")])])
+if AC_TRY_EVAL(ac_link); then
+  eval libat_cv_have_feat_lse128=yes
+else
+  eval libat_cv_have_feat_lse128=no
+fi
+rm -f conftest*
+  ])
+  LIBAT_DEFINE_YESNO([HAVE_FEAT_LSE128], [$libat_cv_have_feat_lse128],
+   [Have LSE128 support for 16 byte integers.])
+  AM_CONDITIONAL([ARCH_AARCH64_HAVE_LSE128], [test x$libat_cv_have_feat_lse128 
= xyes])
+])
+
 dnl
 dnl Test if we have __atomic_load and __atomic_store for mode $1, size $2
 dnl
diff --git a/libatomic/auto-config.h.in b/libatomic/auto-config.h.in
index ab3424a759e..7c78933b07d 100644
--- a/libatomic/auto-config

[PATCH 0/2] Libatomic: Add LSE128 atomics support for AArch64

2023-11-07 Thread Victor Do Nascimento

Building upon Wilco Dijkstra's work on AArch64 128-bit atomics for
Libatomic, namely the patches from [1] and [2],  this patch series
extends the library's  capabilities to dynamically select and emit
Armv9.4-a LSE128 implementations of atomic operations via ifuncs at
run-time whenever architectural support is present.

Regression tested on aarch64-linux-gnu target with LSE128-support.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620529.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626358.html

Victor Do Nascimento (2):
  libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface
  libatomic: Enable LSE128 128-bit atomics for armv9.4-a

 libatomic/Makefile.am|   3 +
 libatomic/Makefile.in|   1 +
 libatomic/acinclude.m4   |  19 ++
 libatomic/auto-config.h.in   |   3 +
 libatomic/config/linux/aarch64/atomic_16.S   | 315 ++-
 libatomic/config/linux/aarch64/host-config.h |  23 +-
 libatomic/configure  |  59 +++-
 libatomic/configure.ac   |   1 +
 8 files changed, 349 insertions(+), 75 deletions(-)

-- 
2.41.0

[PATCH 2/5] aarch64: Add support for GCS system registers with the +gcs modifier

2023-11-07 Thread Victor Do Nascimento

Given the introduction of system registers associated with the Guarded
Control Stack extension to Armv9.4-a in Binutils and their reliance on
the `+gcs' modifier, we implement the necessary changes in GCC to
allow for them to be recognized by the compiler.

gcc/ChangeLog:

* config/aarch64/aarch64-option-extensions.def (gcs): New.
* config/aarch64/aarch64.h (AARCH64_ISA_GCS): New.
(TARGET_THE):  Likewise.
* doc/invoke.texi (AArch64 Options): Describe GCS.
---
 gcc/config/aarch64/aarch64-option-extensions.def | 2 ++
 gcc/config/aarch64/aarch64.h | 6 ++
 gcc/doc/invoke.texi  | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index da31f7c32d1..e72c039b612 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -155,4 +155,6 @@ AARCH64_OPT_EXTENSION("d128", D128, (), (), (), "d128")
 
 AARCH64_OPT_EXTENSION("the", THE, (), (), (), "the")
 
+AARCH64_OPT_EXTENSION("gcs", GCS, (), (), (), "gcs")
+
 #undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 1b3c800ec89..69ef54553d7 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -230,6 +230,7 @@ enum class aarch64_feature : unsigned char {
 #define AARCH64_ISA_CSSC  (aarch64_isa_flags & AARCH64_FL_CSSC)
 #define AARCH64_ISA_D128  (aarch64_isa_flags & AARCH64_FL_D128)
 #define AARCH64_ISA_THE   (aarch64_isa_flags & AARCH64_FL_THE)
+#define AARCH64_ISA_GCS   (aarch64_isa_flags & AARCH64_FL_GCS)
 
 /* AARCH64_FL options necessary for system register implementation.  */
 
@@ -403,6 +404,11 @@ enum class aarch64_feature : unsigned char {
 enabled through +the.  */
 #define TARGET_THE (AARCH64_ISA_THE)
 
+/*  Armv9.4-A Guarded Control Stack extension system registers are
+enabled through +gcs.  */
+#define TARGET_GCS (AARCH64_ISA_GCS)
+
+
 /* Standard register usage.  */
 
 /* 31 64-bit general purpose registers R0-R30:
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 88327ce9681..88ee1fdb524 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -21032,6 +21032,8 @@ Enable the Pointer Authentication Extension.
 Enable the Common Short Sequence Compression instructions.
 @item d128
 Enable support for 128-bit system register read/write instructions.
+@item gcs
+Enable support for Armv9.4-a Guarded Control Stack extension.
 @item the
 Enable support for Armv8.9-a/9.4-a translation hardening extension.
 
-- 
2.41.0

[PATCH 3/5] aarch64: Sync `aarch64-sys-regs.def' with Binutils.

2023-11-07 Thread Victor Do Nascimento

This patch updates `aarch64-sys-regs.def', bringing it into sync with
the Binutils source.

gcc/ChangeLog:

* config/aarch64/aarch64-sys-regs.def (par_el1): New.
(rcwmask_el1): Likewise.
(rcwsmask_el1): Likewise.
(ttbr0_el1): Likewise.
(ttbr0_el12): Likewise.
(ttbr0_el2): Likewise.
(ttbr1_el1): Likewise.
(ttbr1_el12): Likewise.
(ttbr1_el2): Likewise.
(vttbr_el2): Likewise.
(gcspr_el0): Likewise.
(gcspr_el1): Likewise.
(gcspr_el12): Likewise.
(gcspr_el2): Likewise.
(gcspr_el3): Likewise.
(gcscre0_el1): Likewise.
(gcscr_el1): Likewise.
(gcscr_el12): Likewise.
(gcscr_el2): Likewise.
(gcscr_el3): Likewise.
---
 gcc/config/aarch64/aarch64-sys-regs.def | 30 +
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
b/gcc/config/aarch64/aarch64-sys-regs.def
index d24a2455503..96bdadb0b0f 100644
--- a/gcc/config/aarch64/aarch64-sys-regs.def
+++ b/gcc/config/aarch64/aarch64-sys-regs.def
@@ -419,6 +419,16 @@
   SYSREG ("fpcr",  CPENC (3,3,4,4,0),  0,  
AARCH64_NO_FEATURES)
   SYSREG ("fpexc32_el2",   CPENC (3,4,5,3,0),  0,  
AARCH64_NO_FEATURES)
   SYSREG ("fpsr",  CPENC (3,3,4,4,1),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("gcspr_el0", CPENC (3,3,2,5,1),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcspr_el1", CPENC (3,0,2,5,1),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcspr_el2", CPENC (3,4,2,5,1),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcspr_el12",CPENC (3,5,2,5,1),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcspr_el3", CPENC (3,6,2,5,1),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcscre0_el1",   CPENC (3,0,2,5,2),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcscr_el1", CPENC (3,0,2,5,0),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcscr_el2", CPENC (3,4,2,5,0),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcscr_el12",CPENC (3,5,2,5,0),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
+  SYSREG ("gcscr_el3", CPENC (3,6,2,5,0),  F_ARCHEXT,  
AARCH64_FEATURE (GCS))
   SYSREG ("gcr_el1",   CPENC (3,0,1,0,6),  F_ARCHEXT,  
AARCH64_FEATURE (MEMTAG))
   SYSREG ("gmid_el1",  CPENC (3,1,0,0,4),  F_REG_READ|F_ARCHEXT,   
AARCH64_FEATURE (MEMTAG))
   SYSREG ("gpccr_el3", CPENC (3,6,2,1,6),  0,  
AARCH64_NO_FEATURES)
@@ -584,7 +594,7 @@
   SYSREG ("oslar_el1", CPENC (2,0,1,0,4),  F_REG_WRITE,
AARCH64_NO_FEATURES)
   SYSREG ("oslsr_el1", CPENC (2,0,1,1,4),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("pan",   CPENC (3,0,4,2,3),  F_ARCHEXT,  
AARCH64_FEATURE (PAN))
-  SYSREG ("par_el1",   CPENC (3,0,7,4,0),  0,  
AARCH64_NO_FEATURES)
+  SYSREG ("par_el1",   CPENC (3,0,7,4,0),  F_REG_128,  
AARCH64_NO_FEATURES)
   SYSREG ("pmbidr_el1",CPENC (3,0,9,10,7), 
F_REG_READ|F_ARCHEXT,   AARCH64_FEATURE (PROFILE))
   SYSREG ("pmblimitr_el1", CPENC (3,0,9,10,0), F_ARCHEXT,  
AARCH64_FEATURE (PROFILE))
   SYSREG ("pmbptr_el1",CPENC (3,0,9,10,1), F_ARCHEXT,  
AARCH64_FEATURE (PROFILE))
@@ -746,6 +756,8 @@
   SYSREG ("prlar_el2", CPENC (3,4,6,8,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8R))
   SYSREG ("prselr_el1",CPENC (3,0,6,2,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8R))
   SYSREG ("prselr_el2",CPENC (3,4,6,2,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8R))
+  SYSREG ("rcwmask_el1",   CPENC (3,0,13,0,6), F_ARCHEXT|F_REG_128,
AARCH64_FEATURE (THE))
+  SYSREG ("rcwsmask_el1",  CPENC (3,0,13,0,3), F_ARCHEXT|F_REG_128,
AARCH64_FEATURE (THE))
   SYSREG ("revidr_el1",CPENC (3,0,0,0,6),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("rgsr_el1",  CPENC (3,0,1,0,5),  F_ARCHEXT,  
AARCH64_FEATURE (MEMTAG))
   SYSREG ("rmr_el1",   CPENC (3,0,12,0,2), 0,  
AARCH64_NO_FEATURES)
@@ -1034,13 +1046,13 @@
   SYSREG ("trfcr_el1", CPENC (3,0,1,2,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8_4A))
   SYSREG ("trfcr_el12",CPENC (3,5,1,2,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8_4A))
   SYSREG ("trfcr_el2", CPENC (3,4,1,2,1),  F_ARCHEXT,  
AARCH64_FEATURE (V8_4A))
-  SYSREG ("ttbr0_el1", CPENC (3,0,2,0,0),  0,

[PATCH 1/5] aarch64: Add march flags for +the and +d128 arch extensions

2023-11-07 Thread Victor Do Nascimento

Given the introduction of optional 128-bit page table descriptor and
translation hardening extension support with the Arm9.4-a
architecture, this introduces the relevant flags to enable the reading
and writing of 128-bit system registers.

The `+d128' -march modifier enables the use of the following ACLE
builtin functions:

  * __uint128_t __arm_rsr128(const char *special_register);
  * void __arm_wsr128(const char *special_register, __uint128_t value);

and defines the __ARM_FEATURE_SYSREG128 macro to 1.

Finally, the `rcwmask_el1' and `rcwsmask_el1' 128-bit system register
implementations are also reliant on the enablement of the `+the' flag,
which is thus also implemented in this patch.

gcc/ChangeLog:

* config/aarch64/aarch64-arches.def (armv8.9-a): New.
(armv9.4-a): Likewise.
* config/aarch64/aarch64-option-extensions.def (d128): Likewise.
(the): Likewise.
* config/aarch64/aarch64.h (AARCH64_ISA_V9_4A): Likewise.
(AARCH64_ISA_V8_9A): Likewise.
(TARGET_ARMV9_4): Likewise.
(AARCH64_ISA_D128): Likewise.
(AARCH64_ISA_THE): Likewise.
(TARGET_D128): Likewise.
* doc/invoke.texi (AArch64 Options): Document new -march flags
and extensions.
---
 gcc/config/aarch64/aarch64-arches.def|  2 ++
 gcc/config/aarch64/aarch64-c.cc  |  1 +
 gcc/config/aarch64/aarch64-option-extensions.def |  4 
 gcc/config/aarch64/aarch64.h | 15 +++
 gcc/doc/invoke.texi  |  6 ++
 5 files changed, 28 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-arches.def 
b/gcc/config/aarch64/aarch64-arches.def
index 7ae92aa8e98..becccb801d0 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -39,10 +39,12 @@ AARCH64_ARCH("armv8.5-a", generic,   V8_5A, 8,  
(V8_4A, SB, SSBS, PR
 AARCH64_ARCH("armv8.6-a", generic,   V8_6A, 8,  (V8_5A, I8MM, 
BF16))
 AARCH64_ARCH("armv8.7-a", generic,   V8_7A, 8,  (V8_6A, LS64))
 AARCH64_ARCH("armv8.8-a", generic,   V8_8A, 8,  (V8_7A, MOPS))
+AARCH64_ARCH("armv8.9-a", generic,   V8_9A, 8,  (V8_8A))
 AARCH64_ARCH("armv8-r",   generic,   V8R  , 8,  (V8_4A))
 AARCH64_ARCH("armv9-a",   generic,   V9A  , 9,  (V8_5A, SVE2))
 AARCH64_ARCH("armv9.1-a", generic,   V9_1A, 9,  (V8_6A, V9A))
 AARCH64_ARCH("armv9.2-a", generic,   V9_2A, 9,  (V8_7A, V9_1A))
 AARCH64_ARCH("armv9.3-a", generic,   V9_3A, 9,  (V8_8A, V9_2A))
+AARCH64_ARCH("armv9.4-a", generic,   V9_4A, 9,  (V8_9A, V9_3A))
 
 #undef AARCH64_ARCH
diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index be8b7236cf9..cacf8e8ed25 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -206,6 +206,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (TARGET_LS64,
"__ARM_FEATURE_LS64", pfile);
   aarch64_def_or_undef (AARCH64_ISA_RCPC, "__ARM_FEATURE_RCPC", pfile);
+  aarch64_def_or_undef (TARGET_D128, "__ARM_FEATURE_SYSREG128", pfile);
 
   /* Not for ACLE, but required to keep "float.h" correct if we switch
  target between implementations that do or do not support ARMv8.2-A
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index 825f3bf7758..da31f7c32d1 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -151,4 +151,8 @@ AARCH64_OPT_EXTENSION("mops", MOPS, (), (), (), "")
 
 AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "cssc")
 
+AARCH64_OPT_EXTENSION("d128", D128, (), (), (), "d128")
+
+AARCH64_OPT_EXTENSION("the", THE, (), (), (), "the")
+
 #undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 84e6f79ca83..1b3c800ec89 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -219,13 +219,17 @@ enum class aarch64_feature : unsigned char {
 #define AARCH64_ISA_PAUTH (aarch64_isa_flags & AARCH64_FL_PAUTH)
 #define AARCH64_ISA_V8_7A (aarch64_isa_flags & AARCH64_FL_V8_7A)
 #define AARCH64_ISA_V8_8A (aarch64_isa_flags & AARCH64_FL_V8_8A)
+#define AARCH64_ISA_V8_9A (aarch64_isa_flags & AARCH64_FL_V8_9A)
 #define AARCH64_ISA_V9A   (aarch64_isa_flags & AARCH64_FL_V9A)
 #define AARCH64_ISA_V9_1A  (aarch64_isa_flags & AARCH64_FL_V9_1A)
 #define AARCH64_ISA_V9_2A  (aarch64_isa_flags & AARCH64_FL_V9_2A)
 #define AARCH64_ISA_V9_3A  (aarch64_isa_flags & AARCH64_FL_V9_3A)
+#define AARCH64_ISA_V9_4A  (aarch64_isa_flags & AARCH64_FL_V9_4A)
 #define AARCH64_ISA_MOPS  (aarch64_isa_flags & AARCH64_FL_MOPS)
 #define AARCH64_ISA_LS64  (aarch64_isa_flags & AARCH64_FL_LS64)
 #define AARCH64_ISA_CSSC  (aarch64_

[PATCH 0/5] aarch64: Add Armv9.4-a 128-bit system-register read/write support

2023-11-07 Thread Victor Do Nascimento

Given the introduction of optional 128-bit page table descriptor and
translation hardening extension support with the Arm9.4-a
architecture, this patch series introduces the necessary changes to
the aarch64-specific builtin code to enable the reading and writing of
128-bit system registers.  In so doing, the following ACLE builtins and
feature macro are made available to the compiler:

  * __uint128_t __arm_rsr128(const char *special_register);
  * void __arm_wsr128(const char *special_register, __uint128_t value);
  * __ARM_FEATURE_SYSREG128.

Finally, in order to update the GCC system-register database bringing
it in line with Binutils, and in so doing add the relevant 128-bit
system registers to GCC, this patch also introduces the Guarded
Control Stack (GCS) `+gcs' architecture modifier flag, allowing the
inclusion of the novel GCS system registers which are now supported
and also present in the `aarch64-sys-regs.def' system register
database.

Victor Do Nascimento (5):
  aarch64: Add march flags for +the and +d128 arch extensions
  aarch64: Add support for GCS system registers with the +gcs modifier
  aarch64: Sync `aarch64-sys-regs.def' with Binutils.
  aarch64: Implement 128-bit extension to ACLE sysreg r/w builtins
  aarch64: Add rsr128 and wsr128 ACLE tests

 gcc/config/aarch64/aarch64-arches.def |  2 +
 gcc/config/aarch64/aarch64-builtins.cc| 50 ---
 gcc/config/aarch64/aarch64-c.cc   |  1 +
 .../aarch64/aarch64-option-extensions.def |  6 +++
 gcc/config/aarch64/aarch64-protos.h   |  2 +-
 gcc/config/aarch64/aarch64-sys-regs.def   | 30 +++
 gcc/config/aarch64/aarch64.cc |  6 ++-
 gcc/config/aarch64/aarch64.h  | 21 
 gcc/config/aarch64/aarch64.md | 18 +++
 gcc/config/aarch64/arm_acle.h | 11 
 gcc/doc/invoke.texi   |  8 +++
 gcc/testsuite/gcc.target/aarch64/acle/rwsr.c  | 30 ++-
 12 files changed, 165 insertions(+), 20 deletions(-)

-- 
2.41.0

[PATCH 5/5] aarch64: Add rsr128 and wsr128 ACLE tests

2023-11-07 Thread Victor Do Nascimento

Extend existing unit tests for the ACLE system register manipulation
functions to include 128-bit tests.

gcc/testsuite/ChangeLog:

* gcc/testsuite/gcc.target/aarch64/acle/rwsr.c (get_rsr128): New.
(set_wsr128): Likewise.
---
 gcc/testsuite/gcc.target/aarch64/acle/rwsr.c | 30 +++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/acle/rwsr.c 
b/gcc/testsuite/gcc.target/aarch64/acle/rwsr.c
index 3af4b960306..e7725022316 100644
--- a/gcc/testsuite/gcc.target/aarch64/acle/rwsr.c
+++ b/gcc/testsuite/gcc.target/aarch64/acle/rwsr.c
@@ -1,11 +1,15 @@
 /* Test the __arm_[r,w]sr ACLE intrinsics family.  */
 /* Check that function variants for different data types handle types 
correctly.  */
 /* { dg-do compile } */
-/* { dg-options "-O1 -march=armv8.4-a" } */
+/* { dg-options "-O1 -march=armv9.4-a+d128" } */
 /* { dg-final { check-function-bodies "**" "" } } */
 
 #include 
 
+#ifndef __ARM_FEATURE_SYSREG128
+#error "__ARM_FEATURE_SYSREG128 feature macro not defined."
+#endif
+
 /*
 ** get_rsr:
 ** ...
@@ -66,6 +70,17 @@ get_rsrf64 ()
   return __arm_rsrf64("trcseqstr");
 }
 
+/*
+** get_rsr128:
+** mrrsx0, x1, s3_0_c7_c4_0
+** ...
+*/
+__uint128_t
+get_rsr128 ()
+{
+  __arm_rsr128("par_el1");
+}
+
 /*
 ** set_wsr32:
 ** ...
@@ -129,6 +144,18 @@ set_wsrf64(double a)
   __arm_wsrf64("trcseqstr", a);
 }
 
+/*
+** set_wsr128:
+** ...
+** msrrs3_0_c7_c4_0, x0, x1
+** ...
+*/
+void
+set_wsr128 (__uint128_t c)
+{
+  __arm_wsr128 ("par_el1", c);
+}
+
 /*
 ** set_custom:
 ** ...
@@ -142,3 +169,4 @@ void set_custom()
   __uint64_t b = __arm_rsr64("S1_2_C3_C4_5");
   __arm_wsr64("S1_2_C3_C4_5", b);
 }
+
-- 
2.41.0

[PATCH 4/5] aarch64: Implement 128-bit extension to ACLE sysreg r/w builtins

2023-11-07 Thread Victor Do Nascimento

Implement the ACLE builtins for 128-bit system register manipulation:

  * __uint128_t __arm_rsr128(const char *special_register);
  * void __arm_wsr128(const char *special_register, __uint128_t value);

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (AARCH64_RSR128): New
`enum aarch64_builtins' value.
(AARCH64_WSR128): Likewise.
(aarch64_init_rwsr_builtins): Init `__builtin_aarch64_rsr128'
and `__builtin_aarch64_wsr128' builtins.
(aarch64_expand_rwsr_builtin): Extend function to handle
`__builtin_aarch64_{rsr|wsr}128'.
* config/aarch64/aarch64-protos.h (aarch64_retrieve_sysreg):
Update function signature.
* config/aarch64/aarch64.cc (F_REG_128): New.
(aarch64_retrieve_sysreg): Add 128-bit register mode check.
* config/aarch64/aarch64.md (UNSPEC_SYSREG_RTI): New.
(UNSPEC_SYSREG_WTI): Likewise.
(aarch64_read_sysregti): Likewise.
(aarch64_write_sysregti): Likewise.
---
 gcc/config/aarch64/aarch64-builtins.cc | 50 +-
 gcc/config/aarch64/aarch64-protos.h|  2 +-
 gcc/config/aarch64/aarch64.cc  |  6 +++-
 gcc/config/aarch64/aarch64.md  | 18 ++
 gcc/config/aarch64/arm_acle.h  | 11 ++
 5 files changed, 77 insertions(+), 10 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index c5f20f68bca..40d3788b5e0 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -815,11 +815,13 @@ enum aarch64_builtins
   AARCH64_RSR64,
   AARCH64_RSRF,
   AARCH64_RSRF64,
+  AARCH64_RSR128,
   AARCH64_WSR,
   AARCH64_WSRP,
   AARCH64_WSR64,
   AARCH64_WSRF,
   AARCH64_WSRF64,
+  AARCH64_WSR128,
   AARCH64_BUILTIN_MAX
 };
 
@@ -1842,6 +1844,10 @@ aarch64_init_rwsr_builtins (void)
 = build_function_type_list (double_type_node, const_char_ptr_type, NULL);
   AARCH64_INIT_RWSR_BUILTINS_DECL (RSRF64, rsrf64, fntype);
 
+  fntype
+= build_function_type_list (uint128_type_node, const_char_ptr_type, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (RSR128, rsr128, fntype);
+
   fntype
 = build_function_type_list (void_type_node, const_char_ptr_type,
uint32_type_node, NULL);
@@ -1867,6 +1873,12 @@ aarch64_init_rwsr_builtins (void)
 = build_function_type_list (void_type_node, const_char_ptr_type,
double_type_node, NULL);
   AARCH64_INIT_RWSR_BUILTINS_DECL (WSRF64, wsrf64, fntype);
+
+  fntype
+= build_function_type_list (void_type_node, const_char_ptr_type,
+   uint128_type_node, NULL);
+  AARCH64_INIT_RWSR_BUILTINS_DECL (WSR128, wsr128, fntype);
+
 }
 
 /* Initialize the memory tagging extension (MTE) builtins.  */
@@ -2710,6 +2722,7 @@ aarch64_expand_rwsr_builtin (tree exp, rtx target, int 
fcode)
   tree arg0, arg1;
   rtx const_str, input_val, subreg;
   enum machine_mode mode;
+  enum insn_code icode;
   class expand_operand ops[2];
 
   arg0 = CALL_EXPR_ARG (exp, 0);
@@ -2718,7 +2731,18 @@ aarch64_expand_rwsr_builtin (tree exp, rtx target, int 
fcode)
   || fcode == AARCH64_WSRP
   || fcode == AARCH64_WSR64
   || fcode == AARCH64_WSRF
-  || fcode == AARCH64_WSRF64);
+  || fcode == AARCH64_WSRF64
+  || fcode == AARCH64_WSR128);
+
+  bool op128 = (fcode == AARCH64_RSR128 || fcode == AARCH64_WSR128);
+  enum machine_mode sysreg_mode = op128 ? TImode : DImode;
+
+  if (op128 && !TARGET_D128)
+{
+  error_at (EXPR_LOCATION (exp), "128-bit system register suppport 
requires "
+"the +d128 Armv9.4-A extension");
+  return const0_rtx;
+}
 
   /* Argument 0 (system register name) must be a string literal.  */
   gcc_assert (TREE_CODE (arg0) == ADDR_EXPR
@@ -2741,7 +2765,7 @@ aarch64_expand_rwsr_builtin (tree exp, rtx target, int 
fcode)
 sysreg_name[pos] = TOLOWER (sysreg_name[pos]);
 
   const char* name_output = aarch64_retrieve_sysreg ((const char *) 
sysreg_name,
-write_op);
+write_op, op128);
   if (name_output == NULL)
 {
   error_at (EXPR_LOCATION (exp), "invalid system register name provided");
@@ -2760,13 +2784,17 @@ aarch64_expand_rwsr_builtin (tree exp, rtx target, int 
fcode)
   mode = TYPE_MODE (TREE_TYPE (arg1));
   input_val = copy_to_mode_reg (mode, expand_normal (arg1));
 
+  icode = (op128 ? CODE_FOR_aarch64_write_sysregti
+: CODE_FOR_aarch64_write_sysregdi);
+
   switch (fcode)
{
case AARCH64_WSR:
case AARCH64_WSRP:
case AARCH64_WSR64:
case AARCH64_WSRF64:
- subreg = lowpart_subreg (DImode, input_val, mode);
+   case AARCH64_WSR128:
+ subreg = lowpart_subreg (sysreg_mode, input_val, mode);

Re: testsuite: introduce hostedlib effective target

2023-11-07 Thread Jonathan Wakely

On Tue, 7 Nov 2023 at 10:24, Jonathan Wakely  wrote:
>
> On Tue, 7 Nov 2023 at 10:18, Jonathan Wakely  wrote:
> >
> > On Tue, 7 Nov 2023 at 10:04, Alexandre Oliva  wrote:
> > >
> > > [adding libstdc++@]
> > >
> > > On Nov  5, 2023, Mike Stump  wrote:
> > >
> > > > Ick.
> > >
> > > Indeed ;-)
> > >
> > > > I wish there were fewer changed lines and not 1 per test
> > > > case. It feels like we've painted ourselves into a corner.
> > >
> > > The libstdc++ testsuite took a different approach, detecting missing
> > > headers (and libraries?) at error pruning time, and xfailing the tests,
> > > which seems to be more in line with what you are looking for.
> > >
> > > That approach, though more expedient, seems more fragile to me, in that
> > > an actual bug that caused headers to go missing would cause tests to be
> > > silently skipped rather than fail.
> >
> > I don't think we XFAIL based on missing headers. We XFAIL based on a
> > specific #error message in certain headers.
> >
> > If a header goes missing, we'll still XFAIL.
> >
> > >
> > > I expect the set of headers, and thus of affected tests, won't by very
> > > dynamic, so it's kind of a one-shot change.
> > >
> > > Of course new tests might be added that rely on such headers, and would
> > > likely go unnoticed until someone tries them on a non-hosted libstdc++.
> >
> > Since GCC 13 you don't need to build a non-hosted libstdc++ to test
> > it, you can just add -ffreestanding to the runtestflags.
> >
> > > We could alleviate this if libstdc++ headers that are not installed on
> > > hosted systems issued a warning (conditional on some macro defined by
> > > the testsuite, say -D_GLIBCXX_WARN_HOSTED_ONLY).
> >
> > That's exactly what happens (except #error not #warning) when you
> > compile with -ffreestanding.
> >
> > >  For tests aimed
> > > exclusively at hosted libstdc++, we'd then use a dg directive that both
> > > implied this requirement, and changed the macro definition to suppress
> > > the warning.  Then anyone who added a testcase that included hosted
> > > headers without indicating its hostedlib requirement would get a fail
> > > even when testing with a hosted libstdc++.
> >
> > I don't think we need to add checks for a new macro and then use that
> > when testing, you can just test with -ffreestanding instead. This
> > already works today.
>
> Ah, reading back in the thread for  the context I missed, I see that
> you're specifically testing a --disable-hosted-libstdcxx build. In
> that case some headers really will be absent, not just
> present-with-#error. But I am still not concerned about failing to
> notice if a header goes unintentionally missing, because the libstdc++
> testsuite will still notice that.
>
> We don't prune based on "no such header" errors, so would still get
> FAILs for those tests that depend on headers which are supposed to be
> present for freestanding.

An alternative approach for the g++ testsuite would be to provide a
set of dummy headers for the non-freestanding ones, so that all the
hosted-only headers are provided by the testsuite itself, but consist
of a single line:

#error not available in freestanding

Then match on that and XFAIL. So the individual tests themselves
wouldn't need the dg-skip-if added to them, they would just
automatically XFAIL if they use a hosted-only header.

The difficulty would be where to add those dummy headers for
,  etc. so that they're only found when testing a
non-hosted build. Maybe libstdc++ could provide them in the build dir
for the purposes of the testsuite, but not install them?

Re: [PATCH] RISC-V: Use stdint-gcc.h in rvv testsuite

2023-11-07 Thread Christoph Müllner

On Tue, Nov 7, 2023 at 11:16 AM Kito Cheng  wrote:
>
> LGTM, but title is little bit misleading, it's not really related to rvv, 
> change to either RISC-V or T-head is fine, anyway, you can commit without 
> send v2 :)

Fixed and pushed.

Thanks!

>
> Christoph Muellner  於 2023年11月7日 週二 17:45 寫道：
>>
>> From: Christoph Müllner 
>>
>> stdint.h can be replaced with stdint-gcc.h to resolve some missing
>> system headers in non-multilib installations.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/xtheadmemidx-helpers.h:
>> Replace stdint.h with stdint-gcc.h.
>>
>> Signed-off-by: Christoph Müllner 
>> ---
>>  gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h 
>> b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
>> index a97f08c5cc1..9d8ce124a93 100644
>> --- a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
>> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
>> @@ -1,7 +1,7 @@
>>  #ifndef XTHEADMEMIDX_HELPERS_H
>>  #define XTHEADMEMIDX_HELPERS_H
>>
>> -#include 
>> +#include 
>>
>>  #define intX_t long
>>  #define uintX_t unsigned long
>> --
>> 2.41.0
>>

RE: [PATCH v6 0/21]middle-end: Support early break/return auto-vectorization

2023-11-07 Thread Tamar Christina

> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, November 7, 2023 9:43 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [PATCH v6 0/21]middle-end: Support early break/return auto-
> vectorization
> 
> On Mon, 6 Nov 2023, Tamar Christina wrote:
> 
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Monday, November 6, 2023 2:25 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > Subject: Re: [PATCH v6 0/21]middle-end: Support early break/return
> > > auto- vectorization
> > >
> > > On Mon, 6 Nov 2023, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch adds initial support for early break vectorization in GCC.
> > > > The support is added for any target that implements a vector
> > > > cbranch optab, this includes both fully masked and non-masked targets.
> > > >
> > > > Depending on the operation, the vectorizer may also require
> > > > support for boolean mask reductions using Inclusive OR.  This is
> > > > however only checked then the comparison would produce multiple
> statements.
> > > >
> > > > Note: I am currently struggling to get patch 7 correct in all
> > > > cases and could
> > > use
> > > >   some feedback there.
> > > >
> > > > Concretely the kind of loops supported are of the forms:
> > > >
> > > >  for (int i = 0; i < N; i++)
> > > >  {
> > > >
> > > >if ()
> > > >  {
> > > >...
> > > >;
> > > >  }
> > > >
> > > >  }
> > > >
> > > > where  can be:
> > > >  - break
> > > >  - return
> > > >  - goto
> > > >
> > > > Any number of statements can be used before the  occurs.
> > > >
> > > > Since this is an initial version for GCC 14 it has the following
> > > > limitations and
> > > > features:
> > > >
> > > > - Only fixed sized iterations and buffers are supported.  That is to 
> > > > say any
> > > >   vectors loaded or stored must be to statically allocated arrays with
> known
> > > >   sizes. N must also be known.  This limitation is because our primary
> target
> > > >   for this optimization is SVE.  For VLA SVE we can't easily do cross 
> > > > page
> > > >   iteraion checks. The result is likely to also not be beneficial. For 
> > > > that
> > > >   reason we punt support for variable buffers till we have 
> > > > First-Faulting
> > > >   support in GCC.
> 
> Btw, for this I wonder if you thought about marking memory accesses required
> for the early break condition as required to be vector-size aligned, thus 
> peeling
> or versioning them for alignment?  That should ensure they do not fault.
> 
> OTOH I somehow remember prologue peeling isn't supported for early break
> vectorization?  ..
> 
> > > > - any stores in  should not be to the same objects as in
> > > >   .  Loads are fine as long as they don't have the 
> > > > possibility to
> > > >   alias.  More concretely, we block RAW dependencies when the
> > > > intermediate
> > > value
> > > >   can't be separated fromt the store, or the store itself can't be 
> > > > moved.
> > > > - Prologue peeling, alignment peelinig and loop versioning are 
> > > > supported.
> 
> .. but here you say it is.  Not sure if peeling for alignment works for VLA 
> vectors
> though.  Just to say x86 doesn't support first-faulting loads.

For VLA we support it through masking.  i.e. if you need to peel N iterations, 
we
generate a masked copy of the loop vectorized which masks off the first N bits.

This is not typically needed, but we do support it.  But the problem with this
scheme and early break is obviously that the peeled loop needs to be vectorized
so you kinda end up with the same issue again.  So Atm it rejects it for VLA.

Regards,
Tamar

> 
> > > > - Fully masked loops, unmasked loops and partially masked loops
> > > > are supported
> > > > - Any number of loop early exits are supported.
> > > > - No support for epilogue vectorization.  The only epilogue supported is
> the
> > > >   scalar final one.  Peeling code supports it but the code motion code
> cannot
> > > >   find instructions to make the move in the epilog.
> > > > - Early breaks are only supported for inner loop vectorization.
> > > >
> > > > I have pushed a branch to
> > > > refs/users/tnfchris/heads/gcc-14-early-break
> > > >
> > > > With the help of IPA and LTO this still gets hit quite often.
> > > > During bootstrap it hit rather frequently.  Additionally TSVC
> > > > s332, s481 and
> > > > s482 all pass now since these are tests for support for early exit
> > > vectorization.
> > > >
> > > > This implementation does not support completely handling the early
> > > > break inside the vector loop itself but instead supports adding
> > > > checks such that if we know that we have to exit in the current
> > > > iteration then we branch to scalar code to actually do the final
> > > > VF iterations which
> > > handles all the code in .
> > > >
> > > > For the scalar loop we know that whatever exit you take you have
> > > > to perform

Re: [PATCH 3/21]middle-end: Implement code motion and dependency analysis for early breaks

2023-11-07 Thread Richard Biener

On Mon, 6 Nov 2023, Tamar Christina wrote:

> Hi All,
> 
> When performing early break vectorization we need to be sure that the vector
> operations are safe to perform.  A simple example is e.g.
> 
>  for (int i = 0; i < N; i++)
>  {
>vect_b[i] = x + i;
>if (vect_a[i]*2 != x)
>  break;
>vect_a[i] = x;
>  }
> 
> where the store to vect_b is not allowed to be executed unconditionally since
> if we exit through the early break it wouldn't have been done for the full VF
> iteration.
> 
> Effective the code motion determines:
>   - is it safe/possible to vectorize the function
>   - what updates to the VUSES should be performed if we do
>   - Which statements need to be moved
>   - Which statements can't be moved:
> * values that are live must be reachable through all exits
> * values that aren't single use and shared by the use/def chain of the 
> cond
>   - The final insertion point of the instructions.  In the cases we have
> multiple early exist statements this should be the one closest to the loop
> latch itself.
> 
> After motion the loop above is:
> 
>  for (int i = 0; i < N; i++)
>  {
>... y = x + i;
>if (vect_a[i]*2 != x)
>  break;
>vect_b[i] = y;
>vect_a[i] = x;
> 
>  }
> 
> The operation is split into two, during data ref analysis we determine
> validity of the operation and generate a worklist of actions to perform if we
> vectorize.
> 
> After peeling and just before statetement tranformation we replay this 
> worklist
> which moves the statements and updates book keeping only in the main loop 
> that's
> to be vectorized.  This includes updating of USES in exit blocks.
> 
> At the moment we don't support this for epilog nomasks since the additional
> vectorized epilog's stmt UIDs are not found.

As of UIDs note that UIDs are used for dominance checking in
vect_stmt_dominates_stmt_p and that at least is used during
transform when scheduling SLP.  Moving stmts around invalidates
this UID order (I don't see you "renumbering" UIDs).

More comments below.
 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
>
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-vect-data-refs.cc (validate_early_exit_stmts): New.
>   (vect_analyze_early_break_dependences): New.
>   (vect_analyze_data_ref_dependences): Use them.
>   * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Initialize
>   early_breaks.
>   (move_early_exit_stmts): New.
>   (vect_transform_loop): use it/
>   * tree-vect-stmts.cc (vect_is_simple_use): Use vect_early_exit_def.
>   * tree-vectorizer.h (enum vect_def_type): Add vect_early_exit_def.
>   (class _loop_vec_info): Add early_breaks, early_break_conflict,
>   early_break_vuses.
>   (LOOP_VINFO_EARLY_BREAKS): New.
>   (LOOP_VINFO_EARLY_BRK_CONFLICT_STMTS): New.
>   (LOOP_VINFO_EARLY_BRK_DEST_BB): New.
>   (LOOP_VINFO_EARLY_BRK_VUSES): New.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> index 
> d5c9c4a11c2e5d8fd287f412bfa86d081c2f8325..0fc4f325980be0474f628c32b9ce7be77f3e1d60
>  100644
> --- a/gcc/tree-vect-data-refs.cc
> +++ b/gcc/tree-vect-data-refs.cc
> @@ -613,6 +613,332 @@ vect_analyze_data_ref_dependence (struct 
> data_dependence_relation *ddr,
>return opt_result::success ();
>  }
>  
> +/* This function tries to validate whether an early break vectorization
> +   is possible for the current instruction sequence. Returns True i
> +   possible, otherwise False.
> +
> +   Requirements:
> + - Any memory access must be to a fixed size buffer.
> + - There must not be any loads and stores to the same object.
> + - Multiple loads are allowed as long as they don't alias.
> +
> +   NOTE:
> + This implemementation is very conservative. Any overlappig loads/stores
> + that take place before the early break statement gets rejected aside 
> from
> + WAR dependencies.
> +
> + i.e.:
> +
> + a[i] = 8
> + c = a[i]
> + if (b[i])
> +   ...
> +
> + is not allowed, but
> +
> + c = a[i]
> + a[i] = 8
> + if (b[i])
> +   ...
> +
> + is which is the common case.
> +
> +   Arguments:
> + - LOOP_VINFO: loop information for the current loop.
> + - CHAIN: Currently detected sequence of instructions that need to be 
> moved
> +   if we are to vectorize this early break.
> + - FIXED: Sequences of SSA_NAMEs that must not be moved, they are 
> reachable from
> +   one or more cond conditions.  If this set overlaps with CHAIN 
> then FIXED
> +   takes precedence.  This deals with non-single use cases.
> + - LOADS: List of all loads found during traversal.
> + - BASES: List of all load data references found during traversal.
> + - GSTMT: Current position to inspect for validity.  The sequence
> +   will be moved upwards from this point.
> + - REACHING_VUSE: The dominating VUSE found

Re: [PATCH] libstdc++/112351 - deal with __gthread_once failure during locale init

2023-11-07 Thread Richard Biener

On Mon, 6 Nov 2023, Richard Biener wrote:

> On Mon, 6 Nov 2023, Jonathan Wakely wrote:
> 
> > On Mon, 6 Nov 2023 at 11:52, Richard Biener  wrote:
> > >
> > > The following makes the C++98 locale init path follow the way the
> > > C++11 performs initialization.  This way we deal with pthread_once
> > > failing, falling back to non-threadsafe initialization which, given we
> > > initialize from the library, should be serialized by the dynamic
> > > loader already.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
> > > And GCC 13 branch?
> > >
> > > Thanks,
> > > Richard.

[.. old patch ..]

> > The next time we call _S_initialize, __gthread_once will try to run
> > again, and because _S_once was not changed, it might call
> > _S_initialize_once() again, which writes to _S_c_locale again
> > (possibly causing a data race).
> 
> Ah, yeah, so in the C++11 path the check for !_S_classic in
> locale::_S_initialize is redundant.  But good spot.
> 
> > I don't think the slightly different code in src/c++11/locale_init.cc
> > is different in order to handle __gthread_once failing, I think it's
> > different because the effects of locale::facet::_S_initialize_once()
> > and locale::_S_initialize_once() are different. One is safe to call
> > more than once, and the other isn't.
> > 
> > I don't think we need to care about __gthread_once failing at all, do
> > we? There are no error conditions for pthread_once, it always returns
> > 0 (previous POSIX revisions said it could return EINVAL for an
> > uninitialized pthread_once_t but that can't happen here as it's
> > correctly initialized in src/c++11/locale.cc). Is the concern that it
> > can fail for non-posix thread models? (I didn't check if any of them
> > can actually fail)
> 
> The concern is that there are actual products out that break with the
> new I/O initialization in libstdc++ for GCC13+ because they have bugs.
> It's easy enough to work around those by the proposed patch (plus
> correction for the above issue).  I suppose the comment in
> locale::_S_initialize_once holds as well for the C++98 path.
> 
> The failure mode of the product is that it overrides pthread_once
> but does nothing (not even indicate failure) when its pthread_*
> override mechanism isn't initialized yet.  With libstdc++ from GCC13
> we now use pthread_once "too early" and fail to initialize the locale
> object.
> 
> Adjusted patch below.
> 
> OK after another round of testing?

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

> Thanks,
> Richard.
> 
> 
> From 4e3fa2f4426a5a10d189587b63e4d7298c347b01 Mon Sep 17 00:00:00 2001
> From: Richard Biener 
> Date: Mon, 6 Nov 2023 11:31:40 +0100
> Subject: [PATCH] libstdc++/112351 - deal with __gthread_once failure during
>  locale init
> To: gcc-patches@gcc.gnu.org
> 
> The following makes the C++98 locale init path follow the way the
> C++11 performs initialization.  This way we deal with pthread_once
> failing, falling back to non-threadsafe initialization which, given we
> initialize from the library, should be serialized by the dynamic
> loader already.
> 
>   PR libstdc++/112351
> libstdc++-v3/
>   * src/c++98/locale.cc (locale::facet::_S_initialize_once):
>   Check whether _S_c_locale is already initialized.
>   (locale::facet::_S_get_c_locale): Always perform non-threadsafe
>   init when threadsafe init failed.
> ---
>  libstdc++-v3/src/c++98/locale.cc | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/libstdc++-v3/src/c++98/locale.cc 
> b/libstdc++-v3/src/c++98/locale.cc
> index d308140bab7..1ef0c394cd7 100644
> --- a/libstdc++-v3/src/c++98/locale.cc
> +++ b/libstdc++-v3/src/c++98/locale.cc
> @@ -206,6 +206,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>void
>locale::facet::_S_initialize_once()
>{
> +// Need to check this because we could get called once from
> +// _S_get_c_locale() when the program is single-threaded, and then again
> +// (via __gthread_once) when it's multi-threaded.
> +if (_S_c_locale)
> +  return;
> +
>  // Initialize the underlying locale model.
>  _S_create_c_locale(_S_c_locale, _S_c_name);
>}
> @@ -216,12 +222,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  #ifdef __GTHREADS
>  if (__gthread_active_p())
>__gthread_once(&_S_once, _S_initialize_once);
> -else
>  #endif
> -  {
> - if (!_S_c_locale)
> -   _S_initialize_once();
> -  }
> +if (__builtin_expect (!_S_c_locale, 0))
> +  _S_initialize_once();
>  return _S_c_locale;
>}
>  
>

Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Andrew Stubbs


On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:

So, this patch not only fixes RVV FAIL, but also fixes GCN ?


Before the patch I have:

PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"

gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2

PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"

With the patch I now get:

PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"

gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
XPASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 4

PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"

It's different, but not "fixed".

Andrew





juzhe.zh...@rivai.ai

*From:* Andrew Stubbs 
*Date:* 2023-11-07 18:09
*To:* Juzhe-Zhong ;
gcc-patches@gcc.gnu.org 
*CC:* jeffreya...@gmail.com ;
rguent...@suse.de 
*Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 07:44, Juzhe-Zhong wrote:
 > This test shows vectorizing stmts using SLP 4 times instead of 2
for RVV.
 > The reason is RVV has 512 bit vector.
 > Here is comparison between RVV ans ARM SVE:
 > https://godbolt.org/z/xc5KE5rPs
 >
 > But I notice AMDGCN also has 512 bit vector, seems this patch
will cause FAIL in GCN ?
 >
 > Not sure whether GCN is 2 times or 4 times ?
The pattern matches 4 times on GCN.
 > gcc/testsuite/ChangeLog:
 >
 > * gcc.dg/vect/pr97428.c: Fix FAIL for RVV.
 >
 > ---
 >   gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
 >   1 file changed, 2 insertions(+), 1 deletion(-)
 >
 > diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c
b/gcc/testsuite/gcc.dg/vect/pr97428.c
 > index ad6416096aa..352c9bf04a7 100644
 > --- a/gcc/testsuite/gcc.dg/vect/pr97428.c
 > +++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
 > @@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t
src[], int n)
 >   /* { dg-final { scan-tree-dump "Detected interleaving store of
size 16" "vect" } } */
 >   /* We're not able to peel & apply re-aligning to make accesses
well-aligned for !vect_hw_misalign,
 >  but we could by peeling the stores for alignment and
applying re-aligning loads.  */
 > -/* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 2 "vect" { xfail { ! vect_hw_misalign } } } } */
 > +/* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 2 "vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
 > +/* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 4 "vect" { xfail { { ! vect_hw_misalign } || { ! vect512 } } }
} } */
 >   /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" }
} */

Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread juzhe.zh...@rivai.ai

Could you try this ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
xfail { ! vect512 } } } } */



juzhe.zh...@rivai.ai
 
From: Andrew Stubbs
Date: 2023-11-07 18:59
To: juzhe.zh...@rivai.ai; gcc-patches
CC: jeffreyalaw; rguenther
Subject: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:
> So, this patch not only fixes RVV FAIL, but also fixes GCN ?
 
Before the patch I have:
 
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
 
With the patch I now get:
 
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
XPASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 4
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
 
It's different, but not "fixed".
 
Andrew
 
> 
> 
> 
> juzhe.zh...@rivai.ai
> 
> *From:* Andrew Stubbs 
> *Date:* 2023-11-07 18:09
> *To:* Juzhe-Zhong ;
> gcc-patches@gcc.gnu.org 
> *CC:* jeffreya...@gmail.com ;
> rguent...@suse.de 
> *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
> On 07/11/2023 07:44, Juzhe-Zhong wrote:
>  > This test shows vectorizing stmts using SLP 4 times instead of 2
> for RVV.
>  > The reason is RVV has 512 bit vector.
>  > Here is comparison between RVV ans ARM SVE:
>  > https://godbolt.org/z/xc5KE5rPs
>  >
>  > But I notice AMDGCN also has 512 bit vector, seems this patch
> will cause FAIL in GCN ?
>  >
>  > Not sure whether GCN is 2 times or 4 times ?
> The pattern matches 4 times on GCN.
>  > gcc/testsuite/ChangeLog:
>  >
>  > * gcc.dg/vect/pr97428.c: Fix FAIL for RVV.
>  >
>  > ---
>  >   gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
>  >   1 file changed, 2 insertions(+), 1 deletion(-)
>  >
>  > diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c
> b/gcc/testsuite/gcc.dg/vect/pr97428.c
>  > index ad6416096aa..352c9bf04a7 100644
>  > --- a/gcc/testsuite/gcc.dg/vect/pr97428.c
>  > +++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
>  > @@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t
> src[], int n)
>  >   /* { dg-final { scan-tree-dump "Detected interleaving store of
> size 16" "vect" } } */
>  >   /* We're not able to peel & apply re-aligning to make accesses
> well-aligned for !vect_hw_misalign,
>  >  but we could by peeling the stores for alignment and
> applying re-aligning loads.  */
>  > -/* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 2 "vect" { xfail { ! vect_hw_misalign } } } } */
>  > +/* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 2 "vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
>  > +/* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 4 "vect" { xfail { { ! vect_hw_misalign } || { ! vect512 } } }
> } } */
>  >   /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" }
> } */
>

Re: [PATCH v1] LoongArch: Add modifiers for lsx and lasx.

2023-11-07 Thread Xi Ruoyao

On Tue, 2023-11-07 at 12:06 +0800, chenxiaolong wrote:
> +__m128i  a,b,c;
> +
> +__asm__ ("vadd.d %w0,%w1,%w2\n\t"
> +   :"=f" (c)
> +   :"f" (a),"f" (b)
> +   :"cc");

This example does not work for me, with the definition of __m128i in
another patch:

typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__));

__m128i  a,b,c;

void t(void) {
__asm__ ("vadd.d %w0,%w1,%w2\n\t"
   :"=f" (c)
   :"f" (a),"f" (b)
   :"cc");
}

t1.c: In function ‘t’:
t1.c:6:1: error: inconsistent operand constraints in an ‘asm’
6 | __asm__ ("vadd.d %w0,%w1,%w2\n\t"

Please recheck.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH v1] LoongArch: Add modifiers for lsx and lasx.

2023-11-07 Thread Xi Ruoyao

On Tue, 2023-11-07 at 19:10 +0800, Xi Ruoyao wrote:
> On Tue, 2023-11-07 at 12:06 +0800, chenxiaolong wrote:
> > +__m128i  a,b,c;
> > +
> > +__asm__ ("vadd.d %w0,%w1,%w2\n\t"
> > +   :"=f" (c)
> > +   :"f" (a),"f" (b)
> > +   :"cc");
> 
> This example does not work for me, with the definition of __m128i in
> another patch:
> 
> typedef long long __m128i __attribute__ ((__vector_size__ (16), 
> __may_alias__));
> 
> __m128i  a,b,c;
> 
> void t(void) {
> __asm__ ("vadd.d %w0,%w1,%w2\n\t"
>    :"=f" (c)
>    :"f" (a),"f" (b)
>    :"cc");
> }
> 
> t1.c: In function ‘t’:
> t1.c:6:1: error: inconsistent operand constraints in an ‘asm’
>     6 | __asm__ ("vadd.d %w0,%w1,%w2\n\t"
> 
> Please recheck.

Sorry, I didn't added -mlasx :(.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Andrew Stubbs


On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:

Could you try this ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
"vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
"vect" { xfail { ! vect512 } } } } */


PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"

gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 4

PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"

The passes are all correct (assuming that 4 matches are a valid number), 
but if you have mutliple patterns with contractictory expectations then 
you probably want to use "target" rather than "xfail" to avoid the noise 
(and invert the conditions, obviously).


Andrew



juzhe.zh...@rivai.ai

*From:* Andrew Stubbs 
*Date:* 2023-11-07 18:59
*To:* juzhe.zh...@rivai.ai ;
gcc-patches 
*CC:* jeffreyalaw ; rguenther

*Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:
 > So, this patch not only fixes RVV FAIL, but also fixes GCN ?
Before the patch I have:
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 2
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
With the patch I now get:
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 2
XPASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 4
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
It's different, but not "fixed".
Andrew
 >
 >
 >

 > juzhe.zh...@rivai.ai
 >
 > *From:* Andrew Stubbs 
 > *Date:* 2023-11-07 18:09
 > *To:* Juzhe-Zhong ;
 > gcc-patches@gcc.gnu.org 
 > *CC:* jeffreya...@gmail.com ;
 > rguent...@suse.de 
 > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
 > On 07/11/2023 07:44, Juzhe-Zhong wrote:
 >  > This test shows vectorizing stmts using SLP 4 times
instead of 2
 > for RVV.
 >  > The reason is RVV has 512 bit vector.
 >  > Here is comparison between RVV ans ARM SVE:
 >  > https://godbolt.org/z/xc5KE5rPs
 >  >
 >  > But I notice AMDGCN also has 512 bit vector, seems this patch
 > will cause FAIL in GCN ?
 >  >
 >  > Not sure whether GCN is 2 times or 4 times ?
 > The pattern matches 4 times on GCN.
 >  > gcc/testsuite/ChangeLog:
 >  >
 >  > * gcc.dg/vect/pr97428.c: Fix FAIL for RVV.
 >  >
 >  > ---
 >  >   gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
 >  >   1 file changed, 2 insertions(+), 1 deletion(-)
 >  >
 >  > diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c
 > b/gcc/testsuite/gcc.dg/vect/pr97428.c
 >  > index ad6416096aa..352c9bf04a7 100644
 >  > --- a/gcc/testsuite/gcc.dg/vect/pr97428.c
 >  > +++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
 >  > @@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t
 > src[], int n)
 >  >   /* { dg-final { scan-tree-dump "Detected interleaving
store of
 > size 16" "vect" } } */
 >  >   /* We're not able to peel & apply re-aligning to make
accesses
 > well-aligned for !vect_hw_misalign,
 >  >  but we could by peeling the stores for alignment and
 > applying re-aligning loads.  */
 >  > -/* { dg-final { scan-tree-

Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread juzhe.zh...@rivai.ai

Do you mean this ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
xfail { ! vect512 } } } } */

Could you try again ? If it works for you, I am gonna send V2 patch to Richi.

Thank you so much for help.


juzhe.zh...@rivai.ai
 
From: Andrew Stubbs
Date: 2023-11-07 19:21
To: juzhe.zh...@rivai.ai; gcc-patches
CC: jeffreyalaw; rguenther
Subject: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:
> Could you try this ?
> 
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
> "vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
> "vect" { xfail { ! vect512 } } } } */
 
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 4
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
 
The passes are all correct (assuming that 4 matches are a valid number), 
but if you have mutliple patterns with contractictory expectations then 
you probably want to use "target" rather than "xfail" to avoid the noise 
(and invert the conditions, obviously).
 
Andrew
 
> 
> juzhe.zh...@rivai.ai
> 
> *From:* Andrew Stubbs 
> *Date:* 2023-11-07 18:59
> *To:* juzhe.zh...@rivai.ai ;
> gcc-patches 
> *CC:* jeffreyalaw ; rguenther
> 
> *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
> On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:
>  > So, this patch not only fixes RVV FAIL, but also fixes GCN ?
> Before the patch I have:
> PASS: gcc.dg/vect/pr97428.c (test for excess errors)
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> load of size 8"
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> store of size 16"
> gcc.dg/vect/pr97428.c: pattern found 4 times
> XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 2
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
> With the patch I now get:
> PASS: gcc.dg/vect/pr97428.c (test for excess errors)
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> load of size 8"
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> store of size 16"
> gcc.dg/vect/pr97428.c: pattern found 4 times
> XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 2
> XPASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 4
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
> It's different, but not "fixed".
> Andrew
>  >
>  >
>  >
> 
>  > juzhe.zh...@rivai.ai
>  >
>  > *From:* Andrew Stubbs 
>  > *Date:* 2023-11-07 18:09
>  > *To:* Juzhe-Zhong ;
>  > gcc-patches@gcc.gnu.org 
>  > *CC:* jeffreya...@gmail.com ;
>  > rguent...@suse.de 
>  > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
>  > On 07/11/2023 07:44, Juzhe-Zhong wrote:
>  >  > This test shows vectorizing stmts using SLP 4 times
> instead of 2
>  > for RVV.
>  >  > The reason is RVV has 512 bit vector.
>  >  > Here is comparison between RVV ans ARM SVE:
>  >  > https://godbolt.org/z/xc5KE5rPs
>  >  >
>  >  > But I notice AMDGCN also has 512 bit vector, seems this patch
>  > will cause FAIL in GCN ?
>  >  >
>  >  > Not sure whether GCN is 2 times or 4 times ?
>  > The pattern matches 4 times on GCN.
>  >  > gcc/testsuite/ChangeLog:
>  >  >
>  >  > * gcc.dg/vect/pr97428.c: Fix FAIL for RVV.
>  >  >
>  >  > ---
>  >  >   gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
>  >  >   1 file changed, 2 insertions(+), 1 deletion(-)
>  >  >
>  >  > diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c
>

Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread juzhe.zh...@rivai.ai

Oh. Sorry maybe it's better like this:

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { ! vect512 } } } } */



juzhe.zh...@rivai.ai
 
From: juzhe.zh...@rivai.ai
Date: 2023-11-07 19:23
To: ams; gcc-patches
CC: jeffreyalaw; rguenther
Subject: Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
Do you mean this ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
xfail { ! vect512 } } } } */

Could you try again ? If it works for you, I am gonna send V2 patch to Richi.

Thank you so much for help.


juzhe.zh...@rivai.ai
 
From: Andrew Stubbs
Date: 2023-11-07 19:21
To: juzhe.zh...@rivai.ai; gcc-patches
CC: jeffreyalaw; rguenther
Subject: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:
> Could you try this ?
> 
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
> "vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
> "vect" { xfail { ! vect512 } } } } */
 
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving 
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 4
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
 
The passes are all correct (assuming that 4 matches are a valid number), 
but if you have mutliple patterns with contractictory expectations then 
you probably want to use "target" rather than "xfail" to avoid the noise 
(and invert the conditions, obviously).
 
Andrew
 
> 
> juzhe.zh...@rivai.ai
> 
> *From:* Andrew Stubbs 
> *Date:* 2023-11-07 18:59
> *To:* juzhe.zh...@rivai.ai ;
> gcc-patches 
> *CC:* jeffreyalaw ; rguenther
> 
> *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
> On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:
>  > So, this patch not only fixes RVV FAIL, but also fixes GCN ?
> Before the patch I have:
> PASS: gcc.dg/vect/pr97428.c (test for excess errors)
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> load of size 8"
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> store of size 16"
> gcc.dg/vect/pr97428.c: pattern found 4 times
> XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 2
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
> With the patch I now get:
> PASS: gcc.dg/vect/pr97428.c (test for excess errors)
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> load of size 8"
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving
> store of size 16"
> gcc.dg/vect/pr97428.c: pattern found 4 times
> XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 2
> XPASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 4
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
> It's different, but not "fixed".
> Andrew
>  >
>  >
>  >
> 
>  > juzhe.zh...@rivai.ai
>  >
>  > *From:* Andrew Stubbs 
>  > *Date:* 2023-11-07 18:09
>  > *To:* Juzhe-Zhong ;
>  > gcc-patches@gcc.gnu.org 
>  > *CC:* jeffreya...@gmail.com ;
>  > rguent...@suse.de 
>  > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
>  > On 07/11/2023 07:44, Juzhe-Zhong wrote:
>  >  > This test shows vectorizing stmts using SLP 4 times
> instead of 2
>  > for RVV.
>  >  > The reason is RVV has 512 bit vector.
>  >  > Here is comparison between RVV ans ARM SVE:
>  >  > https://godbolt.org/z/xc5KE5rPs
>  >  >
>  >  > But I notice AMDGCN also has 512 bit vector, seems this patch
>  > will cause FAIL in GCN ?
>  >  >

RE: [PATCH 3/21]middle-end: Implement code motion and dependency analysis for early breaks

2023-11-07 Thread Tamar Christina

> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, November 7, 2023 10:53 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH 3/21]middle-end: Implement code motion and
> dependency analysis for early breaks
> 
> On Mon, 6 Nov 2023, Tamar Christina wrote:
> 
> > Hi All,
> >
> > When performing early break vectorization we need to be sure that the
> > vector operations are safe to perform.  A simple example is e.g.
> >
> >  for (int i = 0; i < N; i++)
> >  {
> >vect_b[i] = x + i;
> >if (vect_a[i]*2 != x)
> >  break;
> >vect_a[i] = x;
> >  }
> >
> > where the store to vect_b is not allowed to be executed
> > unconditionally since if we exit through the early break it wouldn't
> > have been done for the full VF iteration.
> >
> > Effective the code motion determines:
> >   - is it safe/possible to vectorize the function
> >   - what updates to the VUSES should be performed if we do
> >   - Which statements need to be moved
> >   - Which statements can't be moved:
> > * values that are live must be reachable through all exits
> > * values that aren't single use and shared by the use/def chain of the 
> > cond
> >   - The final insertion point of the instructions.  In the cases we have
> > multiple early exist statements this should be the one closest to the 
> > loop
> > latch itself.
> >
> > After motion the loop above is:
> >
> >  for (int i = 0; i < N; i++)
> >  {
> >... y = x + i;
> >if (vect_a[i]*2 != x)
> >  break;
> >vect_b[i] = y;
> >vect_a[i] = x;
> >
> >  }
> >
> > The operation is split into two, during data ref analysis we determine
> > validity of the operation and generate a worklist of actions to
> > perform if we vectorize.
> >
> > After peeling and just before statetement tranformation we replay this
> > worklist which moves the statements and updates book keeping only in
> > the main loop that's to be vectorized.  This includes updating of USES in 
> > exit
> blocks.
> >
> > At the moment we don't support this for epilog nomasks since the
> > additional vectorized epilog's stmt UIDs are not found.
> 
> As of UIDs note that UIDs are used for dominance checking in
> vect_stmt_dominates_stmt_p and that at least is used during transform when
> scheduling SLP.  Moving stmts around invalidates this UID order (I don't see
> you "renumbering" UIDs).
> 

Just some responses to questions while I process the rest.

I see, yeah I didn't encounter it because I punted SLP support.  As you said 
for SLP
We indeed don't need this.

> More comments below.
> 
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-data-refs.cc (validate_early_exit_stmts): New.
> > (vect_analyze_early_break_dependences): New.
> > (vect_analyze_data_ref_dependences): Use them.
> > * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Initialize
> > early_breaks.
> > (move_early_exit_stmts): New.
> > (vect_transform_loop): use it/
> > * tree-vect-stmts.cc (vect_is_simple_use): Use vect_early_exit_def.
> > * tree-vectorizer.h (enum vect_def_type): Add vect_early_exit_def.
> > (class _loop_vec_info): Add early_breaks, early_break_conflict,
> > early_break_vuses.
> > (LOOP_VINFO_EARLY_BREAKS): New.
> > (LOOP_VINFO_EARLY_BRK_CONFLICT_STMTS): New.
> > (LOOP_VINFO_EARLY_BRK_DEST_BB): New.
> > (LOOP_VINFO_EARLY_BRK_VUSES): New.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> > index
> >
> d5c9c4a11c2e5d8fd287f412bfa86d081c2f8325..0fc4f325980be0474f628c
> 32b9ce
> > 7be77f3e1d60 100644
> > --- a/gcc/tree-vect-data-refs.cc
> > +++ b/gcc/tree-vect-data-refs.cc
> > @@ -613,6 +613,332 @@ vect_analyze_data_ref_dependence (struct
> data_dependence_relation *ddr,
> >return opt_result::success ();
> >  }
> >
> > +/* This function tries to validate whether an early break vectorization
> > +   is possible for the current instruction sequence. Returns True i
> > +   possible, otherwise False.
> > +
> > +   Requirements:
> > + - Any memory access must be to a fixed size buffer.
> > + - There must not be any loads and stores to the same object.
> > + - Multiple loads are allowed as long as they don't alias.
> > +
> > +   NOTE:
> > + This implemementation is very conservative. Any overlappig 
> > loads/stores
> > + that take place before the early break statement gets rejected aside 
> > from
> > + WAR dependencies.
> > +
> > + i.e.:
> > +
> > +   a[i] = 8
> > +   c = a[i]
> > +   if (b[i])
> > + ...
> > +
> > +   is not allowed, but
> > +
> > +   c = a[i]
> > +   a[i] = 8
> > +   if (b[i])
> > + ...
> > +
> > +   is which is the common case.
> > +
> > +   Arguments:
> > + - LOOP_VINFO: loop information for the current loop.
> > + - CHAIN: Currently detected

Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Andrew Stubbs


On 07/11/2023 11:24, juzhe.zh...@rivai.ai wrote:

Oh. Sorry maybe it's better like this:

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
"vect" { target { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
"vect" { target{ ! vect512 } } } } */


The conditions are backwards; this expects vect512 machines to match 
twice. Also I think there's a space missing.


Andrew




juzhe.zh...@rivai.ai

*From:* juzhe.zh...@rivai.ai 
*Date:* 2023-11-07 19:23
*To:* ams ; gcc-patches

*CC:* jeffreyalaw ; rguenther

*Subject:* Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
Do you mean this ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2
"vect" { target { { ! vect_hw_misalign } || { vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4
"vect" { xfail { ! vect512 } } } } */

Could you try again ? If it works for you, I am gonna send V2 patch
to Richi.

Thank you so much for help.

juzhe.zh...@rivai.ai

*From:* Andrew Stubbs 
*Date:* 2023-11-07 19:21
*To:* juzhe.zh...@rivai.ai ;
gcc-patches 
*CC:* jeffreyalaw ; rguenther

*Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:
 > Could you try this ?
 >
 > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 2
 > "vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
 > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 4
 > "vect" { xfail { ! vect512 } } } } */
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
interleaving
load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
interleaving
store of size 16"
gcc.dg/vect/pr97428.c: pattern found 4 times
XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
stmts using SLP" 2
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
"vectorizing stmts
using SLP" 4
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6
elements"
The passes are all correct (assuming that 4 matches are a valid
number),
but if you have mutliple patterns with contractictory
expectations then
you probably want to use "target" rather than "xfail" to avoid
the noise
(and invert the conditions, obviously).
Andrew
 >

 > juzhe.zh...@rivai.ai
 >
 > *From:* Andrew Stubbs 
 > *Date:* 2023-11-07 18:59
 > *To:* juzhe.zh...@rivai.ai ;
 > gcc-patches 
 > *CC:* jeffreyalaw ; rguenther
 > 
 > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
 > On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:
 >  > So, this patch not only fixes RVV FAIL, but also fixes
GCN ?
 > Before the patch I have:
 > PASS: gcc.dg/vect/pr97428.c (test for excess errors)
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
interleaving
 > load of size 8"
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
interleaving
 > store of size 16"
 > gcc.dg/vect/pr97428.c: pattern found 4 times
 > XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
"vectorizing
 > stmts using SLP" 2
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap
of 6 elements"
 > With the patch I now get:
 > PASS: gcc.dg/vect/pr97428.c (test for excess errors)
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
interleaving
 > load of size 8"
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
interleaving
 > store of size 16"
 > gcc.dg/vect/pr97428.c: pattern found 4 times
 > XFAIL: gcc.dg/vect/pr97428.c scan-tre

Re: [PATCH] libstdc++/112351 - deal with __gthread_once failure during locale init

2023-11-07 Thread Jonathan Wakely

On Mon, 6 Nov 2023 at 12:52, Richard Biener  wrote:
>
> On Mon, 6 Nov 2023, Jonathan Wakely wrote:
>
> > On Mon, 6 Nov 2023 at 11:52, Richard Biener  wrote:
> > >
> > > The following makes the C++98 locale init path follow the way the
> > > C++11 performs initialization.  This way we deal with pthread_once
> > > failing, falling back to non-threadsafe initialization which, given we
> > > initialize from the library, should be serialized by the dynamic
> > > loader already.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
> > > And GCC 13 branch?
> > >
> > > Thanks,
> > > Richard.
> > >
> > > PR libstdc++/112351
> > > libstdc++-v3/
> > > * src/c++98/locale.cc (locale::facet::_S_get_c_locale):
> > > Always perform non-threadsafe init when threadsafe init
> > > failed.
> > > ---
> > >  libstdc++-v3/src/c++98/locale.cc | 7 ++-
> > >  1 file changed, 2 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/src/c++98/locale.cc 
> > > b/libstdc++-v3/src/c++98/locale.cc
> > > index d308140bab7..e9bec1db3b6 100644
> > > --- a/libstdc++-v3/src/c++98/locale.cc
> > > +++ b/libstdc++-v3/src/c++98/locale.cc
> > > @@ -216,12 +216,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >  #ifdef __GTHREADS
> > >  if (__gthread_active_p())
> > >__gthread_once(&_S_once, _S_initialize_once);
> > > -else
> > >  #endif
> > > -  {
> > > -   if (!_S_c_locale)
> > > - _S_initialize_once();
> > > -  }
> > > +if (__builtin_expect (!_S_c_locale, 0))
> > > +  _S_initialize_once();
> > >  return _S_c_locale;
> > >}
> >
> >
> > I think this has a problem, which is handled correctly in
> > src/c++11/locale_init.cc by checking _S_classic inside the
> > _S_initialize_once function.
>
> We check _S_c_locale here (it's just a pointer) instead of in

It's a __locale_struct* for glibc and int* for BSD. Hopefully it's
always something pointer-ish so that works.

> _S_initialize_once (), so I think the code is equivalent, no?
>
> > If the first call to __gthread_once does fail then _S_once will not be
> > changed. We will fall through to calling _S_initialize_once directly
> > (which is not thread-safe) and set _S_c_locale.
> >
> > The next time we call _S_initialize, __gthread_once will try to run
> > again, and because _S_once was not changed, it might call
> > _S_initialize_once() again, which writes to _S_c_locale again
> > (possibly causing a data race).
>
> Ah, yeah, so in the C++11 path the check for !_S_classic in
> locale::_S_initialize is redundant.  But good spot.

Redundant, but it avoids a call to _S_initialize_once, so probably
worth it for performance.

>
> > I don't think the slightly different code in src/c++11/locale_init.cc
> > is different in order to handle __gthread_once failing, I think it's
> > different because the effects of locale::facet::_S_initialize_once()
> > and locale::_S_initialize_once() are different. One is safe to call
> > more than once, and the other isn't.
> >
> > I don't think we need to care about __gthread_once failing at all, do
> > we? There are no error conditions for pthread_once, it always returns
> > 0 (previous POSIX revisions said it could return EINVAL for an
> > uninitialized pthread_once_t but that can't happen here as it's
> > correctly initialized in src/c++11/locale.cc). Is the concern that it
> > can fail for non-posix thread models? (I didn't check if any of them
> > can actually fail)
>
> The concern is that there are actual products out that break with the
> new I/O initialization in libstdc++ for GCC13+ because they have bugs.
> It's easy enough to work around those by the proposed patch (plus
> correction for the above issue).  I suppose the comment in
> locale::_S_initialize_once holds as well for the C++98 path.
>
> The failure mode of the product is that it overrides pthread_once
> but does nothing (not even indicate failure) when its pthread_*
> override mechanism isn't initialized yet.  With libstdc++ from GCC13
> we now use pthread_once "too early" and fail to initialize the locale
> object.
>
> Adjusted patch below.
>
> OK after another round of testing?

OK, thanks.

>
> Thanks,
> Richard.
>
>
> From 4e3fa2f4426a5a10d189587b63e4d7298c347b01 Mon Sep 17 00:00:00 2001
> From: Richard Biener 
> Date: Mon, 6 Nov 2023 11:31:40 +0100
> Subject: [PATCH] libstdc++/112351 - deal with __gthread_once failure during
>  locale init
> To: gcc-patches@gcc.gnu.org
>
> The following makes the C++98 locale init path follow the way the
> C++11 performs initialization.  This way we deal with pthread_once
> failing, falling back to non-threadsafe initialization which, given we
> initialize from the library, should be serialized by the dynamic
> loader already.
>
> PR libstdc++/112351
> libstdc++-v3/
> * src/c++98/locale.cc (locale::facet::_S_initialize_once):
> Check whether _S_c_locale is already initialized.
> (locale::facet::_S_get_c_locale): Alw

[PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread Juzhe-Zhong

Like all other targets, we add RISC-V into vect_cmdline_needed.

This patch fixes following FAILs:

FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect "vectorized 0 
loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect "vectorized 0 
loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment of 
access forced using peeling" 1
FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "Alignment of 
access forced using peeling" 1

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add RISC-V.

---
 gcc/testsuite/lib/target-supports.exp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 17a87db0007..285817ef16e 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4036,7 +4036,8 @@ proc check_effective_target_vect_cmdline_needed { } {
 || ([istarget sparc*-*-*] && [check_effective_target_sparc_vis])
 || ([istarget arm*-*-*] && [check_effective_target_arm_neon])
 || [istarget aarch64*-*-*]
- || [istarget amdgcn*-*-*]} {
+|| [istarget amdgcn*-*-*]
+|| [istarget riscv*-*-*]} {
return 0
} else {
return 1
-- 
2.36.3

Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread juzhe.zh...@rivai.ai

Sorry I made a mistake here.

Does it work for you ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { { vect_hw_misalign } && { ! vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { vect512 } } } } */

Tested on RVV is OK.


juzhe.zh...@rivai.ai
 
From: Andrew Stubbs
Date: 2023-11-07 19:44
To: juzhe.zh...@rivai.ai; gcc-patches
CC: jeffreyalaw; rguenther
Subject: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 11:24, juzhe.zh...@rivai.ai wrote:
> Oh. Sorry maybe it's better like this:
> 
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
> "vect" { target { { ! vect_hw_misalign } || { vect512 } } } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
> "vect" { target{ ! vect512 } } } } */
 
The conditions are backwards; this expects vect512 machines to match 
twice. Also I think there's a space missing.
 
Andrew
 
> 
> 
> juzhe.zh...@rivai.ai
> 
> *From:* juzhe.zh...@rivai.ai 
> *Date:* 2023-11-07 19:23
> *To:* ams ; gcc-patches
> 
> *CC:* jeffreyalaw ; rguenther
> 
> *Subject:* Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
> Do you mean this ?
> 
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2
> "vect" { target { { ! vect_hw_misalign } || { vect512 } } } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4
> "vect" { xfail { ! vect512 } } } } */
> 
> Could you try again ? If it works for you, I am gonna send V2 patch
> to Richi.
> 
> Thank you so much for help.
> 
> juzhe.zh...@rivai.ai
> 
> *From:* Andrew Stubbs 
> *Date:* 2023-11-07 19:21
> *To:* juzhe.zh...@rivai.ai ;
> gcc-patches 
> *CC:* jeffreyalaw ; rguenther
> 
> *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
> On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:
>  > Could you try this ?
>  >
>  > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 2
>  > "vect" { xfail { { ! vect_hw_misalign } || { vect512 } } } } } */
>  > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 4
>  > "vect" { xfail { ! vect512 } } } } */
> PASS: gcc.dg/vect/pr97428.c (test for excess errors)
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
> interleaving
> load of size 8"
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
> interleaving
> store of size 16"
> gcc.dg/vect/pr97428.c: pattern found 4 times
> XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 2
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
> "vectorizing stmts
> using SLP" 4
> PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6
> elements"
> The passes are all correct (assuming that 4 matches are a valid
> number),
> but if you have mutliple patterns with contractictory
> expectations then
> you probably want to use "target" rather than "xfail" to avoid
> the noise
> (and invert the conditions, obviously).
> Andrew
>  >
> 
> 
>  > juzhe.zh...@rivai.ai
>  >
>  > *From:* Andrew Stubbs 
>  > *Date:* 2023-11-07 18:59
>  > *To:* juzhe.zh...@rivai.ai ;
>  > gcc-patches 
>  > *CC:* jeffreyalaw ; rguenther
>  > 
>  > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
>  > On 07/11/2023 10:10, juzhe.zh...@rivai.ai wrote:
>  >  > So, this patch not only fixes RVV FAIL, but also fixes
> GCN ?
>  > Before the patch I have:
>  > PASS: gcc.dg/vect/pr97428.c (test for excess errors)
>  > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
> interleaving
>  > load of size 8"
>  > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
> interleaving
>  > store of size 16"
>  > gcc.dg/vect/pr97428.c: pattern found 4 ti

Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread Robin Dapp

Looks OK but I don't really get the test (e.g. gen-vect-26.c).  It is
only ran if target vect_cmdline_needed, otherwise compiled?  Why does
that have an impact on the scan?  Looks weird but well...

Regards
 Robin

Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Andrew Stubbs


On 07/11/2023 12:03, juzhe.zh...@rivai.ai wrote:

Sorry I made a mistake here.

Does it work for you ?

/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
"vect" { target { { vect_hw_misalign } && { ! vect512 } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
"vect" { target { vect512 } } } } */


Tested on RVV is OK.


5 PASS on amdgcn also.

Andrew



juzhe.zh...@rivai.ai

*From:* Andrew Stubbs 
*Date:* 2023-11-07 19:44
*To:* juzhe.zh...@rivai.ai ;
gcc-patches 
*CC:* jeffreyalaw ; rguenther

*Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 11:24, juzhe.zh...@rivai.ai wrote:
 > Oh. Sorry maybe it's better like this:
 >
 > /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2
 > "vect" { target { { ! vect_hw_misalign } || { vect512 } } } } } */
 > /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4
 > "vect" { target{ ! vect512 } } } } */
The conditions are backwards; this expects vect512 machines to match
twice. Also I think there's a space missing.
Andrew
 >
 >

 > juzhe.zh...@rivai.ai
 >
 > *From:* juzhe.zh...@rivai.ai 
 > *Date:* 2023-11-07 19:23
 > *To:* ams ; gcc-patches
 > 
 > *CC:* jeffreyalaw ; rguenther
 > 
 > *Subject:* Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
 > Do you mean this ?
 >
 > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 2
 > "vect" { target { { ! vect_hw_misalign } || { vect512 } } } }
} */
 > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
SLP" 4
 > "vect" { xfail { ! vect512 } } } } */
 >
 > Could you try again ? If it works for you, I am gonna send V2
patch
 > to Richi.
 >
 > Thank you so much for help.
 >


 > juzhe.zh...@rivai.ai
 >
 > *From:* Andrew Stubbs 
 > *Date:* 2023-11-07 19:21
 > *To:* juzhe.zh...@rivai.ai ;
 > gcc-patches 
 > *CC:* jeffreyalaw ; rguenther
 > 
 > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
 > On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:
 >  > Could you try this ?
 >  >
 >  > /* { dg-final { scan-tree-dump-times "vectorizing
stmts using
 > SLP" 2
 >  > "vect" { xfail { { ! vect_hw_misalign } || { vect512 }
} } } } */
 >  > /* { dg-final { scan-tree-dump-times "vectorizing
stmts using
 > SLP" 4
 >  > "vect" { xfail { ! vect512 } } } } */
 > PASS: gcc.dg/vect/pr97428.c (test for excess errors)
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
 > interleaving
 > load of size 8"
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
 > interleaving
 > store of size 16"
 > gcc.dg/vect/pr97428.c: pattern found 4 times
 > XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
"vectorizing
 > stmts using SLP" 2
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
 > "vectorizing stmts
 > using SLP" 4
 > PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6
 > elements"
 > The passes are all correct (assuming that 4 matches are a
valid
 > number),
 > but if you have mutliple patterns with contractictory
 > expectations then
 > you probably want to use "target" rather than "xfail" to
avoid
 > the noise
 > (and invert the conditions, obviously).
 > Andrew
 >  >
 >


 >  > juzhe.zh...@rivai.ai
 >  >
 >  > *From:* Andrew Stubbs 
 >  > *Date:* 2023-11-07 18:59
 >  > *To:* juzhe.zh...@rivai.ai
;
 >  > gcc-patches

Re: Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread juzhe.zh...@rivai.ai

It need command line to enable SIMD auto-vectorization (VLS mode in RVV).
It will enable VLS modes auto-vectorization by default if we didn't add RISCV 
into vect_cmdline.
So adding it to disable VLS mode vectorization which will fix the FAILs like 
other targets.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-07 20:04
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed
Looks OK but I don't really get the test (e.g. gen-vect-26.c).  It is
only ran if target vect_cmdline_needed, otherwise compiled?  Why does
that have an impact on the scan?  Looks weird but well...
 
Regards
Robin

Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread juzhe.zh...@rivai.ai

Thanks a lot ! I will send V2 for Richi to review.




juzhe.zh...@rivai.ai
 
From: Andrew Stubbs
Date: 2023-11-07 20:05
To: juzhe.zh...@rivai.ai; gcc-patches
CC: jeffreyalaw; rguenther
Subject: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
On 07/11/2023 12:03, juzhe.zh...@rivai.ai wrote:
> Sorry I made a mistake here.
> 
> Does it work for you ?
> 
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
> "vect" { target { { vect_hw_misalign } && { ! vect512 } } } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 
> "vect" { target { vect512 } } } } */
> 
> Tested on RVV is OK.
 
5 PASS on amdgcn also.
 
Andrew
 
> 
> juzhe.zh...@rivai.ai
> 
> *From:* Andrew Stubbs 
> *Date:* 2023-11-07 19:44
> *To:* juzhe.zh...@rivai.ai ;
> gcc-patches 
> *CC:* jeffreyalaw ; rguenther
> 
> *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
> On 07/11/2023 11:24, juzhe.zh...@rivai.ai wrote:
>  > Oh. Sorry maybe it's better like this:
>  >
>  > /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2
>  > "vect" { target { { ! vect_hw_misalign } || { vect512 } } } } } */
>  > /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4
>  > "vect" { target{ ! vect512 } } } } */
> The conditions are backwards; this expects vect512 machines to match
> twice. Also I think there's a space missing.
> Andrew
>  >
>  >
> 
>  > juzhe.zh...@rivai.ai
>  >
>  > *From:* juzhe.zh...@rivai.ai 
>  > *Date:* 2023-11-07 19:23
>  > *To:* ams ; gcc-patches
>  > 
>  > *CC:* jeffreyalaw ; rguenther
>  > 
>  > *Subject:* Re: Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
>  > Do you mean this ?
>  >
>  > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 2
>  > "vect" { target { { ! vect_hw_misalign } || { vect512 } } } }
> } */
>  > /* { dg-final { scan-tree-dump-times "vectorizing stmts using
> SLP" 4
>  > "vect" { xfail { ! vect512 } } } } */
>  >
>  > Could you try again ? If it works for you, I am gonna send V2
> patch
>  > to Richi.
>  >
>  > Thank you so much for help.
>  >
> 
>  > juzhe.zh...@rivai.ai
>  >
>  > *From:* Andrew Stubbs 
>  > *Date:* 2023-11-07 19:21
>  > *To:* juzhe.zh...@rivai.ai ;
>  > gcc-patches 
>  > *CC:* jeffreyalaw ; rguenther
>  > 
>  > *Subject:* Re: [PATCH] test: Fix FAIL of pr97428.c for RVV
>  > On 07/11/2023 11:05, juzhe.zh...@rivai.ai wrote:
>  >  > Could you try this ?
>  >  >
>  >  > /* { dg-final { scan-tree-dump-times "vectorizing
> stmts using
>  > SLP" 2
>  >  > "vect" { xfail { { ! vect_hw_misalign } || { vect512 }
> } } } } */
>  >  > /* { dg-final { scan-tree-dump-times "vectorizing
> stmts using
>  > SLP" 4
>  >  > "vect" { xfail { ! vect512 } } } } */
>  > PASS: gcc.dg/vect/pr97428.c (test for excess errors)
>  > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
>  > interleaving
>  > load of size 8"
>  > PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected
>  > interleaving
>  > store of size 16"
>  > gcc.dg/vect/pr97428.c: pattern found 4 times
>  > XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
> "vectorizing
>  > stmts using SLP" 2
>  > PASS: gcc.dg/vect/pr97428.c scan-tree-dump-times vect
>  > "vectorizing stmts
>  > using SLP" 4
>  > PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6
>  > elements"
>  > The passes are all correct (assuming that 4 matches are a
> valid
>  > number),
>  > but if you have mutliple patterns with contractictory
>  > expectations then
>  > you probably want to use "target" rather than "xfail" to
> avoid
>  > the noise
>  > (and invert the conditions, obviously).
>  >

Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread Robin Dapp

> It need command line to enable SIMD auto-vectorization (VLS mode in RVV).
> It will enable VLS modes auto-vectorization by default if we didn't add RISCV 
> into vect_cmdline.
> So adding it to disable VLS mode vectorization which will fix the FAILs like 
> other targets.

Ah so it's about SIMD despite the name, I see.  Still weird
but adding riscv then makes sense.  So OK.  The test probably
just very old.

Regards
 Robin

Re: Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread juzhe.zh...@rivai.ai

Thanks. Committed.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-07 20:10
To: juzhe.zh...@rivai.ai; gcc-patches
CC: rdapp.gcc; kito.cheng; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed
> It need command line to enable SIMD auto-vectorization (VLS mode in RVV).
> It will enable VLS modes auto-vectorization by default if we didn't add RISCV 
> into vect_cmdline.
> So adding it to disable VLS mode vectorization which will fix the FAILs like 
> other targets.
 
Ah so it's about SIMD despite the name, I see.  Still weird
but adding riscv then makes sense.  So OK.  The test probably
just very old.
 
Regards
Robin

[PATCH V2] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Juzhe-Zhong

This test shows vectorizing stmts using SLP 4 times instead of 2 for RVV.
The reason is RVV has 512 bit vector.
Here is comparison between RVV ans ARM SVE:
https://godbolt.org/z/xc5KE5rPs

Confirm GCN also matches 4 SLP. This patch is passed on both GCN and RVV.

Ok for trunk ?

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr97428.c: Adapt for RVV and GCN.

---
 gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c 
b/gcc/testsuite/gcc.dg/vect/pr97428.c
index ad6416096aa..f77adb1be97 100644
--- a/gcc/testsuite/gcc.dg/vect/pr97428.c
+++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
@@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t src[], int n)
 /* { dg-final { scan-tree-dump "Detected interleaving store of size 16" "vect" 
} } */
 /* We're not able to peel & apply re-aligning to make accesses well-aligned 
for !vect_hw_misalign,
but we could by peeling the stores for alignment and applying re-aligning 
loads.  */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail { ! vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { { vect_hw_misalign } && { ! vect512 } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { vect512 } } } } */
 /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" } } */
-- 
2.36.3

Re: [PATCH] test: Fix FAIL of pr65518.c for RVV[PR112420]

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Juzhe-Zhong wrote:

>   PR target/112420

OK

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr65518.c: Fix check for RVV.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/pr65518.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr65518.c 
> b/gcc/testsuite/gcc.dg/vect/pr65518.c
> index 3e5b986183c..189a65534f6 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr65518.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr65518.c
> @@ -49,4 +49,6 @@ int main ()
> sub-optimal and causes memory explosion (even though the cost model
> should reject that in the end).  */
>  
> -/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 
> "vect" } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 
> "vect" { target {! riscv*-*-* } } } } */
> +/* We end up using gathers for the strided load on RISC-V which would be OK. 
>  */
> +/* { dg-final { scan-tree-dump "using gather/scatter for strided/grouped 
> access" "vect" { target { riscv*-*-* } } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [V2 PATCH] Handle bitop with INTEGER_CST in analyze_and_compute_bitop_with_inv_effect.

2023-11-07 Thread Hongtao Liu

On Tue, Nov 7, 2023 at 4:10 PM Richard Biener
 wrote:
>
> On Tue, Nov 7, 2023 at 7:08 AM liuhongt  wrote:
> >
> > analyze_and_compute_bitop_with_inv_effect assumes the first operand is
> > loop invariant which is not the case when it's INTEGER_CST.
> >
> > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
>
> So this addresses a missed optimization, right?  It seems to me that
> even with two SSA names we are only "lucky" when rhs1 is the invariant
> one.  So instead of swapping this way I'd do
Yes, it's a miss optimization.
And I think expr_invariant_in_loop_p (loop, match_op[1]) should be
enough, if match_op[1] is a loop invariant.it must be false for the
below conditions(there couldn't be any header_phi from its
definition).

>
>  unsigned i;
>  for (i = 0; i < 2; ++i)
>if (TREE_CODE (match_op[i]) == SSA_NAME
>&& ...)
> break; /* found! */
>
>   if (i == 2)
> return NULL_TREE;
>   if (i == 0)
> std::swap (match_op[0], match_op[1]);
>
> to also handle a "swapped" pair of SSA names?
>
> > gcc/ChangeLog:
> >
> > PR tree-optimization/105735
> > PR tree-optimization/111972
> > * tree-scalar-evolution.cc
> > (analyze_and_compute_bitop_with_inv_effect): Handle bitop with
> > INTEGER_CST.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr105735-3.c: New test.
> > ---
> >  gcc/testsuite/gcc.target/i386/pr105735-3.c | 87 ++
> >  gcc/tree-scalar-evolution.cc   |  3 +
> >  2 files changed, 90 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-3.c
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-3.c 
> > b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > new file mode 100644
> > index 000..9e268a1a997
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr105735-3.c
> > @@ -0,0 +1,87 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } 
> > } */
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 64; bit++)
> > +tmp &= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo1 (unsigned int tmp)
> > +{
> > +  for (int bit = 63; bit >= 0; bit -=3)
> > +tmp &= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo2 (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 64; bit++)
> > +tmp |= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo3 (unsigned int tmp)
> > +{
> > +  for (int bit = 63; bit >= 0; bit -=3)
> > +tmp |= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo4 (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 64; bit++)
> > +tmp ^= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +foo5 (unsigned int tmp)
> > +{
> > +  for (int bit = 0; bit < 63; bit++)
> > +tmp ^= 11304;
> > +  return tmp;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +f (unsigned int tmp, int bit)
> > +{
> > +  unsigned int res = tmp;
> > +  for (int i = 0; i < bit; i++)
> > +res &= 11304;
> > +  return res;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +f1 (unsigned int tmp, int bit)
> > +{
> > +  unsigned int res = tmp;
> > +  for (int i = 0; i < bit; i++)
> > +res |= 11304;
> > +  return res;
> > +}
> > +
> > +unsigned int
> > +__attribute__((noipa))
> > +f2 (unsigned int tmp, int bit)
> > +{
> > +  unsigned int res = tmp;
> > +  for (int i = 0; i < bit; i++)
> > +res ^= 11304;
> > +  return res;
> > +}
> > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > index 70b17c5bca1..f61277c32df 100644
> > --- a/gcc/tree-scalar-evolution.cc
> > +++ b/gcc/tree-scalar-evolution.cc
> > @@ -3689,6 +3689,9 @@ analyze_and_compute_bitop_with_inv_effect (class 
> > loop* loop, tree phidef,
> >match_op[0] = gimple_assign_rhs1 (def);
> >match_op[1] = gimple_assign_rhs2 (def);
> >
> > +  if (expr_invariant_in_loop_p (loop, match_op[1]))
> > +std::swap (match_op[0], match_op[1]);
> > +
> >if (TREE_CODE (match_op[1]) != SSA_NAME
> >|| !expr_invariant_in_loop_p (loop, match_op[0])
> >|| !(header_phi = dyn_cast  (SSA_NAME_DEF_STMT 
> > (match_op[1])))
> > --
> > 2.31.1
> >



-- 
BR,
Hongtao

RE: [PATCH] test: Fix FAIL of pr65518.c for RVV[PR112420]

2023-11-07 Thread Li, Pan2

Committed, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Tuesday, November 7, 2023 8:51 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com
Subject: Re: [PATCH] test: Fix FAIL of pr65518.c for RVV[PR112420]

On Tue, 7 Nov 2023, Juzhe-Zhong wrote:

>   PR target/112420

OK

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr65518.c: Fix check for RVV.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/pr65518.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr65518.c 
> b/gcc/testsuite/gcc.dg/vect/pr65518.c
> index 3e5b986183c..189a65534f6 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr65518.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr65518.c
> @@ -49,4 +49,6 @@ int main ()
> sub-optimal and causes memory explosion (even though the cost model
> should reject that in the end).  */
>  
> -/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 
> "vect" } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 0 loops in function" 2 
> "vect" { target {! riscv*-*-* } } } } */
> +/* We end up using gathers for the strided load on RISC-V which would be OK. 
>  */
> +/* { dg-final { scan-tree-dump "using gather/scatter for strided/grouped 
> access" "vect" { target { riscv*-*-* } } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] test: Fix FAIL of bb-slp-cond-1.c for RVV

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Juzhe-Zhong wrote:

> This patch fixes:
> FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "loop vectorized" 1
> FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times vect "loop vectorized" 
> 1
> 
> For RVV, "loop vectorized" appears 2 times instead of 1. Because:
> optimized: loop vectorized using 16 byte vectors
> optimized: loop vectorized using 8 byte vectors
> 
> As long as targets have both 64bit and 128bit vectors, it will occur 2 times.
> 2 targets are same situation, one is AMDGCN, the other is RVV.
> 
> Replace it target amdgcn with vect64 && vect128 to make test more general and 
> easy maintain.

I think we usually add --param vect-epilogues-nomask=0 instead to avoid
epilogue vectorization.  I wonder why the test is called bb-slp-*
though ...

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/bb-slp-cond-1.c: Fix FAIL.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> index c8024429e9c..7efb91725df 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> @@ -47,6 +47,6 @@ int main ()
>  }
>  
>  /* { dg-final { scan-tree-dump {(no need for alias check [^\n]* when VF is 
> 1|no alias between [^\n]* when [^\n]* is outside \(-16, 16\))} "vect" { 
> target vect_element_align } } } */
> -/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { 
> vect_element_align && { ! amdgcn-*-* } } } } } */
> -/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target 
> amdgcn-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { 
> vect_element_align && { ! { vect64 && vect128 } } } } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target { 
> vect64 && vect128 } } } } */
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] test: Fix FAIL of vect-sdiv-pow2-1.c for RVV test: Fix FAIL of vect-sdiv-pow2-1.c for RVV#

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Juzhe-Zhong wrote:

> RVV didn't explictly enable DIV_POW2 optab but we cen vectorize it.
> We should check pattern recognition instead of explicit pattern check.

But I see

proc check_effective_target_vect_sdiv_pow2_si {} {
return [expr { ([istarget aarch64*-*-*]
&& [check_effective_target_aarch64_sve])
   || ([istarget riscv*-*-*]
   && [check_effective_target_riscv_v]) }]

so if you don't have sdiv_pow2_si then please don't advertise it.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-sdiv-pow2-1.c: Fix dump check.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> index 49ecbe216f2..8056c2a6748 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> @@ -79,5 +79,5 @@ main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump {\.DIV_POW2} "vect" { target 
> vect_sdiv_pow2_si } } } */
> +/* { dg-final { scan-tree-dump "vect_recog_divmod_pattern: detected" "vect" 
> } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loop" 18 "vect" { target 
> vect_sdiv_pow2_si } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] test: Fix bb-slp-33.c for RVV

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Juzhe-Zhong wrote:

> As https://godbolt.org/z/hPsqahEa5 shows.
> RVV failed dump check since "vectorizing stmts using SLP" shows 3 times 
> instead of 2.
> 
> The root cause is this code in main:
> 
>   if (a[0] != 1
>   || a[1] != 2
>   || a[2] != 3
>   || a[3] != 4
>   || a[4] != 7
>   || a[5] != 0
>   || a[6] != 0
>   || a[7] != 0
>   || a[8] != 0)
> abort ();
> 
> is vectorized. So add -fno-tree-vectorize avoid the confusing check.

Uh, please don't add optimize attributes.  If you see this vectorized
(as reduction?) then please instead rewrite the condition as

 if (a[0] != 1)
   abort ();
 __asm__ volatile ("");
 if (a[1] != 2)
   abort ();
...

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/bb-slp-33.c: Add -fno-tree-vectorize to main.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-33.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-33.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-33.c
> index bbb13ef798e..f44cbdcfbcf 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-33.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-33.c
> @@ -17,7 +17,8 @@ test(int *__restrict__ a, int *__restrict__ b)
>a[8] = 0;
>  }
>  
> -int main()
> +int __attribute__((optimize(("-fno-tree-vectorize"
> +main()
>  {
>int a[9];
>int b[4];
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH V2] test: Fix FAIL of pr97428.c for RVV

2023-11-07 Thread Richard Biener

On Tue, 7 Nov 2023, Juzhe-Zhong wrote:

> This test shows vectorizing stmts using SLP 4 times instead of 2 for RVV.
> The reason is RVV has 512 bit vector.
> Here is comparison between RVV ans ARM SVE:
> https://godbolt.org/z/xc5KE5rPs
> 
> Confirm GCN also matches 4 SLP. This patch is passed on both GCN and RVV.
> 
> Ok for trunk ?

Does --param vect-epilogues-nomask=0 help?

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr97428.c: Adapt for RVV and GCN.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/pr97428.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c 
> b/gcc/testsuite/gcc.dg/vect/pr97428.c
> index ad6416096aa..f77adb1be97 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr97428.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
> @@ -43,5 +43,6 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t src[], int n)
>  /* { dg-final { scan-tree-dump "Detected interleaving store of size 16" 
> "vect" } } */
>  /* We're not able to peel & apply re-aligning to make accesses well-aligned 
> for !vect_hw_misalign,
> but we could by peeling the stores for alignment and applying re-aligning 
> loads.  */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { xfail { ! vect_hw_misalign } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { { vect_hw_misalign } && { ! vect512 } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { vect512 } } } } */
>  /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

1 2 3 >

1 - 100 of 207 matches

Mail list logo