[PATCH] testsuite/105122 - adjust testcases after memcpy folding changes

2022-04-04 Thread Richard Biener via Gcc-patches
After r12-7931 we again honor MOVE_MAX when folding memcpy to
a load/store pair.  On i?86-*-* without SSE this now exposes the
change done in r12-2666-g29f0e955c97da0 which adjusts MOVE_MAX
from 16 to 4 on those targets.  This makes adjusting testcases
necessary that assume that we transform memcpy to load/store pairs
on GIMPLE for sizes larger or equal to 8.

Tested on x86_64-unknown-linux-gnu with -m32 -mno-sse.

OK?

2022-04-04  Richard Biener  

* gcc.dg/memcpy-6.c: Adjust.
* gcc.dg/strlenopt-73.c: Likewise.
* gcc.dg/strlenopt-80.c: Likewise.
---
 gcc/testsuite/gcc.dg/memcpy-6.c | 3 ++-
 gcc/testsuite/gcc.dg/strlenopt-73.c | 2 +-
 gcc/testsuite/gcc.dg/strlenopt-80.c | 3 ++-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/memcpy-6.c b/gcc/testsuite/gcc.dg/memcpy-6.c
index 7ff735e94d1..d4df03903c3 100644
--- a/gcc/testsuite/gcc.dg/memcpy-6.c
+++ b/gcc/testsuite/gcc.dg/memcpy-6.c
@@ -6,7 +6,8 @@
of targets where it's known to pass (see PR testsuite/83483).
{ dg-do compile }
{ dg-options "-O0 -Wrestrict -fdump-tree-optimized" }
-   { dg-skip-if "skip non-x86 targets" { ! { i?86-*-* x86_64-*-* } } }  */
+   { dg-skip-if "skip non-x86 targets" { ! { i?86-*-* x86_64-*-* } } }
+   { dg-additional-options "-msse" { target i?86-*-* x86_64-*-* } } */
 
 char a[32];
 
diff --git a/gcc/testsuite/gcc.dg/strlenopt-73.c 
b/gcc/testsuite/gcc.dg/strlenopt-73.c
index 170b66a21b0..6e15303dc3c 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-73.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-73.c
@@ -69,7 +69,7 @@ void test_copy_cond_equal_length (void)
   T ( 0 ==, 33,  1, (i0 ? a32 : b32) + 32);
 }
 
-#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) \
+#if (defined(__i386__) && defined(__SSE__)) || defined(__x86_64__) || 
defined(__aarch64__) \
 || defined(__s390__) || defined(__powerpc64__)
 
 /* The following tests assume GCC transforms the memcpy calls into
diff --git a/gcc/testsuite/gcc.dg/strlenopt-80.c 
b/gcc/testsuite/gcc.dg/strlenopt-80.c
index a853402b5ce..a8adbf1eed5 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-80.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-80.c
@@ -5,7 +5,8 @@
such a store.
{ dg-do compile { target { { aarch64*-*-* i?86-*-* x86_64-*-* } || { { 
powerpc*-*-* } && lp64 } } } }
 
-   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
+   { dg-options "-O2 -Wall -fdump-tree-optimized" }
+   { dg-additional-options "-msse" { target i?86-*-* x86_64-*-* } } */
 
 #define CHAR_BIT  __CHAR_BIT__
 #define SIZE_MAX  __SIZE_MAX__
-- 
2.34.1


[PATCH] gcc-changelog: ignore one more revision

2022-04-04 Thread Martin Liška

Ignore:

Checking 86d8e0c0652ef5236a460b75c25e4f7093cc0651: FAILED
ERR: line should start with a tab: "This reverts commits r12-7804 and r12-7929."
ERR: could not deduce ChangeLog file

It seems Jason pushed the revision to origin/trunk where the checking script is 
not run.

@Jakub: Can you please re-run Daily bump script?

Thanks,
Martin

contrib/ChangeLog:

* gcc-changelog/git_update_version.py: Ignore the revision.
---
 contrib/gcc-changelog/git_update_version.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_update_version.py 
b/contrib/gcc-changelog/git_update_version.py
index 8cd337d5434..2de96882943 100755
--- a/contrib/gcc-changelog/git_update_version.py
+++ b/contrib/gcc-changelog/git_update_version.py
@@ -31,7 +31,8 @@ IGNORED_COMMITS = (
 'c2be82058fb40f3ae891c68d185ff53e07f14f45',
 '04a040d907a83af54e0a98bdba5bfabc0ef4f700',
 '2e96b5f14e4025691b57d2301d71aa6092ed44bc',
-'3ab5c8cd03d92bf4ec41e351820349d92fbc40c4')
+'3ab5c8cd03d92bf4ec41e351820349d92fbc40c4',
+'86d8e0c0652ef5236a460b75c25e4f7093cc0651')
 
 
 def read_timestamp(path):

--
2.35.1



Re: [PATCH] Add condition coverage profiling

2022-04-04 Thread Sebastian Huber

Hello Jørgen,

having support for MC/DC coverage in GCC would be really nice. I tried 
out your latest patch on an arm cross-compiler with Newlib (inhibit_libc 
is defined). Could you please add the following fix to your patch:


diff --git a/libgcc/libgcov-merge.c b/libgcc/libgcov-merge.c
index 89741f637e1..9e3e8ee5657 100644
--- a/libgcc/libgcov-merge.c
+++ b/libgcc/libgcov-merge.c
@@ -33,6 +33,11 @@ void __gcov_merge_add (gcov_type *counters 
__attribute__ ((unused)),

unsigned n_counters __attribute__ ((unused))) {}
 #endif

+#ifdef L_gcov_merge_ior
+void __gcov_merge_ior (gcov_type *counters  __attribute__ ((unused)),
+  unsigned n_counters __attribute__ ((unused))) {}
+#endif
+
 #ifdef L_gcov_merge_topn
 void __gcov_merge_topn (gcov_type *counters  __attribute__ ((unused)),
unsigned n_counters __attribute__ ((unused))) {}

It seems that support for the new GCOV_TAG_CONDS is missing in gcov-tool 
and gcov-dump, see "tag_table" in gcc/gcov-dump.c and libgcc/libgcov-util.c.


On 21/03/2022 12:55, Jørgen Kvalsvik via Gcc-patches wrote:
[...]

Like Wahlen et al this implementation uses bitsets to store conditions,
which gcov later interprets. This is very fast, but introduces an max
limit for the number of terms in a single boolean expression. This limit
is the number of bits in a gcov_unsigned_type (which is typedef'd to
uint64_t), so for most practical purposes this would be acceptable.
limitation can be relaxed with a more sophisticated way of storing and
updating bitsets (for example length-encoding).


For multi-threaded applications using -fprofile-update=atomic is quite 
important. Unfortunately, not all 32-bit targets support 64-bit atomic 
operations in hardware. There is a target hook to select the size of 
gcov_type. Maybe a dedicated 64-bit type should be used for the bitfield 
using two 32-bit atomic OR if necessary.




In action it looks pretty similar to the branch coverage. The -g short
opt carries no significance, but was chosen because it was an available
option with the upper-case free too.

gcov --conditions:

 3:   17:void fn (int a, int b, int c, int d) {
 3:   18:if ((a && (b || c)) && d)
conditions covered 5/8
condition  1 not covered (false)
condition  2 not covered (true)
condition  2 not covered (false)
 1:   19:x = 1;
 -:   20:else
 2:   21:x = 2;
 3:   22:}


I have some trouble to understand the output. Would 8/8 mean that we 
have 100% MC/DC coverage? What does "not covered (false)" or "not 
covered (true)" mean?


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


[PATCH] aarch64: Fix aarch64-tune.md (re)generation [PR105144]

2022-04-04 Thread Jakub Jelinek via Gcc-patches
Hi!

As I wrote in the PR, our Fedora trunk gcc builds likely after r12-7842
change are now failing (lto1 crashes).
What happens is that when one bootstraps into an empty build directory
(or set of them), mddeps.mk doesn't exist yet and so Makefile doesn't
include it.  When building from an empty dir, that is usually not a big
issue, it is enough when various build directory files depend on just
$(srcdir)/config/aarch64/aarch64.md, those files don't exist and
aarch64.md does, so they are built, so is mddeps.mk.
But because the other dependencies aren't there (in particular
$(srcdir)/config/aarch64/aarch64-tune.md ), the
s-aarch64-tune-md rule isn't invoked to regenerate that file and the
r12-7842 commit reordered aarch64-cores.def entries but didn't commit
regenerated aarch64-tune.md.  Because it is just reordering in
aarch64-tune.md, it actually doesn't matter and bootstraps succeeds.
But then during make install, mddeps.mk exists already in gcc/ directory,
it sees that aarch64-cores.def is newer than aarch64-tune.md (unless
gen_update is used, that just touches aarch64-tune.md to make sure it
is newer) and regenerates it and as it is different, make install rebuilds
a large subset of the *.o files, but this time with the system g++
rather than previous stage one.  And during lto linking of it there
are differences in LTO bytecode between the compilers and we crash.

The following patch fixes that by regenerating aarch64-tune.md
(what was forgotten in r12-7842) and by adding a dependency from
s-mddeps to s-aarch64-tune-md, which makes sure that even when mddeps.mk
doesn't exist yet make sees the dependency and regenerates aarch64-tune.md
if needed.

Tested on aarch64-linux and x86_64-linux (cross there), ok for trunk?

2022-04-04  Jakub Jelinek  

PR target/105144
* config/aarch64/t-aarch64 (s-mddeps): Depend on s-aarch64-tune-md.
* config/aarch64/aarch64-tune.md: Regenerated.

--- gcc/config/aarch64/t-aarch64.jj 2022-01-18 11:58:59.024990028 +0100
+++ gcc/config/aarch64/t-aarch642022-04-04 10:14:30.256323070 +0200
@@ -34,6 +34,8 @@ s-aarch64-tune-md: $(srcdir)/config/aarc
$(srcdir)/config/aarch64/aarch64-tune.md
$(STAMP) s-aarch64-tune-md
 
+s-mddeps: s-aarch64-tune-md
+
 aarch64-builtins.o: $(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) \
   $(SYSTEM_H) coretypes.h $(TM_H) \
   $(RTL_H) $(TREE_H) expr.h $(TM_P_H) $(RECOG_H) langhooks.h \
--- gcc/config/aarch64/aarch64-tune.md.jj   2022-04-03 23:30:25.710798806 
+0200
+++ gcc/config/aarch64/aarch64-tune.md  2022-04-04 10:14:55.668962478 +0200
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,zeus,neoversev1,neoverse512tvb,saphira,neoversen2,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa710,cortexx2,demeter"
+   
"cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,zeus,neoversev1,neoverse512tvb,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa710,cortexx2,neoversen2,demeter"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))

Jakub



[PATCH] aarch64: Restrict aarch64-tune.md regeneration to --enable-maintainer-mode [PR105144]

2022-04-04 Thread Jakub Jelinek via Gcc-patches
Hi!

Normally updates to the source directory files are guarded with
--enable-maintainer-mode, e.g. we don't regenerate configure, config.h,
Makefile.in in directories that use automake etc. unless gcc is configured
that way.  Otherwise the source tree can't be e.g. stored on a read-only
filesystem etc.
In gcc/Makefile.in we use @MAINT@ for that but that works because
gcc/Makefile is generated by configure.  In config/*/t-* files we need to
check $(ENABLE_MAINTAINER_RULES):
# The following provides the variable ENABLE_MAINTAINER_RULES that can
# be used in language Make-lang.in makefile fragments to enable
# maintainer rules.  So, ENABLE_MAINTAINER_RULES is 'true' in
# maintainer mode, and '' otherwise.
@MAINT@ ENABLE_MAINTAINER_RULES = true

This is incremental patch does that, tested again on aarch64-linux and
x86_64-linux (cross in that case), ok for trunk?

2022-04-04  Jakub Jelinek  

PR target/105144
* config/aarch64/t-aarch64 ($(srcdir)/config/aarch64/aarch64-tune.md,
s-aarch64-tune-md, s-mddeps): Only enable the rules if
$(ENABLE_MAINTAINER_RULES) is non-empty.

--- gcc/config/aarch64/t-aarch64.jj 2022-04-04 10:14:30.256323070 +0200
+++ gcc/config/aarch64/t-aarch642022-04-04 10:32:55.591651822 +0200
@@ -24,6 +24,7 @@ OPTIONS_H_EXTRA += $(srcdir)/config/aarc
   $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
   $(srcdir)/config/aarch64/aarch64-tuning-flags.def
 
+ifneq ($(strip $(ENABLE_MAINTAINER_RULES)),)
 $(srcdir)/config/aarch64/aarch64-tune.md: s-aarch64-tune-md; @true
 s-aarch64-tune-md: $(srcdir)/config/aarch64/gentune.sh \
$(srcdir)/config/aarch64/aarch64-cores.def
@@ -35,6 +36,7 @@ s-aarch64-tune-md: $(srcdir)/config/aarc
$(STAMP) s-aarch64-tune-md
 
 s-mddeps: s-aarch64-tune-md
+endif
 
 aarch64-builtins.o: $(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) \
   $(SYSTEM_H) coretypes.h $(TM_H) \

Jakub



[committed] d: Compile simd_ctfe.d only on avx_runtime or vect_sizes_16B_8B targets

2022-04-04 Thread Iain Buclaw via Gcc-patches
Hi,

This test makes use of the `__vector(int[4])' type, which is not
supported on all targets, so guard the test with target avx_runtime ||
vect_sizes_16B_8B, fixing PR104740.

Regression tested on x86_64-linux-gnu, committed to mainline.

Regards,
Iain.

---

PR d/104740

gcc/testsuite/ChangeLog:

* gdc.dg/simd_ctfe.d: Compile with target avx_runtime or
vect_sizes_16B_8B.
---
 gcc/testsuite/gdc.dg/simd_ctfe.d | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gdc.dg/simd_ctfe.d b/gcc/testsuite/gdc.dg/simd_ctfe.d
index b254cf312cb..507de17baa2 100644
--- a/gcc/testsuite/gdc.dg/simd_ctfe.d
+++ b/gcc/testsuite/gdc.dg/simd_ctfe.d
@@ -1,4 +1,5 @@
-// { dg-do compile }
+// { dg-additional-options "-mavx" { target avx_runtime } }
+// { dg-do compile { target { avx_runtime || vect_sizes_16B_8B } } }
 import core.simd;
 
 // https://issues.dlang.org/show_bug.cgi?id=19627
-- 
2.32.0



Re: [PATCH] rs6000: Improve .machine

2022-04-04 Thread Sebastian Huber

Hello Segher,

On 15/03/2022 23:29, Segher Boessenkool wrote:

On Tue, Mar 15, 2022 at 03:29:23PM +0100, Sebastian Huber wrote:

now that the PR104829 is fixed could I back port

Segher Boessenkool (2):
   rs6000: Improve .machine
   rs6000: Do not use rs6000_cpu for .machine ppc and ppc64 (PR104829)

to GCC 10 and 11?

I will do it, in a few days though.

Thanks for your enthusiasm :-),


would now be a good time to back port the fixes or do you want to wait 
for the GCC 12 release? I would be nice if the fixes are included in the 
GCC 10.4 release.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


[PATCH] middle-end/105140 - fix bogus recursion in fold_convertible_p

2022-04-04 Thread Richard Biener via Gcc-patches
fold_convertible_p expects an operand and a type to convert to
but recurses with two vector component types.  Fixed by allowing
types instead of an operand as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-04-04  Richard Biener  

PR middle-end/105140
* fold-const.cc (fold_convertible_p): Allow a TYPE_P arg.

* gcc.dg/pr105140.c: New testcase.
---
 gcc/fold-const.cc   |  5 +++--
 gcc/testsuite/gcc.dg/pr105140.c | 17 +
 2 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr105140.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index b647e5305aa..fb08fa1dbc6 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -2379,12 +2379,13 @@ build_zero_vector (tree type)
   return build_vector_from_val (type, t);
 }
 
-/* Returns true, if ARG is convertible to TYPE using a NOP_EXPR.  */
+/* Returns true, if ARG, an operand or a type, is convertible to TYPE
+   using a NOP_EXPR.  */
 
 bool
 fold_convertible_p (const_tree type, const_tree arg)
 {
-  tree orig = TREE_TYPE (arg);
+  const_tree orig = TYPE_P (arg) ? arg : TREE_TYPE (arg);
 
   if (type == orig)
 return true;
diff --git a/gcc/testsuite/gcc.dg/pr105140.c b/gcc/testsuite/gcc.dg/pr105140.c
new file mode 100644
index 000..14bff2f7f9c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr105140.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -w -Wno-psabi" } */
+
+typedef char __attribute__((__vector_size__ (16 * sizeof (char U;
+typedef int __attribute__((__vector_size__ (16 * sizeof (int V;
+
+void bar ();
+
+bar (int i, int j, int k, V v)
+{
+}
+
+void
+foo (void)
+{
+  bar ((V){}, (V){}, (V){}, (U){});
+}
-- 
2.34.1


[PATCH] tree-optimization/105132 - add missing checking in vectorizable_operation

2022-04-04 Thread Richard Biener via Gcc-patches
The following adds missing verification that the input vectors
have the same number of elements for vectorizable_operation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-04-04  Richard Biener  

PR tree-optimization/105132
* tree-vect-stmts.cc (vectorizable_operation): Check that
the input vectors have the same number of elements.

* gcc.dg/torture/pr105132.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr105132.c | 12 
 gcc/tree-vect-stmts.cc  |  6 ++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr105132.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr105132.c 
b/gcc/testsuite/gcc.dg/torture/pr105132.c
new file mode 100644
index 000..f8f0b16ec56
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr105132.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
i?86-*-* } } */
+
+short a;
+extern int b[];
+int c;
+void d(long f[][5][5][17], int g[][5][5][17]) {
+  for (short e = 0; e < 17; e++) {
+a = g[19][2][3][e];
+b[e] = c & (f[3][2][3][e] && g[19][2][3][e]);
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index f7449a79d1c..f6fc7e1fcdd 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -6133,6 +6133,9 @@ vectorizable_operation (vec_info *vinfo,
  "use not simple.\n");
  return false;
}
+  if (vectype2
+ && maybe_ne (nunits_out, TYPE_VECTOR_SUBPARTS (vectype2)))
+   return false;
 }
   if (op_type == ternary_op)
 {
@@ -6144,6 +6147,9 @@ vectorizable_operation (vec_info *vinfo,
  "use not simple.\n");
  return false;
}
+  if (vectype3
+ && maybe_ne (nunits_out, TYPE_VECTOR_SUBPARTS (vectype3)))
+   return false;
 }
 
   /* Multiple types in SLP are handled by creating the appropriate number of
-- 
2.34.1


Re: [PATCH] aarch64: Fix aarch64-tune.md (re)generation [PR105144]

2022-04-04 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> Hi!
>
> As I wrote in the PR, our Fedora trunk gcc builds likely after r12-7842
> change are now failing (lto1 crashes).
> What happens is that when one bootstraps into an empty build directory
> (or set of them), mddeps.mk doesn't exist yet and so Makefile doesn't
> include it.  When building from an empty dir, that is usually not a big
> issue, it is enough when various build directory files depend on just
> $(srcdir)/config/aarch64/aarch64.md, those files don't exist and
> aarch64.md does, so they are built, so is mddeps.mk.
> But because the other dependencies aren't there (in particular
> $(srcdir)/config/aarch64/aarch64-tune.md ), the
> s-aarch64-tune-md rule isn't invoked to regenerate that file and the
> r12-7842 commit reordered aarch64-cores.def entries but didn't commit
> regenerated aarch64-tune.md.  Because it is just reordering in
> aarch64-tune.md, it actually doesn't matter and bootstraps succeeds.
> But then during make install, mddeps.mk exists already in gcc/ directory,
> it sees that aarch64-cores.def is newer than aarch64-tune.md (unless
> gen_update is used, that just touches aarch64-tune.md to make sure it
> is newer) and regenerates it and as it is different, make install rebuilds
> a large subset of the *.o files, but this time with the system g++
> rather than previous stage one.  And during lto linking of it there
> are differences in LTO bytecode between the compilers and we crash.
>
> The following patch fixes that by regenerating aarch64-tune.md
> (what was forgotten in r12-7842) and by adding a dependency from
> s-mddeps to s-aarch64-tune-md, which makes sure that even when mddeps.mk
> doesn't exist yet make sees the dependency and regenerates aarch64-tune.md
> if needed.
>
> Tested on aarch64-linux and x86_64-linux (cross there), ok for trunk?
>
> 2022-04-04  Jakub Jelinek  
>
>   PR target/105144
>   * config/aarch64/t-aarch64 (s-mddeps): Depend on s-aarch64-tune-md.
>   * config/aarch64/aarch64-tune.md: Regenerated.

OK, thanks.

Richard

>
> --- gcc/config/aarch64/t-aarch64.jj   2022-01-18 11:58:59.024990028 +0100
> +++ gcc/config/aarch64/t-aarch64  2022-04-04 10:14:30.256323070 +0200
> @@ -34,6 +34,8 @@ s-aarch64-tune-md: $(srcdir)/config/aarc
>   $(srcdir)/config/aarch64/aarch64-tune.md
>   $(STAMP) s-aarch64-tune-md
>  
> +s-mddeps: s-aarch64-tune-md
> +
>  aarch64-builtins.o: $(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) 
> \
>$(SYSTEM_H) coretypes.h $(TM_H) \
>$(RTL_H) $(TREE_H) expr.h $(TM_P_H) $(RECOG_H) langhooks.h \
> --- gcc/config/aarch64/aarch64-tune.md.jj 2022-04-03 23:30:25.710798806 
> +0200
> +++ gcc/config/aarch64/aarch64-tune.md2022-04-04 10:14:55.668962478 
> +0200
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> - 
> "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,zeus,neoversev1,neoverse512tvb,saphira,neoversen2,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa710,cortexx2,demeter"
> + 
> "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,ares,neoversen1,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,zeus,neoversev1,neoverse512tvb,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa710,cortexx2,neoversen2,demeter"
>   (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
>
>   Jakub


Re: [PATCH] aarch64: Restrict aarch64-tune.md regeneration to --enable-maintainer-mode [PR105144]

2022-04-04 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> Hi!
>
> Normally updates to the source directory files are guarded with
> --enable-maintainer-mode, e.g. we don't regenerate configure, config.h,
> Makefile.in in directories that use automake etc. unless gcc is configured
> that way.  Otherwise the source tree can't be e.g. stored on a read-only
> filesystem etc.
> In gcc/Makefile.in we use @MAINT@ for that but that works because
> gcc/Makefile is generated by configure.  In config/*/t-* files we need to
> check $(ENABLE_MAINTAINER_RULES):
> # The following provides the variable ENABLE_MAINTAINER_RULES that can
> # be used in language Make-lang.in makefile fragments to enable
> # maintainer rules.  So, ENABLE_MAINTAINER_RULES is 'true' in
> # maintainer mode, and '' otherwise.
> @MAINT@ ENABLE_MAINTAINER_RULES = true
>
> This is incremental patch does that, tested again on aarch64-linux and
> x86_64-linux (cross in that case), ok for trunk?
>
> 2022-04-04  Jakub Jelinek  
>
>   PR target/105144
>   * config/aarch64/t-aarch64 ($(srcdir)/config/aarch64/aarch64-tune.md,
>   s-aarch64-tune-md, s-mddeps): Only enable the rules if
>   $(ENABLE_MAINTAINER_RULES) is non-empty.

OK.  But I guess the risk is that it will become even easier to forget
to commit an updated aarch64-tune.md.  Perhaps we should have a
non-maintainer rule to build aarch64-tune.md locally and check it
against the source-directory version, and fail the build if there's
a mismatch.  Or maybe we should just generate aarch64-tune.md in the
build directory and remove the source directory version.

That's all future work though.  The patch is still an improvement
of the status quo.

Thanks,
Richard

>
> --- gcc/config/aarch64/t-aarch64.jj   2022-04-04 10:14:30.256323070 +0200
> +++ gcc/config/aarch64/t-aarch64  2022-04-04 10:32:55.591651822 +0200
> @@ -24,6 +24,7 @@ OPTIONS_H_EXTRA += $(srcdir)/config/aarc
>  $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \
>  $(srcdir)/config/aarch64/aarch64-tuning-flags.def
>  
> +ifneq ($(strip $(ENABLE_MAINTAINER_RULES)),)
>  $(srcdir)/config/aarch64/aarch64-tune.md: s-aarch64-tune-md; @true
>  s-aarch64-tune-md: $(srcdir)/config/aarch64/gentune.sh \
>   $(srcdir)/config/aarch64/aarch64-cores.def
> @@ -35,6 +36,7 @@ s-aarch64-tune-md: $(srcdir)/config/aarc
>   $(STAMP) s-aarch64-tune-md
>  
>  s-mddeps: s-aarch64-tune-md
> +endif
>  
>  aarch64-builtins.o: $(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) 
> \
>$(SYSTEM_H) coretypes.h $(TM_H) \
>
>   Jakub


Re: [PATCH] aarch64: Restrict aarch64-tune.md regeneration to --enable-maintainer-mode [PR105144]

2022-04-04 Thread Jakub Jelinek via Gcc-patches
On Mon, Apr 04, 2022 at 11:10:14AM +0100, Richard Sandiford wrote:
> > Normally updates to the source directory files are guarded with
> > --enable-maintainer-mode, e.g. we don't regenerate configure, config.h,
> > Makefile.in in directories that use automake etc. unless gcc is configured
> > that way.  Otherwise the source tree can't be e.g. stored on a read-only
> > filesystem etc.
> > In gcc/Makefile.in we use @MAINT@ for that but that works because
> > gcc/Makefile is generated by configure.  In config/*/t-* files we need to
> > check $(ENABLE_MAINTAINER_RULES):
> > # The following provides the variable ENABLE_MAINTAINER_RULES that can
> > # be used in language Make-lang.in makefile fragments to enable
> > # maintainer rules.  So, ENABLE_MAINTAINER_RULES is 'true' in
> > # maintainer mode, and '' otherwise.
> > @MAINT@ ENABLE_MAINTAINER_RULES = true
> >
> > This is incremental patch does that, tested again on aarch64-linux and
> > x86_64-linux (cross in that case), ok for trunk?
> >
> > 2022-04-04  Jakub Jelinek  
> >
> > PR target/105144
> > * config/aarch64/t-aarch64 ($(srcdir)/config/aarch64/aarch64-tune.md,
> > s-aarch64-tune-md, s-mddeps): Only enable the rules if
> > $(ENABLE_MAINTAINER_RULES) is non-empty.
> 
> OK.  But I guess the risk is that it will become even easier to forget
> to commit an updated aarch64-tune.md.  Perhaps we should have a
> non-maintainer rule to build aarch64-tune.md locally and check it
> against the source-directory version, and fail the build if there's
> a mismatch.  Or maybe we should just generate aarch64-tune.md in the
> build directory and remove the source directory version.

I've tried if aarch64-tune.md will be read from the build dir, but it is
not.  The gen* files can use -I options to add additional directories, but
they don't use them.

Here is a variant patch which will complain and fail if there is a change
and --enable-maintainer-mode is not enabled:

2022-04-04  Jakub Jelinek  

PR target/105144
* config/aarch64/t-aarch64 (s-aarch64-tune-md): Do move-if-change
only if configured with --enable-maintainer-mode, otherwise compare
tmp-aarch64-tune.md with $(srcdir)/config/aarch64/aarch64-tune.md and
if they differ, emit a message and fail.

--- gcc/config/aarch64/t-aarch64.jj 2022-04-04 12:09:18.530859281 +0200
+++ gcc/config/aarch64/t-aarch642022-04-04 12:44:35.878930189 +0200
@@ -30,8 +30,18 @@ s-aarch64-tune-md: $(srcdir)/config/aarc
$(SHELL) $(srcdir)/config/aarch64/gentune.sh \
$(srcdir)/config/aarch64/aarch64-cores.def > \
tmp-aarch64-tune.md
+ifneq ($(strip $(ENABLE_MAINTAINER_RULES)),)
$(SHELL) $(srcdir)/../move-if-change tmp-aarch64-tune.md \
$(srcdir)/config/aarch64/aarch64-tune.md
+else
+   @if ! cmp -s tmp-aarch64-tune.md \
+ $(srcdir)/config/aarch64/aarch64-tune.md; then \
+ echo "aarch64-tune.md has changed; either"; \
+ echo "configure with --enable-maintainer-mode"; \
+ echo "or copy tmp-aarch64-tune.md to 
$(srcdir)/config/aarch64/aarch64-tune.md"; \
+ exit 1; \
+   fi
+endif
$(STAMP) s-aarch64-tune-md
 
 s-mddeps: s-aarch64-tune-md


Jakub



[PATCH] libstdc++: Add pretty printer for std::span

2022-04-04 Thread Philipp Fent via Gcc-patches
This improves the debug output for C++20 spans.
Before:
{static extent = 18446744073709551615, _M_ptr = 0x7fffb9a8,
_M_extent = {_M_extent_value = 2}}
Now with StdSpanPrinter:
std::span of length 2 = {1, 2}
---
 libstdc++-v3/python/libstdcxx/v6/printers.py  | 38 +++
 .../libstdc++-prettyprinters/cxx20.cc | 11 ++
 2 files changed, 49 insertions(+)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index f7a7f9961..6d8b765f2 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1654,6 +1654,43 @@ class StdRegexStatePrinter:
 s = "{}, {}={}".format(s, v, self.val['_M_' + v])
 return "{%s}" % (s)
 
+class StdSpanPrinter:
+"Print a std::span"
+
+class _iterator(Iterator):
+def __init__(self, begin, size):
+self.count = 0
+self.begin = begin
+self.size = size
+
+def __iter__ (self):
+return self
+
+def __next__ (self):
+if self.count == self.size:
+raise StopIteration
+
+count = self.count
+self.count = self.count + 1
+return '[%d]' % count, (self.begin + count).dereference()
+
+def __init__(self, typename, val):
+self.typename = typename
+self.val = val
+if val.type.template_argument(1) == 
gdb.parse_and_eval('static_cast(-1)'):
+self.size = val['_M_extent']['_M_extent_value']
+else:
+self.size = val.type.template_argument(1)
+
+def to_string(self):
+return '%s of length %d' % (self.typename, self.size)
+
+def children(self):
+return self._iterator(self.val['_M_ptr'], self.size)
+
+def display_hint(self):
+return 'array'
+
 # A "regular expression" printer which conforms to the
 # "SubPrettyPrinter" protocol from gdb.printing.
 class RxPrinter(object):
@@ -2170,6 +2207,7 @@ def build_libstdcxx_dictionary ():
 libstdcxx_printer.add_version('std::', 'partial_ordering', 
StdCmpCatPrinter)
 libstdcxx_printer.add_version('std::', 'weak_ordering', StdCmpCatPrinter)
 libstdcxx_printer.add_version('std::', 'strong_ordering', StdCmpCatPrinter)
+libstdcxx_printer.add_version('std::', 'span', StdSpanPrinter)
 
 # Extensions.
 libstdcxx_printer.add_version('__gnu_cxx::', 'slist', StdSlistPrinter)
diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc 
b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
index b0de25c27..76023df93 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
@@ -18,8 +18,10 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+#include 
 #include 
 #include 
+#include 
 
 struct X
 {
@@ -54,6 +56,15 @@ main()
   auto c10 = 0.0 <=> __builtin_nan("");
 // { dg-final { note-test c10 "std::partial_ordering::unordered" } }
 
+  auto il = {1, 2};
+  auto s1 = std::span(il);
+  static_assert(s1.extent == std::size_t(-1));
+// { dg-final { note-test s1 {std::span of length 2 = {1, 2}} } }
+  auto a = std::array{3, 4};
+  auto s2 = std::span(a);
+  static_assert(s2.extent == std::size_t(2));
+// { dg-final { note-test s2 {std::span of length 2 = {3, 4}} } }
+
   std::cout << "\n";
   return 0;// Mark SPOT
 }
-- 
2.35.1



Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-04 Thread Tom de Vries via Gcc-patches

On 4/1/22 17:57, Tom de Vries wrote:

On 4/1/22 17:38, Jakub Jelinek wrote:

On Fri, Apr 01, 2022 at 05:34:50PM +0200, Tom de Vries wrote:

Do you perhaps have an idea why it's failing?


Because you call on_device_arch_nvptx () outside of
!$omp target region, so unless the host device is NVPTX,
it will not be true.



That bit does works because on_device_arch_nvptx calls on_device_arch 
which contains the omp target bit:

...
static int
on_device_arch (int d)
{
   int d_cur;
   #pragma omp target map(from:d_cur)
   d_cur = device_arch ();

   return d_cur == d;
}

int
on_device_arch_nvptx ()
{
   return on_device_arch (GOMP_DEVICE_NVIDIA_PTX);
}
...

So I realized that I didn't do a good job of specifying the problem I 
encountered, and went looking at it, at which point I realized the error 
message had changed, and knew how to fix it ... So, my apologies, some 
confusion on my part.


Anyway, attached patch avoids any nvptx-related tcl directives (just for 
once test-case for now).  To me, this seems the most robust solution.


It this approach acceptable?


I intend to commit this in a few days, unless there are objections.

Thanks,
- Tom[libgomp/testsuite] Fix libgomp.fortran/examples-4/declare_target-{1,2}.f90

The test-cases libgomp.fortran/examples-4/declare_target-{1,2}.f90 mean to
set an nvptx-specific limit using offload_target_nvptx, but also change
behaviour for amd.

That is, there is now a difference in behaviour between:
- a compiler configured for GCN offloading, and
- a compiler configured for both GCN and nvptx offloading.

Fix this by using instead on_device_arch_nvptx.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-04-04  Tom de Vries  

	* testsuite/libgomp.fortran/examples-4/on_device_arch.c: Copy from
	parent dir.
	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Use
	on_device_arch_nvptx instead of offload_target_nvptx.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.

---
 .../examples-4/declare_target-1.f90| 31 +-
 .../examples-4/declare_target-2.f90| 31 +-
 .../libgomp.fortran/examples-4/on_device_arch.c|  3 +++
 3 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90 b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
index 03c5c53ed67..acded20f756 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
@@ -1,16 +1,6 @@
 ! { dg-do run }
-! { dg-additional-options "-cpp" }
-! Reduced from 25 to 23, otherwise execution runs out of thread stack on
-! Nvidia Titan V.
-! Reduced from 23 to 22, otherwise execution runs out of thread stack on
-! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
-! Reduced from 22 to 20, otherwise execution runs out of thread stack on
-! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
-! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } } */
-
-#ifndef REC_DEPTH
-#define REC_DEPTH 25
-#endif
+! { dg-additional-sources on_device_arch.c }
+! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
 
 module e_53_1_mod
   integer :: THRESHOLD = 20
@@ -38,6 +28,23 @@ end module
 
 program e_53_1
   use e_53_1_mod, only : fib, fib_wrapper
+  integer :: REC_DEPTH = 25
+
+  interface
+integer function on_device_arch_nvptx() bind(C)
+end function on_device_arch_nvptx
+  end interface
+
+  if (on_device_arch_nvptx () /= 0) then
+ ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+ ! Nvidia Titan V.
+ ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+ ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+ ! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+ ! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+ REC_DEPTH = 20
+  end if
+
   if (fib (15) /= fib_wrapper (15)) stop 1
   if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
 end program
diff --git a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90 b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
index 0e8bea578a8..27a5cec2e9d 100644
--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-2.f90
@@ -1,20 +1,27 @@
 ! { dg-do run }
-! { dg-additional-options "-cpp" }
-! Reduced from 25 to 23, otherwise execution runs out of thread stack on
-! Nvidia Titan V.
-! Reduced from 23 to 22, otherwise execution runs out of thread stack on
-! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
-! Reduced from 22 to 18, otherwise execution runs out of thread stack on
-! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
-! { dg-add

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-04 Thread Jakub Jelinek via Gcc-patches
On Mon, Apr 04, 2022 at 01:05:12PM +0200, Tom de Vries wrote:
> 2022-04-04  Tom de Vries  
> 
>   * testsuite/libgomp.fortran/examples-4/on_device_arch.c: Copy from
>   parent dir.

Wouldn't just ! { dg-additional-sources ../on_device_arch.c }
work?

Jakub



Re: [PATCH] aarch64: Restrict aarch64-tune.md regeneration to --enable-maintainer-mode [PR105144]

2022-04-04 Thread Richard Earnshaw via Gcc-patches




On 04/04/2022 11:49, Jakub Jelinek via Gcc-patches wrote:

On Mon, Apr 04, 2022 at 11:10:14AM +0100, Richard Sandiford wrote:

Normally updates to the source directory files are guarded with
--enable-maintainer-mode, e.g. we don't regenerate configure, config.h,
Makefile.in in directories that use automake etc. unless gcc is configured
that way.  Otherwise the source tree can't be e.g. stored on a read-only
filesystem etc.
In gcc/Makefile.in we use @MAINT@ for that but that works because
gcc/Makefile is generated by configure.  In config/*/t-* files we need to
check $(ENABLE_MAINTAINER_RULES):
# The following provides the variable ENABLE_MAINTAINER_RULES that can
# be used in language Make-lang.in makefile fragments to enable
# maintainer rules.  So, ENABLE_MAINTAINER_RULES is 'true' in
# maintainer mode, and '' otherwise.
@MAINT@ ENABLE_MAINTAINER_RULES = true

This is incremental patch does that, tested again on aarch64-linux and
x86_64-linux (cross in that case), ok for trunk?

2022-04-04  Jakub Jelinek  

PR target/105144
* config/aarch64/t-aarch64 ($(srcdir)/config/aarch64/aarch64-tune.md,
s-aarch64-tune-md, s-mddeps): Only enable the rules if
$(ENABLE_MAINTAINER_RULES) is non-empty.


OK.  But I guess the risk is that it will become even easier to forget
to commit an updated aarch64-tune.md.  Perhaps we should have a
non-maintainer rule to build aarch64-tune.md locally and check it
against the source-directory version, and fail the build if there's
a mismatch.  Or maybe we should just generate aarch64-tune.md in the
build directory and remove the source directory version.


I've tried if aarch64-tune.md will be read from the build dir, but it is
not.  The gen* files can use -I options to add additional directories, but
they don't use them.

Here is a variant patch which will complain and fail if there is a change
and --enable-maintainer-mode is not enabled:

2022-04-04  Jakub Jelinek  

PR target/105144
* config/aarch64/t-aarch64 (s-aarch64-tune-md): Do move-if-change
only if configured with --enable-maintainer-mode, otherwise compare
tmp-aarch64-tune.md with $(srcdir)/config/aarch64/aarch64-tune.md and
if they differ, emit a message and fail.

--- gcc/config/aarch64/t-aarch64.jj 2022-04-04 12:09:18.530859281 +0200
+++ gcc/config/aarch64/t-aarch642022-04-04 12:44:35.878930189 +0200
@@ -30,8 +30,18 @@ s-aarch64-tune-md: $(srcdir)/config/aarc
$(SHELL) $(srcdir)/config/aarch64/gentune.sh \
$(srcdir)/config/aarch64/aarch64-cores.def > \
tmp-aarch64-tune.md
+ifneq ($(strip $(ENABLE_MAINTAINER_RULES)),)
$(SHELL) $(srcdir)/../move-if-change tmp-aarch64-tune.md \
$(srcdir)/config/aarch64/aarch64-tune.md
+else
+   @if ! cmp -s tmp-aarch64-tune.md \
+ $(srcdir)/config/aarch64/aarch64-tune.md; then \
+ echo "aarch64-tune.md has changed; either"; \
+ echo "configure with --enable-maintainer-mode"; \
+ echo "or copy tmp-aarch64-tune.md to 
$(srcdir)/config/aarch64/aarch64-tune.md"; \
+ exit 1; \
+   fi
+endif
$(STAMP) s-aarch64-tune-md
  
  s-mddeps: s-aarch64-tune-md



Jakub



OK.

I think we have a similar issue for arm with arm-tune.md and 
arm-tables.opt.  Perhaps we should adopt a similar approach for those as 
well.


R.


Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

2022-04-04 Thread Tom de Vries via Gcc-patches

On 4/4/22 13:07, Jakub Jelinek wrote:

On Mon, Apr 04, 2022 at 01:05:12PM +0200, Tom de Vries wrote:

2022-04-04  Tom de Vries  

* testsuite/libgomp.fortran/examples-4/on_device_arch.c: Copy from
parent dir.


Wouldn't just ! { dg-additional-sources ../on_device_arch.c }
work?


I does, pushed with that update.

Thanks,
- Tom



Re: [PATCH] libstdc++: Add pretty printer for std::span

2022-04-04 Thread Jonathan Wakely via Gcc-patches
On Mon, 4 Apr 2022 at 11:54, Philipp Fent via Libstdc++
 wrote:
>
> This improves the debug output for C++20 spans.
> Before:
> {static extent = 18446744073709551615, _M_ptr = 0x7fffb9a8,
> _M_extent = {_M_extent_value = 2}}
> Now with StdSpanPrinter:
> std::span of length 2 = {1, 2}

Nice, thanks. I'll get this committed in time for GCC 12 (and backport
it to release branches too).


> ---
>  libstdc++-v3/python/libstdcxx/v6/printers.py  | 38 +++
>  .../libstdc++-prettyprinters/cxx20.cc | 11 ++
>  2 files changed, 49 insertions(+)
>
> diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
> b/libstdc++-v3/python/libstdcxx/v6/printers.py
> index f7a7f9961..6d8b765f2 100644
> --- a/libstdc++-v3/python/libstdcxx/v6/printers.py
> +++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
> @@ -1654,6 +1654,43 @@ class StdRegexStatePrinter:
>  s = "{}, {}={}".format(s, v, self.val['_M_' + v])
>  return "{%s}" % (s)
>
> +class StdSpanPrinter:
> +"Print a std::span"
> +
> +class _iterator(Iterator):
> +def __init__(self, begin, size):
> +self.count = 0
> +self.begin = begin
> +self.size = size
> +
> +def __iter__ (self):
> +return self
> +
> +def __next__ (self):
> +if self.count == self.size:
> +raise StopIteration
> +
> +count = self.count
> +self.count = self.count + 1
> +return '[%d]' % count, (self.begin + count).dereference()
> +
> +def __init__(self, typename, val):
> +self.typename = typename
> +self.val = val
> +if val.type.template_argument(1) == 
> gdb.parse_and_eval('static_cast(-1)'):
> +self.size = val['_M_extent']['_M_extent_value']
> +else:
> +self.size = val.type.template_argument(1)
> +
> +def to_string(self):
> +return '%s of length %d' % (self.typename, self.size)
> +
> +def children(self):
> +return self._iterator(self.val['_M_ptr'], self.size)
> +
> +def display_hint(self):
> +return 'array'
> +
>  # A "regular expression" printer which conforms to the
>  # "SubPrettyPrinter" protocol from gdb.printing.
>  class RxPrinter(object):
> @@ -2170,6 +2207,7 @@ def build_libstdcxx_dictionary ():
>  libstdcxx_printer.add_version('std::', 'partial_ordering', 
> StdCmpCatPrinter)
>  libstdcxx_printer.add_version('std::', 'weak_ordering', StdCmpCatPrinter)
>  libstdcxx_printer.add_version('std::', 'strong_ordering', 
> StdCmpCatPrinter)
> +libstdcxx_printer.add_version('std::', 'span', StdSpanPrinter)
>
>  # Extensions.
>  libstdcxx_printer.add_version('__gnu_cxx::', 'slist', StdSlistPrinter)
> diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc 
> b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
> index b0de25c27..76023df93 100644
> --- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
> +++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/cxx20.cc
> @@ -18,8 +18,10 @@
>  // with this library; see the file COPYING3.  If not see
>  // .
>
> +#include 
>  #include 
>  #include 
> +#include 
>
>  struct X
>  {
> @@ -54,6 +56,15 @@ main()
>auto c10 = 0.0 <=> __builtin_nan("");
>  // { dg-final { note-test c10 "std::partial_ordering::unordered" } }
>
> +  auto il = {1, 2};
> +  auto s1 = std::span(il);
> +  static_assert(s1.extent == std::size_t(-1));
> +// { dg-final { note-test s1 {std::span of length 2 = {1, 2}} } }
> +  auto a = std::array{3, 4};
> +  auto s2 = std::span(a);
> +  static_assert(s2.extent == std::size_t(2));
> +// { dg-final { note-test s2 {std::span of length 2 = {3, 4}} } }
> +
>std::cout << "\n";
>return 0;// Mark SPOT
>  }
> --
> 2.35.1
>


Re: [PATCH V3] Split vector load from parm_del to elemental loads to avoid STLF stalls.

2022-04-04 Thread Hongtao Liu via Gcc-patches
On Fri, Apr 1, 2022 at 4:32 PM liuhongt via Gcc-patches
 wrote:
>
> Update in V3:
> 1. Add -param=x86-stlf-window-ninsns= (default 64).
> 2. Exclude call in the window.
>
> Since cfg is freed before machine_reorg, just do a rough calculation
> of the window according to the layout.
> Also according to an experiment on CLX, set window size to 64.
>
> Currently only handle V2DFmode load since it doesn't need any scratch
> registers, and it's sufficient to recover cray performance for -O2
> compared to GCC11.

I'm going to check in the patch.
>
> gcc/ChangeLog:
>
> PR target/101908
> * config/i386/i386.cc (ix86_split_stlf_stall_load): New
> function
> (ix86_reorg): Call ix86_split_stlf_stall_load.
> * config/i386/i386.opt (-param=x86-stlf-window-ninsns=): New
> param.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr101908-1.c: New test.
> * gcc.target/i386/pr101908-2.c: New test.
> * gcc.target/i386/pr101908-3.c: New test.
> ---
>  gcc/config/i386/i386.cc| 61 ++
>  gcc/config/i386/i386.opt   |  4 ++
>  gcc/testsuite/gcc.target/i386/pr101908-1.c | 12 +
>  gcc/testsuite/gcc.target/i386/pr101908-2.c | 12 +
>  gcc/testsuite/gcc.target/i386/pr101908-3.c | 14 +
>  5 files changed, 103 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr101908-3.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 5a561966eb4..3f8a2c7932d 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -21933,6 +21933,65 @@ ix86_seh_fixup_eh_fallthru (void)
>emit_insn_after (gen_nops (const1_rtx), insn);
>  }
>  }
> +/* Split vector load from parm_decl to elemental loads to avoid STLF
> +   stalls.  */
> +static void
> +ix86_split_stlf_stall_load ()
> +{
> +  rtx_insn* insn, *start = get_insns ();
> +  unsigned window = 0;
> +
> +  for (insn = start; insn; insn = NEXT_INSN (insn))
> +{
> +  if (!NONDEBUG_INSN_P (insn))
> +   continue;
> +  window++;
> +  /* Insert 64 vaddps %xmm18, %xmm19, %xmm20(no dependence between each
> +other, just emulate for pipeline) before stalled load, stlf stall
> +case is as fast as no stall cases on CLX.
> +Since CFG is freed before machine_reorg, just do a rough
> +calculation of the window according to the layout.  */
> +  if (window > (unsigned) x86_stlf_window_ninsns)
> +   return;
> +
> +  if (any_uncondjump_p (insn)
> + || ANY_RETURN_P (PATTERN (insn))
> + || CALL_P (insn))
> +   return;
> +
> +  rtx set = single_set (insn);
> +  if (!set)
> +   continue;
> +  rtx src = SET_SRC (set);
> +  if (!MEM_P (src)
> + /* Only handle V2DFmode load since it doesn't need any scratch
> +register.  */
> + || GET_MODE (src) != E_V2DFmode
> + || !MEM_EXPR (src)
> + || TREE_CODE (get_base_address (MEM_EXPR (src))) != PARM_DECL)
> +   continue;
> +
> +  rtx zero = CONST0_RTX (V2DFmode);
> +  rtx dest = SET_DEST (set);
> +  rtx m = adjust_address (src, DFmode, 0);
> +  rtx loadlpd = gen_sse2_loadlpd (dest, zero, m);
> +  emit_insn_before (loadlpd, insn);
> +  m = adjust_address (src, DFmode, 8);
> +  rtx loadhpd = gen_sse2_loadhpd (dest, dest, m);
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +   {
> + fputs ("Due to potential STLF stall, split instruction:\n",
> +dump_file);
> + print_rtl_single (dump_file, insn);
> + fputs ("To:\n", dump_file);
> + print_rtl_single (dump_file, loadlpd);
> + print_rtl_single (dump_file, loadhpd);
> +   }
> +  PATTERN (insn) = loadhpd;
> +  INSN_CODE (insn) = -1;
> +  gcc_assert (recog_memoized (insn) != -1);
> +}
> +}
>
>  /* Implement machine specific optimizations.  We implement padding of returns
> for K8 CPUs and pass to avoid 4 jumps in the single 16 byte window.  */
> @@ -21948,6 +22007,8 @@ ix86_reorg (void)
>
>if (optimize && optimize_function_for_speed_p (cfun))
>  {
> +  if (TARGET_SSE2)
> +   ix86_split_stlf_stall_load ();
>if (TARGET_PAD_SHORT_FUNCTION)
> ix86_pad_short_function ();
>else if (TARGET_PAD_RETURNS)
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index d8e8656a8ab..a6b0e28f238 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -1210,3 +1210,7 @@ Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, 
> AVX, AVX2, AVX512F and AVX5
>  mdirect-extern-access
>  Target Var(ix86_direct_extern_access) Init(1)
>  Do not use GOT to access external symbols.
> +
> +-param=x86-stlf-window-ninsns=
> +Target Joined UInteger Var(x86_stlf_window_ninsns) Init(64) Param
> +Instructions nu

[PATCH] testsuite: Add -fno-tree-loop-distribute-patterns for s390.

2022-04-04 Thread Robin Dapp via Gcc-patches
Hi,

in gcc.dg/Wuse-after-free-2.c we try to detect a use-after-free.  On
s390 the test's while loop is converted into a rawmemchr builtin making
it impossible to determine that the pointers *p and *q are related.

Therefore, disable the tree loop distribute patterns pass on s390 for
this test.

OK for trunk?

Regards
 Robin

gcc/testsuite/ChangeLog:

* gcc.dg/Wuse-after-free-2.c:
Add -fno-tree-loop-distribute-patterns for s390*.


---
diff --git a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
index 9f7ed4529f0..3a8f690e9f8 100644
--- a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
+++ b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
@@ -1,6 +1,8 @@
 /* PR middle-end/104232 - spurious -Wuse-after-free after conditional free
{ dg-do compile }
-   { dg-options "-O2 -Wall" } */
+   { dg-options "-O2 -Wall" }
+   { dg-additional-options "-fno-tree-loop-distribute-patterns" {
target { s390*-*-* } } }
+   */

 void free (void*);

-- 
2.35.1



[PATCH] testsuite/s390: Change nle -> h in ifcvt tests.

2022-04-04 Thread Robin Dapp via Gcc-patches
Hi,

we have been emitting the "higher" variantes instead of the "not less or
equal" ones for a while.  Change the test expectations accordingly.

OK for trunk?

Regards
 Robin

gcc/testsuite/ChangeLog:

* gcc.target/s390/ifcvt-two-insns-bool.c: Change nle to h.
* gcc.target/s390/ifcvt-two-insns-int.c: Dito.
* gcc.target/s390/ifcvt-two-insns-long.c: Dito.


---
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
index d2f18f58e45..df0416a71d8 100644
--- a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
@@ -3,8 +3,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -march=z13 --save-temps" } */

-/* { dg-final { scan-assembler "lochinle\t%r.?,1" } } */
-/* { dg-final { scan-assembler "locrnle\t.*" } } */
+/* { dg-final { scan-assembler "lochih\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locrh\t.*" } } */
 #include 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
index 031cc433f56..181173b91e9 100644
--- a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
@@ -3,8 +3,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -march=z13 --save-temps" } */

-/* { dg-final { scan-assembler "lochinle\t%r.?,1" } } */
-/* { dg-final { scan-assembler "locrnle\t.*" } } */
+/* { dg-final { scan-assembler "lochih\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locrh\t.*" } } */
 #include 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
index cd04d2ad33e..c66ef6cfdea 100644
--- a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
+++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
@@ -3,8 +3,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -march=z13 --save-temps" } */

-/* { dg-final { scan-assembler "locghinle\t%r.?,1" } } */
-/* { dg-final { scan-assembler "locgrnle\t.*" } } */
+/* { dg-final { scan-assembler "locghih\t%r.?,1" } } */
+/* { dg-final { scan-assembler "locgrh\t.*" } } */
 #include 
 #include 
 #include 
-- 
2.35.1



[PATCH] testsuite/s390: Adapt test expections.

2022-04-04 Thread Robin Dapp via Gcc-patches
Hi,

some tests expect a convert instruction but nowadays the conversion is
already done at compile time.  This results in a literal-pool load.
Change the tests accordingly.

OK for trunk?

Regards
 Robin

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/vec-double-compile.c: Expect vl
  instead of vc*.
* gcc.target/s390/zvector/vec-float-compile.c: Dito.
* gcc.target/s390/zvector/vec-signed-compile.c: Dito.
* gcc.target/s390/zvector/vec-unsigned-compile.c: Dito.


---
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-double-compile.c
b/gcc/testsuite/gcc.target/s390/zvector/vec-double-compile.c
index 0a70b095b88..24a49474e38 100644
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-double-compile.c
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-double-compile.c
@@ -31,6 +31,11 @@ vcdlgb_mem (vector unsigned long long *a)
   return vec_double (*a);
 }

+/* Since r12-4475-g247c407c83f001 the following immediates are being
+   converted and directly stored in the literal pool so no explicit
+   conversion is necessary.   */
+/* { dg-final { scan-assembler-times
"vl\t%v\[0-9\]+,\.L\[0-9\]+\-\.L\[0-9\]+\\(%r\[0-9\]+\\)" 2 } } */
+
 vector double
 vcdgb_imm ()
 {
@@ -43,5 +48,5 @@ vcdlgb_imm ()
   return vec_double ((vector unsigned long long){ 1, 2 });
 }

-/* { dg-final { scan-assembler-times "vcdgb\t" 3 } } */
-/* { dg-final { scan-assembler-times "vcdlgb\t" 3 } } */
+/* { dg-final { scan-assembler-times "vcdgb\t" 2 } } */
+/* { dg-final { scan-assembler-times "vcdlgb\t" 2 } } */
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-float-compile.c
b/gcc/testsuite/gcc.target/s390/zvector/vec-float-compile.c
index a591e23872e..bf5cebb34f5 100644
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-float-compile.c
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-float-compile.c
@@ -31,6 +31,11 @@ vcelfb_mem (vector unsigned int *a)
   return vec_float (*a);
 }

+/* Since r12-4475-g247c407c83f001 the following immediates are being
+   converted and directly stored in the literal pool so no explicit
+   conversion is necessary.   */
+/* { dg-final { scan-assembler-times
"vl\t%v\[0-9\]+,\.L\[0-9\]+\-\.L\[0-9\]+\\(%r\[0-9\]+\\)" 2 } } */
+
 vector float
 vcefb_imm ()
 {
@@ -43,5 +48,5 @@ vcelfb_imm ()
   return vec_float ((vector unsigned int){ 1, 2 });
 }

-/* { dg-final { scan-assembler-times "vcefb\t" 3 } } */
-/* { dg-final { scan-assembler-times "vcelfb\t" 3 } } */
+/* { dg-final { scan-assembler-times "vcefb\t" 2 } } */
+/* { dg-final { scan-assembler-times "vcelfb\t" 2 } } */
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-signed-compile.c
b/gcc/testsuite/gcc.target/s390/zvector/vec-signed-compile.c
index 9814cc5d74d..1d30ba3a9ad 100644
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-signed-compile.c
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-signed-compile.c
@@ -31,6 +31,11 @@ vcgdb_mem (vector double *a)
   return vec_signed (*a);
 }

+/* Since r12-4475-g247c407c83f001 the following immediates are being
+   converted and directly stored in the literal pool so no explicit
+   conversion is necessary.   */
+/* { dg-final { scan-assembler-times
"vl\t%v\[0-9\]+,\.L\[0-9\]+\-\.L\[0-9\]+\\(%r\[0-9\]+\\)" 2 } } */
+
 vector signed int
 vcfeb_imm ()
 {
@@ -43,5 +48,5 @@ vcgdb_imm ()
   return vec_signed ((vector double){ 1.0, 2.0 });
 }

-/* { dg-final { scan-assembler-times "vcfeb\t" 3 } } */
-/* { dg-final { scan-assembler-times "vcgdb\t" 3 } } */
+/* { dg-final { scan-assembler-times "vcfeb\t" 2 } } */
+/* { dg-final { scan-assembler-times "vcgdb\t" 2 } } */
diff --git
a/gcc/testsuite/gcc.target/s390/zvector/vec-unsigned-compile.c
b/gcc/testsuite/gcc.target/s390/zvector/vec-unsigned-compile.c
index 1eed284adff..90347e618c1 100644
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-unsigned-compile.c
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-unsigned-compile.c
@@ -31,6 +31,11 @@ vclgdb_mem (vector double *a)
   return vec_unsigned (*a);
 }

+/* Since r12-4475-g247c407c83f001 the following immediates are being
+   converted and directly stored in the literal pool so no explicit
+   conversion is necessary.   */
+/* { dg-final { scan-assembler-times
"vl\t%v\[0-9\]+,\.L\[0-9\]+\-\.L\[0-9\]+\\(%r\[0-9\]+\\)" 2 } } */
+
 vector unsigned int
 vclfeb_imm ()
 {
@@ -43,5 +48,5 @@ vclgdb_imm ()
   return vec_unsigned ((vector double){ 1.0, 2.0 });
 }

-/* { dg-final { scan-assembler-times "vclfeb\t" 3 } } */
-/* { dg-final { scan-assembler-times "vclgdb\t" 3 } } */
+/* { dg-final { scan-assembler-times "vclfeb\t" 2 } } */
+/* { dg-final { scan-assembler-times "vclgdb\t" 2 } } */
-- 
2.35.1



Re: [PATCH] aarch64: Restrict aarch64-tune.md regeneration to --enable-maintainer-mode [PR105144]

2022-04-04 Thread Jakub Jelinek via Gcc-patches
On Mon, Apr 04, 2022 at 12:32:27PM +0100, Richard Earnshaw via Gcc-patches 
wrote:
> OK.

Thanks, now committed.

> I think we have a similar issue for arm with arm-tune.md and arm-tables.opt.
> Perhaps we should adopt a similar approach for those as well.

>From what I can see, both arm and c6x suffer from the point 3) in the PR,
i.e. they regenerate files in the source tree regardless of
--enable-maintainer-mode.
As for 2), both arm and c6x are ok, but handle it in a different way from
what I did (s-mddeps dependency addition) - they instead set
MD_INCLUDES = long-list-of-*.md-files
s-config s-conditions s-flags s-codes s-constants s-emit s-recog s-preds \
s-opinit s-extract s-peep s-attr s-attrtab s-output: $(MD_INCLUDES)
The MD_INCLUDES variable is overwritten later if mddeps.mk exists and
is included, so the one in t-arm etc. doesn't need to be accurate and can
just contain the files that are generated.  The MD_INCLUDES approach
has the disadvantage that people will try to add stuff to it even when
it isn't needed.
rs6000 is the only target that uses MD_INCLUDES beyond arm and c6x, but
in that case the rule to regenerate it is commented out (should that be
enabled for maintainer mode?).

Jakub



Re: [PATCH] aarch64: Restrict aarch64-tune.md regeneration to --enable-maintainer-mode [PR105144]

2022-04-04 Thread Richard Earnshaw via Gcc-patches




On 04/04/2022 13:12, Jakub Jelinek via Gcc-patches wrote:

On Mon, Apr 04, 2022 at 12:32:27PM +0100, Richard Earnshaw via Gcc-patches 
wrote:

OK.


Thanks, now committed.


I think we have a similar issue for arm with arm-tune.md and arm-tables.opt.
Perhaps we should adopt a similar approach for those as well.


 From what I can see, both arm and c6x suffer from the point 3) in the PR,
i.e. they regenerate files in the source tree regardless of
--enable-maintainer-mode.


Well the read-only tree issue will simply result in a build failure if 
the file needs to change - I don't see that as a major issue.


The only risk is if multiple builds are running from the same sources at 
the same time and you somehow end up with a race condition.



As for 2), both arm and c6x are ok, but handle it in a different way from
what I did (s-mddeps dependency addition) - they instead set
MD_INCLUDES = long-list-of-*.md-files
s-config s-conditions s-flags s-codes s-constants s-emit s-recog s-preds \
 s-opinit s-extract s-peep s-attr s-attrtab s-output: $(MD_INCLUDES)
The MD_INCLUDES variable is overwritten later if mddeps.mk exists and
is included, so the one in t-arm etc. doesn't need to be accurate and can
just contain the files that are generated.  The MD_INCLUDES approach
has the disadvantage that people will try to add stuff to it even when
it isn't needed.
rs6000 is the only target that uses MD_INCLUDES beyond arm and c6x, but
in that case the rule to regenerate it is commented out (should that be
enabled for maintainer mode?).

Jakub



Really the long-term solution is to fix these so that both the md 
reading machinery and the opt machinery can read generated files from 
the build directories, then we wouldn't need to copy anything other than 
config files back to the source tree.


R.


Re: try multi dest registers in default_zero_call_used_regs

2022-04-04 Thread Richard Sandiford via Gcc-patches
Alexandre Oliva  writes:
> Hello, Richard,
>
> Thanks for the review!
>
> On Mar 31, 2022, Richard Sandiford  wrote:
>
>>> +   /* If the natural mode doesn't work, try some wider mode.  */
>>> +   if (!targetm.hard_regno_mode_ok (regno, mode))
>>> + {
>>> +   for (int nregs = 2;
>>> +regno + nregs <= FIRST_PSEUDO_REGISTER
>>> +  && TEST_HARD_REG_BIT (need_zeroed_hardregs,
>>> +regno + nregs - 1);
>>> +nregs++)
>>> + {
>>> +   mode = choose_hard_reg_mode (regno, nregs, 0);
>
>> I like the idea, but it would be good to avoid the large:
>
>>   FIRST_PSEUDO_REGISTER * FIRST_PSEUDO_REGISTER * NUM_MACHINE_MODES
>
>> constant factor.
>
> Enteringf the nregs loop, because the register can't be used in its
> natural mode, is supposed to be an unusual case, not worth optimizing
> much under Amdahl's law.  I gather the aggregate trip counts are
> unlikely to hit the theoretical O(n^2) because registers that would take
> the loop are rare and expected to be paired/grouped up.  If that
> assumption doesn't hold, then a cap would indeed be desirable.
>
>> How about if init_reg_modes_target recorded the maximum value of
>> x_hard_regno_nregs?
>
> I had thought of a cap but couldn't find one I was happy with, and in
> the end I thought we didn't need one.  But this is indeed a good one to
> use.  Thanks, I'm implementing it.
>
>> This seems big enough to be worth splitting out into a helper, rather
>> than repeating.
>
> I had considered that, but it didn't seem to me it would bring an
> improvement.  As it turns out, it does.  Thanks.
>
>>> -   rtx zsrc = gen_rtx_REG (mode, src);
>>> +   rtx src_rtx = (mode == GET_MODE (regno_reg_rtx[src])
>>> +  ? regno_reg_rtx[src]
>>> +  : gen_rtx_REG (mode, src));
>
>> Is this needed?  The original gen_rtx_REG (mode, src) seems OK.
>
> No, it's not needed, it's just an attempt to avoid allocating RTL that
> we have handy.  This function could in theory make several attempts at
> allocating rtl for each register in the shrinking pending set.  I
> thought every saved bit could help.

But if that's true, it should happen in gen_rtx_REG.  It already has:

#if 0
  /* If the per-function register table has been set up, try to re-use
 an existing entry in that table to avoid useless generation of RTL.

 This code is disabled for now until we can fix the various backends
 which depend on having non-shared hard registers in some cases.   Long
 term we want to re-enable this code as it can significantly cut down
 on the amount of useless RTL that gets generated.

 We'll also need to fix some code that runs after reload that wants to
 set ORIGINAL_REGNO.  */

  if (cfun
  && cfun->emit
  && regno_reg_rtx
  && regno < FIRST_PSEUDO_REGISTER
  && reg_raw_mode[regno] == mode)
return regno_reg_rtx[regno];
#endif

Having the special case here in targhooks.c would set a precedent that
efficiency-conscious callers should always do the ?: rather than call
gen_rtx_REG directly.  Keeping the code in gen_rtx_REG means that we can
flip the switch when backends have been fixed (maybe they already have).

OK without the introduction of the ?:, thanks.

Richard

>
>
> Here's what I'm regstrapping on x86_64-linux-gnu, after verifying that
> it does the job on the affected arm variant.  Ok to install, assuming no
> surprises in the testing?
>
>
> try multi-reg dest in default_zero_call_used_regs
>
> From: Alexandre Oliva 
>
> When the mode of regno_reg_rtx is not hard_regno_mode_ok for the
> target, try grouping the register with subsequent ones.  This enables
> s16 to s31 and their hidden pairs to be zeroed with the default logic
> on some arm variants.
>
>
> for  gcc/ChangeLog
>
>   * targhooks.c (default_zero_call_used_regs): Attempt to group
>   regs that the target refuses to use in their natural modes.
>   (zcur_select_mode_rtx): New.
>   * regs.h (struct target_regs): Add x_hard_regno_max_nregs.
>   (hard_regno_max_nregs): Define.
>   * reginfo.c (init_reg_modes_target): Set hard_regno_max_nregs.
> ---
>  gcc/reginfo.cc   |9 --
>  gcc/regs.h   |5 +++
>  gcc/targhooks.cc |   86 
> --
>  3 files changed, 89 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/reginfo.cc b/gcc/reginfo.cc
> index 234f72eceeb25..67e30cab42855 100644
> --- a/gcc/reginfo.cc
> +++ b/gcc/reginfo.cc
> @@ -441,10 +441,15 @@ init_reg_modes_target (void)
>  {
>int i, j;
>  
> +  this_target_regs->x_hard_regno_max_nregs = 1;
>for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
>  for (j = 0; j < MAX_MACHINE_MODE; j++)
> -  this_target_regs->x_hard_regno_nregs[i][j]
> - = targetm.hard_regno_nregs (i, (machine_mode) j);
> +  {
> + unsigned char nregs = targetm.hard_regno_nregs (i, (machine_mode) j);
> + this_target_regs-

GCC 12.0.1 Status Report (2022-04-04)

2022-04-04 Thread Richard Biener via Gcc-patches


Status
==

The GCC development branch is in regression and documentation fixing
mode (Stage 4) in preparation for the release of GCC 13.  Re-opening
of general development will happen once we reach zero P1 regressions
which is when we branch for the release.  Time wise history projects
that to happen around end of April 2022.

While we've made progress since mid January in the quest to squash
P1 regressions and other important bugs there's still quite a bit
of work ahead.  Please look after bugs you have assigned to yourself
that are blocking the release and consider helping out with looking
after bugs that are in an area of the compiler where you'd be
considered experienced.

Again this is also the time to look after non-primary/secondary
targets and ensure they build, install and work correctly.  We
will not hold off releasing with late discovered problems there.


Quality Data


Priority  #   Change from last report
---   ---
P1  23-  15
P2  387   +  77
P3  84- 202
P4  248   +  27 
P5  25
---   ---
Total P1-P3 494   - 140
Total   767   - 113


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2022-January/238136.html


Re: [PATCH] testsuite: Add -fno-tree-loop-distribute-patterns for s390.

2022-04-04 Thread Richard Biener via Gcc-patches
On Mon, Apr 4, 2022 at 1:52 PM Robin Dapp via Gcc-patches
 wrote:
>
> Hi,
>
> in gcc.dg/Wuse-after-free-2.c we try to detect a use-after-free.  On
> s390 the test's while loop is converted into a rawmemchr builtin making
> it impossible to determine that the pointers *p and *q are related.
>
> Therefore, disable the tree loop distribute patterns pass on s390 for
> this test.
>
> OK for trunk?
>
> Regards
>  Robin
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/Wuse-after-free-2.c:
> Add -fno-tree-loop-distribute-patterns for s390*.
>
>
> ---
> diff --git a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
> b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
> index 9f7ed4529f0..3a8f690e9f8 100644
> --- a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
> +++ b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
> @@ -1,6 +1,8 @@
>  /* PR middle-end/104232 - spurious -Wuse-after-free after conditional free
> { dg-do compile }
> -   { dg-options "-O2 -Wall" } */
> +   { dg-options "-O2 -Wall" }
> +   { dg-additional-options "-fno-tree-loop-distribute-patterns" {
> target { s390*-*-* } } }

Please add the option unconditional and add a comment wrt rawmemchr

OK with that change.

> +   */
>
>  void free (void*);
>
> --
> 2.35.1
>


[pushed] c++: alias-tmpl equivalence and default targs [PR103852]

2022-04-04 Thread Jason Merrill via Gcc-patches
The suggested resolution for CWG1286, which we implemented, ignores default
template arguments, but this PR is an example of why that doesn't make
sense: the templates aren't functionally equivalent.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/103852
DR 1286

gcc/cp/ChangeLog:

* pt.cc (get_underlying_template): Compare default template args.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-dr1286a.C: Default args now matter.
* g++.dg/cpp1z/class-deduction-alias1.C: New test.
---
 gcc/cp/pt.cc| 13 +
 gcc/testsuite/g++.dg/cpp0x/alias-decl-dr1286a.C | 16 
 .../g++.dg/cpp1z/class-deduction-alias1.C   | 17 +
 3 files changed, 38 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction-alias1.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 75ed9a34018..1f0231f70e6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -6637,6 +6637,18 @@ get_underlying_template (tree tmpl)
   if (!comp_template_args (TI_ARGS (tinfo), alias_args))
break;
 
+  /* Are any default template arguments equivalent?  */
+  tree aparms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (tmpl));
+  tree uparms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS 
(underlying));
+  const int nparms = TREE_VEC_LENGTH (aparms);
+  for (int i = 0; i < nparms; ++i)
+   {
+ tree adefarg = TREE_PURPOSE (TREE_VEC_ELT (aparms, i));
+ tree udefarg = TREE_PURPOSE (TREE_VEC_ELT (uparms, i));
+ if (!template_args_equal (adefarg, udefarg))
+   goto top_break;
+   }
+
   /* If TMPL adds or changes any constraints, it isn't equivalent.  I think
 it's appropriate to treat a less-constrained alias as equivalent.  */
   if (!at_least_as_constrained (underlying, tmpl))
@@ -6645,6 +6657,7 @@ get_underlying_template (tree tmpl)
   /* Alias is equivalent.  Strip it and repeat.  */
   tmpl = underlying;
 }
+  top_break:;
 
   return tmpl;
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-dr1286a.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-dr1286a.C
index 1780c9a47b7..fbd63d891d0 100644
--- a/gcc/testsuite/g++.dg/cpp0x/alias-decl-dr1286a.C
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-dr1286a.C
@@ -11,13 +11,13 @@ template struct A;
 template class> struct X;
 
 // equivalent to A
-template
+template
 using B = A;
 
 same,X> s1;
 
 // not equivalent to A: not all parameters used
-template
+template
 using C = A;
 
 different,X> d1;
@@ -29,32 +29,32 @@ using D = A;
 different,X> d2;
 
 // not equivalent to A: template-arguments in wrong order
-template
+template
 using E = A;
 
 different,X> d3;
 
-// equivalent to A: default arguments not considered
+// NOT equivalent to A: default arguments now considered
 template
 using F = A;
 
-same,X> s2;
+different,X> s2;
 
 // equivalent to A and B
-template
+template
 using G = A;
 
 same,X> s3;
 same,X> s3b;
 
 // equivalent to E
-template
+template
 using H = E;
 
 same,X> s4;
 
 // not equivalent to A: argument not identifier
-template
+template
 using I = A;
 
 different,X> d4;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction-alias1.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction-alias1.C
new file mode 100644
index 000..1ec90b58e3a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction-alias1.C
@@ -0,0 +1,17 @@
+// PR c++/103852
+// { dg-do compile { target c++17 } }
+
+template  struct b{};
+template >
+struct s
+{
+s(T);
+};
+s c(100);
+template >
+using ss = s;// equivalent under proposed resolution of DR 1286
+ss tt(1);   // OK
+
+template 
+using ss2 = s;   // different default arg makes it non-equivalent
+ss2 tt2(1); // { dg-error "alias template deduction" "" { target c++17_only } }

base-commit: 2f0610acbc056052a108e4a46911fc21d0dca2ab
-- 
2.27.0



[pushed] c++: repeated friend template [PR101894]

2022-04-04 Thread Jason Merrill via Gcc-patches
Since olddecl isn't a definition, it doesn't get DECL_FRIEND_CONTEXT, so we
need to copy it from newdecl when we merge the declarations.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/101894

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Copy DECL_FRIEND_CONTEXT.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/friend22.C: New test.
---
 gcc/cp/decl.cc | 5 +
 gcc/testsuite/g++.dg/lookup/friend22.C | 7 +++
 2 files changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/lookup/friend22.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 69f60a6dc0f..0ff13e99595 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -2344,6 +2344,9 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, 
bool was_hidden)
  for (parm = DECL_ARGUMENTS (old_result); parm;
   parm = DECL_CHAIN (parm))
DECL_CONTEXT (parm) = old_result;
+
+ if (tree fc = DECL_FRIEND_CONTEXT (new_result))
+   SET_DECL_FRIEND_CONTEXT (old_result, fc);
}
}
 
@@ -2667,6 +2670,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, 
bool was_hidden)
 otherwise it is a DECL_FRIEND_CONTEXT.  */
  if (DECL_VIRTUAL_P (newdecl))
SET_DECL_THUNKS (newdecl, DECL_THUNKS (olddecl));
+ else if (tree fc = DECL_FRIEND_CONTEXT (newdecl))
+   SET_DECL_FRIEND_CONTEXT (olddecl, fc);
}
   else if (VAR_P (newdecl))
{
diff --git a/gcc/testsuite/g++.dg/lookup/friend22.C 
b/gcc/testsuite/g++.dg/lookup/friend22.C
new file mode 100644
index 000..f52a7d7bad5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/friend22.C
@@ -0,0 +1,7 @@
+// PR c++/101894
+
+struct A
+{
+  template friend void foo();
+  template friend void foo() {}
+};

base-commit: 2f0610acbc056052a108e4a46911fc21d0dca2ab
prerequisite-patch-id: 74ea9d22f5f8e4ebddcda6cbb7a371bcb08f0488
-- 
2.27.0



[RFC] Remove default option -fpie for projects that use -T linker options

2022-04-04 Thread Carlos Bilbao via Gcc-patches
Projects that rely on a linker script usually specify a memory location 
where the executable should be placed. This directly contradicts the 
default option -fpie for position independent executables. In fact, using
PIE generates a GOT, which might be undesirable for developers that need
control over the generated sections.

Would it be positive to assume -fno-pie on these situations?

Signed-off-by: Carlos Bilbao 


Re: [RFC] Remove default option -fpie for projects that use -T linker options

2022-04-04 Thread Koning, Paul via Gcc-patches
I'm not sure if it is valid to assume that a linker script "usually" specifies 
a fixed memory location.  

paul


> On Apr 4, 2022, at 11:06 AM, Carlos Bilbao via Gcc-patches 
>  wrote:
> 
> Projects that rely on a linker script usually specify a memory location 
> where the executable should be placed. This directly contradicts the 
> default option -fpie for position independent executables. In fact, using
> PIE generates a GOT, which might be undesirable for developers that need
> control over the generated sections.
> 
> Would it be positive to assume -fno-pie on these situations?
> 
> Signed-off-by: Carlos Bilbao 



[PATCH][committed] doc: Fix typos in match.pd documentation

2022-04-04 Thread Alex Coplan via Gcc-patches
Hi,

This patch fixes some spelling and grammar issues in the match.pd
documentation.

Pushed as obvious.

Thanks,
Alex

gcc/ChangeLog:

* doc/match-and-simplify.texi: Fix typos.
diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
index 055a5308e7d..b33d83518a7 100644
--- a/gcc/doc/match-and-simplify.texi
+++ b/gcc/doc/match-and-simplify.texi
@@ -19,7 +19,7 @@ tries to address several issues.
 gimplifying via force_gimple_operand
 @end enumerate
 
-To address these the project introduces a simple domain specific language
+To address these the project introduces a simple domain-specific language
 to write expression simplifications from which code targeting GIMPLE
 and GENERIC is auto-generated.  The GENERIC variant follows the
 fold_buildN API while for the GIMPLE variant and to address 2) new
@@ -40,7 +40,7 @@ APIs are introduced.
 @deftypefnx {GIMPLE function} tree gimple_simplify (enum built_in_function, 
tree, tree, gimple_seq *, tree (*)(tree))
 @deftypefnx {GIMPLE function} tree gimple_simplify (enum built_in_function, 
tree, tree, tree, gimple_seq *, tree (*)(tree))
 @deftypefnx {GIMPLE function} tree gimple_simplify (enum built_in_function, 
tree, tree, tree, tree, gimple_seq *, tree (*)(tree))
-The main GIMPLE API entry to the expression simplifications mimicing
+The main GIMPLE API entry to the expression simplifications mimicking
 that of the GENERIC fold_@{unary,binary,ternary@} functions.
 @end deftypefn
 
@@ -57,7 +57,7 @@ a valueization hook:
 @end deftypefn
 
 
-Ontop of these a @code{fold_buildN}-like API for GIMPLE is introduced:
+On top of these a @code{fold_buildN}-like API for GIMPLE is introduced:
 
 @deftypefn {GIMPLE function} tree gimple_build (gimple_seq *, location_t, enum 
tree_code, tree, tree, tree (*valueize) (tree) = NULL);
 @deftypefnx {GIMPLE function} tree gimple_build (gimple_seq *, location_t, 
enum tree_code, tree, tree, tree, tree (*valueize) (tree) = NULL);
@@ -78,9 +78,9 @@ and simplification is performed using the optional 
valueization hook.
 @section The Language
 @cindex The Language
 
-The language to write expression simplifications in resembles other
-domain-specific languages GCC uses.  Thus it is lispy.  Lets start
-with an example from the match.pd file:
+The language in which to write expression simplifications resembles
+other domain-specific languages GCC uses.  Thus it is lispy.  Let's
+start with an example from the match.pd file:
 
 @smallexample
 (simplify
@@ -100,7 +100,7 @@ function code names in all-caps, like @code{BUILT_IN_SQRT}.
 
 @code{@@n} denotes a so-called capture.  It captures the operand and lets
 you refer to it in other places of the match-and-simplify.  In the
-above example it is refered to in the replacement expression.  Captures
+above example it is referred to in the replacement expression.  Captures
 are @code{@@} followed by a number or an identifier.
 
 @smallexample
@@ -110,10 +110,10 @@ are @code{@@} followed by a number or an identifier.
 @end smallexample
 
 In this example @code{@@0} is mentioned twice which constrains the matched
-expression to have two equal operands.  Usually matches are constraint
-to equal types.  If operands may be constants and conversions are involved
+expression to have two equal operands.  Usually matches are constrained
+to equal types.  If operands may be constants and conversions are involved,
 matching by value might be preferred in which case use @code{0} to
-denote a by value match and the specific operand you want to refer to
+denote a by-value match and the specific operand you want to refer to
 in the result part.  This example also introduces
 operands written in C code.  These can be used in the expression
 replacements and are supposed to evaluate to a tree node which has to
@@ -129,7 +129,7 @@ be a valid GIMPLE operand (so you cannot generate 
expressions in C code).
 Here @code{@@0} captures the first operand of the trunc_mod expression
 which is also predicated with @code{integer_zerop}.  Expression operands
 may be either expressions, predicates or captures.  Captures
-can be unconstrained or capture expresions or predicates.
+can be unconstrained or capture expressions or predicates.
 
 This example introduces an optional operand of simplify,
 the if-expression.  This condition is evaluated after the
@@ -219,9 +219,9 @@ Captures can also be used for capturing results of 
sub-expressions.
 @end smallexample
 
 In the above example, @code{@@2} captures the result of the expression
-@code{(addr @@0)}.  For outermost expression only its type can be captured,
-and the keyword @code{type} is reserved for this purpose.  The above
-example also gives a way to conditionalize patterns to only apply
+@code{(addr @@0)}.  For the outermost expression only its type can be
+captured, and the keyword @code{type} is reserved for this purpose.  The
+above example also gives a way to conditionalize patterns to only apply
 to @code{GIMPLE} or @code{GENERI

[PATCH] PR fortran/105138 - Bogus error when function name does not shadow an intrinsic when RESULT clause is used

2022-04-04 Thread Harald Anlauf via Gcc-patches
Dear all,

Steve's analysis (see PR) showed that we confused the case when a
symbol refererred to a recursive procedure which was named the same
as an intrinsic.  The standard allows such recursive references
(see e.g. F2018:19.3.1).

The attached patch is based on Steve's, but handles both functions
and subroutines.  Testcase verified with NAG and Crayftn.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This bug is a rejects-valid, but could also lead to wrong code,
see e.g. the PR, comment#4.  Would this qualify for a backport
to e.g. the 11-branch?

Thanks,
Harald

From 4c23f78a41fad7cb19ad84c99a73d761fa02 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 4 Apr 2022 20:42:51 +0200
Subject: [PATCH] Fortran: a RECURSIVE procedure cannot be an INTRINSIC

gcc/fortran/ChangeLog:

	PR fortran/105138
	* intrinsic.cc (gfc_is_intrinsic): When a symbol refers to a
	RECURSIVE procedure, it cannot be an INTRINSIC.

gcc/testsuite/ChangeLog:

	PR fortran/105138
	* gfortran.dg/recursive_reference_3.f90: New test.

Co-authored-by: Steven G. Kargl 
---
 gcc/fortran/intrinsic.cc   |  1 +
 .../gfortran.dg/recursive_reference_3.f90  | 14 ++
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/recursive_reference_3.f90

diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc
index 2339d9050ec..e89131f5a71 100644
--- a/gcc/fortran/intrinsic.cc
+++ b/gcc/fortran/intrinsic.cc
@@ -1164,6 +1164,7 @@ gfc_is_intrinsic (gfc_symbol* sym, int subroutine_flag, locus loc)

   /* Check for attributes which prevent the symbol from being INTRINSIC.  */
   if (sym->attr.external || sym->attr.contained
+  || sym->attr.recursive
   || sym->attr.if_source == IFSRC_IFBODY)
 return false;

diff --git a/gcc/testsuite/gfortran.dg/recursive_reference_3.f90 b/gcc/testsuite/gfortran.dg/recursive_reference_3.f90
new file mode 100644
index 000..f4e2963aec2
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/recursive_reference_3.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-options "-std=f2018" }
+! PR fortran/105138 - recursive procedures and shadowing of intrinsics
+
+RECURSIVE FUNCTION LOG_GAMMA(Z) RESULT(RES)
+  COMPLEX, INTENT(IN) :: Z
+  COMPLEX :: RES
+  RES = LOG_GAMMA(Z)
+END FUNCTION LOG_GAMMA
+
+recursive subroutine date_and_time (z)
+  real :: z
+  if (z > 0) call date_and_time (z-1)
+end subroutine date_and_time
--
2.34.1



*Ping* [PATCH] PR fortran/104210 - [11/12 Regression] ICE in gfc_zero_size_array, at fortran/arith.cc:1685

2022-04-04 Thread Harald Anlauf via Gcc-patches

Am 29.03.22 um 23:41 schrieb Harald Anlauf via Fortran:

Dear all,

during error recovery on invalid declarations of functions as
coarrays we may hit multiple places with NULL pointer dereferences.
The attached patch provides a minimal and conservative solution.

Regtested on x86_64-pc-linux-gnu.  OK for mainline/11-branch?

Thanks,
Harald





Re: [PATCH] gcc-changelog: ignore one more revision

2022-04-04 Thread Joseph Myers
On Mon, 4 Apr 2022, Martin Liška wrote:

> Ignore:
> 
> Checking 86d8e0c0652ef5236a460b75c25e4f7093cc0651: FAILED
> ERR: line should start with a tab: "This reverts commits r12-7804 and
> r12-7929."
> ERR: could not deduce ChangeLog file
> 
> It seems Jason pushed the revision to origin/trunk where the checking script
> is not run.

I thought I'd fixed that (last August) by making the hooks-bin scripts 
handle master and trunk the same.

-- 
Joseph S. Myers
jos...@codesourcery.com


Ping: [PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.

2022-04-04 Thread Michael Meissner via Gcc-patches
Ping patch.

| Date: Tue, 29 Mar 2022 23:25:31 -0400
| From: Michael Meissner 
| Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for 
V2DI/V2DF, PR target 99293.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[wwwdocs] Add Ada's changelog entry

2022-04-04 Thread Fernando Oleo Blanco via Gcc-patches
Hi,

this is my first patch to GCC, if there is anything off, please, say
so. I have used the default HTML formatting that comes with Emacs. I
have created the patch using the `git format-patch` utility.

One thing that may not be allowed (I am not aware of any rule against
it but still) is the amount of nesting in the lists. The section "Ada
2022 extensions" has list that nest up to "level 3". I am pointing this
out, since no other section has 3 levels of nesting, all have at most 2.

I have also added a formatted code example. I do not know if this is
allowed. Trying to explain Pattern Matching without a minimal example
is not something that I am capable of.

The patch contains verbatim expressions from
https://blog.adacore.com/going-beyond-ada-2022 but Arnaud has provided
explicit permission to do so; he is also a GCC contributor.

I have not signed any legal requirements to contribute to GCC, however,
this is just a documentation patch. I am not providing any content
that is new or improving a part of GCC, so this should not be necessary.

Regards,
--
Fernando Oleo Blanco
https://irvise.xyz
From a2f402895ab87713882adf2faef6f587d6d01264 Mon Sep 17 00:00:00 2001
From: Fernando Oleo Blanco 
Date: Mon, 4 Apr 2022 23:22:43 +0200
Subject: [PATCH] Add Ada's entry in the v12 changelog

---
 htdocs/gcc-12/changes.html | 79 +-
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 4e1f6b0f..16351697 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -183,7 +183,84 @@ a work-in-progress.
   
 
 
-
+Ada
+
+  Ada 2022
+  
+Added the -gnat2022 flag to indicate strict Ada
+  2022 compliance. The old -gnat2020 flag is now
+  deprecated.
+Support for Big Numbers (Annex G) has seen continuous
+  improvements. It is now cosidered complete. It is also proven to
+  be correct through the use of contracts and SPARK.
+Continuous improvements to the Ada 2022 standard since GCC
+  11. The main missing feature is support for the new
+  parallel keyword. However, some initial support has
+  already been put in place.
+Greatly improved compile time support. More functions can now
+  have the with Static aspect and can be used in more
+  contexts.
+  
+  
+  Ada 2022 extensions. The use of the -gnatX flag is
+necessary to access these features as they are not considered
+stable or standard.
+  
+Fixed lower bound for unconstrained arrays.
+
+  type Matrix is array (Natural range 0 .. <>, Natural
+range 0 .. <>) of Integer; is now valid.
+  Subtypes can also specify a lower bound: subtype
+String_1 is String (1 .. <>);. Boundaries from slices
+will "slide" to the correct lower bound of the subtype.
+
+
+Generalized Object.Operand notation. The follwing
+  code is now valid V.Add_Element(42);,
+  with V being a vector, for example.
+Additional when constructs. Keywords
+  return, goto and raise
+  can now use when in addition to the existing
+  exit when. The following expression is therefore
+  now valid raise Constraint_Error with "Element is null"
+  when Element = null;
+Pattern matching
+
+  The case statement has been extended to cover
+records and arrays as well as finer grained casing on scalar
+types. In the future it is expected to provide more compile
+time guarantees when accessing discriminated fields. Case
+exhaustion is supported for patter matching. An example would
+be 
+type Sign is (Neg, Zero, Pos);
+
+function Multiply (S1, S2 : Sign) return Sign is
+  (case (S1, S2) is
+ when (Neg, Neg) | (Pos, Pos) => Pos,
+ when (Zero, <>) | (<>, Zero) => Zero,
+ when (Neg, Pos) | (Pos, Neg) => Neg);
+
+
+
+  
+  
+  gnatfind and gnatxref, which were
+already deprecated, have been removed.
+  Support for 128bit integers has beed added.
+  Greatly expanded code covered by contracts. Thanks to this work,
+there are now several Ada standard libraries fully proven in SPARK
+which means they have no runtime nor logical errors. They are
+mostly numeric and string handling libraries.
+  Enable return-slot optimization for Pure
+functions.
+  General optimizations, improvements and additions to the
+standard library. Performance, correctness and in some cases
+stability was improved. Memory pools have also seen some minor
+enhancements.
+  Improvements to embedded-RTOS targets such as RTEMS, VxWorks and
+QNX. Older targets were removed or cleaned.
+  Added some https://gcc.gnu.org/onlinedocs/gnat_rm/Security-Hardening-Features.html#Security-Hardening-Features";>hardening features.
+
 
 C family
 
-- 
2.35.1



Re: [wwwdocs] Add Ada's changelog entry

2022-04-04 Thread Eric Botcazou via Gcc-patches
> this is my first patch to GCC, if there is anything off, please, say
> so. I have used the default HTML formatting that comes with Emacs. I
> have created the patch using the `git format-patch` utility.

Thanks for your contribution.  Small nit:

+  Support for 128bit integers has beed added.

The support was already present in GCC 11, the criterion being the use of the 
'Max_Integer_Size attribute in system.ads.

-- 
Eric Botcazou




Re: [wwwdocs] Add Ada's changelog entry

2022-04-04 Thread Fernando Oleo Blanco via Gcc-patches
Am Mon, 04 Apr 2022 23:51:24 +0200
schrieb "Eric Botcazou" :

> Thanks for your contribution.  Small nit:
>
> +  Support for 128bit integers has beed added.
>
> The support was already present in GCC 11, the criterion being the
> use of the 'Max_Integer_Size attribute in system.ads.
>
> --
> Eric Botcazou
>
>

Thank you for the feedback. Should I remove it and resuply the patch or
can you/GCC maintainers do the modification before merging?

Regards,
--
Fernando Oleo Blanco
https://irvise.xyz



Re: [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-04-04 Thread Yonghong Song via Gcc-patches




On 4/1/22 12:42 PM, David Faust wrote:

Hello,

This patch series is a first attempt at adding support for:

- Two new C-language-level attributes that allow to associate (to "tag")
   particular declarations and types with arbitrary strings. As explained below,
   this is intended to be used to, for example, characterize certain pointer
   types.

- The conveyance of that information in the DWARF output in the form of a new
   DIE: DW_TAG_GNU_annotation.

- The conveyance of that information in the BTF output in the form of two new
   kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.

All of these facilities are being added to the eBPF ecosystem, and support for
them exists in some form in LLVM. However, as we shall see, we have found some
problems implementing them so some discussion is in order.

Purpose
===

1)  Addition of C-family language constructs (attributes) to specify free-text
 tags on certain language elements, such as struct fields.

 The purpose of these annotations is to provide additional information about
 types, variables, and function paratemeters of interest to the kernel. A
 driving use case is to tag pointer types within the linux kernel and eBPF
 programs with additional semantic information, such as '__user' or '__rcu'.

 For example, consider the linux kernel function do_execve with the
 following declaration:

   static int do_execve(struct filename *filename,
  const char __user *const __user *__argv,
  const char __user *const __user *__envp);

 Here, __user could be defined with these annotations to record semantic
 information about the pointer parameters (e.g., they are user-provided) in
 DWARF and BTF information. Other kernel facilites such as the eBPF verifier
 can read the tags and make use of the information.

2)  Conveying the tags in the generated DWARF debug info.

 The main motivation for emitting the tags in DWARF is that the Linux kernel
 generates its BTF information via pahole, using DWARF as a source:

 ++  BTF  BTF   +--+
 | pahole |---> vmlinux.btf --->| verifier |
 ++ +--+
 ^^
 ||
   DWARF |BTF |
 ||
  vmlinux  +-+
  module1.ko   | BPF program |
  module2.ko   +-+
...

 This is because:

 a)  Unlike GCC, LLVM will only generate BTF for BPF programs.

 b)  GCC can generate BTF for whatever target with -gbtf, but there is no
 support for linking/deduplicating BTF in the linker.

 In the scenario above, the verifier needs access to the pointer tags of
 both the kernel types/declarations (conveyed in the DWARF and translated
 to BTF by pahole) and those of the BPF program (available directly in BTF).

 Another motivation for having the tag information in DWARF, unrelated to
 BPF and BTF, is that the drgn project (another DWARF consumer) also wants
 to benefit from these tags in order to differentiate between different
 kinds of pointers in the kernel.

3)  Conveying the tags in the generated BTF debug info.

 This is easy: the main purpose of having this info in BTF is for the
 compiled eBPF programs. The kernel verifier can then access the tags
 of pointers used by the eBPF programs.


For more information about these tags and the motivation behind them, please
refer to the following linux kernel discussions:

   https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
   https://lore.kernel.org/bpf/20211012164838.3345699-1-...@fb.com/
   https://lore.kernel.org/bpf/2022012604.1504583-1-...@fb.com/


What is in this patch series


This patch series adds support for these annotations in GCC. The implementation
is largely complete. However, in some cases the produced debug info (both DWARF
and BTF) differs significantly from that produced by LLVM. This issue is
discussed in detail below, along with a few specific questions for both GCC and
LLVM. Any input would be much appreciated.


Hi, David, Thanks for the RFC implementation! I will answer your 
questions related to llvm and kernel.





Implementation Overview
===

To enable these annotations, two new C language attributes are added:
__attribute__((btf_decl_tag("foo")) and __attribute__((btf_type_tag("bar"))).
Both attributes accept a single arbitrary string constant argument, which will
be recorded in the generated DWARF and/or BTF debugging information. They have
no effect on code generation.

Note that we are using the same attribute names as LLVM, which include "bt

Re: [PATCH] testsuite/s390: Adapt test expections.

2022-04-04 Thread Mike Stump via Gcc-patches
On Apr 4, 2022, at 4:52 AM, Robin Dapp via Gcc-patches 
 wrote:
> OK for trunk?

> +/* Since r12-4475-g247c407c83f001 the following immediates are being
> +   converted and directly stored in the literal pool so no explicit
> +   conversion is necessary.   */

Not fan of git revision numbers in the source code.  Also, having historical 
behaviors most of the time isn't useful/interesting, so I'd generally omit such 
descriptions.


Re: [PATCH] PR fortran/105138 - Bogus error when function name does not shadow an intrinsic when RESULT clause is used

2022-04-04 Thread Jerry D via Gcc-patches

On 4/4/22 12:04 PM, Harald Anlauf via Fortran wrote:

Dear all,

Steve's analysis (see PR) showed that we confused the case when a
symbol refererred to a recursive procedure which was named the same
as an intrinsic.  The standard allows such recursive references
(see e.g. F2018:19.3.1).

The attached patch is based on Steve's, but handles both functions
and subroutines.  Testcase verified with NAG and Crayftn.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This bug is a rejects-valid, but could also lead to wrong code,
see e.g. the PR, comment#4.  Would this qualify for a backport
to e.g. the 11-branch?

Thanks,
Harald


Yes, looks good, OK to commit

Regards,

Jerry



Re: *Ping* [PATCH] PR fortran/104210 - [11/12 Regression] ICE in gfc_zero_size_array, at fortran/arith.cc:1685

2022-04-04 Thread Jerry D via Gcc-patches

On 4/4/22 12:09 PM, Harald Anlauf via Fortran wrote:

Am 29.03.22 um 23:41 schrieb Harald Anlauf via Fortran:

Dear all,

during error recovery on invalid declarations of functions as
coarrays we may hit multiple places with NULL pointer dereferences.
The attached patch provides a minimal and conservative solution.

Regtested on x86_64-pc-linux-gnu.  OK for mainline/11-branch?

Thanks,
Harald




Patch looks OK Harald, OK.

Thanks,

Jerry



PING Re: [PATCH] Fortran: Fix clause splitting for OMP masked taskloop directive

2022-04-04 Thread Sandra Loosemore

On 3/25/22 20:02, Sandra Loosemore wrote:
I ran into this bug in the handling of clauses on the combined "masked 
taskloop" OMP directive when I was working on something else.  The fix 
turned out to be a 1-liner.  OK for trunk?


Ping!  This one's borderline obvious and would be good to fix in GCC 12.

https://gcc.gnu.org/pipermail/fortran/2022-March/057705.html

-Sandra


Re: [PATCH] Fortran: Add location info to OpenMP tree nodes

2022-04-04 Thread Sandra Loosemore

On 3/25/22 20:03, Sandra Loosemore wrote:
I've got another patch forthcoming (stage 1 material) that adds some new 
diagnostics for non-rectangular loops during gimplification of OMP 
nodes.  When I was working on that, I discovered that the Fortran front 
end wasn't attaching location information to the tree nodes 
corresponding to the various OMP directives, so the new errors weren't 
coming out with location info either.  I went through trans-openmp.cc 
and fixed all the places where make_node was being called to explicitly 
set the location.


I don't have a test case specifically for this change, but my test cases 
for the new diagnostics in the non-rectangular loops patch do exercise 
it.  Is this OK for trunk now, or for stage 1 when we get there?


Ping!  Even a quick review and "this isn't suitable for GCC 12" answer 
would be helpful.


https://gcc.gnu.org/pipermail/fortran/2022-March/057706.html

The definitely-stage-1 patch that exercises this is here:

https://gcc.gnu.org/pipermail/fortran/2022-March/057707.html

-Sandra


Re: try multi dest registers in default_zero_call_used_regs

2022-04-04 Thread Alexandre Oliva via Gcc-patches
On Apr  4, 2022, Richard Sandiford  wrote:

> But if that's true, it should happen in gen_rtx_REG.

Yeah, I agree, that makes sense.

> OK without the introduction of the ?:, thanks.

Thanks, here's what I'm checking in.


try multi-reg dest in default_zero_call_used_regs

From: Alexandre Oliva 

When the mode of regno_reg_rtx is not hard_regno_mode_ok for the
target, try grouping the register with subsequent ones.  This enables
s16 to s31 and their hidden pairs to be zeroed with the default logic
on some arm variants.


for  gcc/ChangeLog

* targhooks.c (default_zero_call_used_regs): Attempt to group
regs that the target refuses to use in their natural modes.
(zcur_select_mode_rtx): New.
* regs.h (struct target_regs): Add x_hard_regno_max_nregs.
(hard_regno_max_nregs): Define.
* reginfo.c (init_reg_modes_target): Set hard_regno_max_nregs.
---
 gcc/reginfo.cc   |9 +-
 gcc/regs.h   |5 +++
 gcc/targhooks.cc |   83 --
 3 files changed, 86 insertions(+), 11 deletions(-)

diff --git a/gcc/reginfo.cc b/gcc/reginfo.cc
index 234f72eceeb25..67e30cab42855 100644
--- a/gcc/reginfo.cc
+++ b/gcc/reginfo.cc
@@ -441,10 +441,15 @@ init_reg_modes_target (void)
 {
   int i, j;
 
+  this_target_regs->x_hard_regno_max_nregs = 1;
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 for (j = 0; j < MAX_MACHINE_MODE; j++)
-  this_target_regs->x_hard_regno_nregs[i][j]
-   = targetm.hard_regno_nregs (i, (machine_mode) j);
+  {
+   unsigned char nregs = targetm.hard_regno_nregs (i, (machine_mode) j);
+   this_target_regs->x_hard_regno_nregs[i][j] = nregs;
+   if (nregs > this_target_regs->x_hard_regno_max_nregs)
+ this_target_regs->x_hard_regno_max_nregs = nregs;
+  }
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 {
diff --git a/gcc/regs.h b/gcc/regs.h
index 74f1f63770322..f72b06fb56508 100644
--- a/gcc/regs.h
+++ b/gcc/regs.h
@@ -202,6 +202,9 @@ struct target_regs {
  registers that a given machine mode occupies.  */
   unsigned char x_hard_regno_nregs[FIRST_PSEUDO_REGISTER][MAX_MACHINE_MODE];
 
+  /* The max value found in x_hard_regno_nregs.  */
+  unsigned char x_hard_regno_max_nregs;
+
   /* For each hard register, the widest mode object that it can contain.
  This will be a MODE_INT mode if the register can hold integers.  Otherwise
  it will be a MODE_FLOAT or a MODE_CC mode, whichever is valid for the
@@ -235,6 +238,8 @@ extern struct target_regs *this_target_regs;
 #else
 #define this_target_regs (&default_target_regs)
 #endif
+#define hard_regno_max_nregs \
+  (this_target_regs->x_hard_regno_max_nregs)
 #define reg_raw_mode \
   (this_target_regs->x_reg_raw_mode)
 #define have_regs_of_mode \
diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index fc49235eb38ee..e22bc66a6c896 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -1017,6 +1017,45 @@ default_function_value_regno_p (const unsigned int regno 
ATTRIBUTE_UNUSED)
 #endif
 }
 
+/* Choose the mode and rtx to use to zero REGNO, storing tem in PMODE and
+   PREGNO_RTX and returning TRUE if successful, otherwise returning FALSE.  If
+   the natural mode for REGNO doesn't work, attempt to group it with subsequent
+   adjacent registers set in TOZERO.  */
+
+static inline bool
+zcur_select_mode_rtx (unsigned int regno, machine_mode *pmode,
+ rtx *pregno_rtx, HARD_REG_SET tozero)
+{
+  rtx regno_rtx = regno_reg_rtx[regno];
+  machine_mode mode = GET_MODE (regno_rtx);
+
+  /* If the natural mode doesn't work, try some wider mode.  */
+  if (!targetm.hard_regno_mode_ok (regno, mode))
+{
+  bool found = false;
+  for (int nregs = 2;
+  !found && nregs <= hard_regno_max_nregs
+&& regno + nregs <= FIRST_PSEUDO_REGISTER
+&& TEST_HARD_REG_BIT (tozero,
+  regno + nregs - 1);
+  nregs++)
+   {
+ mode = choose_hard_reg_mode (regno, nregs, 0);
+ if (mode == E_VOIDmode)
+   continue;
+ gcc_checking_assert (targetm.hard_regno_mode_ok (regno, mode));
+ regno_rtx = gen_rtx_REG (mode, regno);
+ found = true;
+   }
+  if (!found)
+   return false;
+}
+
+  *pmode = mode;
+  *pregno_rtx = regno_rtx;
+  return true;
+}
+
 /* The default hook for TARGET_ZERO_CALL_USED_REGS.  */
 
 HARD_REG_SET
@@ -1035,16 +1074,28 @@ default_zero_call_used_regs (HARD_REG_SET 
need_zeroed_hardregs)
 if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
   {
rtx_insn *last_insn = get_last_insn ();
-   machine_mode mode = GET_MODE (regno_reg_rtx[regno]);
+   rtx regno_rtx;
+   machine_mode mode;
+
+   if (!zcur_select_mode_rtx (regno, &mode, ®no_rtx,
+  need_zeroed_hardregs))
+ {
+   SET_HARD_REG_BIT (failed, regno);
+   continue;
+ }
+
rtx zero = CONST0_RTX (mode);
-  

Re: C++ modules and AAPCS/ARM EABI clash on inline key methods

2022-04-04 Thread Alexandre Oliva via Gcc-patches
On Mar 31, 2022, Alexandre Oliva  wrote:

> g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
> that use the AAPCS variant.  ARM is the only target that overrides
> TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way the
> clash between AAPCS and C++ Modules design should be resolved, but
> currently it favors AAPCS and thus the test fails.

> Should we skip the test on ARM, XFAIL it, or put in some kludge like
> the patchlet below?

That kludge doesn't work: subsequent virt tests fail with it, on arm.

Would something like this be acceptable/desirable?  It's overreaching,
in that not all arm platforms are expected to fail, but the result on
them will be an unexpected pass, which is not quite as bad as the
unexpected fail we get on most arm variants now.


diff --git a/gcc/testsuite/g++.dg/modules/virt-2_a.C 
b/gcc/testsuite/g++.dg/modules/virt-2_a.C
index 9115cc19cc286..0b780645708ba 100644
--- a/gcc/testsuite/g++.dg/modules/virt-2_a.C
+++ b/gcc/testsuite/g++.dg/modules/virt-2_a.C
@@ -22,6 +22,6 @@ export int Visit (Visitor *v)
 }
 
 // Emit here
-// { dg-final { scan-assembler {_ZTV7Visitor:} } }
-// { dg-final { scan-assembler {_ZTI7Visitor:} } }
-// { dg-final { scan-assembler {_ZTS7Visitor:} } }
+// { dg-final { scan-assembler {_ZTV7Visitor:} { xfail arm*-*-* } } }
+// { dg-final { scan-assembler {_ZTI7Visitor:} { xfail arm*-*-* } } }
+// { dg-final { scan-assembler {_ZTS7Visitor:} { xfail arm*-*-* } } }


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH] testsuite: Add -fno-tree-loop-distribute-patterns for s390.

2022-04-04 Thread Andreas Krebbel via Gcc-patches
On 4/4/22 13:51, Robin Dapp wrote:
> Hi,
> 
> in gcc.dg/Wuse-after-free-2.c we try to detect a use-after-free.  On
> s390 the test's while loop is converted into a rawmemchr builtin making
> it impossible to determine that the pointers *p and *q are related.
> 
> Therefore, disable the tree loop distribute patterns pass on s390 for
> this test.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/Wuse-after-free-2.c:
>   Add -fno-tree-loop-distribute-patterns for s390*.

Ok. Thanks!

Andreas


Re: [PATCH] testsuite/s390: Change nle -> h in ifcvt tests.

2022-04-04 Thread Andreas Krebbel via Gcc-patches
On 4/4/22 13:51, Robin Dapp wrote:
> Hi,
> 
> we have been emitting the "higher" variantes instead of the "not less or
> equal" ones for a while.  Change the test expectations accordingly.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/ifcvt-two-insns-bool.c: Change nle to h.
>   * gcc.target/s390/ifcvt-two-insns-int.c: Dito.
>   * gcc.target/s390/ifcvt-two-insns-long.c: Dito.

Ok. Thanks!

Andreas


Re: [PATCH] testsuite/s390: Adapt test expections.

2022-04-04 Thread Andreas Krebbel via Gcc-patches
On 4/4/22 13:52, Robin Dapp wrote:
> Hi,
> 
> some tests expect a convert instruction but nowadays the conversion is
> already done at compile time.  This results in a literal-pool load.
> Change the tests accordingly.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zvector/vec-double-compile.c: Expect vl
> instead of vc*.
>   * gcc.target/s390/zvector/vec-float-compile.c: Dito.
>   * gcc.target/s390/zvector/vec-signed-compile.c: Dito.
>   * gcc.target/s390/zvector/vec-unsigned-compile.c: Dito.

I've seen Mike's comment but I'm not opposed to checking it in that way. These 
kind of comments have
probably saved me a few hours of bisecting already. Next time you might 
consider moving it to the
commit message instead.

Ok. Thanks!

Bye,

Andreas


Re: [wwwdocs] Add Ada's changelog entry

2022-04-04 Thread Arnaud Charlet via Gcc-patches
> Thank you for the feedback. Should I remove it and resuply the patch or
> can you/GCC maintainers do the modification before merging?

Can you please resubmit it?

I'll let others comment on the need to sign a contributor agreement, my
understanding is that this is unavoidable, whether you're contributing
code or documentation doesn't change this need AFAIU.

Arno


Re: [PATCH] Fortran: Add location info to OpenMP tree nodes

2022-04-04 Thread Richard Biener via Gcc-patches
On Tue, Apr 5, 2022 at 6:12 AM Sandra Loosemore  wrote:
>
> On 3/25/22 20:03, Sandra Loosemore wrote:
> > I've got another patch forthcoming (stage 1 material) that adds some new
> > diagnostics for non-rectangular loops during gimplification of OMP
> > nodes.  When I was working on that, I discovered that the Fortran front
> > end wasn't attaching location information to the tree nodes
> > corresponding to the various OMP directives, so the new errors weren't
> > coming out with location info either.  I went through trans-openmp.cc
> > and fixed all the places where make_node was being called to explicitly
> > set the location.
> >
> > I don't have a test case specifically for this change, but my test cases
> > for the new diagnostics in the non-rectangular loops patch do exercise
> > it.  Is this OK for trunk now, or for stage 1 when we get there?
>
> Ping!  Even a quick review and "this isn't suitable for GCC 12" answer
> would be helpful.
>
> https://gcc.gnu.org/pipermail/fortran/2022-March/057706.html

OK if nobody objects in 24h.

Richard.

> The definitely-stage-1 patch that exercises this is here:
>
> https://gcc.gnu.org/pipermail/fortran/2022-March/057707.html
>
> -Sandra


Re: [wwwdocs] Add Ada's changelog entry

2022-04-04 Thread Richard Biener via Gcc-patches
On Tue, Apr 5, 2022 at 8:06 AM Arnaud Charlet via Gcc-patches
 wrote:
>
> > Thank you for the feedback. Should I remove it and resuply the patch or
> > can you/GCC maintainers do the modification before merging?
>
> Can you please resubmit it?
>
> I'll let others comment on the need to sign a contributor agreement, my
> understanding is that this is unavoidable, whether you're contributing
> code or documentation doesn't change this need AFAIU.

If you have git write access you should add yourself in the DCO section
in the MAINTAINERS file.  Otherwise it has been said it's enough to
explicitly state in mail that you are contributing this change under
the Developer's Certificate of Origin Version 1.1
(https://gcc.gnu.org/dco.html).

Richard.

>
> Arno