date:20210318

Re: [Patch, fortran] PR99602 - [11 regression] runtime error: pointer actual argument not associated

2021-03-18 Thread Tobias Burnus


Hi Paul, hi all fortran@/gcc-patch@ reader,

it looks as if you replied with your patch submission to the wrong email
address – and your re-submission ended up at https://gcc.gnu.org/PR99602#c17

On 16.03.21 18:08, Tobias Burnus wrote:

On 16.03.21 17:42, Paul Richard Thomas via Gcc-patches wrote:

Fortran: Fix runtime errors for class actual arguments [PR99602].
* trans-array.c (gfc_conv_procedure_call): For class formal
arguments, use the _data field attributes for runtime errors.
* gfortran.dg/pr99602.f90: New test.

Shouldn't there be also a testcase which triggers this run-time error?


Note: The new submission consists of a new testcase (now two) and the
actual patch; the new testcase removes 'pointer' from the dummy argument
of prepare_m2_proc/prepare_whizard_m2 and checks via the
-ftree-original-dump that there is now run-time check code inserted when
passing a (nullified) pointer to a nonpointer dummy argument.

Compared to previous patch, 'fsym_attr.pointer =
fsym_attr.class_pointer' is new, before it was 'fsym_attr.pointer =
fsym_attr.pointer'.

Paul Richard Thomas wrote in PR99602:


Good morning all,

I have attached the revised patch and an additional testcase. I had totally
forgotten about the class pointer gotcha.

OK for master?

Paul

Fortran: Fix runtime errors for class actual arguments [PR99602  
].


LGTM – thanks for the patch.

I am wondering whether the second testcase should be a 'dg-do run' test
instead of 'compile' to ensure that the error is indeed triggered
(currently, it only checks the tree dump that a check is inserted). What
do you think? [If you do so, you need a dg-shouldfail + dg-output, cf.
e.g. pointer_check_5.f90.]

Thanks,

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf

Re: PING: Re: [Patch] Fortran: Fix func decl mismatch [PR93660]

2021-03-18 Thread Tobias Burnus


*PING* of my 11.03.21 18:15 CET patch.

The issue is that the TREE_TYPE of the fndecl does not match its arglist.

In some cases, the middle end looks at the function type – and then it 
goes wrong.


The issue only occurs for -fcoarray=lib as other hidden arguments are 
properly handled.


Solution: Add the missing args to the fndecl (in gfc_get_function_type) 
– and increment hidden_typelist in build_function_decl – the latter is 
used in a gcc_assert which was supposed to check for this mismatch ...


Tobias

On 14.03.21 12:04, Tobias Burnus wrote:

Early ping – and minor post script:

+  hidden_typelist = TREE_CHAIN (hidden_typelist);

This change is to avoid running into the ICE:

  gcc_assert (hidden_typelist == NULL_TREE
  || TREE_VALUE (hidden_typelist) == void_type_node);

The purpose of this assert is to check that the TREE_TYPE (fndecl)
arg list and the one created by
  create_function_arglist (gfc_symbol * sym)
are the same (at least in terms of the number of arguments). Namely:
   typelist = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
...
  hidden_typelist = typelist;

Tobias


On 11.03.21 18:15, Tobias Burnus wrote:

This fixes an ICE with OpenMP 'omp decare simd' but is a generic bug.

Namely TREE_TYPE(fndecl) has a mismatch to the arglist chain,
missing some hidden arguments with -fcoarray=lib.

OK for mainline and GCC 10?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
Frank Thürauf

RE: [PATCH] aarch64: Improve generic SVE tuning defaults

2021-03-18 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: qia...@fujitsu.com 
> Sent: 18 March 2021 01:52
> To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford 
> Subject: RE: [PATCH] aarch64: Improve generic SVE tuning defaults
> 
> Hello Kyrill,
> 
> Sorry for the slow response.
> The performance on a64fx is not impacted with this patch.

Thank you very much for testing Qian.
Glad to see there is no impact on A64FX. I will push the patch to master then.
Kyrill

> 
> Regards,
> Qian
> 
> > -Original Message-
> > From: Kyrylo Tkachov 
> > Sent: Wednesday, March 10, 2021 10:56 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford ; Qian, Jianhua/钱 建
> 华
> > 
> > Subject: [PATCH] aarch64: Improve generic SVE tuning defaults
> >
> > Hi all,
> >
> > This patch adds the recently-added tweak to split some SVE VL-based scalar
> > operations [1] to the generic tuning used for SVE, as enabled by adding
> +sve to
> > the -march flag, for example -march=armv8.2-a+sve.
> >
> > The recommendation for best performance on a particular CPU remains
> > unchanged:
> > use the -mcpu option for that CPU, where possible. -mcpu=native makes
> this
> > straightforward for native compilation.
> >
> > The tweak to split out SVE VL-based scalar operations is a consistent win
> for
> > the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run of
> > SPEC2017 on A64FX with this tweak on didn't show any non-noise
> differences.
> > It is also expected to be neutral on SVE2 implementations.
> >
> > Therefore, the patch enables the tweak for generic +sve tuning e.g.
> > -march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it,
> > therefore the tweak is disabled for generic tuning when +sve2 is in -march
> e.g.
> > -march=armv8.2-a+sve2.
> >
> > The implementation of this approach requires a bit of custom logic in
> > aarch64_override_options_internal to handle these kinds of
> > architecture-dependent decisions, but we do believe the user-facing
> principle
> > here is important to implement.
> >
> > Qian, as you've contributed the A64FX support to GCC, I would be grateful
> for
> > your feedback on this approach and in particular on the performance
> evaluation
> > of this change.
> >
> > In general, for the generic target we're using a decision framework that
> looks
> > like:
> >
> > * If all cores that are known to benefit from an optimization are of
> architecture X,
> > and all other cores that implement X or above are not impacted, or have a
> very
> > slight impact, we will consider it for generic tuning for architecture X.
> > * We will not enable that optimisation for generic tuning for architecture
> X+1 if
> > no known cores of architecture X+1 or above will benefit.
> >
> > This framework allows us to improve generic tuning for CPUs of generation
> X
> > while avoiding accumulating tweaks for future CPUs of generation X+1,
> X+2...
> > that do not need them, and thus avoid even the slight negative effects of
> these
> > optimisations if the user is willing to tell us the desired architecture
> accurately.
> >
> > X above can mean either annual architecture updates (Armv8.2-a, Armv8.3-
> a
> > etc) or optional architecture extensions (like SVE, SVE2).
> >
> > We think that this patch fits that framework, so would like to propose it 
> > for
> the
> > trunk default tunings for SVE.
> >
> > Bootstrapped and tested on aarch64-none-linux-gnu.
> >
> > Thanks,
> > Kyrill
> >
> > [1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning):
> > Define.
> > (aarch64_override_options_internal): Use it.
> > (generic_tunings): Add
> > AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to
> > tune_flags.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.target/aarch64/sve/aarch64-sve.exp: Add
> > -moverride=tune=none to
> > sve_flags.
> > * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
> > * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> > * gcc.target/aarch64/sve/aarch64-sve.exp: Likewise.
> > * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
> > * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> >

[patch] fix installation of jit headers, usage of $(mkinstalldirs)

2021-03-18 Thread Matthias Klose

The installation of the jit headers can fail, because the directory might not be
created yet, a missing dependency on the installdirs target.

Also the Makefile hardcodes mkdir -p, instead of using $(mkinstalldirs).

Ok for the trunk and the branches?

Matthias
diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
index f9b0df850bd..663772aba63 100644
--- a/gcc/jit/Make-lang.in
+++ b/gcc/jit/Make-lang.in
@@ -249,7 +249,7 @@ jit.texinfo.install-pdf: doc/libgccjit.pdf
 SPHINX_BUILD_DIR=jit/sphinx-build
 
 jit.sphinx.html:
-	mkdir -p $(SPHINX_BUILD_DIR)
+	$(mkinstalldirs) $(SPHINX_BUILD_DIR)
 	(cd $(srcdir)/jit/docs && \
 	  make html BUILDDIR=$(PWD)/$(SPHINX_BUILD_DIR) )
 
@@ -270,7 +270,7 @@ jit.sphinx.install-html: jit.sphinx.html
 # see https://bugzilla.redhat.com/show_bug.cgi?id=1148845 )
 jit.sphinx.pdf: $(SPHINX_BUILD_DIR)/latex/libgccjit.pdf
 $(SPHINX_BUILD_DIR)/latex/libgccjit.pdf:
-	mkdir -p $(SPHINX_BUILD_DIR)
+	$(mkinstalldirs) $(SPHINX_BUILD_DIR)
 	(cd $(srcdir)/jit/docs && \
 	  make latexpdf BUILDDIR=$(PWD)/$(SPHINX_BUILD_DIR) )
 
@@ -305,7 +305,7 @@ selftest-jit:
 
 #
 # Install hooks:
-jit.install-headers:
+jit.install-headers: installdirs
 	$(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \
 	  $(DESTDIR)$(includedir)/libgccjit.h
 	$(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \

[PATCH][pushed] coroutines: init struct members to NULL

2021-03-18 Thread Martin Liška


Hello.

The patch is about a missing struct initialization. It's pre-approved by Iain.

Thanks,
Martin

gcc/cp/ChangeLog:

PR c++/99617
* coroutines.cc (struct var_nest_node): Init then_cl and else_cl
to NULL.
---
 gcc/cp/coroutines.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 51984efe2fd..dbd703a67cc 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2805,7 +2805,7 @@ struct var_nest_node
 {
   var_nest_node () = default;
   var_nest_node (tree v, tree i, var_nest_node *p, var_nest_node *n)
-: var(v), init(i), prev(p), next(n)
+: var(v), init(i), prev(p), next(n), then_cl (NULL), else_cl (NULL)
 {
   if (p)
p->next = this;
--
2.30.2

[PATCH] testsuite: Skip c-c++-common/zero-scratch-regs-10.c on arm

2021-03-18 Thread Christophe Lyon via Gcc-patches

As discussed in PR 97680, -fzero-call-used-regs is not supported on
arm.

Skip this test to avoid failure reports.

2021-03-18  Christophe Lyon  

gcc/tesuite/
* c-c++-common/zero-scratch-regs-10.c: Skip on arm
---
 gcc/testsuite/c-c++-common/zero-scratch-regs-10.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c 
b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c
index 193db8c..f393a3b 100644
--- a/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c
+++ b/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { arm*-*-* } } */
 /* { dg-options "-O2" } */
 
 #include 
-- 
2.7.4

[PATCH] Aarch64: Prevent use of SIMD fcvtz[su] instruction variant with "nosimd"

2021-03-18 Thread mihailo.stojanovic--- via Gcc-patches

From: Mihailo Stojanovic 

Hi all,

Currently, SF->SI and DF->DI conversions on Aarch64 with the "nosimd"
flag provided sometimes cause the emitting of a vector variant of the
fcvtz[su] instruction (e.g. fcvtzu s0, s0).

This modifies the corresponding pattern to only select the vector
variant of the instruction when generating code with SIMD enabled.

Tested on aarch64-linux-gnu.

gcc/ChangeLog:

* gcc/config/aarch64/aarch64.md
(_trunc2): Set the "arch"
attribute to disambiguate between SIMD and FP variants of the
instruction.

gcc/testsuite/ChangeLog:

* gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c: New test.
---
 gcc/config/aarch64/aarch64.md |  3 ++-
 .../gcc.target/aarch64/fcvt_nosimd.c  | 23 +++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index b2abb5b5b3c..dd1dc2bd7a8 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5989,7 +5989,8 @@
   "@
fcvtz\t%0, %1
fcvtz\t%0, %1"
-  [(set_attr "type" "neon_fp_to_int_s,f_cvtf2i")]
+  [(set_attr "type" "neon_fp_to_int_s,f_cvtf2i")
+   (set_attr "arch" "simd,fp")]
 )
 
 ;; Convert HF -> SI or DI
diff --git a/gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c 
b/gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c
new file mode 100644
index 000..7b2ab65e307
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8-a+nosimd" } */
+
+#include 
+
+uint64_t test_double_to_uint64(double x) {
+  return (uint64_t)x;
+}
+
+int64_t test_double_to_int64(double x) {
+  return (int64_t)x;
+}
+
+uint32_t test_float_to_uint32(float x) {
+  return (uint32_t)x;
+}
+
+int32_t test_float_to_int32(float x) {
+  return (int32_t)x;
+}
+
+/* { dg-final { scan-assembler-not {\tfcvtz[su]\td[0-9]*, d[0-9]*} } } */
+/* { dg-final { scan-assembler-not {\tfcvtz[su]\ts[0-9]*, s[0-9]*} } } */
-- 
2.29.0

RE: [PATCH] aarch64: Improve generic SVE tuning defaults

2021-03-18 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Kyrylo Tkachov
> Sent: 18 March 2021 09:37
> To: 'qia...@fujitsu.com' ; gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford 
> Subject: RE: [PATCH] aarch64: Improve generic SVE tuning defaults
> 
> 
> 
> > -Original Message-
> > From: qia...@fujitsu.com 
> > Sent: 18 March 2021 01:52
> > To: Kyrylo Tkachov ; gcc-
> patc...@gcc.gnu.org
> > Cc: Richard Sandiford 
> > Subject: RE: [PATCH] aarch64: Improve generic SVE tuning defaults
> >
> > Hello Kyrill,
> >
> > Sorry for the slow response.
> > The performance on a64fx is not impacted with this patch.
> 
> Thank you very much for testing Qian.
> Glad to see there is no impact on A64FX. I will push the patch to master then.

I should say, I intend to backport this to GCC 10 as well as it has the same 
effect on that branch (helps Neoverse V1, no effect on anything else).
Will do so after a bit more testing, the patch applies cleanly.
Thanks,
Kyrill

> Kyrill
> 
> >
> > Regards,
> > Qian
> >
> > > -Original Message-
> > > From: Kyrylo Tkachov 
> > > Sent: Wednesday, March 10, 2021 10:56 PM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Richard Sandiford ; Qian, Jianhua/钱
> 建
> > 华
> > > 
> > > Subject: [PATCH] aarch64: Improve generic SVE tuning defaults
> > >
> > > Hi all,
> > >
> > > This patch adds the recently-added tweak to split some SVE VL-based
> scalar
> > > operations [1] to the generic tuning used for SVE, as enabled by adding
> > +sve to
> > > the -march flag, for example -march=armv8.2-a+sve.
> > >
> > > The recommendation for best performance on a particular CPU remains
> > > unchanged:
> > > use the -mcpu option for that CPU, where possible. -mcpu=native makes
> > this
> > > straightforward for native compilation.
> > >
> > > The tweak to split out SVE VL-based scalar operations is a consistent win
> > for
> > > the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run
> of
> > > SPEC2017 on A64FX with this tweak on didn't show any non-noise
> > differences.
> > > It is also expected to be neutral on SVE2 implementations.
> > >
> > > Therefore, the patch enables the tweak for generic +sve tuning e.g.
> > > -march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it,
> > > therefore the tweak is disabled for generic tuning when +sve2 is in -
> march
> > e.g.
> > > -march=armv8.2-a+sve2.
> > >
> > > The implementation of this approach requires a bit of custom logic in
> > > aarch64_override_options_internal to handle these kinds of
> > > architecture-dependent decisions, but we do believe the user-facing
> > principle
> > > here is important to implement.
> > >
> > > Qian, as you've contributed the A64FX support to GCC, I would be
> grateful
> > for
> > > your feedback on this approach and in particular on the performance
> > evaluation
> > > of this change.
> > >
> > > In general, for the generic target we're using a decision framework that
> > looks
> > > like:
> > >
> > > * If all cores that are known to benefit from an optimization are of
> > architecture X,
> > > and all other cores that implement X or above are not impacted, or have
> a
> > very
> > > slight impact, we will consider it for generic tuning for architecture X.
> > > * We will not enable that optimisation for generic tuning for architecture
> > X+1 if
> > > no known cores of architecture X+1 or above will benefit.
> > >
> > > This framework allows us to improve generic tuning for CPUs of
> generation
> > X
> > > while avoiding accumulating tweaks for future CPUs of generation X+1,
> > X+2...
> > > that do not need them, and thus avoid even the slight negative effects of
> > these
> > > optimisations if the user is willing to tell us the desired architecture
> > accurately.
> > >
> > > X above can mean either annual architecture updates (Armv8.2-a,
> Armv8.3-
> > a
> > > etc) or optional architecture extensions (like SVE, SVE2).
> > >
> > > We think that this patch fits that framework, so would like to propose it
> for
> > the
> > > trunk default tunings for SVE.
> > >
> > > Bootstrapped and tested on aarch64-none-linux-gnu.
> > >
> > > Thanks,
> > > Kyrill
> > >
> > > [1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7
> > >
> > > gcc/ChangeLog:
> > >
> > >   * config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning):
> > > Define.
> > >   (aarch64_override_options_internal): Use it.
> > >   (generic_tunings): Add
> > > AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to
> > >   tune_flags.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * g++.target/aarch64/sve/aarch64-sve.exp: Add
> > > -moverride=tune=none to
> > >   sve_flags.
> > >   * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
> > >   * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> > >   * gcc.target/aarch64/sve/aarch64-sve.exp: Likewise.
> > >   * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
> > >   * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> > >

[PATCH] RFC: come up with startswith function.

2021-03-18 Thread Martin Liška


Hey.

Recently, I noticed a cumbersome construct we use for string startswith function
(most notably in a situation when the prefix is a string literal).

Commonly used patterns are:
1) strncmp (arg, "--sysroot=", 10) == 0
2) strncmp (name, "not found", sizeof ("not found") - 1) == 0
3) strncmp (varname, "__builtin_", strlen ("__builtin_")) == 0
4) #define STR "-foffload-abi="
   if (strncmp (argv[i], STR, strlen (STR)) == 0)

I see all these quite error prone for the following reasons:
1) one needs to correctly calculate string length (in their head)
2) sizeof ("foo") - 1 == strlen ("foo")
3) one needs to undefine a temporary macros

Moreover, there are helper functions that already do the same:

gcc/ada/adadecode.c:
static int
has_prefix (const char *name, const char *prefix)
{
  return strncmp (name, prefix, strlen (prefix)) == 0;
}

gcc/fortran/gfortran.h:
#define gfc_str_startswith(str, pref) \
   (strncmp ((str), (pref), strlen (pref)) == 0)

That said, I'm suggesting a new function 'startswith' in system.h.
I prepared a patch that utilizes the function in gcc/ subfolder
(excluding all target code for now). I can prepare similar mechanical
patch for the rest of the compiler (and run-time libraries).

Thoughts?
Thanks,
Martin

---
 gcc/ada/adadecode.c | 14 +++
 gcc/ada/init.c  |  8 +++
 gcc/analyzer/sm-file.cc |  5 ++--
 gcc/builtins.c  | 10 +++-
 gcc/c-family/c-ada-spec.c   |  2 +-
 gcc/c-family/c-common.c |  2 +-
 gcc/c-family/c-format.c |  2 +-
 gcc/collect2.c  | 36 ++---
 gcc/cp/decl.c   |  6 ++---
 gcc/cp/mangle.c |  2 +-
 gcc/d/dmd/dmangle.c |  2 +-
 gcc/d/dmd/hdrgen.c  |  2 +-
 gcc/d/dmd/identifier.c  |  6 ++---
 gcc/dwarf2out.c | 14 +--
 gcc/fortran/decl.c  |  4 ++--
 gcc/fortran/gfortran.h  |  4 
 gcc/fortran/module.c| 10 
 gcc/fortran/options.c   |  2 +-
 gcc/fortran/primary.c   |  6 ++---
 gcc/fortran/trans-decl.c|  2 +-
 gcc/fortran/trans-expr.c|  2 +-
 gcc/fortran/trans-intrinsic.c   | 22 +-
 gcc/gcc.c   |  4 ++--
 gcc/gencfn-macros.c |  2 +-
 gcc/gengtype.c  |  6 ++---
 gcc/genmatch.c  |  8 +++
 gcc/genoutput.c |  2 +-
 gcc/go/gofrontend/runtime.cc|  2 +-
 gcc/incpath.c   |  2 +-
 gcc/langhooks.c |  8 +++
 gcc/objc/objc-next-runtime-abi-02.c |  2 +-
 gcc/opts-common.c   |  2 +-
 gcc/opts.c  |  2 +-
 gcc/read-rtl-function.c |  2 +-
 gcc/selftest.c  |  3 +--
 gcc/system.h|  8 +++
 gcc/timevar.c   |  2 +-
 gcc/tree.c  |  2 +-
 gcc/ubsan.c |  2 +-
 gcc/varasm.c| 22 +-
 40 files changed, 117 insertions(+), 127 deletions(-)

diff --git a/gcc/ada/adadecode.c b/gcc/ada/adadecode.c
index 43a378f9058..1154e628311 100644
--- a/gcc/ada/adadecode.c
+++ b/gcc/ada/adadecode.c
@@ -29,8 +29,9 @@
  *  *
  /
 
+#include "config.h"

+#include "system.h"
 #include "runtime.h"
-#include 
 #include 
 #include 
 
@@ -47,7 +48,6 @@

 #include "adadecode.h"
 
 static void add_verbose (const char *, char *);

-static int has_prefix (const char *, const char *);
 static int has_suffix (const char *, const char *);
 
 /* This is a safe version of strcpy that can be used with overlapped

@@ -68,14 +68,6 @@ static void add_verbose (const char *text, char *ada_name)
   verbose_info = 1;
 }
 
-/* Returns 1 if NAME starts with PREFIX.  */

-
-static int
-has_prefix (const char *name, const char *prefix)
-{
-  return strncmp (name, prefix, strlen (prefix)) == 0;
-}
-
 /* Returns 1 if NAME ends with SUFFIX.  */
 
 static int

@@ -167,7 +159,7 @@ __gnat_decode (const char *coded_name, char *ada_name, int 
verbose)
 }
 
   /* Check for library level subprogram.  */

-  else if (has_prefix (coded_name, "_ada_"))
+  else if (startswith (coded_name, "_ada_"))
 {
   strcpy (ada_name, coded_name + 5);
   lib_subprog = 1;
diff --git a/gcc/ada/init.c b/gcc/ada/init.c
index 3ceb1a31b02..24eb67bbf0d 100644
--- a/gcc/ada/init.c
+++ b/gcc/ada/init.c
@@ -2111,10 +2111,10 @@ __gnat_install_handler (void)
   prefix for vxsim when running on Linux and Windows.  */
   {
 char *model = sysModel ();
-if ((strncmp (model, "Linux", 5) == 0)
-|| (strncmp (model, "Windows", 7) == 0)
-|| (strncmp (model, "SIM

[patch] Fix PR middle-end/99641

2021-03-18 Thread Eric Botcazou

Hi,

this is the failure of a couple of tests in the gnat.dg testsuite on 32-bit 
platforms (but on some hosts only):

FAIL: gnat.dg/loop_optimization3.adb (test for excess errors)
FAIL: gnat.dg/opt30.adb (test for excess errors)
FAIL: gnat.dg/opt49.adb (test for excess errors)

caused by a segfault in native_encode_initializer when it is encoding the 
CONSTRUCTOR for an array whose lower bound is negative (it's OK in Ada).
The computation of the current position is done in HOST_WIDE_INT and this does 
not work for arrays whose original range has a negative lower bound and a 
positive upper bound; the computation must be done in sizetype instead so that 
it may wrap around, like in get_inner_reference or get_ref_base_and_extent.

Tested on x86-64/Linux, OK for the mainline and 10 branch?


2021-03-18  Eric Botcazou  

PR middle-end/99641
* fold-const.c (native_encode_initializer) : For an
array type, do the computation of the current position in sizetype.

-- 
Eric Botcazoudiff --git a/gcc/fold-const.c b/gcc/fold-const.c
index e0bdb4b6ba6..55652819d71 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -8051,21 +8051,20 @@ native_encode_initializer (tree init, unsigned char *ptr, int len,
   int o = off == -1 ? 0 : off;
   if (TREE_CODE (type) == ARRAY_TYPE)
 	{
-	  HOST_WIDE_INT min_index;
+	  tree min_index;
 	  unsigned HOST_WIDE_INT cnt;
 	  HOST_WIDE_INT curpos = 0, fieldsize, valueinit = -1;
 	  constructor_elt *ce;
 
-	  if (TYPE_DOMAIN (type) == NULL_TREE
-	  || !tree_fits_shwi_p (TYPE_MIN_VALUE (TYPE_DOMAIN (type
+	  if (!TYPE_DOMAIN (type))
 	return 0;
+	  min_index = TYPE_MIN_VALUE (TYPE_DOMAIN (type));
 
 	  fieldsize = int_size_in_bytes (TREE_TYPE (type));
 	  if (fieldsize <= 0)
 	return 0;
 
-	  min_index = tree_to_shwi (TYPE_MIN_VALUE (TYPE_DOMAIN (type)));
-	  if (ptr != NULL)
+	  if (ptr)
 	memset (ptr, '\0', MIN (total_bytes - off, len));
 
 	  for (cnt = 0; ; cnt++)
@@ -8084,21 +8083,35 @@ native_encode_initializer (tree init, unsigned char *ptr, int len,
 		break;
 	  else
 		pos = total_bytes;
+
 	  if (index && TREE_CODE (index) == RANGE_EXPR)
 		{
-		  if (!tree_fits_shwi_p (TREE_OPERAND (index, 0))
-		  || !tree_fits_shwi_p (TREE_OPERAND (index, 1)))
+		  tree pos_diff
+		= fold_convert (sizetype,
+fold_build2 (MINUS_EXPR, TREE_TYPE (index),
+		 TREE_OPERAND (index, 0),
+		 min_index));
+		  if (!tree_fits_shwi_p (pos_diff))
+		return 0;
+		  pos = tree_to_shwi (pos_diff) * fieldsize;
+		  tree count_diff
+		= fold_convert (sizetype,
+fold_build2 (MINUS_EXPR, TREE_TYPE (index),
+		 TREE_OPERAND (index, 1),
+		 TREE_OPERAND (index, 0)));
+		  if (!tree_fits_shwi_p (count_diff))
 		return 0;
-		  pos = (tree_to_shwi (TREE_OPERAND (index, 0)) - min_index)
-			* fieldsize;
-		  count = (tree_to_shwi (TREE_OPERAND (index, 1))
-			   - tree_to_shwi (TREE_OPERAND (index, 0)));
+		  count = tree_to_shwi (count_diff);
 		}
 	  else if (index)
 		{
-		  if (!tree_fits_shwi_p (index))
+		  tree pos_diff
+		= fold_convert (sizetype,
+fold_build2 (MINUS_EXPR, TREE_TYPE (index),
+		 index, min_index));
+		  if (!tree_fits_shwi_p (pos_diff))
 		return 0;
-		  pos = (tree_to_shwi (index) - min_index) * fieldsize;
+		  pos = tree_to_shwi (pos_diff) * fieldsize;
 		}
 
 	  if (mask && !CONSTRUCTOR_NO_CLEARING (init) && curpos != pos)

[RFC] Fortran: OpenMP (Coarray?) – handling transfer/mapping of allocatable componens, esp. polymorphic ones

2021-03-18 Thread Tobias Burnus


Fortran itself: suggestion is to add a new entry to the vtable
(breaking change) — thus, please also comment if you are not
interested in OpenMP (or coarrays).

For OpenMP: When mapping a derived-type to a non-shared-memory
(accelerator/GPU) device, it gets complicated with (polymorphic)
allocatable components — as OpenMP requires a deep copy of
_allocatable_ components.
[Side note: 'virtual calls' on the device are also permitted,
i.e. the vtable also has to be mapped properly.]

For coarrays: I thought there is the same issue with CO_REDUCE
(arbitrary type w/ user-defined reduction function), but I now think
that I either missed a constraint or that J3/WG5 missed to add one.
See thread starting with my just written email (no reply so far):
to J3: https://mailman.j3-fortran.org/pipermail/j3/2021-March/012965.html

[C++: Side note – OpenMP 5.1 now also permits virtual calls;
but the deep copy problem does not seem to exist (excpt next item?).]

For OpenMP, I think there is a relation between this issue and
how MAPPER might be implemented. — However, I have not looked at
mappers, hence, it could be a completely separate implementation or not.

 * * *

(A) EXAMPLES AS PREREMARK

  type recursive_t
type(recursive_t) :: A   ! recursive types; OpenMP: valid since 5.1
  end type

  type t
  end type t
  type, extends(t) :: t2
   integer, allocatable :: A(:)  ! allocatable component
  end type t2
  type t3
class(t), allocatable :: C  ! allocatable polymorphic component
  end type t3

  type(recursive_t) :: rt, rtc[*]
  class(t), allocatable :: B
  type(t3) :: C[*]
...
  !$omp target enter data map(to:B, rt, C)
...
  call CO_REDUCE (rtc, my_reduct_proc, result_image=1)
  call CO_REDUCE (C, my_reduct_proc3, result_image=1)


And for OpenMP also the following (virtual call on device):

  class(*), intent(in) :: dummy_class
  !$omp target map(to:dummy_class)
select class(dummy_class)
  class is my_cmplx_class
call dummy_class%type_bound_proc(5) ! TBP / virtual call



(B) DESCRIPTION OF THE PROBLEM

Coarrays: While there are some restrictions regarding the use of coarrays,
especially with user-defined reductions data has to be accessed on the
remote image with limited data available on this_image() about details on
the remote image.

OpenMP: While OpenMP 4.5 mostly avoided all pitfalls, 5.0 permitted a lot
more and 5.1 removed additional restrictions. For unified shared memory
or when not using 'target' constructs, there is no issue beyond the normal
Fortran issues (e.g. data-sharing firstprivate with polymorphic variables).
However, when the memory is not shared it becomes harder.


In any case, the information is distributed over several places:

* Run-time library
  libgomp: knows how to transfer the data between the host and the
  device and update pointers
  libcaf: knows how to access remote memory. I think pointer mapping
  (like remove vs. local vtable) is not required, but it looks as
  if the vtab->hash value has to be obtainable for
  same_type_as(var[i], var[j])

* Type and associated: At the location of the type declaration and
  vtable generation, all details about the type is known (except for
  array bounds and the depth of the recursive types, which are both
  only known at run time).

* Code location which calls into the library (openMP construct,
  co_reduction call etc.):
  Here, both the need for the data transfer and the declared type is known
  – including which parts have to be handled in a loop form
  (for A%B(:)%recursive, the compiler can generate an outer loop over A%B(i)
  and then an inner loop over A%B(i)%recursive%recursive%...%recursive).

  For the used data ref itself, the compiler can also add code to handle
  the dynamic type of the last partref - that is both vtable and
  obtaining the vtab->size or similar.
  But if the last partref is a polymorphic type, neither
  allocatable nonpolymorphic nor allocatable polymorphic components
  are known at the code location.


(B) CURRENT LIB SUPPORT

(1) For OpenMP

The current code generation does not permit run-time dependent
mapping as everything is folded into a single libgomp mapping call:
   map(A)
may become something like:
   map(to:a.p [len: 64]) map(to:*a.p.data [len: D.3953 * 4]) 
map(always_pointer: a.p.data [pointer assign, bias: 0]) ...
which then calls
   __builtin_GOMP_target_enter_exit_data (-1, 1, &.omp_data_arr.4, 
&.omp_data_sizes.5, &.omp_data_kinds.6, 0, 0B);

Taking the example of the recusive type (valid since OpenMP 5.1), we would need 
something like:
   __builtin_GOMP_target_enter_exit_data_begin
→ map 'A'
   prev = A;
   for (ptr = A.rt; rt != NULL; rt = rt->rt, prev = prev->rt)
 map (ptr) map(alwaysptr: ptr → prev%rt)
   __builtin_GOMP_target_enter_exit_data_end

(2) Likewise for Coarrays which also only get as argument:
  _gfortran_caf_co_reduce (gfc_descriptor_t *a, ..., int a_len)
which also does not help with allocatable components.

While for simple cases, a loop would do (

Re: znver3 tuning part 2

2021-03-18 Thread Richard Biener via Gcc-patches

On Wed, Mar 17, 2021 at 10:46 PM Jan Hubicka  wrote:
>
> Hi,
> this patch enables gather on zen3 hardware.  For TSVC it get used by 6
> benchmarks with following runtime improvements:
>
> s4114: 1.424 -> 1.209  (84.9017%)
> s4115: 2.021 -> 1.065  (52.6967%)
> s4116: 1.549 -> 0.854  (55.1323%)
> s4117: 1.386 -> 1.193  (86.075%)
> vag: 2.741 -> 1.940  (70.7771%)
>
> and one regression:
>
> s4112: 1.115 -> 1.184  (106.188%)
>
> In s4112 the internal loop is:
>
> for (int i = 0; i < LEN_1D; i++) {
> a[i] += b[ip[i]] * s;
> }
>
> (so a standard accmulate and add with indirect addressing)
>
>   40a400:   c5 fe 6f 24 03  vmovdqu (%rbx,%rax,1),%ymm4
>   40a405:   c5 fc 28 da vmovaps %ymm2,%ymm3
>   40a409:   48 83 c0 20 add$0x20,%rax
>   40a40d:   c4 e2 65 92 04 a5 00vgatherdps 
> %ymm3,0x594100(,%ymm4,4),%ymm0
>   40a414:   41 59 00
>   40a417:   c4 e2 75 a8 80 e0 34vfmadd213ps 0x5b34e0(%rax),%ymm1,%ymm0
>   40a41e:   5b 00
>   40a420:   c5 fc 29 80 e0 34 5bvmovaps %ymm0,0x5b34e0(%rax)
>   40a427:   00
>   40a428:   48 3d 00 f4 01 00   cmp$0x1f400,%rax
>   40a42e:   75 d0   jne40a400 
>
> compared to:
>
>   40a280:   49 63 14 04 movslq (%r12,%rax,1),%rdx
>   40a284:   48 83 c0 04 add$0x4,%rax
>   40a288:   c5 fa 10 04 95 00 41vmovss 0x594100(,%rdx,4),%xmm0
>   40a28f:   59 00
>   40a291:   c4 e2 71 a9 80 fc 34vfmadd213ss 0x5b34fc(%rax),%xmm1,%xmm0
>   40a298:   5b 00
>   40a29a:   c5 fa 11 80 fc 34 5bvmovss %xmm0,0x5b34fc(%rax)
>   40a2a1:   00
>   40a2a2:   48 3d 00 f4 01 00   cmp$0x1f400,%rax
>   40a2a8:   75 d6   jne40a280 
>
> Looking at instructions latencies
>
>  - fmadd is 4 cycles
>  - vgatherdps is 39
>
> So vgather iself is 4.8 cycle per iteration and probably CPU is able to 
> execute
> rest out of order getting clos to 4 cycles per iteration (it can do 2 loads in
> parallel, one store and rest fits easily to execution resources). That would
> explain 20% slowdown.
>
> gimple internal loop is:
>   _2 = a[i_38];
>   _3 = (long unsigned int) i_38;
>   _4 = _3 * 4;
>   _5 = ip_18 + _4;
>   _6 = *_5;
>   _7 = b[_6];
>   _8 = _7 * s_19;
>   _9 = _2 + _8;
>   a[i_38] = _9;
>   i_28 = i_38 + 1;
>   ivtmp_52 = ivtmp_53 - 1;
>   if (ivtmp_52 != 0)
> goto ; [98.99%]
>   else
> goto ; [1.01%]
>
> 0x25bac30 a[i_38] 1 times scalar_load costs 12 in body
> 0x25bac30 *_5 1 times scalar_load costs 12 in body
> 0x25bac30 b[_6] 1 times scalar_load costs 12 in body
> 0x25bac30 _7 * s_19 1 times scalar_stmt costs 12 in body
> 0x25bac30 _2 + _8 1 times scalar_stmt costs 12 in body
> 0x25bac30 _9 1 times scalar_store costs 16 in body
>
> so 19 cycles estimate of scalar load
>
> 0x2668630 a[i_38] 1 times vector_load costs 12 in body
> 0x2668630 *_5 1 times unaligned_load (misalign -1) costs 12 in body
> 0x2668630 b[_6] 8 times scalar_load costs 96 in body
> 0x2668630 _7 * s_19 1 times scalar_to_vec costs 4 in prologue
> 0x2668630 _7 * s_19 1 times vector_stmt costs 12 in body
> 0x2668630 _2 + _8 1 times vector_stmt costs 12 in body
> 0x2668630 _9 1 times vector_store costs 16 in body
>
> so 40 cycles per 8x vectorized body
>
> tsvc.c:3450:27: note:  operating only on full vectors.
> tsvc.c:3450:27: note:  Cost model analysis:
>   Vector inside of loop cost: 160
>   Vector prologue cost: 4
>   Vector epilogue cost: 0
>   Scalar iteration cost: 76
>   Scalar outside cost: 0
>   Vector outside cost: 4
>   prologue iterations: 0
>   epilogue iterations: 0
>   Calculated minimum iters for profitability: 1
>
> I think this generally suffers from GIGO principle.
> One problem seems to be that we do not know about fmadd yet and compute it as
> two instructions (6 cycles instead of 4). More importnat problem is that we do
> not account the parallelism at all.  I do not see how to disable the
> vecotrization here without bumping gather costs noticeably off reality and 
> thus
> we probably can try to experiment with this if more similar problems are 
> found.

Yep.  Vectorizer costing is really hard w/o modeling the CPU pipeline more
accurately.  Esp. for the scalar side of the code where modern CPUs often
can effectively do two-lane "vectorization" by executing two lanes in parallel.
At the moment we simply assume a single-issue pipeline.  But doing better
requires tracking dependences but the current vectorizer costing API does
not expose dependencies to the target so even rough estimates are hard to
come by (like assuming an issue width of two).  My current plan is not to
revisit this as long as we have both SLP and non-SLP data structures.

> Icc is also using gather in s1115 and s128.
> For s1115 the vectorization does not seem to help and s128 gets slower.
>
> Clang nor aocc does not use gathers.
>
> Honza
>
> * x86-tune-costs.h (znver3_cost): Update costs of gat

Re: [patch] Fix PR middle-end/99641

2021-03-18 Thread Richard Biener via Gcc-patches

On Thu, Mar 18, 2021 at 1:04 PM Eric Botcazou  wrote:
>
> Hi,
>
> this is the failure of a couple of tests in the gnat.dg testsuite on 32-bit
> platforms (but on some hosts only):
>
> FAIL: gnat.dg/loop_optimization3.adb (test for excess errors)
> FAIL: gnat.dg/opt30.adb (test for excess errors)
> FAIL: gnat.dg/opt49.adb (test for excess errors)
>
> caused by a segfault in native_encode_initializer when it is encoding the
> CONSTRUCTOR for an array whose lower bound is negative (it's OK in Ada).
> The computation of the current position is done in HOST_WIDE_INT and this does
> not work for arrays whose original range has a negative lower bound and a
> positive upper bound; the computation must be done in sizetype instead so that
> it may wrap around, like in get_inner_reference or get_ref_base_and_extent.
>
> Tested on x86-64/Linux, OK for the mainline and 10 branch?

Can you use wide_ints instead of building trees here please?

Thanks,
Richard.

>
>
> 2021-03-18  Eric Botcazou  
>
> PR middle-end/99641
> * fold-const.c (native_encode_initializer) : For an
> array type, do the computation of the current position in sizetype.
>
> --
> Eric Botcazou

Re: [WIP] 'walk_gimple_seq' backward

2021-03-18 Thread Thomas Schwinge

Hi!

On 2021-03-17T08:54:15+0100, Richard Biener  wrote:
> On Wed, Mar 17, 2021 at 12:35 AM Thomas Schwinge
>  wrote:
>> On 2021-03-17T00:24:55+0100, I wrote:
>> > Now, walking each function backwards (!), [...]
>>
>> > I've now got a simple 'callback_op', which for '!is_lhs' looks at
>> > 'get_base_address ([op])', and if that 'var' is contained in the set of
>> > current candidates (initialized per containg 'bind's, which we enter
>> > first, even if walking a 'gimple_seq' backwards), removes that 'var' as a
>> > candidate for such optimization.  (Plus some "details", of couse.)  This
>> > seems to work fine, as far as I can tell.  :-)
>>
>> Is something like the attached "[WIP] 'walk_gimple_seq' backward" OK
>> conceptually?  (For next development stage 1 and with all the TODOs
>> resolved, of course.)
>>
>> The 'backward' flag cannot simply be a argument to 'walk_gimple_seq'
>> etc.: it needs to also be passed to 'walk_gimple_seq_mod' calls triggered
>> from inside 'walk_gimple_stmt'.  Hence, I've put it into the state
>> 'struct walk_stmt_info'.
>
> Can't you simply walk the sequence backward youself and call
> walk_gimple_stmt on each stmt instead?

That's what I'd intended to convey with the paragraph above -- I can't do
what you suggested: 'walk_gimple_stmt' will recursively call
'walk_gimple_seq_mod' ("If STMT can have statements inside
(e.g. GIMPLE_BIND), walk them.") -- and I need all these to be walked
backwards, too.  (..., and don't want to re-produce that whole
'walk_gimple_stmt' logic -- but could factor that logic out of
'walk_gimple_stmt', if that's cleaner than the 'backward' flag?)


> That said,
>
>if (!wi->removed_stmt)
> -   gsi_next (&gsi);
> +   {
> + if (forward)
> +   gsi_next (&gsi);
> + else //TODO Correct?
> +   gsi_prev (&gsi);
> + //TODO This could do with some unit testing, to make sure
> all the corner cases (removing first/last, for example) work
> correctly.
> +   }
>
> if wi->removed_stmt maps to gsi_remove being called then the backwards
> code is incorrect since gsi_remove advances the iterator in the wrong 
> direction.
> So you always need gsi_prev () here.

Thanks, we shall look into that.


> Otherwise sure.

OK, we shall prepare something for next development stage 1.


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf

Re: GCC: v850-elf

2021-03-18 Thread Nick Clifton via Gcc-patches


Hi JBG,


These three let it build.  One done.  Thanks for your support!


No worries.  Patch pushed.

Cheers
  Nick

Re: [Patch, fortran] PR99602 - [11 regression] runtime error: pointer actual argument not associated

2021-03-18 Thread Paul Richard Thomas via Gcc-patches

Hi Tobias,

Thanks for the review. I am resisting dg-run for this patch simply because
the testsuite already takes an oppressive amount of time to run. That the
runtime error is present in the code should be sufficient IMHO.

Regards

Paul


On Thu, 18 Mar 2021 at 08:46, Tobias Burnus  wrote:

> Hi Paul, hi all fortran@/gcc-patch@ reader,
>
> it looks as if you replied with your patch submission to the wrong email
> address – and your re-submission ended up at
> https://gcc.gnu.org/PR99602#c17
>
> On 16.03.21 18:08, Tobias Burnus wrote:
> > On 16.03.21 17:42, Paul Richard Thomas via Gcc-patches wrote:
> >> Fortran: Fix runtime errors for class actual arguments [PR99602].
> >> * trans-array.c (gfc_conv_procedure_call): For class formal
> >> arguments, use the _data field attributes for runtime errors.
> >> * gfortran.dg/pr99602.f90: New test.
> > Shouldn't there be also a testcase which triggers this run-time error?
>
> Note: The new submission consists of a new testcase (now two) and the
> actual patch; the new testcase removes 'pointer' from the dummy argument
> of prepare_m2_proc/prepare_whizard_m2 and checks via the
> -ftree-original-dump that there is now run-time check code inserted when
> passing a (nullified) pointer to a nonpointer dummy argument.
>
> Compared to previous patch, 'fsym_attr.pointer =
> fsym_attr.class_pointer' is new, before it was 'fsym_attr.pointer =
> fsym_attr.pointer'.
>
> Paul Richard Thomas wrote in PR99602:
>
> > Good morning all,
> >
> > I have attached the revised patch and an additional testcase. I had
> totally
> > forgotten about the class pointer gotcha.
> >
> > OK for master?
> >
> > Paul
> >
> > Fortran: Fix runtime errors for class actual arguments [PR99602  <
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99602>].
>
> LGTM – thanks for the patch.
>
> I am wondering whether the second testcase should be a 'dg-do run' test
> instead of 'compile' to ensure that the error is indeed triggered
> (currently, it only checks the tree dump that a check is inserted). What
> do you think? [If you do so, you need a dg-shouldfail + dg-output, cf.
> e.g. pointer_check_5.f90.]
>
> Thanks,
>
> Tobias
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank
> Thürauf
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein

Re: [PATCH] testsuite: Skip c-c++-common/zero-scratch-regs-10.c on arm

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/18/2021 4:02 AM, Christophe Lyon via Gcc-patches wrote:

As discussed in PR 97680, -fzero-call-used-regs is not supported on
arm.

Skip this test to avoid failure reports.

2021-03-18  Christophe Lyon  

gcc/tesuite/
* c-c++-common/zero-scratch-regs-10.c: Skip on arm


OK

Jeff

Re: [patch] fix installation of jit headers, usage of $(mkinstalldirs)

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/18/2021 3:40 AM, Matthias Klose wrote:

The installation of the jit headers can fail, because the directory might not be
created yet, a missing dependency on the installdirs target.

Also the Makefile hardcodes mkdir -p, instead of using $(mkinstalldirs).

Ok for the trunk and the branches?


OK

jeff

[PATCH V5 2/6] dwarf: new dwarf_debuginfo_p predicate

2021-03-18 Thread Jose E. Marchesi via Gcc-patches

This patch introduces a dwarf_debuginfo_p predicate that abstracts and
replaces complex checks on write_symbols.

2021-03-18  Indu Bhagat  

gcc/ChangeLog

* flags.h (dwarf_debuginfo_p): New function declaration.
* opts.c (dwarf_debuginfo_p): New function definition.
* config/c6x/c6x.c (c6x_output_file_unwind): Likewise.
* dwarf2cfi.c (cfi_label_required_p): Likewise.
(dwarf2out_do_frame): Likewise.
* final.c (dwarf2_debug_info_emitted_p): Likewise.
(final_scan_insn_1): Likewise.
* targhooks.c (default_debug_unwind_info): Likewise.
* toplev.c (process_options): Likewise.

gcc/c-family/ChangeLog

* c-lex.c (init_c_lex): Use dwarf_debuginfo_p.
---
 gcc/c-family/c-lex.c |  4 ++--
 gcc/config/c6x/c6x.c |  3 +--
 gcc/dwarf2cfi.c  |  9 -
 gcc/final.c  | 15 ++-
 gcc/flags.h  |  3 +++
 gcc/opts.c   |  8 
 gcc/targhooks.c  |  2 +-
 gcc/toplev.c |  6 ++
 8 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 6374b72ed2d..5174b22c303 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "c-pragma.h"
 #include "debug.h"
+#include "flags.h"
 #include "file-prefix-map.h" /* remap_macro_filename()  */
 #include "langhooks.h"
 #include "attribs.h"
@@ -87,8 +88,7 @@ init_c_lex (void)
 
   /* Set the debug callbacks if we can use them.  */
   if ((debug_info_level == DINFO_LEVEL_VERBOSE
-   && (write_symbols == DWARF2_DEBUG
-  || write_symbols == VMS_AND_DWARF2_DEBUG))
+   && dwarf_debuginfo_p ())
   || flag_dump_go_spec != NULL)
 {
   cb->define = cb_define;
diff --git a/gcc/config/c6x/c6x.c b/gcc/config/c6x/c6x.c
index f9ad1e5f6c5..a10e2f8d662 100644
--- a/gcc/config/c6x/c6x.c
+++ b/gcc/config/c6x/c6x.c
@@ -439,8 +439,7 @@ c6x_output_file_unwind (FILE * f)
 {
   if (flag_unwind_tables || flag_exceptions)
{
- if (write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG)
+ if (dwarf_debuginfo_p ())
asm_fprintf (f, "\t.cfi_sections .debug_frame, .c6xabi.exidx\n");
  else
asm_fprintf (f, "\t.cfi_sections .c6xabi.exidx\n");
diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 2fa9f325360..8a88252edf2 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -39,7 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "expr.h"  /* init_return_column_size */
 #include "output.h"/* asm_out_file */
 #include "debug.h" /* dwarf2out_do_frame, dwarf2out_do_cfi_asm */
-
+#include "flags.h" /* dwarf_debuginfo_p */
 
 /* ??? Poison these here until it can be done generically.  They've been
totally replaced in this file; make sure it stays that way.  */
@@ -2268,8 +2268,7 @@ cfi_label_required_p (dw_cfi_ref cfi)
 
   if (dwarf_version == 2
   && debug_info_level > DINFO_LEVEL_TERSE
-  && (write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG))
+  && dwarf_debuginfo_p ())
 {
   switch (cfi->dw_cfi_opc)
{
@@ -3535,9 +3534,9 @@ bool
 dwarf2out_do_frame (void)
 {
   /* We want to emit correct CFA location expressions or lists, so we
- have to return true if we're going to output debug info, even if
+ have to return true if we're going to generate debug info, even if
  we're not going to output frame or unwind info.  */
-  if (write_symbols == DWARF2_DEBUG || write_symbols == VMS_AND_DWARF2_DEBUG)
+  if (dwarf_debuginfo_p ())
 return true;
 
   if (saved_do_cfi_asm > 0)
diff --git a/gcc/final.c b/gcc/final.c
index daae115fef5..cae692062b4 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -1442,7 +1442,8 @@ asm_str_count (const char *templ)
 static bool
 dwarf2_debug_info_emitted_p (tree decl)
 {
-  if (write_symbols != DWARF2_DEBUG && write_symbols != VMS_AND_DWARF2_DEBUG)
+  /* When DWARF2 debug info is not generated internally.  */
+  if (!dwarf_debuginfo_p ())
 return false;
 
   if (DECL_IGNORED_P (decl))
@@ -2330,10 +2331,8 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
optimize_p ATTRIBUTE_UNUSED,
  break;
 
case NOTE_INSN_BLOCK_BEG:
- if (debug_info_level == DINFO_LEVEL_NORMAL
- || debug_info_level == DINFO_LEVEL_VERBOSE
- || write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG
+ if (debug_info_level >= DINFO_LEVEL_NORMAL
+ || dwarf_debuginfo_p ()
  || write_symbols == VMS_DEBUG)
{
  int n = BLOCK_NUMBER (NOTE_BLOCK (insn));
@@ -2368,10 +2367,8 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
optimize_p ATTRIBUTE_UNUSED,
case NOTE_INSN_BLOCK_END:
  maybe_output_next_view (seen);
 
- if (debug_info_

[PATCH V5 0/6] Support for the CTF and BTF debug formats

2021-03-18 Thread Jose E. Marchesi via Gcc-patches

[Changes from V4:
- Rebased to latest master.
- Support for DATASEC in BTF.
- Bug fixes in the CTF support.
- Be more silent: do not inform() the user anymore if -gctf is used
  along with a frontend for which there is no CTF support.  Ignore
  the request instead.
- Got rid of lang_GNU_GIMPLE, which is not needed.
- New preparatory patch that abstracts the tests on write_symbols
  in a predicate.
- New preparatory patch that provides an internal interface to the
  DWARF internal structures.  This makes it possible to not have to
  #include dwarf2ctf.c in dwarf2out.c.  Note that we only added the
  minimum set of functions/data types we need for dwarf2ctf.  Note
  also that we didn't add prefixes to avoid massive renames in
  dwarf2out.c.  We also add a few new accessor functions.  See the
  particular patch description.
- Fixes to allow using -gctf along with -gdwarf.
- More testing:
  + More BTF tests.
  + More CTF tests.
  + Tests for mixing -gctf and -gdwarf.
  + Regression tests in x86_64 and aarch64.
  + LTO testing.
  + 1612 Gentoo packages built with CTF support, no failures.]

Hi people!

Last year we submitted a first patch series introducing support for
the CTF debugging format in GCC [1].  We got a lot of feedback that
prompted us to change the approach used to generate the debug info,
and this patch series is the result of that.

This series also add support for the BTF debug format, which is needed
by the BPF backend (more on this below.)

This implementation works, but there are several points that need
discussion and agreement with the upstream community, as they impact
the way debugging options work.  We are also proposing a way to add
additional debugging formats in the future.  See below for more
details.

Finally, a patch makes the BPF GCC backend to use the DWARF debug
hooks in order to make -gbtf available to it.

[1] https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01297.html

About CTF
=

CTF is a debugging format designed in order to express C types in a
very compact way.  The key is compactness and simplicity.  For more
information see:

- CTF specification
  http://www.esperi.org.uk/~oranix/ctf/ctf-spec.pdf

- Compact C-Type support in the GNU toolchain (talk + slides)
  https://linuxplumbersconf.org/event/4/contributions/396/

- On type de-duplication in CTF (talk + slides)
  https://linuxplumbersconf.org/event/7/contributions/725/

About BTF
=

BTF is a debugging format, similar to CTF, that is used in the Linux
kernel as the debugging format for BPF programs.  From the kernel
documentation:

"BTF (BPF Type Format) is the metadata format which encodes the debug
 info related to BPF program/map. The name BTF was used initially to
 describe data types. The BTF was later extended to include function
 info for defined subroutines, and line info for source/line
 information."

Supporting BTF in GCC is important because compiled BPF programs
(which GCC supports as a target) require the type information in order
to be loaded and run in diverse kernel versions.  This mechanism is
known as CO-RE (compile-once, run-everywhere) and is described in the
"Update of the BPF support in the GNU Toolchain" talk mentioned below.

The BTF is documented in the Linux kernel documentation tree:
- linux/Documentation/bpf/btf.rst

CTF in the GNU Toolchain


During the last year we have been working in adding support for CTF to
several components of the GNU toolchain:

- binutils support is already upstream.  It supports linking objects
  with CTF information with full type de-duplication.

- GDB support is to be sent upstream very shortly.  It makes the
  debugger capable to use the CTF information whenever available.
  This is useful in cases where DWARF has been stripped out but CTF is
  kept.

- GCC support is being discussed and submitted in this series.

Overview of the Implementation
==

  dwarf2out.c

The enabled debug formats are hooked in dwarf2out_early_finish.

  dwarf2int.h

Internal interface that exports a few functions and data types
defined in dwarf2out.c.

  dwarf2ctf.c

Code that tranform the internal GCC DWARF DIEs into CTF container
structures.  This file uses the dwarf2int.h interface.

  ctfc.c
  ctfc.h

These two files implement the "CTF container", which is shared
among CTF and BTF, due to the many similarities between both
formats.

  ctfout.c

Code that emits assembler with the .ctf section data, from the CTF
container.

  btfout.c

Code that emits assembler with the .BTF section data, from the CTF
container.

>From debug hooks to debug formats
=

Our first attempt in adding CTF to GCC used the obvious approach of
adding a new set of debug hooks as defined in gcc/debug.h.

During our first interaction with the upstream community we were told
to _not_ use debug hooks, because these are to be obsoleted at some
point.  We were sug

[PATCH V5 1/6] dwarf: add a dwarf2int.h internal interface

2021-03-18 Thread Jose E. Marchesi via Gcc-patches

This patch introduces a dwarf2int.h header, to be used by code that
needs access to the internal DIE structures and their attributes.

The following functions which were previously defined as static in
dwarf2out.c are now non-static, and extern prototypes for them have
been added to dwarf2int.h:

- get_AT
- get_AT_string
- get_AT_flag
- get_AT_unsigned
- get_AT_ref
- new_die_raw
- lookup_decl_die
- base_type_die
- add_name_attribute

Note how this patch doens't change the names of these functions to
avoid a massive renaming in dwarf2out.c, but n the future we probably
want these functions to sport a dw_* prefix.

Also, a struct type has been moved from dwarf2out.c to dwarf2int.h:

- dw_attr_node

Finally, three new accessor functions have been added to dwarf2out.c
with prototypes in dwarf2int.h:

- dw_get_die_child
- dw_get_die_sib
- dw_get_die_tag

2021-03-17  Jose E. Marchesi  

* dwarf2int.h: New file.
* dwarf2out.c (get_AT): Function is no longer static.
(get_AT_string): Likewise.
(get_AT_flag): Likewise.
(get_AT_unsigned): Likewise.
(get_AT_ref): Likewise.
(new_die_raw): Likewise.
(lookup_decl_die): Likewise.
(base_type_die): Likewise.
(add_name_attribute): Likewise.
(dw_get_die_tag): New function.
(dw_get_die_child): Likewise.
(dw_get_die_sib): Likewise.
Include dwarf2int.h.
* Makefile.in (GTFILES): Add dwarf2int.h.
---
 gcc/Makefile.in |  1 +
 gcc/dwarf2int.h | 58 
 gcc/dwarf2out.c | 71 ++---
 3 files changed, 96 insertions(+), 34 deletions(-)
 create mode 100644 gcc/dwarf2int.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 8a5fb3fd99c..e464e8c65c5 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2653,6 +2653,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h 
$(srcdir)/coretypes.h \
   $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \
   $(srcdir)/ipa-modref-tree.h \
   $(srcdir)/signop.h \
+  $(srcdir)/dwarf2int.h \
   $(srcdir)/dwarf2out.h \
   $(srcdir)/dwarf2asm.c \
   $(srcdir)/dwarf2cfi.c \
diff --git a/gcc/dwarf2int.h b/gcc/dwarf2int.h
new file mode 100644
index 000..c7a2dbd3325
--- /dev/null
+++ b/gcc/dwarf2int.h
@@ -0,0 +1,58 @@
+/* Prototypes for functions manipulating DWARF2 DIEs.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* This file contains prototypes for functions defined in dwarf2out.c.  It is
+   intended to be included in source files that need some internal knowledge of
+   the GCC dwarf structures.  */
+
+#ifndef GCC_DWARF2INT_H
+#define GCC_DWARF2INT_H 1
+
+/* Each DIE attribute has a field specifying the attribute kind,
+   a link to the next attribute in the chain, and an attribute value.
+   Attributes are typically linked below the DIE they modify.  */
+
+typedef struct GTY(()) dw_attr_struct {
+  enum dwarf_attribute dw_attr;
+  dw_val_node dw_attr_val;
+}
+dw_attr_node;
+
+extern dw_attr_node *get_AT (dw_die_ref, enum dwarf_attribute);
+extern HOST_WIDE_INT AT_int (dw_attr_node *);
+extern unsigned HOST_WIDE_INT AT_unsigned (dw_attr_node *a);
+extern dw_die_ref get_AT_ref (dw_die_ref, enum dwarf_attribute);
+extern const char *get_AT_string (dw_die_ref, enum dwarf_attribute);
+extern enum dw_val_class AT_class (dw_attr_node *);
+extern unsigned HOST_WIDE_INT AT_unsigned (dw_attr_node *);
+extern unsigned get_AT_unsigned (dw_die_ref, enum dwarf_attribute);
+extern int get_AT_flag (dw_die_ref, enum dwarf_attribute);
+
+extern void add_name_attribute (dw_die_ref, const char *);
+
+extern dw_die_ref new_die_raw (enum dwarf_tag);
+extern dw_die_ref base_type_die (tree, bool);
+
+extern dw_die_ref lookup_decl_die (tree);
+
+extern dw_die_ref dw_get_die_child (dw_die_ref);
+extern dw_die_ref dw_get_die_sib (dw_die_ref);
+extern enum dwarf_tag dw_get_die_tag (dw_die_ref);
+
+#endif /* !GCC_DWARF2INT_H */
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index b3ca159c3a8..b3fe41313af 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "expr.h"
 #include "dwarf2out.h"
 #include "dwarf2asm.h"
+#include "dwarf2int.h"
 #include "toplev.h"
 #include "md5.h"
 #include "tree-pretty-print.h"
@@ -3069,17 +3070,6 @@ maybe_reset_location_view (rtx_insn *insn,

[PATCH V5 5/6] CTF/BTF documentation

2021-03-18 Thread Jose E. Marchesi via Gcc-patches

This commit documents the new command line options introduced by the
CTF and BTF debug formats.

2021-02-18  Indu Bhagat  

* doc/invoke.texi: Document the CTF and BTF debug info options.
---
 gcc/doc/invoke.texi | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7a368959e5e..79453a53b7b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -462,6 +462,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Debugging Options
 @xref{Debugging Options,,Options for Debugging Your Program}.
 @gccoptlist{-g  -g@var{level}  -gdwarf  -gdwarf-@var{version} @gol
+-gbtf -gctf  -gctf@var{level} @gol
 -ggdb  -grecord-gcc-switches  -gno-record-gcc-switches @gol
 -gstabs  -gstabs+  -gstrict-dwarf  -gno-strict-dwarf @gol
 -gas-loc-support  -gno-as-loc-support @gol
@@ -9665,6 +9666,25 @@ other DWARF-related options such as
 @option{-fno-dwarf2-cfi-asm}) retain a reference to DWARF Version 2
 in their names, but apply to all currently-supported versions of DWARF.
 
+@item -gbtf
+@opindex gbtf
+Request BTF debug information.
+
+@item -gctf
+@itemx -gctf@var{level}
+@opindex gctf
+Request CTF debug information and use level to specify how much CTF debug
+information should be produced.  If -gctf is specified without a value for
+level, the default level of CTF debug information is 2.
+
+Level 0 produces no CTF debug information at all.  Thus, -gctf0 negates -gctf.
+
+Level 1 produces CTF information for tracebacks only.  This includes callsite
+information, but does not include type information.
+
+Level 2 produces type information for entities (functions, data objects etc.)
+at file-scope or global-scope only.
+
 @item -gstabs
 @opindex gstabs
 Produce debugging information in stabs format (if that is supported),
-- 
2.25.0.2.g232378479e

[PATCH V5 6/6] Enable BTF generation in the BPF backend

2021-03-18 Thread Jose E. Marchesi via Gcc-patches

This patch changes the BPF GCC backend in order to use the DWARF debug
hooks and therefore enables the user to generate BTF debugging
information with -gbtf.  Generating BTF is crucial when compiling BPF
programs, since the CO-RE (compile-once, run-everwhere) mechanism
used by the kernel BPF loader relies on it.

Note that since in eBPF it is not possible to unwind frames due to the
restrictive nature of the target architecture, we are disabling the
generation of CFA in this target.

2021-01-22  David Faust 

* config/bpf/bpf.c (bpf_expand_prologue): Do not mark insns as
frame related.
(bpf_expand_epilogue): Likewise.
* config/bpf/bpf.h (DWARF2_FRAME_INFO): Define to 0.
Do not define DBX_DEBUGGING_INFO.
---
 gcc/config/bpf/bpf.c |  4 
 gcc/config/bpf/bpf.h | 12 ++--
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
index 126d4a2798d..e635f9edb40 100644
--- a/gcc/config/bpf/bpf.c
+++ b/gcc/config/bpf/bpf.c
@@ -349,7 +349,6 @@ bpf_expand_prologue (void)
  hard_frame_pointer_rtx,
  fp_offset - 8));
  insn = emit_move_insn (mem, gen_rtx_REG (DImode, regno));
- RTX_FRAME_RELATED_P (insn) = 1;
  fp_offset -= 8;
}
}
@@ -364,7 +363,6 @@ bpf_expand_prologue (void)
 {
   insn = emit_move_insn (stack_pointer_rtx,
 hard_frame_pointer_rtx);
-  RTX_FRAME_RELATED_P (insn) = 1;
 
   if (size > 0)
{
@@ -372,7 +370,6 @@ bpf_expand_prologue (void)
 gen_rtx_PLUS (Pmode,
   stack_pointer_rtx,
   GEN_INT (-size;
- RTX_FRAME_RELATED_P (insn) = 1;
}
 }
 }
@@ -412,7 +409,6 @@ bpf_expand_epilogue (void)
  hard_frame_pointer_rtx,
  fp_offset - 8));
  insn = emit_move_insn (gen_rtx_REG (DImode, regno), mem);
- RTX_FRAME_RELATED_P (insn) = 1;
  fp_offset -= 8;
}
}
diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
index 9e2f5260900..ef0123b56a6 100644
--- a/gcc/config/bpf/bpf.h
+++ b/gcc/config/bpf/bpf.h
@@ -235,17 +235,9 @@ enum reg_class
 
 / Debugging Info /
 
-/* We cannot support DWARF2 because of the limitations of eBPF.  */
+/* In eBPF it is not possible to unwind frames. Disable CFA.  */
 
-/* elfos.h insists in using DWARF.  Undo that here.  */
-#ifdef DWARF2_DEBUGGING_INFO
-# undef DWARF2_DEBUGGING_INFO
-#endif
-#ifdef PREFERRED_DEBUGGING_TYPE
-# undef PREFERRED_DEBUGGING_TYPE
-#endif
-
-#define DBX_DEBUGGING_INFO
+#define DWARF2_FRAME_INFO 0
 
 / Stack Layout and Calling Conventions.  */
 
-- 
2.25.0.2.g232378479e

[PATCH V5 4/6] CTF/BTF testsuites

2021-03-18 Thread Jose E. Marchesi via Gcc-patches

This commit adds a new testsuite for the CTF debug format.

2021-03-18  Indu Bhagat  
David Faust  

gcc/testsuite/

* gcc.dg/debug/btf/btf-1.c: New test.
* gcc.dg/debug/btf/btf-2.c: Likewise.
* gcc.dg/debug/btf/btf-anonymous-struct-1.c: Likewise.
* gcc.dg/debug/btf/btf-anonymous-union-1.c: Likewise.
* gcc.dg/debug/btf/btf-array-1.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-1.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-2.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-3.c: Likewise.
* gcc.dg/debug/btf/btf-cvr-quals-1.c: Likewise.
* gcc.dg/debug/btf/btf-enum-1.c: Likewise.
* gcc.dg/debug/btf/btf-forward-1.c: Likewise.
* gcc.dg/debug/btf/btf-function-1.c: Likewise.
* gcc.dg/debug/btf/btf-function-2.c: Likewise.
* gcc.dg/debug/btf/btf-int-1.c: Likewise.
* gcc.dg/debug/btf/btf-pointers-1.c: Likewise.
* gcc.dg/debug/btf/btf-struct-1.c: Likewise.
* gcc.dg/debug/btf/btf-typedef-1.c: Likewise.
* gcc.dg/debug/btf/btf-union-1.c: Likewise.
* gcc.dg/debug/btf/btf-variables-1.c: Likewise.
* gcc.dg/debug/btf/btf.exp: Likewise.
* gcc.dg/debug/ctf/ctf-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-anonymous-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-anonymous-union-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-attr-mode-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-attr-used-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-complex-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-enum-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-enum-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-file-scope-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-float-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-forward-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-forward-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-func-index-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-functions-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-int-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-objt-index-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-pointers-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-pointers-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-preamble-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-5.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-6.c: Likewise.
* gcc.dg/debug/ctf/ctf-str-table-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-array-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-pointer-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-pointer-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-union-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-variables-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-variables-2.c: Likewise.
* gcc.dg/debug/ctf/ctf.exp: Likewise.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-1.c|  6 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-2.c| 10 +++
 .../gcc.dg/debug/btf/btf-anonymous-struct-1.c | 23 ++
 .../gcc.dg/debug/btf/btf-anonymous-union-1.c  | 23 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-array-1.c  | 31 +++
 .../gcc.dg/debug/btf/btf-bitfields-1.c| 34 
 .../gcc.dg/debug/btf/btf-bitfields-2.c| 26 ++
 .../gcc.dg/debug/btf/btf-bitfields-3.c| 43 ++
 .../gcc.dg/debug/btf/btf-cvr-quals-1.c| 52 
 .../gcc.dg/debug/btf/btf-datasec-1.c  | 45 ++
 gcc/testsuite/gc

[PATCH] testsuite: Fix up strlenopt-73.c on powerpc [PR99626]

2021-03-18 Thread Jakub Jelinek via Gcc-patches

Hi!

As mentioned in the testcase as well as in the PR, this testcase relies on
MOVE_MAX being sufficiently large that the memcpy call is folded early into
load + store.  Some popular targets define MOVE_MAX to 8 or even 16 (e.g.
x86_64 or some options on s390x), but many other targets define it to just 4
(e.g. powerpc 32-bit), or even 2.

The testcase has already one test routine guarded on one particular target
with MOVE_MAX 16 (but does it incorrectly, __i386__ is only defined on
32-bit x86 and __SIZEOF_INT128__ is only defined on 64-bit targets), this
patch fixes that, and guards another test that relies on memcpy (, , 8)
being folded that way (which therefore needs MOVE_MAX >= 8) on a couple of
common targets that are known to have such MOVE_MAX.

Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?

2021-03-18  Jakub Jelinek  

PR testsuite/99626
* gcc.dg/strlenopt-73.c: Ifdef out test_copy_cond_unequal_length_i64
on targets other than x86, aarch64, s390 and 64-bit powerpc.  Use
test_copy_cond_unequal_length_i128 for __x86_64__ with int128 support
rather than __i386__.

--- gcc/testsuite/gcc.dg/strlenopt-73.c.jj  2020-01-12 11:54:37.518396737 
+0100
+++ gcc/testsuite/gcc.dg/strlenopt-73.c 2021-03-18 15:03:56.313564224 +0100
@@ -69,6 +69,13 @@ void test_copy_cond_equal_length (void)
   T ( 0 ==, 33,  1, (i0 ? a32 : b32) + 32);
 }
 
+#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) \
+|| defined(__s390__) || defined(__powerpc64__)
+
+/* The following tests assume GCC transforms the memcpy calls into
+   long long assignments which it does only on targets that define
+   the MOVE_MAX macro to 8 or higher.  Enable on a set of targets
+   known to do that.  */
 
 const char a4[16] = "0123";
 const char b4[16] = "3210";
@@ -84,12 +91,14 @@ void test_copy_cond_unequal_length_i64 (
   T (0 <, 16, 8, i0 ? a4 + 2 : b4 + 3);
 }
 
+#endif
+
 
-#if __i386__ && __SIZEOF_INT128__ == 16
+#if defined(__x86_64__) && __SIZEOF_INT128__ == 16
 
 /* The following tests assume GCC transforms the memcpy calls into
int128_t assignments which it does only on targets that define
-   the MOVE_MAX macro to 16.  That's only s390 and i386 with
+   the MOVE_MAX macro to 16.  That's only s390 and x86_64 with
int128_t support.  */
 
 const char a8[32] = "01234567";

Jakub

[PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Jakub Jelinek via Gcc-patches

Hi!

Similar issue as in strlenopt-73.c, various spots in this test rely
on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
32-bit have MOVE_MAX just 4.

Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?

2021-03-18  Jakub Jelinek  

PR testsuite/99636
* gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.

--- gcc/testsuite/gcc.dg/strlenopt-80.c.jj  2020-01-12 11:54:37.535396481 
+0100
+++ gcc/testsuite/gcc.dg/strlenopt-80.c 2021-03-18 14:58:08.605352191 +0100
@@ -3,7 +3,7 @@
The optimization is only implemented for MEM_REF stores and other
targets than those below may not transform the memcpy call into
such a store.
-   { dg-do compile { target aarch64*-*-* i?86-*-* powerpc*-*-* x86_64-*-* } }
+   { dg-do compile { target { { aarch64*-*-* i?86-*-* x86_64-*-* } || { { 
powerpc*-*-* } && lp64 } } } }
 
{ dg-options "-O2 -Wall -fdump-tree-optimized" } */
 

Jakub

Re: [PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/18/2021 8:37 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

Similar issue as in strlenopt-73.c, various spots in this test rely
on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
32-bit have MOVE_MAX just 4.

Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?

2021-03-18  Jakub Jelinek  

PR testsuite/99636
* gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.


OK.  But it'd sure be nice to be able to do something like force a value 
of MOVE_MAX using a --param to make this kind of hack unnecessary.




Jeff

Re: [PATCH] testsuite: Fix up strlenopt-73.c on powerpc [PR99626]

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/18/2021 8:35 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

As mentioned in the testcase as well as in the PR, this testcase relies on
MOVE_MAX being sufficiently large that the memcpy call is folded early into
load + store.  Some popular targets define MOVE_MAX to 8 or even 16 (e.g.
x86_64 or some options on s390x), but many other targets define it to just 4
(e.g. powerpc 32-bit), or even 2.

The testcase has already one test routine guarded on one particular target
with MOVE_MAX 16 (but does it incorrectly, __i386__ is only defined on
32-bit x86 and __SIZEOF_INT128__ is only defined on 64-bit targets), this
patch fixes that, and guards another test that relies on memcpy (, , 8)
being folded that way (which therefore needs MOVE_MAX >= 8) on a couple of
common targets that are known to have such MOVE_MAX.

Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?

2021-03-18  Jakub Jelinek  

PR testsuite/99626
* gcc.dg/strlenopt-73.c: Ifdef out test_copy_cond_unequal_length_i64
on targets other than x86, aarch64, s390 and 64-bit powerpc.  Use
test_copy_cond_unequal_length_i128 for __x86_64__ with int128 support
rather than __i386__.


OK with same comment as other patch.


Jeff

Re: [PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Jakub Jelinek via Gcc-patches

On Thu, Mar 18, 2021 at 08:58:20AM -0600, Jeff Law via Gcc-patches wrote:
> 
> On 3/18/2021 8:37 AM, Jakub Jelinek via Gcc-patches wrote:
> > Hi!
> > 
> > Similar issue as in strlenopt-73.c, various spots in this test rely
> > on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
> > of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
> > 32-bit have MOVE_MAX just 4.
> > 
> > Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?
> > 
> > 2021-03-18  Jakub Jelinek  
> > 
> > PR testsuite/99636
> > * gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.
> 
> OK.  But it'd sure be nice to be able to do something like force a value of
> MOVE_MAX using a --param to make this kind of hack unnecessary.

I fear such a param would be quite dangerous, dunno what would happen if
somebody chose a length that can't be backed up by some integral or SIMD
type.  Maybe for the gimple-fold.c case
  tree type = lang_hooks.types.type_for_size (ilen * 8, 1);
  if (type
  && is_a  (TYPE_MODE (type), &mode)
  && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
would fail (so we couldn't handle that way the 16 byte case anyway on all
targets), but there are other parts of the compiler that use MOVE_MAX.

I think maybe better would be to instead improve the optimization so that
it would work even with the non-lowered memcpy calls.  But that would be
a GCC12 thing probably.

Jakub

Re: [PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Richard Biener via Gcc-patches

On Thu, Mar 18, 2021 at 4:09 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Thu, Mar 18, 2021 at 08:58:20AM -0600, Jeff Law via Gcc-patches wrote:
> >
> > On 3/18/2021 8:37 AM, Jakub Jelinek via Gcc-patches wrote:
> > > Hi!
> > >
> > > Similar issue as in strlenopt-73.c, various spots in this test rely
> > > on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
> > > of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
> > > 32-bit have MOVE_MAX just 4.
> > >
> > > Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?
> > >
> > > 2021-03-18  Jakub Jelinek  
> > >
> > > PR testsuite/99636
> > > * gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.
> >
> > OK.  But it'd sure be nice to be able to do something like force a value of
> > MOVE_MAX using a --param to make this kind of hack unnecessary.
>
> I fear such a param would be quite dangerous, dunno what would happen if
> somebody chose a length that can't be backed up by some integral or SIMD
> type.  Maybe for the gimple-fold.c case
>   tree type = lang_hooks.types.type_for_size (ilen * 8, 1);
>   if (type
>   && is_a  (TYPE_MODE (type), &mode)
>   && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
> would fail (so we couldn't handle that way the 16 byte case anyway on all
> targets), but there are other parts of the compiler that use MOVE_MAX.
>
> I think maybe better would be to instead improve the optimization so that
> it would work even with the non-lowered memcpy calls.  But that would be
> a GCC12 thing probably.

And/or relax the conditions under which we do the transform.

Richard.

>
> Jakub
>

Re: [PATCH 1/2, rs6000] Add const_anchor for rs6000 [PR33699]

2021-03-18 Thread David Edelsohn via Gcc-patches

Hao,

Segher and I do not doubt that the patch can improve the examples and
testcases.  The question is if those examples are representative of
common situations and if the patch truly improves performance overall
-- for real workloads.  Can you test the performance impact of your
patch, not only demonstrating the change in code generation?

Also, I understand about the range of constants that you wish to address, but

TARGET_MIN_ANCHOR_OFFSET (targetm.min_anchor_offset)

and

TARGET_MAX_ANCHOR_OFFSET (targetm.max_anchor_offset)

are parameters for completely different features in GCC.  I realize
that some GCC ports define the values in close proximity in the source
files, but your patch to define TARGET_ANCHOR_CONST should not define
or change the other macros.  The patch only should define
TARGET_ANCHOR_CONST.

And I believe that it should be possible to define TARGET_ANCHOR_CONST
in rs6000.c

#define TARGET_ANCHOR_CONST 0x8000

instead of using targetm.anchor_const, which would be more consistent
with the style for definitions of other values in the rs6000 port.

Thanks, David


On Wed, Mar 17, 2021 at 9:21 PM HAO CHEN GUI  wrote:
>
> David & Segher,
>
> Thanks so much for your explanation. My patch wants to enables the
> constant anchor on rs6000 as TARGET_ANCHOR_CONST or targetm.anchor_const
> is undefined. I realized that we have addi and addis instructions. So
> the range of the offset could be a 32 bit constant.
>
> I put a test case at
> https://github.ibm.com/wschmidt/power-gcc/issues/1042#issuecomment-28922825.
> It shows how anchor_const can improve asm output. With anchor_const, the
> second complex constant loading can be eliminated by cse if it is within
> the range of the first one.
>
>Thanks again and looking forward to your advice.
>
> On 18/3/2021 上午 8:57, David Edelsohn wrote:
> > On Wed, Mar 17, 2021 at 8:26 PM Segher Boessenkool
> >  wrote:
> >> Hi!
> >>
> >> On Wed, Mar 17, 2021 at 03:35:30PM -0400, David Edelsohn wrote:
> >>> I disagree with your new definitions and I disagree with the manner in
> >>> which you are trying to change the values.
> >> Yes.
> >>
> >>> Your patch is NOT okay without a lot more explanation and justification.
> >> Which is why I said:
> >>
> > 1) This isn't suitable for stage 4.
> >> You give a lot more reasons to not want it, but that was enough for me.
> >>
> > 2) Please add a test case, which shows what it does, that it is useful.
> >> I meant there is no way we can accept this patch if we aren't shown what
> >> it does, and that that is a good thing.
> >>
> > 3) Does this work on other OSes than Linux?  What about Darwin and AIX?
> >> And here I meant that there is no way we can accept patches that
> >> influence code generation on all platforms when we have no idea what it
> >> does on most platforms.  I did not intend to suggest the patch would be
> >> more acceptable if it was tested on other platforms; I wanted to say it
> >> is not acceptable if it is not.
> >>
> >> The main issue is 2).  We need to understand what problem this patch is
> >> trying to solve.  I'm sure Hao Chen had a reason for doing this patch,
> >> so I'd like to know what it is trying to achieve, what it is trying to
> >> improve!
> > Investigating this with Segher, I believe that there is some confusion
> > about the "ANCHOR" macros.
> >
> > TARGET_MIN_ANCHOR_OFFSET and TARGET_MAX_ANCHOR_OFFSET are not related
> > to TARGET_ANCHOR_CONST.
> >
> > Also, TARGET_ANCHOR_CONST can be defined as a macro to trigger the
> > hook, and doesn't need targetm.anchor_const.
> >
> > Any change to TARGET_ANCHOR_CONST requires extensive performance
> > testing.  Yes, it presumably fixes the testcase, but the impact on
> > overall performance is the critical question.
> >
> > Thanks, David

Re: [PATCH] RFC: come up with startswith function.

2021-03-18 Thread Martin Sebor via Gcc-patches


On 3/18/21 4:46 AM, Martin Liška wrote:

Hey.

Recently, I noticed a cumbersome construct we use for string startswith 
function

(most notably in a situation when the prefix is a string literal).

Commonly used patterns are:
1) strncmp (arg, "--sysroot=", 10) == 0
2) strncmp (name, "not found", sizeof ("not found") - 1) == 0
3) strncmp (varname, "__builtin_", strlen ("__builtin_")) == 0
4) #define STR "-foffload-abi="
    if (strncmp (argv[i], STR, strlen (STR)) == 0)

I see all these quite error prone for the following reasons:
1) one needs to correctly calculate string length (in their head)
2) sizeof ("foo") - 1 == strlen ("foo")


Right.  They could be alleviated by either developing a new warning
or extending -Wstring-compare to complain when the length of a string
literal argument isn't the same as the bound in these cases.


3) one needs to undefine a temporary macros

Moreover, there are helper functions that already do the same:

gcc/ada/adadecode.c:
static int
has_prefix (const char *name, const char *prefix)
{
   return strncmp (name, prefix, strlen (prefix)) == 0;
}

gcc/fortran/gfortran.h:
#define gfc_str_startswith(str, pref) \
    (strncmp ((str), (pref), strlen (pref)) == 0)

That said, I'm suggesting a new function 'startswith' in system.h.
I prepared a patch that utilizes the function in gcc/ subfolder
(excluding all target code for now). I can prepare similar mechanical
patch for the rest of the compiler (and run-time libraries).

Thoughts?
Thanks,
Martin


I like it.

...

diff --git a/gcc/system.h b/gcc/system.h
index a3f5948aaee..3e384616d3a 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -1291,4 +1291,12 @@ void gcc_stablesort (void *, size_t, size_t,
  #define NULL nullptr
  #endif

+/* Return 1 if STR string starts with PREFIX.  */
+
+static inline int
+startswith (const char *str, const char *prefix)
+{
+  return strncmp (str, prefix, strlen (prefix)) == 0;
+}


The return type of the function should be bool rather than int.

Martin

Re: [PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/18/2021 9:08 AM, Jakub Jelinek wrote:

On Thu, Mar 18, 2021 at 08:58:20AM -0600, Jeff Law via Gcc-patches wrote:

On 3/18/2021 8:37 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

Similar issue as in strlenopt-73.c, various spots in this test rely
on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
32-bit have MOVE_MAX just 4.

Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?

2021-03-18  Jakub Jelinek  

PR testsuite/99636
* gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.

OK.  But it'd sure be nice to be able to do something like force a value of
MOVE_MAX using a --param to make this kind of hack unnecessary.

I fear such a param would be quite dangerous, dunno what would happen if
somebody chose a length that can't be backed up by some integral or SIMD
type.  Maybe for the gimple-fold.c case
   tree type = lang_hooks.types.type_for_size (ilen * 8, 1);
   if (type
   && is_a  (TYPE_MODE (type), &mode)
   && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
would fail (so we couldn't handle that way the 16 byte case anyway on all
targets), but there are other parts of the compiler that use MOVE_MAX.

I think maybe better would be to instead improve the optimization so that
it would work even with the non-lowered memcpy calls.  But that would be
a GCC12 thing probably.


In my mind it'd only be for the testsuite and would be documented as 
such.  I wouldn't want users twiddling it.  We could do the same with 
BRANCH_COST to simplify the insanity we have with target selectors in 
various tests related to that.  There's probably other cases where 
testing-only params would be helpful.



Totally agree that improving the optimization & analysis to work with 
both cases is also good and that it wouldn't be appropriate for gcc-11.



Jeff

[PATCH][AArch64] Leveraging the use of STP instruction for vec_duplicate

2021-03-18 Thread Victor Do Nascimento via Gcc-patches

The backend pattern for storing a pair of identical values in 32 and 64-bit 
modes with the machine instruction STP was missing, and multiple instructions 
were needed to reproduce this behavior as a result of failed RTL pattern match 
in combine pass.

For the test case :

typedef long long v2di __attribute__((vector_size (16)));
typedef int v2si __attribute__((vector_size (8)));

void
foo (v2di *x, long long a)
{
v2di tmp = {a, a};
*x = tmp;
}

void
foo2 (v2si *x, int a)
{
v2si tmp = {a, a};
*x = tmp;
}

at -O2 on aarch64 gives:

foo:
stp x1, x1, [x0]
ret
foo2:
stp w1, w1, [x0]
ret

instead of:

foo:
dup v0.2d, x1
str q0, [x0]
ret
foo2:
dup v0.2s, w1
str d0, [x0]
ret

In preparation for the next stage 1  phase of development, added new RTL 
template, unittest and checked for regressions on bootstrapped 
aarch64-none-linux-gnu.

gcc/ChangeLog

2021-02-04 victor Do Nascimento 

* config/aarch64/aarch64-simd.md: Implement RTX pattern for
mapping 'vec_duplicate' RTX onto 'STP' ASM insn.
* config/aarch64/iterators.md: Implement ldpstp_vel_sz iterator
to map STP/LDP vector element mode to correct suffix in
attribute type definition of aarch64_simd_stp pattern.

gcc/testsuite/ChangeLog

2021-02-04 Victor Do Nascimento 

* gcc.target/stp_vec-dup_32_64-1.c: Added test.

Regards,
Victor

---
 gcc/config/aarch64/aarch64-simd.md| 10 +
 gcc/config/aarch64/iterators.md   |  3 +++
 .../gcc.target/aarch64/stp_vec_dup_32_64-1.c  | 22 +++
 3 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 71aa77dd010..3d53bab0018 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -205,6 +205,16 @@
   [(set_attr "type" "neon_stp")]
 )
 
+(define_insn "aarch64_simd_stp"
+  [(set (match_operand:VP_2E 0 "aarch64_mem_pair_operand" "=Ump,Ump")
+   (vec_duplicate:VP_2E (match_operand: 1 "register_operand" 
"w,r")))]
+  "TARGET_SIMD"
+  "@
+   stp\\t%1, %1, %z0
+   stp\\t%1, %1, %z0"
+  [(set_attr "type" "neon_stp, store_")]
+)
+
 (define_insn "load_pair"
   [(set (match_operand:VQ 0 "register_operand" "=w")
(match_operand:VQ 1 "aarch64_mem_pair_operand" "Ump"))
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index fb6e228651e..196055d31e5 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -898,6 +898,9 @@
 ;; Likewise for load/store pair.
 (define_mode_attr ldpstp_sz [(SI "8") (DI "16")])
 
+;; Size of element access for STP/LDP-generated vectors.
+(define_mode_attr ldpstp_vel_sz [(V2SI "8") (V2SF "8") (V2DI "16") (V2DF 
"16")])
+
 ;; For inequal width int to float conversion
 (define_mode_attr w1 [(HF "w") (SF "w") (DF "x")])
 (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")])
diff --git a/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c 
b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
new file mode 100644
index 000..a37c903dfd4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
@@ -0,1 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef long long v2di __attribute__((vector_size (16)));
+typedef int v2si __attribute__((vector_size (8)));
+
+void
+foo (v2di *x, long long a)
+{
+  v2di tmp = {a, a};
+  *x = tmp;
+}
+
+void
+foo2 (v2si *x, int a)
+{
+  v2si tmp = {a, a};
+  *x = tmp;
+}
+
+/* { dg-final { scan-assembler-times "stp\t" 2 } } */
+/* { dg-final { scan-assembler-not "dup\t" } } */
-- 
2.17.1

@@ -1,0 +23,0 @@

Re: [PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Jakub Jelinek via Gcc-patches

On Thu, Mar 18, 2021 at 09:31:03AM -0600, Jeff Law wrote:
> > > OK.  But it'd sure be nice to be able to do something like force a value 
> > > of
> > > MOVE_MAX using a --param to make this kind of hack unnecessary.
> > I fear such a param would be quite dangerous, dunno what would happen if
> > somebody chose a length that can't be backed up by some integral or SIMD
> > type.  Maybe for the gimple-fold.c case
> >tree type = lang_hooks.types.type_for_size (ilen * 8, 1);
> >if (type
> >&& is_a  (TYPE_MODE (type), &mode)
> >&& GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
> > would fail (so we couldn't handle that way the 16 byte case anyway on all
> > targets), but there are other parts of the compiler that use MOVE_MAX.
> > 
> > I think maybe better would be to instead improve the optimization so that
> > it would work even with the non-lowered memcpy calls.  But that would be
> > a GCC12 thing probably.
> 
> In my mind it'd only be for the testsuite and would be documented as such. 
> I wouldn't want users twiddling it.  We could do the same with BRANCH_COST

For BRANCH_COST we already have -mbranch-cost= option on a couple of targets
and then --param=logical-op-non-short-circuit={0,1} to override the
gimplification behavior.

Jakub

Re: [PATCH] testsuite: Fix up strlenopt-80.c on powerpc [PR99636]

2021-03-18 Thread Martin Sebor via Gcc-patches


On 3/18/21 8:58 AM, Jeff Law via Gcc-patches wrote:


On 3/18/2021 8:37 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

Similar issue as in strlenopt-73.c, various spots in this test rely
on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
32-bit have MOVE_MAX just 4.

Tested on x86_64-linux and powerpc64-linux -m32/-m64, ok for trunk?

2021-03-18  Jakub Jelinek  

PR testsuite/99636
* gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.


OK.  But it'd sure be nice to be able to do something like force a value 
of MOVE_MAX using a --param to make this kind of hack unnecessary.


Or have GCC define a __MOVE_MAX__ macro that could then be used in
the test suite to guard the test cases.  (This could be done only
conditionally, in response to some internal command line option.)

Martin

Re: [PATCH 2/2] Bypass BLKmode before try_const_anchors

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/14/2021 9:16 PM, HAO CHEN GUI via Gcc-patches wrote:

Hi,

    This patch fixes an ICE found by enabling const_anchor for rs6000. 
The BLKmode constant rtx is sent to try_const_anchors which causes 
assertion failure in try_const_anchors.


    The attachment are the patch diff and change log file.

    Bootstrapped and tested on powerpc64le with no regressions. Is 
this okay for trunk? Any  recommendations? Thanks a lot.



ChangeLog-2

* cse.c (cse_insn): Add a BLKmode check for const_anchor.


Just so I'm sure I understand what's going on here.  This is something 
you need for patch 1/2 which enables anchors  on the PPC port.  It's 
related to PR33699, which is a regression.  Patch #1 has been rejected 
for stage4 and has bigger questions/issues/objections that need to be 
addressed, right?



With that in mind, I think this should defer until patch #1 in this 
series is basically acceptable to PPC maintainers.



I would ask that you refer to PR33699 in the patch so that we can more 
easily see the linkage to the BZ.



Thanks,

Jeff


It sounds like



patch-2.diff

diff --git a/gcc/cse.c b/gcc/cse.c
index 37c6959abea..223fe8c714d 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -5026,7 +5026,8 @@ cse_insn (rtx_insn *insn)
if (targetm.const_anchor
  && !src_related
  && src_const
- && GET_CODE (src_const) == CONST_INT)
+ && GET_CODE (src_const) == CONST_INT
+ && mode != BLKmode)
{
  src_related = try_const_anchors (src_const, mode);
  src_related_is_const_anchor = src_related != NULL_RTX;

Re: Ping^2: [PATCH v2] rs6000: Convert the vector element register to SImode [PR98914]

2021-03-18 Thread Jakub Jelinek via Gcc-patches

On Thu, Mar 18, 2021 at 09:27:17AM +0800, Xionghu Luo via Gcc-patches wrote:
> gcc/ChangeLog:
> 
> 2021-03-18  Xionghu Luo  
> 
>   PR target/98914
>   * config/rs6000/rs6000.c (rs6000_expand_vector_set_var_p9):
>   Convert idx to DImode.
>   (rs6000_expand_vector_set_var_p8): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-03-18  Xionghu Luo  
> 
>   PR target/98914
>   * gcc.target/powerpc/pr98914.c: New test.

LGTM.  But please give Segher some time to chime in if he disagrees.

Jakub

znver3 tuning part 3

2021-03-18 Thread Jan Hubicka

Hi,
this patch updates costs of integer divides to match actual latencies (the
scheduler model already does the right thing).  It is essentially no-op, since
we end up expanding idiv for all sensible constants, so this only may end
up disabling vectorization in some cases, but I did not find any such examples.
However in general it is better ot have actual latencies than random numbers.

Bootstrapped/regtested x86_64-linux, commited.

Honza

gcc/ChangeLog:

2021-03-18  Jan Hubicka  

* config/i386/x86-tune-costs.h (struct processor_costs): Fix costs of
integer divides1.

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index db03738313e..58b3b81985b 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1741,13 +1741,11 @@ struct processor_costs znver3_cost = {
COSTS_N_INSNS (3)}, /*  other.  */
   0,   /* cost of multiply per each bit
   set.  */
-   /* Depending on parameters, idiv can get faster on ryzen.  This is upper
-  bound.  */
-  {COSTS_N_INSNS (16), /* cost of a divide/mod for QI.  */
-   COSTS_N_INSNS (22), /*  HI.  */
-   COSTS_N_INSNS (30), /*  SI.  */
-   COSTS_N_INSNS (45), /*  DI.  */
-   COSTS_N_INSNS (45)},/*  
other.  */
+  {COSTS_N_INSNS (9),  /* cost of a divide/mod for QI.  */
+   COSTS_N_INSNS (10), /*  HI.  */
+   COSTS_N_INSNS (12), /*  SI.  */
+   COSTS_N_INSNS (17), /*  DI.  */
+   COSTS_N_INSNS (17)},/*  
other.  */
   COSTS_N_INSNS (1),   /* cost of movsx.  */
   COSTS_N_INSNS (1),   /* cost of movzx.  */
   8,   /* "large" insn.  */

Re: [PATCH, rs6000 V2] Update "prefix" attribute for Power10 [PR99133]

2021-03-18 Thread will schmidt via Gcc-patches

On Wed, 2021-03-17 at 15:49 -0500, Pat Haugen via Gcc-patches wrote:
> Update prefixed attribute for Power10.
> 
> This patch creates a new attribute, prepend_prefixed_insn, which is
> used to mark
> those instructions that are prefixed and need to have a 'p' prepended
> to their
> mnemonic at asm emit time. The existing "prefix" attribute is now
> used to mark
> all instructions that are prefixed form.
> 
> Bootstrap/regtest on powerpc64le (Power10) and powerpc64 (Power8
> 32/64) with no
> new regressions. Ok for trunk?
> 
> -Pat
> 
> 
> 2021-03-17  Pat Haugen  
> 
> gcc/
>   PR target/99133
>   * config/rs6000/altivec.md (xxspltiw_v4si, xxspltiw_v4sf_inst,
>   xxspltidp_v2df_inst, xxsplti32dx_v4si_inst,
> xxsplti32dx_v4sf_inst,
>   xxblend_, xxpermx_inst, xxeval): Mark prefixed.
>   * config/rs6000/mma.md (mma_, mma_,
>   mma_, mma_, mma_, mma_,
>   mma_, mma_, mma_, mma_):
>   Likewise.
>   * config/rs6000/pcrel-opt.md: Adjust attribute name.
>   * config/rs6000/rs6000.c (rs6000_final_prescan_insn): Adjust
> test. 
>   * config/rs6000/rs6000.md (define_attr
> "prepend_prefixed_insn"): New.
>   (define_attr "prefixed"): Update initializer.
>   (*tls_gd_pcrel, *tls_ld_pcrel, tls_dtprel_,
>   tls_tprel_, *tls_got_tprel_pcrel_,
> *pcrel_local_addr,
>   *pcrel_extern_addr, stack_protect_setdi, stack_protect_testdi):
>   Adjust attribute name.
>   * config/rs6000/sync.md (load_quadpti, store_quadpti):
> Likewise.
> 
> 


Changelog matches patch contents.  (ok!) :-)

Per this change:

+;; Whether an insn is a prefixed insn.  A prefixed instruction has a prefix
+;; instruction word that conveys additional information such as a larger
+;; immediate, additional operands, etc., in addition to the normal instruction
+;; word.  The default "length" attribute will also be adjusted by default to
+;; be 12 bytes.
+(define_attr "prefixed" "no,yes"
+  (if_then_else (eq_attr "prepend_prefixed_insn" "yes")
+   (const_string "yes")
+   (const_string "no")))


.. it looks like at least most of the users of the "prefixed" attribute have
been switched over to use "prepend_prefixed_insn" instead.   Are there still
users of the "prefixed" attribute remaining ?  I'm guessing so, given context,
but can't tell for certain.

(Just a question, not a specific request for a change)

lgtm

thanks
-Will

Re: [PATCH] PR target/99314: Fix integer signedness issue for cpymem pattern expansion.

2021-03-18 Thread Kito Cheng via Gcc-patches

No feedback for 2 weeks, and we already verified this with our
internal CI system for a while, so I go ahead committed to trunk now.

On Fri, Mar 5, 2021 at 12:48 PM Kito Cheng  wrote:
>
> From: Sinan Lin 
>
> Third operand of cpymem pattern is unsigned HOST_WIDE_INT, however we
> are interpret that as signed HOST_WIDE_INT, that not a problem in
> most case, but when the value is large than signed HOST_WIDE_INT, it
> might screw up since we have using that value to calculate the buffer
> size.
>
> 2021-03-05  Sinan Lin  
> Kito Cheng  
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.c (riscv_block_move_straight): Change type
> to unsigned HOST_WIDE_INT for parameter and local variable with
> HOST_WIDE_INT type.
> (riscv_adjust_block_mem): Ditto.
> (riscv_block_move_loop): Ditto.
> (riscv_expand_block_move): Ditto.
> ---
>  gcc/config/riscv/riscv.c | 24 +---
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index fffd0814eee..96fc0c0a4a0 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -3146,9 +3146,9 @@ riscv_legitimize_call_address (rtx addr)
> Assume that the areas do not overlap.  */
>
>  static void
> -riscv_block_move_straight (rtx dest, rtx src, HOST_WIDE_INT length)
> +riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length)
>  {
> -  HOST_WIDE_INT offset, delta;
> +  unsigned HOST_WIDE_INT offset, delta;
>unsigned HOST_WIDE_INT bits;
>int i;
>enum machine_mode mode;
> @@ -3194,8 +3194,8 @@ riscv_block_move_straight (rtx dest, rtx src, 
> HOST_WIDE_INT length)
> register.  Store them in *LOOP_REG and *LOOP_MEM respectively.  */
>
>  static void
> -riscv_adjust_block_mem (rtx mem, HOST_WIDE_INT length,
> -  rtx *loop_reg, rtx *loop_mem)
> +riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length,
> +   rtx *loop_reg, rtx *loop_mem)
>  {
>*loop_reg = copy_addr_to_reg (XEXP (mem, 0));
>
> @@ -3210,11 +3210,11 @@ riscv_adjust_block_mem (rtx mem, HOST_WIDE_INT length,
> the memory regions do not overlap.  */
>
>  static void
> -riscv_block_move_loop (rtx dest, rtx src, HOST_WIDE_INT length,
> - HOST_WIDE_INT bytes_per_iter)
> +riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
> +  unsigned HOST_WIDE_INT bytes_per_iter)
>  {
>rtx label, src_reg, dest_reg, final_src, test;
> -  HOST_WIDE_INT leftover;
> +  unsigned HOST_WIDE_INT leftover;
>
>leftover = length % bytes_per_iter;
>length -= leftover;
> @@ -3259,18 +3259,19 @@ riscv_block_move_loop (rtx dest, rtx src, 
> HOST_WIDE_INT length,
>  bool
>  riscv_expand_block_move (rtx dest, rtx src, rtx length)
>  {
> +  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
>if (CONST_INT_P (length))
>  {
> -  HOST_WIDE_INT factor, align;
> +  unsigned HOST_WIDE_INT factor, align;
>
>align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
>factor = BITS_PER_WORD / align;
>
>if (optimize_function_for_size_p (cfun)
> - && INTVAL (length) * factor * UNITS_PER_WORD > MOVE_RATIO (false))
> + && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
> return false;
>
> -  if (INTVAL (length) <= RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)
> +  if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
> {
>   riscv_block_move_straight (dest, src, INTVAL (length));
>   return true;
> @@ -3280,7 +3281,8 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length)
>   unsigned min_iter_words
> = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
>   unsigned iter_words = min_iter_words;
> - HOST_WIDE_INT bytes = INTVAL (length), words = bytes / 
> UNITS_PER_WORD;
> + unsigned HOST_WIDE_INT bytes = hwi_length;
> + unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
>
>   /* Lengthen the loop body if it shortens the tail.  */
>   for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
> --
> 2.30.0
>

Re: [PATCH 1/2, rs6000] Add const_anchor for rs6000 [PR33699]

2021-03-18 Thread will schmidt via Gcc-patches

On Thu, 2021-03-18 at 09:21 +0800, HAO CHEN GUI wrote:
> David & Segher,
> 
> Thanks so much for your explanation. My patch wants to enables the 
> constant anchor on rs6000 as TARGET_ANCHOR_CONST or targetm.anchor_const 
> is undefined. I realized that we have addi and addis instructions. So 
> the range of the offset could be a 32 bit constant.
> 
> I put a test case at 
> https://github.ibm.com/wschmidt/power-gcc/issues/1042#issuecomment-28922825. 
> It shows how anchor_const can improve asm output. With anchor_const, the 
> second complex constant loading can be eliminated by cse if it is within 
> the range of the first one.

I think about 99.9% of the community won't be able to reach that link. 
If progress on this issue requires additional eyes on the testcase you
may need to provide the test case here.

Thanks
-Will


> 
>Thanks again and looking forward to your advice.
> 
> On 18/3/2021 上午 8:57, David Edelsohn wrote:
> > On Wed, Mar 17, 2021 at 8:26 PM Segher Boessenkool
> >  wrote:
> > > Hi!
> > > 
> > > On Wed, Mar 17, 2021 at 03:35:30PM -0400, David Edelsohn wrote:
> > > > I disagree with your new definitions and I disagree with the manner in
> > > > which you are trying to change the values.
> > > 
> > > Yes.
> > > 
> > > > Your patch is NOT okay without a lot more explanation and justification.
> > > 
> > > Which is why I said:
> > > 
> > > > > > 1) This isn't suitable for stage 4.
> > > 
> > > You give a lot more reasons to not want it, but that was enough for me.
> > > 
> > > > > > 2) Please add a test case, which shows what it does, that it is 
> > > > > > useful.
> > > 
> > > I meant there is no way we can accept this patch if we aren't shown what
> > > it does, and that that is a good thing.
> > > 
> > > > > > 3) Does this work on other OSes than Linux?  What about Darwin and 
> > > > > > AIX?
> > > 
> > > And here I meant that there is no way we can accept patches that
> > > influence code generation on all platforms when we have no idea what it
> > > does on most platforms.  I did not intend to suggest the patch would be
> > > more acceptable if it was tested on other platforms; I wanted to say it
> > > is not acceptable if it is not.
> > > 
> > > The main issue is 2).  We need to understand what problem this patch is
> > > trying to solve.  I'm sure Hao Chen had a reason for doing this patch,
> > > so I'd like to know what it is trying to achieve, what it is trying to
> > > improve!
> > 
> > Investigating this with Segher, I believe that there is some confusion
> > about the "ANCHOR" macros.
> > 
> > TARGET_MIN_ANCHOR_OFFSET and TARGET_MAX_ANCHOR_OFFSET are not related
> > to TARGET_ANCHOR_CONST.
> > 
> > Also, TARGET_ANCHOR_CONST can be defined as a macro to trigger the
> > hook, and doesn't need targetm.anchor_const.
> > 
> > Any change to TARGET_ANCHOR_CONST requires extensive performance
> > testing.  Yes, it presumably fixes the testcase, but the impact on
> > overall performance is the critical question.
> > 
> > Thanks, David

Re: [patch] Fix PR middle-end/99641

2021-03-18 Thread Eric Botcazou

> Can you use wide_ints instead of building trees here please?

Note that this will reject array types whose lower bound is not fixed, but the 
wide_int version is attached.


PR middle-end/99641
* fold-const.c (native_encode_initializer) : For an
array type, do the computation of the current position in sizetype.

-- 
Eric Botcazoudiff --git a/gcc/fold-const.c b/gcc/fold-const.c
index e0bdb4b6ba6..1ebc73d065a 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -8051,21 +8051,21 @@ native_encode_initializer (tree init, unsigned char *ptr, int len,
   int o = off == -1 ? 0 : off;
   if (TREE_CODE (type) == ARRAY_TYPE)
 	{
-	  HOST_WIDE_INT min_index;
+	  tree min_index;
 	  unsigned HOST_WIDE_INT cnt;
 	  HOST_WIDE_INT curpos = 0, fieldsize, valueinit = -1;
 	  constructor_elt *ce;
 
-	  if (TYPE_DOMAIN (type) == NULL_TREE
-	  || !tree_fits_shwi_p (TYPE_MIN_VALUE (TYPE_DOMAIN (type
+	  if (!TYPE_DOMAIN (type)
+	  || TREE_CODE (TYPE_MIN_VALUE (TYPE_DOMAIN (type))) != INTEGER_CST)
 	return 0;
 
 	  fieldsize = int_size_in_bytes (TREE_TYPE (type));
 	  if (fieldsize <= 0)
 	return 0;
 
-	  min_index = tree_to_shwi (TYPE_MIN_VALUE (TYPE_DOMAIN (type)));
-	  if (ptr != NULL)
+	  min_index = TYPE_MIN_VALUE (TYPE_DOMAIN (type));
+	  if (ptr)
 	memset (ptr, '\0', MIN (total_bytes - off, len));
 
 	  for (cnt = 0; ; cnt++)
@@ -8084,21 +8084,40 @@ native_encode_initializer (tree init, unsigned char *ptr, int len,
 		break;
 	  else
 		pos = total_bytes;
+
 	  if (index && TREE_CODE (index) == RANGE_EXPR)
 		{
-		  if (!tree_fits_shwi_p (TREE_OPERAND (index, 0))
-		  || !tree_fits_shwi_p (TREE_OPERAND (index, 1)))
+		  if (TREE_CODE (TREE_OPERAND (index, 0)) != INTEGER_CST
+		  || TREE_CODE (TREE_OPERAND (index, 1)) != INTEGER_CST)
+		return 0;
+		  offset_int wpos
+		= wi::sext (wi::to_offset (TREE_OPERAND (index, 0))
+- wi::to_offset (min_index),
+TYPE_PRECISION (sizetype));
+		  wpos *= fieldsize;
+		  if (!wi::fits_shwi_p (pos))
 		return 0;
-		  pos = (tree_to_shwi (TREE_OPERAND (index, 0)) - min_index)
-			* fieldsize;
-		  count = (tree_to_shwi (TREE_OPERAND (index, 1))
-			   - tree_to_shwi (TREE_OPERAND (index, 0)));
+		  pos = wpos.to_shwi ();
+		  offset_int wcount
+		= wi::sext (wi::to_offset (TREE_OPERAND (index, 1))
+- wi::to_offset (TREE_OPERAND (index, 0)),
+TYPE_PRECISION (sizetype));
+		  if (!wi::fits_shwi_p (wcount))
+		return 0;
+		  count = wcount.to_shwi ();
 		}
 	  else if (index)
 		{
-		  if (!tree_fits_shwi_p (index))
+		  if (TREE_CODE (index) != INTEGER_CST)
+		return 0;
+		  offset_int wpos
+		= wi::sext (wi::to_offset (index)
+- wi::to_offset (min_index),
+TYPE_PRECISION (sizetype));
+		  wpos *= fieldsize;
+		  if (!wi::fits_shwi_p (wpos))
 		return 0;
-		  pos = (tree_to_shwi (index) - min_index) * fieldsize;
+		  pos = wpos.to_shwi ();
 		}
 
 	  if (mask && !CONSTRUCTOR_NO_CLEARING (init) && curpos != pos)

[committed] amdgcn: Silence warnings in gcn.c

2021-03-18 Thread Andrew Stubbs



This patch has no functional changes; it merely cleans up some warning 
messages.


Thanks to Jan-Benedict for pointing them out, off-list.

Andrew
amdgcn: Silence warnings in gcn.c

This fixes a few cases of "unquoted identifier or keyword", one "spurious
trailing punctuation sequence", and a "may be used uninitialized".

gcc/ChangeLog:

	* config/gcn/gcn.c (gcn_parse_amdgpu_hsa_kernel_attribute): Add %< and
	  %> quote markers to error messages.
	(gcn_goacc_validate_dims): Likewise.
	(gcn_conditional_register_usage): Remove exclaimation mark from error
	message.
	(gcn_vectorize_vec_perm_const): Ensure perm is fully uninitialized.

diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index e8bb0b63756..22da37e2532 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -228,7 +228,7 @@ gcn_parse_amdgpu_hsa_kernel_attribute (struct gcn_kernel_args *args,
   const char *str;
   if (TREE_CODE (TREE_VALUE (list)) != STRING_CST)
 	{
-	  error ("amdgpu_hsa_kernel attribute requires string constant "
+	  error ("% attribute requires string constant "
 		 "arguments");
 	  break;
 	}
@@ -241,13 +241,14 @@ gcn_parse_amdgpu_hsa_kernel_attribute (struct gcn_kernel_args *args,
 	}
   if (a == GCN_KERNEL_ARG_TYPES)
 	{
-	  error ("unknown specifier %s in amdgpu_hsa_kernel attribute", str);
+	  error ("unknown specifier %qs in %",
+		 str);
 	  err = true;
 	  break;
 	}
   if (args->requested & (1 << a))
 	{
-	  error ("duplicated parameter specifier %s in amdgpu_hsa_kernel "
+	  error ("duplicated parameter specifier %qs in % "
 		 "attribute", str);
 	  err = true;
 	  break;
@@ -2102,7 +2103,7 @@ gcn_conditional_register_usage (void)
   /* Requesting a set of args different from the default violates the ABI.  */
   if (!leaf_function_p ())
 warning (0, "A non-default set of initial values has been requested, "
-		"which violates the ABI!");
+		"which violates the ABI");
 
   for (int i = SGPR_REGNO (0); i < SGPR_REGNO (14); i++)
 fixed_regs[i] = 0;
@@ -3983,6 +3984,8 @@ gcn_vectorize_vec_perm_const (machine_mode vmode, rtx dst,
   unsigned int perm[64];
   for (unsigned int i = 0; i < nelt; ++i)
 perm[i] = sel[i] & (2 * nelt - 1);
+  for (unsigned int i = nelt; i < 64; ++i)
+perm[i] = 0;
 
   src0 = force_reg (vmode, src0);
   src1 = force_reg (vmode, src1);
@@ -4882,8 +4885,8 @@ gcn_goacc_validate_dims (tree decl, int dims[], int fn_level,
 	warning_at (decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION,
 		OPT_Wopenacc_dims,
 		(dims[GOMP_DIM_VECTOR]
-		 ? G_("using vector_length (64), ignoring %d")
-		 : G_("using vector_length (64), "
+		 ? G_("using %, ignoring %d")
+		 : G_("using %, "
 			  "ignoring runtime setting")),
 		dims[GOMP_DIM_VECTOR]);
   dims[GOMP_DIM_VECTOR] = 1;
@@ -4895,7 +4898,7 @@ gcn_goacc_validate_dims (tree decl, int dims[], int fn_level,
 {
   warning_at (decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION,
 		  OPT_Wopenacc_dims,
-		  "using num_workers (%d), ignoring %d",
+		  "using %, ignoring %d",
 		  max_workers, dims[GOMP_DIM_WORKER]);
   dims[GOMP_DIM_WORKER] = max_workers;
   changed = true;

Re: arm: Fix bfloat16_scalar_1_1.c test

2021-03-18 Thread Christophe Lyon via Gcc-patches

Please disregard this patch: I'll resubmit it as part of a larger
series, based on similar patches I sent ~1 year ago.

On Wed, 17 Mar 2021 at 19:25, Christophe Lyon
 wrote:
>
> Function stacktest1 in bfloat16_scalar_1_1.c test requires
> -mfloat-abi=hard for the associated check-function-bodies to pass.
>
> This patchs the corresponding arm_hard_ok effective-target and
> -mfloat-abi=hard dg-add-options.
>
> This avoids a failure with toolchains configured with
> -mfloat-abi=soft/softfp by default.
>
> 2021-03-17  Christophe Lyon  
>
> gcc/testsuite/
> * gcc.target/arm/bfloat16_scalar_1_1.c: Require -mfloat-abi=hard
> support.
>
> diff --git a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_1_1.c
> b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_1_1.c
> index efcc561..7a6c177 100644
> --- a/gcc/testsuite/gcc.target/arm/bfloat16_scalar_1_1.c
> +++ b/gcc/testsuite/gcc.target/arm/bfloat16_scalar_1_1.c
> @@ -1,7 +1,8 @@
>  /* { dg-do assemble { target { arm*-*-* } } } */
> +/* { dg-require-effective-target arm_hard_ok } */
>  /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
>  /* { dg-add-options arm_v8_2a_bf16_neon }  */
> -/* { dg-additional-options "-O3 --save-temps -std=gnu90" } */
> +/* { dg-additional-options "-O3 --save-temps -std=gnu90 -mfloat-abi=hard" } 
> */
>  /* { dg-final { check-function-bodies "**" "" } } */
>
>  #include

[PATCH, OG10, C++, OpenMP 5.0] Support lambda capturing of pointers and references in target directives

2021-03-18 Thread Chung-Lin Tang


This patch adds proper lambda capturing of pointer and reference variables
as specified in OpenMP 5.0. We map the entire closure object as a to-map,
attach pointers to zero-length array sections, and perform mapping of
references.

The main way of implementation is by tree-walk when finishing processing
of target directives. Due to this nature, it seemed only complete to
combine the processing with all of the this[:1] map creation handling.
This makes this patch also a partial rewrite of PR92120, though things
seem to look better in the new form.
(and yes, the submitted PR92120 patch for mainline is in need of a "v3" re-work)

Now this tree walk is applied in the non-template case and after/during
template instantiation, so a prior patch to relax finish_omp_clauses()
cases to force the this[:1] changes to work are no longer needed, thus
reverted in this patch.

Tested without regressions on x86_64-linux with nvptx offloading,
and pushed to devel/omp/gcc-10.

2021-03-18  Chung-Lin Tang  

gcc/cp/ChangeLog:

* cp-tree.h (set_omp_target_this_expr): Delete.
(finish_omp_target_clauses): New prototype.
* lambda.c (lambda_expr_this_capture): Remove call to
set_omp_target_this_expr.
* parser.c (cp_parser_omp_target): Likewise.
* pt.c (tsubst_expr): Add call to finish_omp_target_clauses for target
directives.
* semantics.c (omp_target_this_expr): Delete.
(omp_target_ptr_members_accessed): Delete.
(finish_non_static_data_member): Remove call to
set_omp_target_this_expr. Remove use of omp_target_ptr_members_accessed.
(finish_this_expr): Remove call to set_omp_target_this_expr.
(struct omp_target_walk_data): New struct for walking over
target-directive tree body.
(finish_omp_target_clauses_r): New function for tree walk.
(finish_omp_target_clauses): New function, with code factored out from
finish_omp_target. Add lambda object handling case.
(finish_omp_target): Factor code out and adjust to use
finish_omp_target_clauses.
(finish_omp_clauses): Revert prior "Adjustments to allow '*ptr' and
'ptr->member' cases in map clausess.", since not needed with new
organization of target-directive clause processing.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-lambda-1.C: New test.

libgomp/testsuite/ChangeLog:

* libgomp.c++/target-lambda-1.C: New test.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index b77bdc380a0..247a3bb1ec3 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7316,7 +7316,7 @@ extern void finish_lambda_scope   (void);
 extern tree start_lambda_function  (tree fn, tree lambda_expr);
 extern void finish_lambda_function (tree body);
 extern tree finish_omp_target  (location_t, tree, tree, bool);
-extern void set_omp_target_this_expr   (tree);
+extern void finish_omp_target_clauses  (location_t, tree, tree *);
 
 /* in tree.c */
 extern int cp_tree_operand_length  (const_tree);
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 9ecf0dbed0c..b55c2f85d27 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -842,9 +842,6 @@ lambda_expr_this_capture (tree lambda, int add_capture_p)
 type cast (_expr.cast_ 5.4) to the type of 'this'. [ The cast
 ensures that the transformed expression is an rvalue. ] */
   result = rvalue (result);
-
-  /* Acknowledge to OpenMP target that 'this' was referenced.  */
-  set_omp_target_this_expr (result);
 }
 
   return result;
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 1af233690a2..9fc2a9b05eb 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -40786,7 +40786,6 @@ cp_parser_omp_target (cp_parser *parser, cp_token 
*pragma_tok,
  keep_next_level (true);
  tree sb = begin_omp_structured_block (), ret;
  unsigned save = cp_parser_begin_omp_structured_block (parser);
- set_omp_target_this_expr (NULL_TREE);
  switch (ccode)
{
case OMP_TEAMS:
@@ -40881,7 +40880,6 @@ cp_parser_omp_target (cp_parser *parser, cp_token 
*pragma_tok,
"#pragma omp target", pragma_tok);
   c_omp_adjust_map_clauses (clauses, true);
   keep_next_level (true);
-  set_omp_target_this_expr (NULL_TREE);
   tree body = cp_parser_omp_structured_block (parser, if_p);
 
   finish_omp_target (pragma_tok->location, clauses, body, false);
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 90cee31bb5a..139d1075986 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18631,6 +18631,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl,
   t = copy_node (t);
   OMP_BODY (t) = stmt;
   OMP_CLAUSES (t) = tmp;
+
+  if (TREE_CODE (t) == OMP_TARGET)
+   finish_omp_target_clauses (EXPR_LOCATION (t), OMP_BODY (t),
+  &OMP_CLAUSES (t));
+

[pushed] c++: Add assert to tsubst.

2021-03-18 Thread Marek Polacek via Gcc-patches

As discussed in the r11-7709 patch, we can now make sure that tsubst
never sees a FLOAT_EXPR, much like its counterpart FIX_TRUNC_EXPR.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* pt.c (tsubst_copy_and_build): Add assert.
---
 gcc/cp/pt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 5e485f10d19..ea530ef36f4 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19770,6 +19770,8 @@ tsubst_copy_and_build (tree t,
complain|decltype_flag));
 
 case FIX_TRUNC_EXPR:
+case FLOAT_EXPR:
+  /* convert_like should have created an IMPLICIT_CONV_EXPR.  */
   gcc_unreachable ();
 
 case ADDR_EXPR:

base-commit: 55308fc26318427c1438cecc60ddd7ba24d5cd33
-- 
2.30.2

Re: [PATCH, rs6000 V2] Update "prefix" attribute for Power10 [PR99133]

2021-03-18 Thread Pat Haugen via Gcc-patches

On 3/18/21 11:33 AM, will schmidt wrote:
> Per this change:
> 
> +;; Whether an insn is a prefixed insn.  A prefixed instruction has a prefix
> +;; instruction word that conveys additional information such as a larger
> +;; immediate, additional operands, etc., in addition to the normal 
> instruction
> +;; word.  The default "length" attribute will also be adjusted by default to
> +;; be 12 bytes.
> +(define_attr "prefixed" "no,yes"
> +  (if_then_else (eq_attr "prepend_prefixed_insn" "yes")
> + (const_string "yes")
> + (const_string "no")))
> 
> 
> .. it looks like at least most of the users of the "prefixed" attribute have
> been switched over to use "prepend_prefixed_insn" instead.   Are there still
> users of the "prefixed" attribute remaining ?  I'm guessing so, given context,
> but can't tell for certain.
> 
> (Just a question, not a specific request for a change)

Yes, there are still a couple uses of get_attr_prefixed() in rs6000.c, plus the 
Power10 scheduling description makes use of it.

-Pat

[committed] [PR99422] LRA: Use lookup_constraint only for a single constraint in process_address_1

2021-03-18 Thread Vladimir Makarov via Gcc-patches


This is an additional patch for PR99422:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99422

The patch was successfully bootstrapped and tested on x86-64, ppc64le, 
and aarch64.


commit a4670f58ebff805e35268542aac35f9791980954
Author: Vladimir N. Makarov 
Date:   Thu Mar 18 15:58:26 2021 -0400

[PR99422] LRA: Use lookup_constraint only for a single constraint in process_address_1.

This is an additional patch for PR99422.  In process_address_1 we
look only at the first constraint in the 1st alternative
and ignore all other possibilities.  As we don't know what
alternative and constraint will be used at this stage, we can be sure
only for a single constraint with one alternative and should use unknown
constraint for all other cases.

gcc/ChangeLog:

PR target/99422
* lra-constraints.c (process_address_1): Use lookup_constraint
only for a single constraint.

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 9205826960c..64801b6fcce 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -3459,7 +3459,12 @@ process_address_1 (int nop, bool check_only_p,
   constraint
 	= skip_contraint_modifiers (curr_static_id->operand[dup].constraint);
 }
-  cn = lookup_constraint (*constraint == '\0' ? "X" : constraint);
+  if (*skip_contraint_modifiers (constraint
+ + CONSTRAINT_LEN (constraint[0],
+		   constraint)) != '\0')
+cn = CONSTRAINT__UNKNOWN;
+  else
+cn = lookup_constraint (*constraint == '\0' ? "X" : constraint);
   if (insn_extra_address_constraint (cn)
   /* When we find an asm operand with an address constraint that
 	 doesn't satisfy address_operand to begin with, we clear

Re: [committed] amdgcn: Silence warnings in gcn.c

2021-03-18 Thread Jan-Benedict Glaw

Hi Andrew,

On Thu, 2021-03-18 17:41:59 +, Andrew Stubbs  wrote:
> gcc/ChangeLog:
> 
>   (gcn_vectorize_vec_perm_const): Ensure perm is fully uninitialized.

I guess "initialized" is ment here?

MfG, JBG

-- 


signature.asc
Description: PGP signature

Re: [pushed] c++: Add assert to tsubst.

2021-03-18 Thread H.J. Lu via Gcc-patches

On Thu, Mar 18, 2021 at 11:35 AM Marek Polacek via Gcc-patches
 wrote:
>
> As discussed in the r11-7709 patch, we can now make sure that tsubst
> never sees a FLOAT_EXPR, much like its counterpart FIX_TRUNC_EXPR.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.
>
> gcc/cp/ChangeLog:
>
> * pt.c (tsubst_copy_and_build): Add assert.
> ---
>  gcc/cp/pt.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index 5e485f10d19..ea530ef36f4 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -19770,6 +19770,8 @@ tsubst_copy_and_build (tree t,
> complain|decltype_flag));
>
>  case FIX_TRUNC_EXPR:
> +case FLOAT_EXPR:
> +  /* convert_like should have created an IMPLICIT_CONV_EXPR.  */
>gcc_unreachable ();
>
>  case ADDR_EXPR:
>
> base-commit: 55308fc26318427c1438cecc60ddd7ba24d5cd33
> --
> 2.30.2
>

This may have caused:

https://gcc.gnu.org/pipermail/gcc-regression/2021-March/074461.html

FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O3 -g  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O3 -g  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O3 -g  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O3 -g   (test for e

Re: [PATCH] c++: Private parent access check for using decls [PR19377]

2021-03-18 Thread Jason Merrill via Gcc-patches


On 3/10/21 4:14 PM, Anthony Sharp wrote:

Hiya


That's because none of the names are overloaded within a single base
class.


Ah, thanks. Thought there must be something I wasn't thinking of.


Also, you can use == instead of cp_tree_equal for comparing FUNCTION_DECLs.


Changed it.

Latest patch is attached. Compiles fine and no regressions.


Great!  You may have already noticed that I applied the patch with a 
little simplification: we can use ovl_iterator for non-overloaded decls 
as well.


Thanks,
Jason

Re: [pushed] c++: Add assert to tsubst.

2021-03-18 Thread Marek Polacek via Gcc-patches

On Thu, Mar 18, 2021 at 02:04:59PM -0700, H.J. Lu wrote:
> On Thu, Mar 18, 2021 at 11:35 AM Marek Polacek via Gcc-patches
>  wrote:
> >
> > As discussed in the r11-7709 patch, we can now make sure that tsubst
> > never sees a FLOAT_EXPR, much like its counterpart FIX_TRUNC_EXPR.
> >
> > Tested x86_64-pc-linux-gnu, applying to trunk.
> >
> > gcc/cp/ChangeLog:
> >
> > * pt.c (tsubst_copy_and_build): Add assert.
> > ---
> >  gcc/cp/pt.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index 5e485f10d19..ea530ef36f4 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -19770,6 +19770,8 @@ tsubst_copy_and_build (tree t,
> > complain|decltype_flag));
> >
> >  case FIX_TRUNC_EXPR:
> > +case FLOAT_EXPR:
> > +  /* convert_like should have created an IMPLICIT_CONV_EXPR.  */
> >gcc_unreachable ();
> >
> >  case ADDR_EXPR:
> >
> > base-commit: 55308fc26318427c1438cecc60ddd7ba24d5cd33
> > --
> > 2.30.2
> >
> 
> This may have caused:
> 
> https://gcc.gnu.org/pipermail/gcc-regression/2021-March/074461.html
> 
> FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess err

Re: [PATCH] c++: Only reject reinterpret casts from pointers to integers for manifestly_const_eval evaluation [PR99456]

2021-03-18 Thread Jason Merrill via Gcc-patches


On 3/9/21 10:31 AM, Jakub Jelinek wrote:

Hi!

My PR82304/PR95307 fix moved reinterpret cast from pointer to integer
diagnostics from cxx_eval_outermost_constant_expr where it caught
invalid code only at the outermost level down into
cxx_eval_constant_expression.
Unfortunately, it regressed following testcase, we emit worse code
including dynamic initialization of some vars.
While the initializers are not constant expressions due to the
reinterpret_cast in there, there is no reason not to fold them as an
optimization.

I've tried to make this dependent on !ctx->quiet, but that regressed
two further tests, so this patch bases that on manifestly_const_eval.


Did you try using ctx->strict?

Though perhaps for GCC 12 the strict flag should be dropped entirely in 
favor of manifestly_const_eval.



The new testcase is now optimized as much as it used to be in GCC 10
and the only regression it causes is an extra -Wnarrowing warning
on vla22.C test on invalid code (which the patch adjusts).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-09  Jakub Jelinek  

PR c++/99456
* constexpr.c (cxx_eval_constant_expression): For CONVERT_EXPR from
INDIRECT_TYPE_P to ARITHMETIC_TYPE_P, when !ctx->manifestly_const_eval
don't diagnose it, set *non_constant_p nor return t.

* g++.dg/opt/pr99456.C: New test.
* g++.dg/ext/vla22.C: Expect a -Wnarrowing warning for c++11 and
later.

--- gcc/cp/constexpr.c.jj   2021-03-08 23:40:28.334509562 +0100
+++ gcc/cp/constexpr.c  2021-03-09 11:50:08.721716460 +0100
@@ -6656,7 +6656,8 @@ cxx_eval_constant_expression (const cons
  
  	if (TREE_CODE (t) == CONVERT_EXPR

&& ARITHMETIC_TYPE_P (type)
-   && INDIRECT_TYPE_P (TREE_TYPE (op)))
+   && INDIRECT_TYPE_P (TREE_TYPE (op))
+   && ctx->manifestly_const_eval)
  {
if (!ctx->quiet)
  error_at (loc,
--- gcc/testsuite/g++.dg/opt/pr99456.C.jj   2021-03-09 11:43:56.452862770 
+0100
+++ gcc/testsuite/g++.dg/opt/pr99456.C  2021-03-09 11:43:56.452862770 +0100
@@ -0,0 +1,33 @@
+// PR c++/99456
+// { dg-do compile { target c++17 } }
+// { dg-options "-g0" }
+// { dg-final { scan-assembler-not "PR99456Var0\[1234]" } }
+// { dg-final { scan-assembler-not "__static_initialization_and_destruction" } 
}
+// { dg-final { scan-assembler-not "_GLOBAL__sub_I" } }
+// { dg-final { scan-assembler-not "_ZGV12PR99456Var1\[1234]" } }
+
+typedef __UINTPTR_TYPE__ uintptr_t;
+
+class Container
+{
+public:
+  uintptr_t m;
+};
+
+extern unsigned desc;
+static constexpr unsigned &descRef = desc;
+
+inline Container PR99456Var01 {reinterpret_cast (&descRef)};
+inline Container PR99456Var02 {reinterpret_cast (&desc)};
+inline uintptr_t PR99456Var03 {reinterpret_cast (&descRef)};
+inline uintptr_t PR99456Var04 {reinterpret_cast (&desc)};
+
+inline Container PR99456Var11 {reinterpret_cast (&descRef)};
+inline Container PR99456Var12 {reinterpret_cast (&desc)};
+inline uintptr_t PR99456Var13 {reinterpret_cast (&descRef)};
+inline uintptr_t PR99456Var14 {reinterpret_cast (&desc)};
+
+auto *PR99456Ref11 = &PR99456Var11;
+auto *PR99456Ref12 = &PR99456Var12;
+auto *PR99456Ref13 = &PR99456Var13;
+auto *PR99456Ref14 = &PR99456Var14;
--- gcc/testsuite/g++.dg/ext/vla22.C.jj 2020-02-27 09:28:46.396956140 +0100
+++ gcc/testsuite/g++.dg/ext/vla22.C2021-03-09 12:00:58.275482884 +0100
@@ -6,4 +6,4 @@ void
  f ()
  {
const int tbl[(long) "h"] = { 12 }; // { dg-error "size of array .tbl. is not an 
integral constant-expression" }
-}
+}// { dg-warning "narrowing conversion" "" 
{ target c++11 } .-1 }

Jakub

[r11-7723 Regression] FAIL: g++.dg/torture/pr85013.C -Os (test for excess errors) on Linux/x86_64

2021-03-18 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

c5e55673b486533c4d6d19ac903460f70b48f11a is the first bad commit
commit c5e55673b486533c4d6d19ac903460f70b48f11a
Author: Marek Polacek 
Date:   Wed Mar 17 19:39:10 2021 -0400

c++: Add assert to tsubst.

caused

FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -O3 -g  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -O3 -g   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -O3 -g  (test for excess errors)
FAIL: g++.dg/torture/pr85013.C   -Os  (internal compiler error)
FAIL: g++.dg/torture/pr85013.C   -Os   (test for errors, line 3)
FAIL: g++.dg/torture/pr85013.C   -Os  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7723/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [r11-7723 Regression] FAIL: g++.dg/torture/pr85013.C -Os (test for excess errors) on Linux/x86_64

2021-03-18 Thread Marek Polacek via Gcc-patches

Fixed now.

On Thu, Mar 18, 2021 at 02:40:19PM -0700, sunil.k.pandey via Gcc-patches wrote:
> On Linux/x86_64,
> 
> c5e55673b486533c4d6d19ac903460f70b48f11a is the first bad commit
> commit c5e55673b486533c4d6d19ac903460f70b48f11a
> Author: Marek Polacek 
> Date:   Wed Mar 17 19:39:10 2021 -0400
> 
> c++: Add assert to tsubst.
> 
> caused
> 
> FAIL: g++.dg/torture/pr85013.C   -O0  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O0   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O0  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O1  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O1   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O1  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O2  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O2   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O2  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -O3 -g  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -O3 -g   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -O3 -g  (test for excess errors)
> FAIL: g++.dg/torture/pr85013.C   -Os  (internal compiler error)
> FAIL: g++.dg/torture/pr85013.C   -Os   (test for errors, line 3)
> FAIL: g++.dg/torture/pr85013.C   -Os  (test for excess errors)
> 
> with GCC configured with
> 
> ../../gcc/configure 
> --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7723/usr
>  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
> --enable-libmpx x86_64-linux --disable-bootstrap
> 
> To reproduce:
> 
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
> --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
> --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
> --target_board='unix{-m64}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr85013.C 
> --target_board='unix{-m64\ -march=cascadelake}'"
> 
> (Please do not reply to this email, for question about this report, contact 
> me at skpgkp2 at gmail dot com)
> 

Marek

[PING][PATCH] adjust "partly out of bounds" warning (PR 98503)

2021-03-18 Thread Martin Sebor via Gcc-patches


Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564483.html

The review of this patch digressed into a design discussion of a new,
more capable implementation of -Wstrict-aliasing, but the proposed
patch turning just this one instance of -Warray-bounds into
-Wstrict-aliasing and making it subject to -fstrict-aliasing wasn't
decided.  PR 98503 was raised by someone working with the kernel
which uses -fno-strict-aliasing, and so to them the warning isn't
useful.  But since the warning does find potential bugs when strict
aliasing is in effect, I'd still like to consider this patch for
GCC 11 so that the kernel (and other such projects) doesn't have
to deal with the false positives.

If/when we add a new, dedicated solution for -Wstrict-aliasing I'll
move this instance from gimple-array-bounds.cc there.

Martin

Re: [PATCH] Fix ICE: in function_and_variable_visibility, at ipa-visibility.c:795 (PR99466)

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/14/2021 8:03 AM, Iain Buclaw via Gcc-patches wrote:

Excerpts from Iain Sandoe's message of March 13, 2021 6:09 pm:

Hi Iain,

Iain Buclaw via Gcc-patches  wrote:


This patch fixes an ICE caused by emutls routines generating a weak,
non-public symbol for storing the initializer of a weak TLS variable.

In get_emutls_init_templ_addr, only declarations that were DECL_ONE_ONLY
would get a public initializer symbol, ignoring variables that were
declared with __attribute__((weak)).

Because DECL_VISIBILITY is also copied to the emutls initializer, a
second test is included which checks that the expected visibility is
emitted too.

Tested on x86_64-apple-darwin10, OK for mainline?

The oldest version of gcc I've checked is 7.5.0, and the ICE is present
there too.  Is this OK for backporting, and if so which versions should
it be backported to?

Regards,
Iain.

---
gcc/ChangeLog:

PR ipa/99466
* tree-emutls.c (get_emutls_init_templ_addr): Mark initializer of weak
TLS declarations as public.

gcc/testsuite/ChangeLog:

* gcc.dg/tls/pr98607-1.c: New test.
* gcc.dg/tls/pr98607-2.c: New test.

^^^ s/98607/99466/ ?


Oops, good catch.  I must have copied the number from the wrong tab.
Mechanically updated the PR numbers and trying again...

---
gcc/ChangeLog:

PR ipa/99466
* tree-emutls.c (get_emutls_init_templ_addr): Mark initializer of weak
TLS declarations as public.

gcc/testsuite/ChangeLog:

PR ipa/99466
* gcc.dg/tls/pr99466-1.c: New test.
* gcc.dg/tls/pr99466-2.c: New test.

OK for the trunk.  I'd probably go back to gcc-9 and gcc-10.
Jeff

[PATCH] x86: Issue error for return/argument only with function body

2021-03-18 Thread H.J. Lu via Gcc-patches

If we never generate function body, we shouldn't issue errors for return
nor argument.  Add init_cumulative_args_called to i386 machine_function
to avoid issuing errors for return and argument without function body.

gcc/

PR target/99652
* config/i386/i386.c (init_cumulative_args): Set
init_cumulative_args_called to true.
(construct_container): Issue error for return and argument only
if init_cumulative_args_called is true.
* config/i386/i386.h (machine_function): Add
init_cumulative_args_called.

gcc/testsuite/

PR target/99652
* gcc.dg/torture/pr99652-1.c: New test.
* gcc.dg/torture/pr99652-2.c: Likewise.
* gcc.target/i386/pr57655.c: Adjusted.
* gcc.target/i386/pr59794-6.c: Likewise.
* gcc.target/i386/pr70738-1.c: Likewise.
* gcc.target/i386/pr96744-1.c: Likewise.
---
 gcc/config/i386/i386.c| 26 ++-
 gcc/config/i386/i386.h|  3 +++
 gcc/testsuite/gcc.dg/torture/pr99652-1.c  |  8 +++
 gcc/testsuite/gcc.dg/torture/pr99652-2.c  |  8 +++
 gcc/testsuite/gcc.target/i386/pr57655.c   |  4 ++--
 gcc/testsuite/gcc.target/i386/pr59794-6.c |  4 ++--
 gcc/testsuite/gcc.target/i386/pr70738-1.c |  4 ++--
 gcc/testsuite/gcc.target/i386/pr96744-1.c |  4 ++--
 8 files changed, 43 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr99652-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr99652-2.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 540d4f44517..4a0b8c73bef 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1705,6 +1705,8 @@ init_cumulative_args (CUMULATIVE_ARGS *cum,  /* Argument 
info to initialize */
   struct cgraph_node *local_info_node = NULL;
   struct cgraph_node *target = NULL;
 
+  cfun->machine->init_cumulative_args_called = true;
+
   memset (cum, 0, sizeof (*cum));
 
   if (fndecl)
@@ -2534,18 +2536,21 @@ construct_container (machine_mode mode, machine_mode 
orig_mode,
  some less clueful developer tries to use floating-point anyway.  */
   if (needed_sseregs && !TARGET_SSE)
 {
-  if (in_return)
+  if (cfun->machine->init_cumulative_args_called)
{
- if (!issued_sse_ret_error)
+ if (in_return)
{
- error ("SSE register return with SSE disabled");
- issued_sse_ret_error = true;
+ if (!issued_sse_ret_error)
+   {
+ error ("SSE register return with SSE disabled");
+ issued_sse_ret_error = true;
+   }
+   }
+ else if (!issued_sse_arg_error)
+   {
+ error ("SSE register argument with SSE disabled");
+ issued_sse_arg_error = true;
}
-   }
-  else if (!issued_sse_arg_error)
-   {
- error ("SSE register argument with SSE disabled");
- issued_sse_arg_error = true;
}
   return NULL;
 }
@@ -2558,7 +2563,8 @@ construct_container (machine_mode mode, machine_mode 
orig_mode,
  || regclass[i] == X86_64_X87UP_CLASS
  || regclass[i] == X86_64_COMPLEX_X87_CLASS)
{
- if (!issued_x87_ret_error)
+ if (cfun->machine->init_cumulative_args_called
+ && !issued_x87_ret_error)
{
  error ("x87 register return with x87 disabled");
  issued_x87_ret_error = true;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 48749104b24..ad908c010b3 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2945,6 +2945,9 @@ struct GTY(()) machine_function {
  function.  */
   BOOL_BITFIELD has_explicit_vzeroupper : 1;
 
+  /* If true if init_cumulative_args has been called.  */
+  BOOL_BITFIELD init_cumulative_args_called: 1;
+
   /* The largest alignment, in bytes, of stack slot actually used.  */
   unsigned int max_used_stack_alignment;
 
diff --git a/gcc/testsuite/gcc.dg/torture/pr99652-1.c 
b/gcc/testsuite/gcc.dg/torture/pr99652-1.c
new file mode 100644
index 000..c2395ff4ed8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr99652-1.c
@@ -0,0 +1,8 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-mgeneral-regs-only" } */
+
+inline double
+foo (void)
+{
+  return 1.0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr99652-2.c 
b/gcc/testsuite/gcc.dg/torture/pr99652-2.c
new file mode 100644
index 000..beefad8bfee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr99652-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-mno-80387" } */
+
+inline double
+foo (void)
+{
+  return 1.0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr57655.c 
b/gcc/testsuite/gcc.target/i386/pr57655.c
index 33a59d3a263..649cdef832d 100644
--- a/gcc/testsuite/gcc.target/i386/pr57655.c
+++ b/gcc/testsuite/gcc.target/i386/pr57655.c
@@ -2,7 +2,7 @@
 /* { dg-options "-mavx -m

Re: [PATCH] c++: Fix error-recovery with requires expression [PR99500]

2021-03-18 Thread Jason Merrill via Gcc-patches


On 3/9/21 10:22 PM, Marek Polacek wrote:

This fixes an ICE on invalid code where one of the parameters was
error_mark_node and thus resetting its DECL_CONTEXT crashed.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.

Jason


gcc/cp/ChangeLog:

PR c++/99500
* parser.c (cp_parser_requirement_parameter_list): Handle
error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/99500
* g++.dg/cpp2a/concepts-err3.C: New test.
---
  gcc/cp/parser.c| 7 +--
  gcc/testsuite/g++.dg/cpp2a/concepts-err3.C | 4 
  2 files changed, 9 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-err3.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0a7d18af98b..dbc44ed0765 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28835,8 +28835,11 @@ cp_parser_requirement_parameter_list (cp_parser 
*parser)
if (parm == void_list_node || parm == explicit_void_list_node)
break;
tree decl = TREE_VALUE (parm);
-  DECL_CONTEXT (decl) = NULL_TREE;
-  CONSTRAINT_VAR_P (decl) = true;
+  if (decl != error_mark_node)
+   {
+ DECL_CONTEXT (decl) = NULL_TREE;
+ CONSTRAINT_VAR_P (decl) = true;
+   }
  }
  
return parms;

diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-err3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-err3.C
new file mode 100644
index 000..9427fd5c5a6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-err3.C
@@ -0,0 +1,4 @@
+// PR c++/99500
+// { dg-do compile { target c++20 } }
+
+bool b = requires (bool a, int a) { requires true; }; // { dg-error "conflicting 
declaration" }

base-commit: 8dc225d311ed87633fa970164bdda19bf228b8a3

[PATCH] rs6000: Fix some unexpected empty split conditions

2021-03-18 Thread Kewen.Lin via Gcc-patches

Hi,

As Segher and Mike pointed out, the define_insn_and_split should avoid
to use empty split condition if the condition for define_insn isn't empty,
otherwise it can sometimes leads to unexpected consequence.

This patch is to fix some places like this.

Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

Since it's in very low risk and can avoid some unexpected errors,
is it ok for trunk? or has to be for GCC12?

BR,
Kewen
--
gcc/ChangeLog:

* config/rs6000/rs6000.md (*rotldi3_insert_sf,
*movcc_p9, floatsi2_lfiwax,
floatsi2_lfiwax_mem, floatunssi2_lfiwzx,
floatunssi2_lfiwzx_mem, *floatsidf2_internal,
*floatunssidf2_internal, fix_truncsi2_stfiwx,
fix_truncsi2_internal, fixuns_truncsi2_stfiwx,
*round322_fprs, *roundu322_fprs,
*fix_truncsi2_internal): Fix empty split condition.
* config/rs6000/vsx.md (*vsx_le_undo_permute_,
vsx_reduc__v2df, vsx_reduc__v4sf,
*vsx_reduc__v2df_scalar,
*vsx_reduc__v4sf_scalar): Likewise.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 592ac90aa44..6ab71979566 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4281,7 +4281,7 @@
   (clobber (match_scratch:V4SF 4))]
   "TARGET_POWERPC64 && INTVAL (operands[2]) == "
   "#"
-  ""
+  "&& 1"
   [(parallel [(set (match_dup 5)
   (zero_extend:DI (unspec:QHSI [(match_dup 3)] UNSPEC_SI_FROM_SF)))
 (clobber (match_dup 4))])
@@ -5327,7 +5327,7 @@
(clobber (match_scratch:V2DI 6 "=0,&wa"))]
   "TARGET_P9_MINMAX"
   "#"
-  ""
+  "&& 1"
   [(set (match_dup 6)
(if_then_else:V2DI (match_dup 1)
   (match_dup 7)
@@ -5436,7 +5436,7 @@
   "TARGET_HARD_FLOAT && TARGET_LFIWAX
&&  && can_create_pseudo_p ()"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx dest = operands[0];
@@ -5476,7 +5476,7 @@
(clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWAX && "
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   operands[1] = rs6000_force_indexed_or_indirect_mem (operands[1]);
@@ -5533,7 +5533,7 @@
(clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWZX && "
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx dest = operands[0];
@@ -5573,7 +5573,7 @@
(clobber (match_scratch:DI 2 "=d,wa"))]
   "TARGET_HARD_FLOAT && TARGET_LFIWZX && "
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   operands[1] = rs6000_force_indexed_or_indirect_mem (operands[1]);
@@ -5638,7 +5638,7 @@
(clobber (match_operand:SI 6 "gpc_reg_operand" "=&r"))]
   "!TARGET_FCFID && TARGET_HARD_FLOAT"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx lowword, highword;
@@ -5728,7 +5728,7 @@
   "!TARGET_FCFIDU && TARGET_HARD_FLOAT
&& !(TARGET_FCFID && TARGET_POWERPC64)"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx lowword, highword;
@@ -5884,7 +5884,7 @@
   "TARGET_HARD_FLOAT && TARGET_STFIWX && can_create_pseudo_p ()
&& !(TARGET_P8_VECTOR && TARGET_DIRECT_MOVE)"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx dest = operands[0];
@@ -5926,7 +5926,7 @@
   "TARGET_HARD_FLOAT
&& !(TARGET_P8_VECTOR && TARGET_DIRECT_MOVE)"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx lowword;
@@ -6032,7 +6032,7 @@
&& TARGET_STFIWX && can_create_pseudo_p ()
&& !TARGET_P8_VECTOR"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx dest = operands[0];
@@ -6252,7 +6252,7 @@
&&  && TARGET_LFIWAX && TARGET_STFIWX && TARGET_FCFID
&& !TARGET_DIRECT_MOVE && can_create_pseudo_p ()"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx dest = operands[0];
@@ -6285,7 +6285,7 @@
&& TARGET_LFIWZX && TARGET_STFIWX && TARGET_FCFIDU && !TARGET_DIRECT_MOVE
&& can_create_pseudo_p ()"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx dest = operands[0];
@@ -8268,7 +8268,7 @@
(clobber (match_operand:DI 5 "offsettable_mem_operand" "=o"))]
   "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
   "#"
-  ""
+  "&& 1"
   [(pc)]
 {
   rtx lowword;
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index ad673968584..d1053ff6746 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -972,7 +972,7 @@
   "@
#
xxlor %x0,%x1"
-  ""
+  "&& 1"
   [(set (match_dup 0) (match_dup 1))]
 {
   if (reload_completed && REGNO (operands[0]) == REGNO (operands[1]))
@@ -4649,7 +4649,7 @@
(clobber (match_scratch:V2DF 2 "=0,&wa"))]
   "VECTOR_UNIT_VSX_P (V2DFmode)"
   "#"
-  ""
+  "&& 1"
   [(const_int 0)]
 {
   rtx tmp = (GET_CODE (operands[2]) == SCRATCH)
@@ -4671,7 +4671,7 @@
(clobber (match_scratch:V4SF 3 "=&wa"))]
   "VECTOR_UNIT_VSX_P (V4SFmode)"
   "#"
-  ""
+  "&& 1"
   [(const_int 0)]
 {
   rtx op0 = operands[0];
@@ -4719,7 +4719,7 @@
(clobber (match_scratch:DF 2 "=0,&wa"))]
   "BYTES_BIG_ENDIAN && VECTOR_UNIT_VSX_P (V2DFmode)"
   "#"
-  ""
+  "&& 1"
   [(const_int 0)]
 {
   rtx hi = gen_highpart (DFmode, operands[1]);
@@ -4746,7 +4746,7 @@
(clobber (match_scratch:V4SF 4 "=0"))]
   "BYTES_BIG_ENDIAN && VECTOR_UNIT_VSX_P (V4SFmode)"
   "#"
-  ""
+  "&& 1"
   [

Re: [patch] substitute @tie{} with a space for the man pages

2021-03-18 Thread Jeff Law via Gcc-patches




On 3/10/2021 5:21 AM, Matthias Klose wrote:

The gcc man page currently has untranslated @tie{} patterns in the man page.
Just replace these with a white space.  Ok for the trunk and branches?

Matthias

--- a/contrib/texi2pod.pl
+++ b/contrib/texi2pod.pl
@@ -210,6 +210,7 @@ while(<$inf>) {
  s/\@TeX\{\}/TeX/g;
  s/\@pounds\{\}/\#/g;
  s/\@minus(?:\{\})?/-/g;
+s/\@tie\{\}/ /g;
  s/\\,/,/g;

  # Now the ones that have to be replaced by special escapes


OK

jeff

fix ssse3_pshufbv8qi3 post-reload const pool load

2021-03-18 Thread Alexandre Oliva via Gcc-patches



The split in ssse3_pshufbv8qi3 forces a const vector into the constant
pool, and loads from it.  That runs after reload, so if the load
requires any reloading, we're out of luck.  Indeed, if the load
address is not legitimate, e.g. -mcmodel=large, the insn is no longer
recognized.

This patch turns the constant into an input operand, introduces an
expander to generate the constant unconditionally, and arranges for
this input operand to be retained as an unused immediate in the
alternatives that don't undergo splitting, and for it to be loaded
into the scratch register for those that do.

It is now the register allocator that arranges to load the const
vector into a register, so it deals with whatever legitimizing steps
needed for the target configuration.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* config/i386/predicates.md (register_or_const_vec_operand):
New.
* config/i386/sse.md (ssse3_pshufbv8qi3): Add an expander for
the now *-prefixed insn_and_split, turn the splitter const vec
into an input for the insn, making it an ignored immediate for
non-split cases, and loaded into the scratch register
otherwise.
---
 gcc/config/i386/predicates.md |6 ++
 gcc/config/i386/sse.md|   26 +++---
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index b6dd5e9d3b243..f1da005c95cf3 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1153,6 +1153,12 @@ (define_predicate "nonimmediate_or_const_vector_operand"
   (ior (match_operand 0 "nonimmediate_operand")
(match_code "const_vector")))
 
+;; Return true when OP is either register operand, or any
+;; CONST_VECTOR.
+(define_predicate "register_or_const_vector_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_code "const_vector")))
+
 ;; Return true when OP is nonimmediate or standard SSE constant.
 (define_predicate "nonimmediate_or_sse_const_operand"
   (ior (match_operand 0 "nonimmediate_operand")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 43e4d57ec6a3d..b693864e62d1b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17159,10 +17159,26 @@ (define_insn "_pshufb3"
(set_attr "btver2_decode" "vector")
(set_attr "mode" "")])
 
-(define_insn_and_split "ssse3_pshufbv8qi3"
+(define_expand "ssse3_pshufbv8qi3"
+  [(parallel
+[(set (match_operand:V8QI 0 "register_operand" "=")
+ (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "")
+   (match_operand:V8QI 2 "register_mmxmem_operand" "")
+   (const_vector:V4SI [(match_dup 3) (match_dup 3)
+   (match_dup 3) (match_dup 3)])]
+  UNSPEC_PSHUFB))
+ (clobber (match_scratch:V4SI 4 "="))])]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
+{
+  operands[3] = gen_int_mode (0xf7f7f7f7, SImode);
+})
+
+(define_insn_and_split "*ssse3_pshufbv8qi3"
   [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
(unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv")
- (match_operand:V8QI 2 "register_mmxmem_operand" 
"ym,x,Yv")]
+ (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")
+ (match_operand:V4SI 4 "register_or_const_vector_operand"
+ "i,3,3")]
 UNSPEC_PSHUFB))
(clobber (match_scratch:V4SI 3 "=X,&x,&Yv"))]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
@@ -17172,8 +17188,7 @@ (define_insn_and_split "ssse3_pshufbv8qi3"
#"
   "TARGET_SSSE3 && reload_completed
&& SSE_REGNO_P (REGNO (operands[0]))"
-  [(set (match_dup 3) (match_dup 5))
-   (set (match_dup 3)
+  [(set (match_dup 3)
(and:V4SI (match_dup 3) (match_dup 2)))
(set (match_dup 0)
(unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))]
@@ -17188,9 +17203,6 @@ (define_insn_and_split "ssse3_pshufbv8qi3"
GET_MODE (operands[2]));
   operands[4] = lowpart_subreg (V16QImode, operands[3],
GET_MODE (operands[3]));
-  rtx vec_const = ix86_build_const_vector (V4SImode, true,
-  gen_int_mode (0xf7f7f7f7, SImode));
-  operands[5] = force_const_mem (V4SImode, vec_const);
 }
   [(set_attr "mmx_isa" "native,sse_noavx,avx")
(set_attr "prefix_extra" "1")


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

63 matches

Mail list logo