Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)

2020-06-23 Thread Richard Biener via Gcc-patches
On Tue, Jun 23, 2020 at 12:22 AM Martin Sebor via Gcc-patches
 wrote:
>
> On 6/22/20 12:55 PM, Jason Merrill wrote:
> > On 6/22/20 1:25 PM, Martin Sebor wrote:
> >> The attached fix parallels the one for the equivalent C bug 95580
> >> where the pretty printers don't correctly handle MEM_REF arguments
> >> with type void* or other pointers to an incomplete type.
> >>
> >> The incorrect handling was exposed by the recent change to
> >> -Wuninitialized which includes such expressions in diagnostics.
> >
> >> +if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
> >> +  if (!integer_onep (size))
> >> +{
> >> +  pp_cxx_left_paren (pp);
> >> +  dump_type (pp, ptr_type_node, flags);
> >> +  pp_cxx_right_paren (pp);
> >> +}
> >
> > Don't we want to print the cast if the pointer target type is incomplete?
>
> I suppose, yes, although after some more testing I think what should
> be output is the type of the access.  The target pointer type isn't
> meaningful (at least not in this case).
>
> Here's what the warning looks like in C for the test case in
> gcc.dg/pr95580.c:
>
>warning: ‘*((void *)(p)+1)’ may be used uninitialized
>
> and like this in C++:
>
>warning: ‘*(p +1)’ may be used uninitialized
>
> The +1 is a byte offset, which is correct given that incrementing
> a void* in GCC is the same as adding 1 to the byte address, but
> dereferencing a void* doesn't correspond to what's going on in
> the source.
>
> Even for a complete type (with size greater than 1), printing
> the type of the argument plus a byte offset is wrong.  It ends
> up with this for the C++ test case from 95768:
>
>warning: ‘*((int*) +4)’ is used uninitialized
>
> when the access is actually ‘*((int*) +1)’
>
> So it seems to me for MEM_REF, to make the output meaningful,
> it's the type of the access (i.e., the MEM_REF type) that should
> be printed here, and the offset should either be in elements of
> the accessed type, i.e.,
>
>warning: ‘*((int*) +1)’ is used uninitialized
>
> or, if the access is misaligned, the argument should first be
> cast to char*, the offset added, and the result then cast to
> the access type, like this:
>
>warning: ‘*(T*)((char*) +1)’ is used uninitialized
>
> The attached revised and less than fully tested patch implements
> this for C++ only for now.  If we agree on this approach I'll see
> about making the corresponding change in C.

Note that there is no C/C++ way of fully expressing MEM_REF
semantics.  __MEM  ((T *)p + 1) is not actually
*(int *)((char *)p + 1) because that does not reflect that the
effective type of the lvalue when TBAA is concerned is 'T'
rather than 'int'.  Note for MEM_REF the offset is always
a constant byte offset but it indeed does not have to be a
multiple of the MEM_REF type size.

I wonder whether printing the MEM_REF in full provides
any real diagnostic value in the more "obfuscated" cases.

I'd also not print  but .

Richard.

> Martin


[committed] libstdc++: Regenerate makefiles

2020-06-23 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* doc/Makefile.in: Regenerate.
* include/Makefile.in: Regenerate.
* libsupc++/Makefile.in: Regenerate.
* po/Makefile.in: Regenerate.
* python/Makefile.in: Regenerate.
* src/Makefile.in: Regenerate.
* src/c++11/Makefile.in: Regenerate.
* src/c++17/Makefile.in: Regenerate.
* src/c++98/Makefile.in: Regenerate.
* src/filesystem/Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.

Tested x86_64-linux, committed to master.

(No patch, because it's just mechanical changes to generated files).




Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-23 Thread Richard Biener
On Mon, 22 Jun 2020, Alexandre Oliva wrote:

> On Jun 22, 2020, Tobias Burnus  wrote:
> 
> > On 6/22/20 8:08 AM, Alexandre Oliva wrote:
> >>> I additionally did run the test case manually → files.log for the
> >>> produced files.
> >> This is with -save-temps, right?
> 
> > Yes. Without, there are no files left under /tmp and only
> >   nvptx-merged-loop.xnvptx-none.mkoffload.309r.mach
> >   nvptx-merged-loop.exe
> > in the current directory.
> > (As in the testsuite, -foffload=-fdump-rtl-mach was used.)
> 
> >> Interesting, in my test run (native only) I didn't trigger that problem.
> >> +++ b/gcc/testsuite/lib/scanoffload.exp
> >> +if [info set offload_target] {
> >> The 'set' above should be 'exists'.
> 
> > UNSUPPORTED:
> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0
> > PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2
> > (test for excess errors)
> > PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2
> > execution test
> > PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c
> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2
> > scan-offload-rtl-dump mach "Merging loop .* into "
> > UNSUPPORTED:
> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c
> > -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2
> 
> > Hence, it looks fine now – given:
> 
> Yay, thanks.
> 
> Here's a consolidated patch, that I've tested myself on x86_64-linux-gnu
> (before consolidation, retesting now) with a toolchain build with all of
> the offload targets built and enabled, but without cuda or ROCm runtimes
> or hardware, only the intelmicemul runtime provided by liboffloadmic.
> 
> Ok to install?

OK.

Thanks,
Richard.

> 
> handle dumpbase in offloading, adjust testsuite
> 
> From: Alexandre Oliva 
> 
> Pass dumpbase on to mkoffloads and their offload-target compiler runs,
> using different suffixes for different offloading targets.
> Obey -save-temps in naming temporary files while at that.
> 
> Adjust the testsuite offload dump scanning machinery to look for dump
> files named under the new conventions, iterating internally over all
> configured offload targets, or recognizing libgomp's testsuite's own
> iteration.
> 
> 
> for  gcc/ChangeLog
> 
>   * colllect-utils.h (dumppfx): New.
>   * colllect-utils.c (dumppfx): Likewise.
>   * lto-wrapper.c (run_gcc): Set global dumppfx.
>   (compile_offload_image): Pass a -dumpbase on to mkoffload.
>   * config/nvptx/mkoffload.c (ptx_dumpbase): New.
>   (main): Handle incoming -dumpbase.  Set ptx_dumpbase.  Obey
>   save_temps.
>   (compile_native): Pass -dumpbase et al to compiler.
>   * config/gcn/mkoffload.c (gcn_dumpbase): New.
>   (main): Handle incoming -dumpbase.  Set gcn_dumpbase.  Obey
>   save_temps.  Pass -dumpbase et al to offload target compiler.
>   (compile_native): Pass -dumpbase et al to compiler.
> 
> for  gcc/testsuite/ChangeLog
> 
>   * lib/scanoffload.exp: New.
>   * lib/scanoffloadrtl.exp: Load it.  Replace ".o" with ""
>   globally, and use scanoffload's scoff wrapper to fill it in.
>   * lib/scanoffloadtree.exp: Likewise.
> 
> for libgomp/testsuite/ChangeLog
> 
>   * lib/libgomp.exp: Load gcc lib scanoffload.exp.
>   * lib/libgomp-dg.exp: Drop now-obsolete -save-temps.
> ---
>  gcc/collect-utils.c   |1 +
>  gcc/collect-utils.h   |1 +
>  gcc/config/gcn/mkoffload.c|   51 
> ++---
>  gcc/config/nvptx/mkoffload.c  |   31 +++-
>  gcc/lto-wrapper.c |   13 +++-
>  gcc/testsuite/lib/scanoffload.exp |   45 +
>  gcc/testsuite/lib/scanoffloadrtl.exp  |   49 
>  gcc/testsuite/lib/scanoffloadtree.exp |   51 
> +
>  libgomp/testsuite/lib/libgomp-dg.exp  |8 -
>  libgomp/testsuite/lib/libgomp.exp |1 +
>  10 files changed, 185 insertions(+), 66 deletions(-)
>  create mode 100644 gcc/testsuite/lib/scanoffload.exp
> 
> diff --git a/gcc/collect-utils.c b/gcc/collect-utils.c
> index e85843bc..d4fa2c3 100644
> --- a/gcc/collect-utils.c
> +++ b/gcc/collect-utils.c
> @@ -34,6 +34,7 @@ static char *response_file;
>  bool debug;
>  bool verbose;
>  bool save_temps;
> +const char *dumppfx;
>  
>  
>  /* Notify user of a non-error.  */
> diff --git a/gcc/collect-utils.h b/gcc/collect-utils.h
> index e7c955f..6ff7d9d9 100644
> --- a/gcc/collect-utils.h
> +++ b/gcc/collect-utils.h
> @@ -37,6 +37,7 @@ extern void utils_cleanup (bool);
>  extern bool debug;
>  extern bool verbose;
>  extern bool save_temps;
> +extern const char *du

[committed] libstdc++: Implement P1972R2 changes to std::variant (PR 95832)

2020-06-23 Thread Jonathan Wakely via Gcc-patches
G++ implements P1972R2 since r11-1597-0ca22d027ecc and so we no longer
need the P0608R3 special case to prevent narrowing conversions to bool.

Since non-GNU compilers don't necessarily implment P1972R2 yet, this
may cause a regression for those compilers. There is no feature-test
macro we can use to detect it though, so we'll have to live with it.

libstdc++-v3/ChangeLog:

PR libstdc++/95832
* include/std/variant (__detail::__variant::_Build_FUN): Remove
partial specialization to prevent narrowing conversions to bool.
* testsuite/20_util/variant/compile.cc: Test non-narrowing
conversions to bool.
* testsuite/20_util/variant/run.cc: Likewise.

Tested powerpc64le-linux, committed to master.

Not possible to backport to gcc-10, because the front end support
isn't there. That unfortunately means std::variant construction works
differently in each of gcc-9, gcc-10 and master.

commit c98fc4eb3afeda6ad8220d0d79bc1247a92c7c65
Author: Jonathan Wakely 
Date:   Tue Jun 23 10:24:49 2020 +0100

libstdc++: Implement P1972R2 changes to std::variant (PR 95832)

G++ implements P1972R2 since r11-1597-0ca22d027ecc and so we no longer
need the P0608R3 special case to prevent narrowing conversions to bool.

Since non-GNU compilers don't necessarily implment P1972R2 yet, this
may cause a regression for those compilers. There is no feature-test
macro we can use to detect it though, so we'll have to live with it.

libstdc++-v3/ChangeLog:

PR libstdc++/95832
* include/std/variant (__detail::__variant::_Build_FUN): Remove
partial specialization to prevent narrowing conversions to bool.
* testsuite/20_util/variant/compile.cc: Test non-narrowing
conversions to bool.
* testsuite/20_util/variant/run.cc: Likewise.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 258a5fb18bd..6eeb3c80ec2 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -164,7 +164,7 @@ namespace __detail
 {
 namespace __variant
 {
-  // Returns the first appearence of _Tp in _Types.
+  // Returns the first appearance of _Tp in _Types.
   // Returns sizeof...(_Types) if _Tp is not in _Types.
   template
 struct __index_of : std::integral_constant {};
@@ -733,10 +733,8 @@ namespace __variant
   // Helper used to check for valid conversions that don't involve narrowing.
   template struct _Arr { _Ti _M_x[1]; };
 
-  // Build an imaginary function FUN(Ti) for each alternative type Ti
-  template, bool>,
-  typename = void>
+  // "Build an imaginary function FUN(Ti) for each alternative type Ti"
+  template
 struct _Build_FUN
 {
   // This function means 'using _Build_FUN::_S_fun;' is valid,
@@ -744,24 +742,15 @@ namespace __variant
   void _S_fun();
 };
 
-  // ... for which Ti x[] = {std::forward(t)}; is well-formed,
+  // "... for which Ti x[] = {std::forward(t)}; is well-formed."
   template
-struct _Build_FUN<_Ind, _Tp, _Ti, false,
+struct _Build_FUN<_Ind, _Tp, _Ti,
  void_t{{std::declval<_Tp>()}})>>
 {
   // This is the FUN function for type _Ti, with index _Ind
   static integral_constant _S_fun(_Ti);
 };
 
-  // ... and if Ti is cv bool, remove_cvref_t is bool.
-  template
-struct _Build_FUN<_Ind, _Tp, _Ti, true,
- enable_if_t, bool>>>
-{
-  // This is the FUN function for when _Ti is cv bool, with index _Ind
-  static integral_constant _S_fun(_Ti);
-};
-
   template>>
 struct _Build_FUNs;
diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc 
b/libstdc++-v3/testsuite/20_util/variant/compile.cc
index a53071c8867..b2b60d1cf10 100644
--- a/libstdc++-v3/testsuite/20_util/variant/compile.cc
+++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc
@@ -155,6 +155,14 @@ void arbitrary_ctor()
   static_assert(!is_constructible_v, unsigned>);
   static_assert(!is_constructible_v, int>);
   static_assert(!is_constructible_v, void*>);
+
+  // P1957R2 Converting from T* to bool should be considered narrowing
+  struct ConvertibleToBool
+  {
+operator bool() const { return true; }
+  };
+  static_assert(is_constructible_v, ConvertibleToBool>);
+  static_assert(is_constructible_v, ConvertibleToBool>);
 }
 
 struct none { none() = delete; };
diff --git a/libstdc++-v3/testsuite/20_util/variant/run.cc 
b/libstdc++-v3/testsuite/20_util/variant/run.cc
index 3e2228a4666..0ac5de25289 100644
--- a/libstdc++-v3/testsuite/20_util/variant/run.cc
+++ b/libstdc++-v3/testsuite/20_util/variant/run.cc
@@ -139,6 +139,20 @@ void arbitrary_ctor()
 variant v3 = 0;
 VERIFY(v3.index() == 1);
   }
+
+  {
+// P1957R2 Converting from T* to bool should be considered narrowing
+struct ConvertibleToBool
+{
+  operator bool() const { return true; }
+};
+variant v1 = ConvertibleToBool();
+VERIFY(std::get<0>

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-23 Thread Alexandre Oliva
Hello, Thomas,

On Jun  9, 2020, Thomas Schwinge  wrote:

> We're trying to scan 'variables.hsail.brig.*', but for input file name
> 'variables.hsail.brig', we're now creating:

I understand this was fixed by Martin Jambor's last week's patch for
brig.exp; can you please confirm whether the problem is fixed?

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-23 Thread Alexandre Oliva
On Jun  9, 2020, Thomas Schwinge  wrote:

> Previously, for '-foffload=nvptx-none -foffload=-fdump-rtl-mach
> -save-temps -o ./nvptx-merged-loop.exe', GCC produced the expected
> 'nvptx-merged-loop.o.307r.mach'.

I believe the patch I've just installed fixes the UNRESOLVED results
caused by not finding dump files.

> Consider 'libgomp.oacc-c-c++-common/pr85381-2.c':

> /* { dg-additional-options "-save-temps" } */

> /* { dg-final { scan-assembler-times "bar.sync" 2 } } */

> This expects to scan the PTX offloading compilation assembler code (not
> host code!), expecting that nvptx offloading code assembly is produced
> after the host code, and thus overwrites the latter file.  (Yes, that's
> certainly ugly/fragile...)

I'm afraid this will need further adjusting in the testsuite, as we'll
store the nvptx asm saved aux output in a separate file.
scan-assembler-times will no longer work for this purpose, we'll need
something that knows how to find the offloaded asm.

-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


Re: [PATCH 1/7 v5] ifn/optabs: Support vector load/store with length

2020-06-23 Thread Richard Sandiford
Things have moved on due to the IRC conversation, but…

"Kewen.Lin"  writes:
> on 2020/6/23 上午3:59, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> @@ -5167,6 +5167,24 @@ mode @var{n}.
>>>  
>>>  This pattern is not allowed to @code{FAIL}.
>>>  
>>> +@cindex @code{lenload@var{m}} instruction pattern
>>> +@item @samp{lenload@var{m}}
>>> +Perform a vector load with length from memory operand 1 of mode @var{m}
>>> +into register operand 0.  Length is provided in register operand 2 with
>>> +appropriate mode which should afford the maximal required precision of
>>> +any available lengths.
>> 
>> I think we need to say in more detail what “load with length” actually
>> means.  How about:
>> 
>>   Load the number of bytes specified by operand 2 from memory operand 1
>>   into register operand 0, setting the other bytes of operand 0 to
>>   undefined values.  Operands 0 and 1 have mode @var{m}.  Operand 2 has
>>   whichever integer mode the target prefers.
>> 
>
> Thanks for nice wordings!  Updated, for "... to undefined values" I changed it
> to "... to undefined values or zeros" as Segher's comments to match the 
> behavior
> on Power.

“set … to undefined values” means that the values are not defined by
the optab interface.  In other words, the target can set the bytes
to whatever it wants, and gimple code can't make any assumptions about
what the values of the bytes are.

So setting the bytes to zero (as Power does) would conform to the
interface.  So would leaving the bytes in operand 0 untouched.
So would using an instruction that really does leave the other
bytes with undefined values, etc.

So I think we should keep it as just “… to undefined values”,

The alternative would be to define the interface so that targets *must*
ensure that the other bytes are zeros.  But at the moment, the only
intended use of the optabs and ifns is for autovectorisation, and the
vectoriser won't care about the values of “inactive” bytes/lanes.
Forcing the target to set them to a specific value like zero would be
unnecessarily restrictive.

Thanks,
Richard


[committed] libstdc++: Adjust std::from_chars negative tests

2020-06-23 Thread Jonathan Wakely via Gcc-patches
Also test with an enumeration type. Move the dg-error directives outside
the #if block, because DejaGnu would process them whether or not wchar_t
support is present.

libstdc++-v3/ChangeLog:

* testsuite/20_util/from_chars/1_c++20_neg.cc: Check enumeration
type.
* testsuite/20_util/from_chars/1_neg.cc: Likewise. Move dg-error
directives outside preprocessor condition.

Tested x86_64-linux, committed to master.

commit b81d4f1e3d6a519afa20d84304f12ee639fde944
Author: Jonathan Wakely 
Date:   Tue Jun 23 12:20:26 2020 +0100

libstdc++: Adjust std::from_chars negative tests

Also test with an enumeration type. Move the dg-error directives outside
the #if block, because DejaGnu would process them whether or not wchar_t
support is present.

libstdc++-v3/ChangeLog:

* testsuite/20_util/from_chars/1_c++20_neg.cc: Check enumeration
type.
* testsuite/20_util/from_chars/1_neg.cc: Likewise. Move dg-error
directives outside preprocessor condition.

diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
index 8a73b2de72d..8454b304d13 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_c++20_neg.cc
@@ -35,6 +35,9 @@ test01(const char* first, const char* last)
   char32_t c32;
   std::from_chars(first, last, c32); // { dg-error "no matching" }
   std::from_chars(first, last, c32, 10); // { dg-error "no matching" }
+  enum E { } e;
+  std::from_chars(first, last, e); // { dg-error "no matching" }
+  std::from_chars(first, last, e, 10); // { dg-error "no matching" }
 }
 
 // { dg-prune-output "enable_if" }
diff --git a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
index 3f46b7e7b95..12b5e597b9f 100644
--- a/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/from_chars/1_neg.cc
@@ -25,9 +25,11 @@ test01(const char* first, const char* last)
 {
 #if _GLIBCXX_USE_WCHAR_T
   wchar_t wc;
+#else
+  enum W { } wc;
+#endif
   std::from_chars(first, last, wc); // { dg-error "no matching" }
   std::from_chars(first, last, wc, 10); // { dg-error "no matching" }
-#endif
 
   char16_t c16;
   std::from_chars(first, last, c16); // { dg-error "no matching" }
@@ -35,6 +37,10 @@ test01(const char* first, const char* last)
   char32_t c32;
   std::from_chars(first, last, c32); // { dg-error "no matching" }
   std::from_chars(first, last, c32, 10); // { dg-error "no matching" }
+
+  enum E { } e;
+  std::from_chars(first, last, e); // { dg-error "no matching" }
+  std::from_chars(first, last, e, 10); // { dg-error "no matching" }
 }
 
 // { dg-prune-output "enable_if" }


Re: [pushed] c++: -fsanitize=vptr and -fstrong-eval-order. [PR95221]

2020-06-23 Thread Thomas Schwinge
Hi Jason!

On 2020-05-22T17:03:01-0400, Jason Merrill via Gcc-patches 
 wrote:
> [...]
>
> This issue suggests that we should be running the ubsan tests in multiple
> standard modes like the rest of the G++ testsuite, so I've made that change
> as well.

> --- a/gcc/testsuite/g++.dg/ubsan/ubsan.exp
> +++ b/gcc/testsuite/g++.dg/ubsan/ubsan.exp
> @@ -26,7 +26,7 @@ ubsan_init
>
>  # Main loop.
>  if [check_effective_target_fsanitize_undefined] {
> -  gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C 
> $srcdir/c-c++-common/ubsan/*.c]] "" ""
> +  g++-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C 
> $srcdir/c-c++-common/ubsan/*.c]] "" ""
>  }

Hmm, but that means that testing is now no longer running the
"optimization options torture testing":

Running [...]/source-gcc/gcc/testsuite/g++.dg/ubsan/ubsan.exp ...
-PASS: c-c++-common/ubsan/align-1.c   -O0  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -O0  execution test
-PASS: c-c++-common/ubsan/align-1.c   -O1  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -O1  execution test
-PASS: c-c++-common/ubsan/align-1.c   -O2  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -O2  execution test
-PASS: c-c++-common/ubsan/align-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
-PASS: c-c++-common/ubsan/align-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
-PASS: c-c++-common/ubsan/align-1.c   -O3 -g  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -O3 -g  execution test
-PASS: c-c++-common/ubsan/align-1.c   -Os  (test for excess errors)
-PASS: c-c++-common/ubsan/align-1.c   -Os  execution test
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++14 (test for excess errors)
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++14 execution test
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++17 (test for excess errors)
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++17 execution test
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++2a (test for excess errors)
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++2a execution test
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++98 (test for excess errors)
+PASS: c-c++-common/ubsan/align-1.c  -std=gnu++98 execution test

Etc.

Not sure if that was intentional?  I suppose this removed way more
testsuite coverage compared to what different C++ '-std=[...]' add?


> The testcase changes are all to accommodate that.

>  /* { dg-options "-fsanitize=bounds -Wno-array-bounds" } */
> +/* { dg-options "-fsanitize=bounds -Wno-array-bounds -Wno-volatile" { target 
> c++ } } */

Simpler would've been (untested):

+/* { dg-additional-options "-Wno-volatile" { target c++ } } */

Etc.

;-)


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH 1/7 v5] ifn/optabs: Support vector load/store with length

2020-06-23 Thread Richard Biener via Gcc-patches
On Tue, Jun 23, 2020 at 11:53 AM Richard Sandiford
 wrote:
>
> Things have moved on due to the IRC conversation, but…
>
> "Kewen.Lin"  writes:
> > on 2020/6/23 上午3:59, Richard Sandiford wrote:
> >> "Kewen.Lin"  writes:
> >>> @@ -5167,6 +5167,24 @@ mode @var{n}.
> >>>
> >>>  This pattern is not allowed to @code{FAIL}.
> >>>
> >>> +@cindex @code{lenload@var{m}} instruction pattern
> >>> +@item @samp{lenload@var{m}}
> >>> +Perform a vector load with length from memory operand 1 of mode @var{m}
> >>> +into register operand 0.  Length is provided in register operand 2 with
> >>> +appropriate mode which should afford the maximal required precision of
> >>> +any available lengths.
> >>
> >> I think we need to say in more detail what “load with length” actually
> >> means.  How about:
> >>
> >>   Load the number of bytes specified by operand 2 from memory operand 1
> >>   into register operand 0, setting the other bytes of operand 0 to
> >>   undefined values.  Operands 0 and 1 have mode @var{m}.  Operand 2 has
> >>   whichever integer mode the target prefers.
> >>
> >
> > Thanks for nice wordings!  Updated, for "... to undefined values" I changed 
> > it
> > to "... to undefined values or zeros" as Segher's comments to match the 
> > behavior
> > on Power.
>
> “set … to undefined values” means that the values are not defined by
> the optab interface.  In other words, the target can set the bytes
> to whatever it wants, and gimple code can't make any assumptions about
> what the values of the bytes are.
>
> So setting the bytes to zero (as Power does) would conform to the
> interface.  So would leaving the bytes in operand 0 untouched.
> So would using an instruction that really does leave the other
> bytes with undefined values, etc.
>
> So I think we should keep it as just “… to undefined values”,
>
> The alternative would be to define the interface so that targets *must*
> ensure that the other bytes are zeros.  But at the moment, the only
> intended use of the optabs and ifns is for autovectorisation, and the
> vectoriser won't care about the values of “inactive” bytes/lanes.
> Forcing the target to set them to a specific value like zero would be
> unnecessarily restrictive.

Actually it _does_ care.  This is supposed to be used for fully masked
loops and 'unspecified values' would require us to explicitely zero
them for any FP op because of possible sNaN representations.  It
also precludes us from bitwise ORing in an appropriately masked
vector of 1s to make integer division happy (OK, no vector ISA supports
integer division).

So unless we have evidence that there exists an ISA that does _not_
zero the excess bits I'd rather specify it does.

Richard.

>
> Thanks,
> Richard


Re: [PATCH wwwdocs] gcc-11/changes: Document TSAN changes

2020-06-23 Thread Marco Elver via Gcc-patches
On Fri, 19 Jun 2020 at 14:25, Marco Elver  wrote:
>
> Document TSAN changes to support alternative runtimes, such as KCSAN.

Is this one good to go, or any objections?

Thanks,
-- Marco



> ---
>  htdocs/gcc-11/changes.html | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index 9dba1e14..dc22f216 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -46,6 +46,21 @@ a work-in-progress.
>  
>  General Improvements
>
> +
> +  
> + href="https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual";>
> +ThreadSanitizer improvements to support alternative runtimes and
> +environments. The  href="https://www.kernel.org/doc/html/latest/dev-tools/kcsan.html";>
> +Linux Kernel Concurrency Sanitizer (KCSAN) is now supported.
> +
> +  Add --param tsan-distinguish-volatile to optionally 
> emit
> +  instrumentation distinguishing volatile accesses.
> +  Add --param tsan-instrument-func-entry-exit to 
> optionally
> +  control if function entries and exits should be instrumented.
> +
> +  
> +
> +
>  
>  New Languages and Language specific improvements
>
> --
> 2.27.0.111.gc72c7da667-goog
>


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-23 Thread Martin Jambor
Hi,

On Tue, Jun 23 2020, Alexandre Oliva wrote:
> Hello, Thomas,
>
> On Jun  9, 2020, Thomas Schwinge  wrote:
>
>> We're trying to scan 'variables.hsail.brig.*', but for input file name
>> 'variables.hsail.brig', we're now creating:
>
> I understand this was fixed by Martin Jambor's last week's patch for
> brig.exp; can you please confirm whether the problem is fixed?
>

I see all but one brig tests passing (I checked out yesterday's master
commit f4670347f10).  The one failure is PR 86948.

Martin


RE: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved registers with CMSE

2020-06-23 Thread Kyrylo Tkachov


> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 22 June 2020 09:52
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved
> registers with CMSE
> 
> Hi,
> 
> As reported in bugzilla when the -mcmse option is used while compiling
> for size (-Os) with a thumb-1 target the generated code will clear the
> registers r7-r10. These however are callee saved and should be preserved
> accross ABI boundaries. The reason this happens is because these
> registers are made "fixed" when optimising for size with Thumb-1 in a
> way to make sure they are not used, as pushing and popping hi-registers
> requires extra moves to and from LO_REGS.
> 
> To fix this, this patch uses 'callee_saved_reg_p', which accounts for
> this optimisation, instead of 'call_used_or_fixed_reg_p'. Be aware of
> 'callee_saved_reg_p''s definition, as it does still take call used
> registers into account, which aren't callee_saved in my opinion, so it
> is a rather misnoemer, works in our advantage here though as it does
> exactly what we need.
> 
> Regression tested on arm-none-eabi.
> 
> Is this OK for trunk? (Will eventually backport to previous versions if
> stable.)

Ok.
Thanks,
Kyrill

> 
> Cheers,
> Andre
> 
> gcc/ChangeLog:
> 2020-06-22  Andre Vieira  
> 
>      PR target/95646
>      * config/arm/arm.c: (cmse_nonsecure_entry_clear_before_return):
> Use 'callee_saved_reg_p' instead of
>      'calL_used_or_fixed_reg_p'.
> 
> gcc/testsuite/ChangeLog:
> 2020-06-22  Andre Vieira  
> 
>      PR target/95646
>      * gcc.target/arm/pr95646.c: New test.



Re: [PATCH 1/7 v5] ifn/optabs: Support vector load/store with length

2020-06-23 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Jun 23, 2020 at 11:53 AM Richard Sandiford
>  wrote:
>>
>> Things have moved on due to the IRC conversation, but…
>>
>> "Kewen.Lin"  writes:
>> > on 2020/6/23 上午3:59, Richard Sandiford wrote:
>> >> "Kewen.Lin"  writes:
>> >>> @@ -5167,6 +5167,24 @@ mode @var{n}.
>> >>>
>> >>>  This pattern is not allowed to @code{FAIL}.
>> >>>
>> >>> +@cindex @code{lenload@var{m}} instruction pattern
>> >>> +@item @samp{lenload@var{m}}
>> >>> +Perform a vector load with length from memory operand 1 of mode @var{m}
>> >>> +into register operand 0.  Length is provided in register operand 2 with
>> >>> +appropriate mode which should afford the maximal required precision of
>> >>> +any available lengths.
>> >>
>> >> I think we need to say in more detail what “load with length” actually
>> >> means.  How about:
>> >>
>> >>   Load the number of bytes specified by operand 2 from memory operand 1
>> >>   into register operand 0, setting the other bytes of operand 0 to
>> >>   undefined values.  Operands 0 and 1 have mode @var{m}.  Operand 2 has
>> >>   whichever integer mode the target prefers.
>> >>
>> >
>> > Thanks for nice wordings!  Updated, for "... to undefined values" I 
>> > changed it
>> > to "... to undefined values or zeros" as Segher's comments to match the 
>> > behavior
>> > on Power.
>>
>> “set … to undefined values” means that the values are not defined by
>> the optab interface.  In other words, the target can set the bytes
>> to whatever it wants, and gimple code can't make any assumptions about
>> what the values of the bytes are.
>>
>> So setting the bytes to zero (as Power does) would conform to the
>> interface.  So would leaving the bytes in operand 0 untouched.
>> So would using an instruction that really does leave the other
>> bytes with undefined values, etc.
>>
>> So I think we should keep it as just “… to undefined values”,
>>
>> The alternative would be to define the interface so that targets *must*
>> ensure that the other bytes are zeros.  But at the moment, the only
>> intended use of the optabs and ifns is for autovectorisation, and the
>> vectoriser won't care about the values of “inactive” bytes/lanes.
>> Forcing the target to set them to a specific value like zero would be
>> unnecessarily restrictive.
>
> Actually it _does_ care.

I'd argue it doesn't, but for essentially the same reasons :-)

> This is supposed to be used for fully masked
> loops and 'unspecified values' would require us to explicitely zero
> them for any FP op because of possible sNaN representations.  It
> also precludes us from bitwise ORing in an appropriately masked
> vector of 1s to make integer division happy (OK, no vector ISA supports
> integer division).

Zeros would be a problem for FP division too.  And even if we require
loads to set inactive lanes to zero, we couldn't infer from that that
any given FP addition (say) won't raise an exception.  E.g. the inputs
could be the result of converting integers and adding them could trigger
an inexact exception.  Or the values could be the result of simple
bitcasts, giving arbitrary FP values.  (AIUI, current bfloat code
works this way.)

The vectoriser currently only allows potentially-trapping FP operations
on partial vectors if the target provides an appropriate IFN_COND_*
function.  (That's one of the main use cases for those functions.)
In other cases it requires the loop to operate on full vectors.
This should be relaxed in future to support inbranch partial
vectorisation of simd calls.

This means that the current patch series will/should simply punt
for “length”-based loop control if the loop contains FP operations
that (as far as gimple is concerned) might trap.

If we're thinking about how to relax that, then IMO it will need
to be done either at the level of each FP operation or by some
kind of “global” vectorisation subpass that introduces known-safe
values for inactive lanes.  The first would be easier, the second
would be more optimal.

I don't think that's specific to “length” vectorisation though.
The same concerns apply to if-converted loops that operate on full
vectors.  I think the approach would be essentially the same for both.

In that scenario, removing zeroing of an IFN_LEN_LOAD would “just” be
an optimisation, and could potentially be left to RTL code if necessary.
(But see my main point below.)

SVE supports integer division btw. :-)

> So unless we have evidence that there exists an ISA that does _not_
> zero the excess bits I'd rather specify it does.

I think the known architectures that might use this are:

- MVE
- Power
- RVV

MVE and Power both set inactive lanes to zero.  But I'm not sure about RVV.
AIUI, for RVV the approach instead would be to reduce the effective vector
length for the final iteration of the vector loop, and I'm not sure
whether in that situation it makes sense to say that the other elements
still exist and are guaranteed to be zero.

I'm the last person who should be speculating on that 

Re: [PATCHv6] Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic

2020-06-23 Thread Marek Polacek via Gcc-patches
On Mon, Jun 22, 2020 at 11:11:48PM -0400, Nicholas Krause wrote:
> 
> 
> On 6/22/20 10:01 PM, Marek Polacek wrote:
> > On Mon, Jun 22, 2020 at 09:42:51PM -0400, Nicholas Krause via Gcc-patches 
> > wrote:
> > > From: Nicholas Krause 
> > > 
> > > This fixs the PR95672 by adding the missing TYPE_PACK_EXPANSION case in
> > > cxx_incomplete_type_diagnostic in order to avoid ICES on diagnosing
> > > incomplete template pack expansion cases. In v2, add the missing required
> > > test case for all new patches. v3 Fixes both the test case to compile in
> > > C++11 mode and the message to print out only the type. v4 fixes the 
> > > testcase
> > > to only target C++11. v5 and v6 fix the test case properly.
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * typeck2.c (cxx_incomplete_type_diagnostic): Add missing 
> > > TYPE_EXPANSION_PACK
> > > check for diagnosticing incomplete types in 
> > > cxx_incomplete_type_diagnostic.
> > 
> > It's already been pointed out to you that it's "diagnosing".
> > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   * g++.dg/template/PR95672.C: New test.
> > > 
> > > Signed-off-by: Nicholas Krause 
> > > ---
> > >   gcc/cp/typeck2.c| 6 ++
> > >   gcc/testsuite/g++.dg/template/PR95672.C | 3 +++
> > >   2 files changed, 9 insertions(+)
> > >   create mode 100644 gcc/testsuite/g++.dg/template/PR95672.C
> > > 
> > > diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
> > > index 5fd3b82fa89..28b32fe0b5a 100644
> > > --- a/gcc/cp/typeck2.c
> > > +++ b/gcc/cp/typeck2.c
> > > @@ -552,6 +552,12 @@ cxx_incomplete_type_diagnostic (location_t loc, 
> > > const_tree value,
> > >  TYPE_NAME (type));
> > > break;
> > > +case TYPE_PACK_EXPANSION:
> > > + emit_diagnostic (diag_kind, loc, 0,
> > 
> > Bad indenting.
> 
> Sorry seems Jason didn't catch that.
> > 
> > > +  "invalid use of pack expansion %qT",
> > > +   type);
> > > +  break;
> > > +
> > >   case TYPENAME_TYPE:
> > >   case DECLTYPE_TYPE:
> > > emit_diagnostic (diag_kind, loc, 0,
> > > diff --git a/gcc/testsuite/g++.dg/template/PR95672.C 
> > > b/gcc/testsuite/g++.dg/template/PR95672.C
> > > new file mode 100644
> > > index 000..fcc3da0a132
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/template/PR95672.C
> > > @@ -0,0 +1,3 @@
> > > +// PR c++/96572
> > > +// { dg-do compile}
> > > +// { dg-options "-std=c++11" }
> > > +struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use 
> > > of pack expansion" }
> > 
> > No, this is completely broken.  It passes only because the patch file is
> > malformed and the last line will be lost when applying the patch, so the
> > test is an empty file.
> > 
> > Marek
> > 
> 
> Yes but now that I look at it seems the issue is that the error message was
> changed. My concern is that the test will have to be updated
> from the current every time the message changes like the current:
> error: expected primary-expression before ‘auto’
> 1 | struct g_class : decltype  (auto) ... {  };
> 
> Is this something I should care about as seems emit_diagnostic and friends
> sometimes do this.

decltype(auto) is a C++14 feature so won't work with -std=c++11.

Marek



Re: [PATCH] coroutines: Add a cleanup expression for g-r-o when needed [PR95477].

2020-06-23 Thread Nathan Sidwell

On 6/12/20 4:06 PM, Iain Sandoe wrote:

Iain Sandoe  wrote:


Nathan Sidwell  wrote:


On 6/8/20 5:17 AM, Iain Sandoe wrote:



+  r = gro_is_void_p ? integer_zero_node : rvalue (gro);
+  /* The return object is constructed, even if the gro is void.  */


Would error_mark_node work here?  I presume we've already diagnosed the 
problem, so will result in no cascade of errors.


We have not diagnosed the problem - it’s valid to have a void g-r-o and a 
non-void function
return.

I am not sure the circumstance has been identified specifically where this 
(valid) permutation
is found together with a specialization of the traits that has a non-class 
return type. (I spotted
this as a corner-case while working on the code).

Perhaps I should construct a test-case and see how the other compilers handle 
it.


I did this anyway …

To answer your original question, we need to diagnose this specifically.
clang does so, I’ve now matched the diagnostic for GCC.
we can then propagate an error_mark_node onwards.

OK now?
Iain


ok, thanks!



===

The PR reports that we fail to destroy the object initially created from
the get-return-object call.  Fixed by adding a cleanup when the DTOR is
non-trivial.  In addition, to meet the specific wording that the call to
get_return_object creates the glvalue for the return, we must construct
that in-place in the return object to avoid a second copy/move CTOR.

gcc/cp/ChangeLog:

PR c++/95477
* coroutines.cc (morph_fn_to_coro): Apply a cleanup to
the get return object when the DTOR is non-trivial.

gcc/testsuite/ChangeLog:

PR c++/95477
* g++.dg/coroutines/pr95477.C: New test.
* g++.dg/coroutines/void-gro-non-class-coro.C: New test.
---
  gcc/cp/coroutines.cc  | 69 ---
  gcc/testsuite/g++.dg/coroutines/pr95477.C | 37 ++
  .../coroutines/void-gro-non-class-coro.C  | 59 
  3 files changed, 155 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95477.C
  create mode 100644 gcc/testsuite/g++.dg/coroutines/void-gro-non-class-coro.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 11fca9954ac..d4f582e91e2 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4280,12 +4280,34 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
  
tree gro = NULL_TREE;

tree gro_bind_vars = NULL_TREE;
+  tree gro_cleanup_stmt = NULL_TREE;
/* We have to sequence the call to get_return_object before initial
   suspend.  */
if (gro_is_void_p)
-finish_expr_stmt (get_ro);
+r = get_ro;
+  else if (same_type_p (gro_type, fn_return_type))
+{
+ /* [dcl.fct.def.coroutine] / 7
+   The expression promise.get_return_object() is used to initialize the
+   glvalue result or... (see below)
+   Construct the return result directly.  */
+  if (TYPE_NEEDS_CONSTRUCTING (gro_type))
+   {
+ vec *arg = make_tree_vector_single (get_ro);
+ r = build_special_member_call (DECL_RESULT (orig),
+complete_ctor_identifier,
+&arg, gro_type, LOOKUP_NORMAL,
+tf_warning_or_error);
+ release_tree_vector (arg);
+   }
+  else
+   r = build2_loc (fn_start, INIT_EXPR, gro_type,
+   DECL_RESULT (orig), get_ro);
+}
else
  {
+  /* ... or ... Construct an object that will be used as the single
+   param to the CTOR for the return object.  */
gro = build_lang_decl (VAR_DECL, get_identifier ("coro.gro"), gro_type);
DECL_CONTEXT (gro) = current_scope ();
DECL_ARTIFICIAL (gro) = true;
@@ -4302,8 +4324,21 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
}
else
r = build2_loc (fn_start, INIT_EXPR, gro_type, gro, get_ro);
-  finish_expr_stmt (r);
+  /* The constructed object might require a cleanup.  */
+  if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (gro_type))
+   {
+ tree cleanup
+   = build_special_member_call (gro, complete_dtor_identifier,
+NULL, gro_type, LOOKUP_NORMAL,
+tf_warning_or_error);
+ gro_cleanup_stmt = build_stmt (input_location, CLEANUP_STMT, NULL,
+cleanup, gro);
+   }
  }
+  finish_expr_stmt (r);
+
+  if (gro_cleanup_stmt)
+CLEANUP_BODY (gro_cleanup_stmt) = push_stmt_list ();
  
/* Initialize the resume_idx_name to 0, meaning "not started".  */

tree resume_idx_m
@@ -4345,16 +4380,15 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
   promise was constructed.  We now supply a reference to that var,
   either as the return value (if it's the same type) or to the CTOR
   for an object of the return type.  */
-  if (gro_is_void_p)
-   

Re: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved registers with CMSE

2020-06-23 Thread Andre Vieira (lists)

On 23/06/2020 13:10, Kyrylo Tkachov wrote:



-Original Message-
From: Andre Vieira (lists) 
Sent: 22 June 2020 09:52
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov 
Subject: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved
registers with CMSE

Hi,

As reported in bugzilla when the -mcmse option is used while compiling
for size (-Os) with a thumb-1 target the generated code will clear the
registers r7-r10. These however are callee saved and should be preserved
accross ABI boundaries. The reason this happens is because these
registers are made "fixed" when optimising for size with Thumb-1 in a
way to make sure they are not used, as pushing and popping hi-registers
requires extra moves to and from LO_REGS.

To fix this, this patch uses 'callee_saved_reg_p', which accounts for
this optimisation, instead of 'call_used_or_fixed_reg_p'. Be aware of
'callee_saved_reg_p''s definition, as it does still take call used
registers into account, which aren't callee_saved in my opinion, so it
is a rather misnoemer, works in our advantage here though as it does
exactly what we need.

Regression tested on arm-none-eabi.

Is this OK for trunk? (Will eventually backport to previous versions if
stable.)

Ok.
Thanks,
Kyrill
As I was getting ready to push this I noticed I didn't add any skip-ifs 
to prevent this failing with specific target options. So here's a new 
version with those.


Still OK?

Cheers,
Andre



Cheers,
Andre

gcc/ChangeLog:
2020-06-22  Andre Vieira  

      PR target/95646
      * config/arm/arm.c: (cmse_nonsecure_entry_clear_before_return):
Use 'callee_saved_reg_p' instead of
      'calL_used_or_fixed_reg_p'.

gcc/testsuite/ChangeLog:
2020-06-22  Andre Vieira  

      PR target/95646
      * gcc.target/arm/pr95646.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
6b7ca829f1c8cbe3d427da474b079882dc522e1a..dac9a6fb5c41ce42cd7a278b417eab25239a043c
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -26960,7 +26960,7 @@ cmse_nonsecure_entry_clear_before_return (void)
continue;
   if (IN_RANGE (regno, IP_REGNUM, PC_REGNUM))
continue;
-  if (call_used_or_fixed_reg_p (regno)
+  if (!callee_saved_reg_p (regno)
  && (!IN_RANGE (regno, FIRST_VFP_REGNUM, LAST_VFP_REGNUM)
  || TARGET_HARD_FLOAT))
bitmap_set_bit (to_clear_bitmap, regno);
diff --git a/gcc/testsuite/gcc.target/arm/pr95646.c 
b/gcc/testsuite/gcc.target/arm/pr95646.c
new file mode 100644
index 
..12d06a0c8c1ed7de1f8d4d15130432259e613a32
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr95646.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } 
{ "-march=armv8-m.base" } } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-mcpu=*" } { 
"-mcpu=cortex-m23" } } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-mfpu=*" } { 
} } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { 
"-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
+/* { dg-options "-mcpu=cortex-m23 -mcmse" } */
+/* { dg-additional-options "-Os" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+int __attribute__ ((cmse_nonsecure_entry))
+foo (void)
+{
+  return 1;
+}
+/* { { dg-final { scan-assembler-not "mov\tr9, r0" } } */
+
+/*
+** __acle_se_bar:
+** mov (r[0-3]), r9
+** push{\1}
+** ...
+** pop {(r[0-3])}
+** mov r9, \2
+** ...
+** bxnslr
+*/
+int __attribute__ ((cmse_nonsecure_entry))
+bar (void)
+{
+  asm ("": : : "r9");
+  return 1;
+}


RE: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved registers with CMSE

2020-06-23 Thread Kyrylo Tkachov


> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 23 June 2020 14:28
> To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved
> registers with CMSE
> 
> On 23/06/2020 13:10, Kyrylo Tkachov wrote:
> >
> >> -Original Message-
> >> From: Andre Vieira (lists) 
> >> Sent: 22 June 2020 09:52
> >> To: gcc-patches@gcc.gnu.org
> >> Cc: Kyrylo Tkachov 
> >> Subject: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved
> >> registers with CMSE
> >>
> >> Hi,
> >>
> >> As reported in bugzilla when the -mcmse option is used while compiling
> >> for size (-Os) with a thumb-1 target the generated code will clear the
> >> registers r7-r10. These however are callee saved and should be preserved
> >> accross ABI boundaries. The reason this happens is because these
> >> registers are made "fixed" when optimising for size with Thumb-1 in a
> >> way to make sure they are not used, as pushing and popping hi-registers
> >> requires extra moves to and from LO_REGS.
> >>
> >> To fix this, this patch uses 'callee_saved_reg_p', which accounts for
> >> this optimisation, instead of 'call_used_or_fixed_reg_p'. Be aware of
> >> 'callee_saved_reg_p''s definition, as it does still take call used
> >> registers into account, which aren't callee_saved in my opinion, so it
> >> is a rather misnoemer, works in our advantage here though as it does
> >> exactly what we need.
> >>
> >> Regression tested on arm-none-eabi.
> >>
> >> Is this OK for trunk? (Will eventually backport to previous versions if
> >> stable.)
> > Ok.
> > Thanks,
> > Kyrill
> As I was getting ready to push this I noticed I didn't add any skip-ifs
> to prevent this failing with specific target options. So here's a new
> version with those.
> 
> Still OK?
> 

Ok.
Thanks,
Kyrill

> Cheers,
> Andre
> >
> >> Cheers,
> >> Andre
> >>
> >> gcc/ChangeLog:
> >> 2020-06-22  Andre Vieira  
> >>
> >>       PR target/95646
> >>       * config/arm/arm.c: (cmse_nonsecure_entry_clear_before_return):
> >> Use 'callee_saved_reg_p' instead of
> >>       'calL_used_or_fixed_reg_p'.
> >>
> >> gcc/testsuite/ChangeLog:
> >> 2020-06-22  Andre Vieira  
> >>
> >>       PR target/95646
> >>       * gcc.target/arm/pr95646.c: New test.


Re: [PR95416] outputs.exp: skip lto tests when not using linker plugin

2020-06-23 Thread Alexandre Oliva
On Jun  8, 2020, Alexandre Oliva  wrote:

>   * outputs.exp (skip_lto): Set when missing the linker plugin.

I withdraw the above work-around patch in favor of the fix proper below.

It's also supposed to fix the FAILs caused by .dSYM directories on
platforms that create them; I'd appreciate confirmation that it does.

Tested on x86_64-linux-gnu, both with and without LTO plugin.  Ok to
install?


outputs.exp: conditionals for split-dwarf and lto plugin

From: Alexandre Oliva 

This patch introduces support for conditionals (and expr) expansions
to file lists in proc outest in outputs.exp.

The conditionals machinery is now used to guard files that are only
created by the LTO plugin, or when not using the LTO plugin.

It is also used to avoid special-casing .dwo files: the condition of
when they're expected is now encoded in the list.

Furthermore, the -g flag, that used to be specified along with
$gsplit_dwarf, is now moved into $gsplit_dwarf, so that we don't
compile with -g if -gsplit-dwarf is not needed.  This avoids having to
deal with .dSYM directories.

Further removing special cases, $aout is now dealt with in a more
general way, using expr to perform variable/string expansion.


for  gcc/testsuite/ChangeLog

PR testsuite/95416
PR testsuite/95577
* outputs.exp (gsplit_dwarf): Move -g into it.
(outest): Introduce conditionals and string/variable/expr
expansion.  Drop special-casing of $aout and .dwo.
(gspd): New conditional.  Guard all .dwo files with it.
(ltop): New conditional.  Guard files created by the LTO
plugin with it.  Guard files created by fat LTO compilation
with its negation.  Add a few -fno-use-linker-plugin tests
guarded by it.
---
 gcc/testsuite/gcc.misc-tests/outputs.exp |  641 --
 1 file changed, 338 insertions(+), 303 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
b/gcc/testsuite/gcc.misc-tests/outputs.exp
index 06a32db..469d94c 100644
--- a/gcc/testsuite/gcc.misc-tests/outputs.exp
+++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
@@ -30,19 +30,25 @@ if {![gcc_parallel_test_run_p $b] || [is_remote host]} {
 }
 gcc_parallel_test_enable 0
 
-# Check for -gsplit-dwarf support.  The outest proc will check that
-# gsplit_dwarf is empty if a .dwo file is missing before deciding
-# that's a fail.
-set gsplit_dwarf "-gsplit-dwarf"
+# Check for -gsplit-dwarf support.  We don't check for .dwo files
+# without it.  Test it along with -g, so that we don't even bother
+# with debug info if -gsplit-dwarf is not supported.  This avoids
+# having to deal with .dSYM directories, as long as -gsplit-dwarf is
+# not supported on platforms that use .dSYM directories.
+set gsplit_dwarf "-g -gsplit-dwarf"
 if ![check_no_compiler_messages gsplitdwarf object {
 void foo (void) { }
 } "$gsplit_dwarf"] {
 set gsplit_dwarf ""
 }
+set gspd [expr { "$gsplit_dwarf" != "" }]
 
 # Check for -flto support.  We explicitly test the result to skip
 # tests that use -flto.
 set skip_lto ![check_effective_target_lto]
+if !$skip_lto {
+set ltop [check_linker_plugin_available]
+}
 
 # Prepare additional options to be used for linking.
 # We do not compile to an executable, because that requires naming an output.
@@ -76,21 +82,33 @@ if {[board_info $dest exists output_format]} {
 # DIRS is a list of output directories, children before parent, and
 # for each element of DIRS, there should be a corresponding sublist in
 # OUTPUTS.  If OUTPUTS has an additional trailing sublist, that's the
-# output list for the current directory.  Each element of the sublists
-# in OUTPUT is a file name or glob pattern to be checked for; a name
-# starting with a dash or a period is taken as a suffix for $b; with a
-# double dash, or a dash followed by a period, the first dash is
-# replaced with $b-$b; names starting with "a--" or "a-." have "$b"
-# inserted after the first dash.  The glob pattern may expand to more
-# than one file, but then the test will pass for any number of
-# matches, i.e., it would be safe to use for a.{out,exe} (if it
-# weren't for https://core.tcl-lang.org/tcl/tktview?name=5bbd044812),
-# but .{i,s,o} and .[iso] will pass even if only the .o is present.
+# output list for the current directory.
+
+# Each element of the sublists in OUTPUT is a condition, a file name
+# or glob pattern to be checked for.  A condition must start with !,
+# and it applies until the next condition, or through to the end of
+# the sublist.  (Use "!0" to disable an earlier condition, use "!!$X"
+# to test for a nonzero X.)  If the condition evaluates to false,
+# files covered by it are ignored.  Conditions may reference global
+# variables that are imported as such in the block that handles
+# conditions: look for "global gspd ltop" below.
+
+# A name starting with '"' or '$' undergoes expr expansion.  If it
+# expands to an empty string, it is ignored.  Global variables
+# referenced in such ex

[PATCH] libiberty, include: add bsearch_r

2020-06-23 Thread Nick Alcock via Gcc-patches
libctf wants a bsearch that takes a void * arg pointer to avoid a
nonportable use of __thread.

bsearch_r is required, not optional, at this point because as far as I
can see this obvious-sounding function is not implemented by anyone's
libc.  We can easily move it to AC_LIBOBJ later if it proves necessary
to do so.

include/
* libiberty.h (bsearch_r): New.
libiberty/
* bsearch_r.c: New file.
* Makefile.in (CFILES): Add bsearch_r.c.
(REQUIRED_OFILES): Add bsearch_r.o.
* functions.texi: Regenerate.
---
 include/libiberty.h   |  7 
 libiberty/Makefile.in | 12 +-
 libiberty/bsearch_r.c | 93 +++
 3 files changed, 111 insertions(+), 1 deletion(-)
 create mode 100644 libiberty/bsearch_r.c

Acked six months ago (!) by Jeff Law here:


Rebased against master (commit efc16503ca10bc0e934e0bace5777500e4dc757a)
and retested building this morning.

Sorry for the delay.

diff --git a/include/libiberty.h b/include/libiberty.h
index 141cb886a85..0bb5b81d4ac 100644
--- a/include/libiberty.h
+++ b/include/libiberty.h
@@ -641,6 +641,13 @@ extern int pexecute (const char *, char * const *, const 
char *,
 
 extern int pwait (int, int *, int);
 
+/* Like bsearch, but takes and passes on an argument like qsort_r.  */
+
+extern void *bsearch_r (register const void *, const void *,
+   size_t, register size_t,
+   register int (*)(const void *, const void *, void *),
+   void *);
+
 #if defined(HAVE_DECL_ASPRINTF) && !HAVE_DECL_ASPRINTF
 /* Like sprintf but provides a pointer to malloc'd storage, which must
be freed by the caller.  */
diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in
index d6b302e02fd..895f701bcd0 100644
--- a/libiberty/Makefile.in
+++ b/libiberty/Makefile.in
@@ -124,7 +124,7 @@ COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. 
-I$(INCDIR) \
 # CONFIGURED_OFILES and funcs in configure.ac.  Also run "make maint-deps"
 # to build the new rules.
 CFILES = alloca.c argv.c asprintf.c atexit.c   \
-   basename.c bcmp.c bcopy.c bsearch.c bzero.c \
+   basename.c bcmp.c bcopy.c bsearch.c bsearch_r.c bzero.c \
calloc.c choose-temp.c clock.c concat.c cp-demangle.c   \
 cp-demint.c cplus-dem.c crc32.c\
d-demangle.c dwarfnames.c dyn-string.c  \
@@ -168,6 +168,7 @@ REQUIRED_OFILES =   
\
./regex.$(objext) ./cplus-dem.$(objext) ./cp-demangle.$(objext) \
./md5.$(objext) ./sha1.$(objext) ./alloca.$(objext) \
./argv.$(objext)\
+   ./bsearch_r.$(objext)   \
./choose-temp.$(objext) ./concat.$(objext)  \
./cp-demint.$(objext) ./crc32.$(objext) ./d-demangle.$(objext)  \
./dwarfnames.$(objext) ./dyn-string.$(objext)   \
@@ -601,6 +602,15 @@ $(CONFIGURED_OFILES): stamp-picdir stamp-noasandir
else true; fi
$(COMPILE.c) $(srcdir)/bsearch.c $(OUTPUT_OPTION)
 
+./bsearch_r.$(objext): $(srcdir)/bsearch_r.c config.h $(INCDIR)/ansidecl.h
+   if [ x"$(PICFLAG)" != x ]; then \
+ $(COMPILE.c) $(PICFLAG) $(srcdir)/bsearch_r.c -o pic/$@; \
+   else true; fi
+   if [ x"$(NOASANFLAG)" != x ]; then \
+ $(COMPILE.c) $(PICFLAG) $(NOASANFLAG) $(srcdir)/bsearch_r.c -o 
noasan/$@; \
+   else true; fi
+   $(COMPILE.c) $(srcdir)/bsearch_r.c $(OUTPUT_OPTION)
+
 ./bzero.$(objext): $(srcdir)/bzero.c
if [ x"$(PICFLAG)" != x ]; then \
  $(COMPILE.c) $(PICFLAG) $(srcdir)/bzero.c -o pic/$@; \
diff --git a/libiberty/bsearch_r.c b/libiberty/bsearch_r.c
new file mode 100644
index 000..79ebae9b0be
--- /dev/null
+++ b/libiberty/bsearch_r.c
@@ -0,0 +1,93 @@
+/*
+ * Copyright (c) 1990 Regents of the University of California.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. [rescinded 22 July 1999]
+ * 4. Neither the name of the University nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR I

[Patch v2 3/3] aarch64: Mitigate SLS for BLR instruction

2020-06-23 Thread Matthew Malcomson
This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which uses
a BR to jump to the original value.  These function stubs are then
appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for all functions.
This set of stubs can have human readable names, and we are using
`__call_indirect_x` for register x.

When BTI branch protection is enabled the BLR instruction can jump to a
`BTI c` instruction using any register, while the BR instruction can
only jump to a `BTI c` instruction using the x16 or x17 registers.
Hence, in order to ensure this transformation is safe we mov the value
of the original register into x16 and use x16 for the BR.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to something like
BL __call_indirect_x0
where __call_indirect_x0 labels a thunk that contains
__call_indirect_x0:
MOV X16, X0
BR X16



The first version of this patch used local symbols specific to a
compilation unit to try and avoid relocations.
This was mistaken since functions coming from the same compilation unit
can still be in different sections, and the assembler will insert
relocations at jumps between sections.

On any relocation the linker is permitted to emit a veneer to handle
jumps between symbols that are very far apart.  The registers x16 and
x17 may be clobbered by these veneers.
Hence the function stubs cannot rely on the values of x16 and x17 being
the same as just before the function stub is called.

Similar can be said for the hot/cold partitioning of single functions,
so function-local stubs have the same restriction.

This updated version of the patch never emits function stubs for x16 and
x17, and instead forces other registers to be used.


Given the above, there is now no benefit to local symbols (since they
are not enough to avoid dealing with linker intricacies).  This patch
now uses global symbols with hidden visibility each stored in their own
COMDAT section.  This means stubs can be shared between compilation
units while still avoiding the PLT indirection.


This patch also removes the `__call_indirect_x30` stub (and
function-local equivalent) which would simply jump back to the original
location.


The function-local stubs are emitted to the assembly output file in one
chunk, which means we need not add the speculation barrier directly
after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

The global stubs are emitted in COMDAT/.linkonce sections by
themselves so that the linker can remove duplicates from multiple object
files.  This means they are not emitted in one chunk, and each one must
include the speculation barrier.

Another difference is that since the global stubs are shared across
compilation units we do not know that all functions will be targeting an
architecture supporting the SB instruction.
Rather than provide multiple stubs for each architecture, we provide a
stub that will work for all architectures -- using the DSB+ISB barrier.


This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.


gcc/ChangeLog:

2020-06-23  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm):
 

[Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-23 Thread Tobias Burnus

If the offloading code is (only) in a library, one can come up
with the idea to build those parts as shared library – and link
it to the nonoffloading code.(*)

Currently, this fails as the mkoffload calls the nonoffloading
compiler without the -fpic/-fPIC flags, even though the compiler
was originally invoked with those options. – And at some point,
the linker then complains.

This patch simply adds -fpic/-fPIC to the calls to the nonoffloading
("host") compiler, invoked from mkoffload, if they were present before.

For the testcase at hand, this works with both AMDGCN and nvptx
with the attached patch.

OK for the trunk?

Tobias

PS: I think as mid-/longterm project it would be nice to test this
in the testsuite, but that's unfortunately a larger task.

(*) Thomas mentioned that this is supposed to work also in more
complex cases than the one I outlined, although, that is probably
currently the most common one.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
gcc/ChangeLog:

	* config/gcn/mkoffload.c (compile_native, main): Pass -fPIC/-fpic
	on to the native compiler, if used.
	* config/nvptx/mkoffload.c (compile_native, main): Likewise.

 gcc/config/gcn/mkoffload.c   | 15 +--
 gcc/config/nvptx/mkoffload.c | 15 +--
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/gcc/config/gcn/mkoffload.c b/gcc/config/gcn/mkoffload.c
index 14f422e..0415d94 100644
--- a/gcc/config/gcn/mkoffload.c
+++ b/gcc/config/gcn/mkoffload.c
@@ -483,7 +483,8 @@ process_obj (FILE *in, FILE *cfile)
 /* Compile a C file using the host compiler.  */
 
 static void
-compile_native (const char *infile, const char *outfile, const char *compiler)
+compile_native (const char *infile, const char *outfile, const char *compiler,
+		bool fPIC, bool fpic)
 {
   const char *collect_gcc_options = getenv ("COLLECT_GCC_OPTIONS");
   if (!collect_gcc_options)
@@ -493,6 +494,10 @@ compile_native (const char *infile, const char *outfile, const char *compiler)
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
   obstack_ptr_grow (&argv_obstack, compiler);
+  if (fPIC)
+obstack_ptr_grow (&argv_obstack, "-fPIC");
+  if (fpic)
+obstack_ptr_grow (&argv_obstack, "-fpic");
   if (save_temps)
 obstack_ptr_grow (&argv_obstack, "-save-temps");
   if (verbose)
@@ -596,6 +601,8 @@ main (int argc, char **argv)
   /* Scan the argument vector.  */
   bool fopenmp = false;
   bool fopenacc = false;
+  bool fPIC = false;
+  bool fpic = false;
   for (int i = 1; i < argc; i++)
 {
 #define STR "-foffload-abi="
@@ -614,6 +621,10 @@ main (int argc, char **argv)
 	fopenmp = true;
   else if (strcmp (argv[i], "-fopenacc") == 0)
 	fopenacc = true;
+  else if (strcmp (argv[i], "-fPIC") == 0)
+	fPIC = true;
+  else if (strcmp (argv[i], "-fpic") == 0)
+	fpic = true;
   else if (strcmp (argv[i], "-save-temps") == 0)
 	save_temps = true;
   else if (strcmp (argv[i], "-v") == 0)
@@ -766,7 +777,7 @@ main (int argc, char **argv)
   xputenv (concat ("COMPILER_PATH=", cpath, NULL));
   xputenv (concat ("LIBRARY_PATH=", lpath, NULL));
 
-  compile_native (gcn_cfile_name, outname, collect_gcc);
+  compile_native (gcn_cfile_name, outname, collect_gcc, fPIC, fpic);
 
   return 0;
 }
diff --git a/gcc/config/nvptx/mkoffload.c b/gcc/config/nvptx/mkoffload.c
index efdf9b9..4fecb2b 100644
--- a/gcc/config/nvptx/mkoffload.c
+++ b/gcc/config/nvptx/mkoffload.c
@@ -356,7 +356,8 @@ process (FILE *in, FILE *out)
 }
 
 static void
-compile_native (const char *infile, const char *outfile, const char *compiler)
+compile_native (const char *infile, const char *outfile, const char *compiler,
+		bool fPIC, bool fpic)
 {
   const char *collect_gcc_options = getenv ("COLLECT_GCC_OPTIONS");
   if (!collect_gcc_options)
@@ -366,6 +367,10 @@ compile_native (const char *infile, const char *outfile, const char *compiler)
   struct obstack argv_obstack;
   obstack_init (&argv_obstack);
   obstack_ptr_grow (&argv_obstack, compiler);
+  if (fPIC)
+obstack_ptr_grow (&argv_obstack, "-fPIC");
+  if (fpic)
+obstack_ptr_grow (&argv_obstack, "-fpic");
   if (save_temps)
 obstack_ptr_grow (&argv_obstack, "-save-temps");
   if (verbose)
@@ -471,6 +476,8 @@ main (int argc, char **argv)
   /* Scan the argument vector.  */
   bool fopenmp = false;
   bool fopenacc = false;
+  bool fPIC = false;
+  bool fpic = false;
   for (int i = 1; i < argc; i++)
 {
 #define STR "-foffload-abi="
@@ -489,6 +496,10 @@ main (int argc, char **argv)
 	fopenmp = true;
   else if (strcmp (argv[i], "-fopenacc") == 0)
 	fopenacc = true;
+  else if (strcmp (argv[i], "-fPIC") == 0)
+	fPIC = true;
+  else if (strcmp (argv[i], "-fpic") == 0)
+	fpic = true;
   else if (strcmp (argv[i], "-save-temps") == 0)
 	save_temps = true;
   else if (strcmp (argv[i], "-v") == 0)
@@ -587,7 +598,7 @@ main (int a

[PATCH] Add TARGET_UPDATE_DECL_ALIGNMENT [PR95237]

2020-06-23 Thread Sunil K Pandey via Gcc-patches
From: Sunil K Pandey 

Default for this hook is NOP. For x86, in 32 bit mode, this hook
sets alignment of long long on stack to 32 bits if preferred stack
boundary is 32 bits.

 - This patch fixes
gcc.target/i386/pr69454-2.c
gcc.target/i386/stackalign/longlong-1.c
 - Regression test on x86-64, no new fail introduced.

Tested on x86-64.

gcc/ChangeLog:

PR target/95237
* config/i386/i386.c (ix86_update_decl_alignment): New
function.
(TARGET_UPDATE_DECL_ALIGNMENT): Define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_UPDATE_DECL_ALIGNMENT): New hook.
* stor-layout.c (do_type_align): Call target hook to update
decl alignment.
* target.def (update_decl_alignment): New hook.

gcc/testsuite/ChangeLog:

PR target/95237
* gcc.target/i386/pr95237-1.c: New test.
* gcc.target/i386/pr95237-2.c: New test.
* gcc.target/i386/pr95237-3.c: New test.
* gcc.target/i386/pr95237-4.c: New test.
* gcc.target/i386/pr95237-5.c: New test.
---
 gcc/config/i386/i386.c| 22 ++
 gcc/doc/tm.texi   |  5 +
 gcc/doc/tm.texi.in|  2 ++
 gcc/stor-layout.c |  2 ++
 gcc/target.def|  7 +++
 gcc/testsuite/gcc.target/i386/pr95237-1.c | 16 
 gcc/testsuite/gcc.target/i386/pr95237-2.c | 10 ++
 gcc/testsuite/gcc.target/i386/pr95237-3.c | 10 ++
 gcc/testsuite/gcc.target/i386/pr95237-4.c | 10 ++
 gcc/testsuite/gcc.target/i386/pr95237-5.c | 16 
 10 files changed, 100 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95237-5.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 37aaa49996d..bcd9abd5303 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16917,6 +16917,25 @@ ix86_minimum_alignment (tree exp, machine_mode mode,
 
   return align;
 }
+
+/* Implement TARGET_UPDATE_DECL_ALIGNMENT.  */
+
+static void
+ix86_update_decl_alignment (tree decl)
+{
+  tree type = TREE_TYPE (decl);
+
+  if (cfun != NULL
+  && !TARGET_64BIT
+  && DECL_ALIGN (decl) == 64
+  && ix86_preferred_stack_boundary < 64
+  && !is_global_var (decl)
+  && (DECL_MODE (decl) == E_DImode
+ || (type && TYPE_MODE (type) == E_DImode))
+  && (!type || !TYPE_USER_ALIGN (type))
+  && (!decl || !DECL_USER_ALIGN (decl)))
+SET_DECL_ALIGN (decl, 32);
+}
 
 /* Find a location for the static chain incoming to a nested function.
This is a register, unless all free registers are used by arguments.  */
@@ -23519,6 +23538,9 @@ ix86_run_selftests (void)
 #undef TARGET_CAN_CHANGE_MODE_CLASS
 #define TARGET_CAN_CHANGE_MODE_CLASS ix86_can_change_mode_class
 
+#undef TARGET_UPDATE_DECL_ALIGNMENT
+#define TARGET_UPDATE_DECL_ALIGNMENT ix86_update_decl_alignment
+
 #undef TARGET_STATIC_RTX_ALIGNMENT
 #define TARGET_STATIC_RTX_ALIGNMENT ix86_static_rtx_alignment
 #undef TARGET_CONSTANT_ALIGNMENT
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 6e7d9dc54a9..c11ef5dca89 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1086,6 +1086,11 @@ On 32-bit ELF the largest supported section alignment in 
bits is
 @samp{(0x8000 * 8)}, but this is not representable on 32-bit hosts.
 @end defmac
 
+@deftypefn {Target Hook} void TARGET_UPDATE_DECL_ALIGNMENT (tree @var{decl})
+Define this hook to update alignment of decl
+@samp{(@var{decl}}.
+@end deftypefn
+
 @deftypefn {Target Hook} HOST_WIDE_INT TARGET_STATIC_RTX_ALIGNMENT 
(machine_mode @var{mode})
 This hook returns the preferred alignment in bits for a
 statically-allocated rtx, such as a constant pool entry.  @var{mode}
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 3be984bbd5c..618acd73a1e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1036,6 +1036,8 @@ On 32-bit ELF the largest supported section alignment in 
bits is
 @samp{(0x8000 * 8)}, but this is not representable on 32-bit hosts.
 @end defmac
 
+@hook TARGET_UPDATE_DECL_ALIGNMENT
+
 @hook TARGET_STATIC_RTX_ALIGNMENT
 
 @defmac DATA_ALIGNMENT (@var{type}, @var{basic-align})
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index bde6fa22b58..0687a68ba29 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -605,6 +605,8 @@ do_type_align (tree type, tree decl)
   if (TYPE_ALIGN (type) > DECL_ALIGN (decl))
 {
   SET_DECL_ALIGN (decl, TYPE_ALIGN (type));
+  /* Update decl alignment */
+  targetm.update_decl_alignment (decl);
   if (TREE_CODE (decl) == FIELD_DECL)
DECL_USER_ALIGN (decl) = TYPE_USER_ALIGN (type);
 }
diff --git a/gcc/target.def b/gcc/target.def
ind

Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-23 Thread Andrew Stubbs

On 23/06/2020 16:21, Tobias Burnus wrote:

If the offloading code is (only) in a library, one can come up
with the idea to build those parts as shared library – and link
it to the nonoffloading code.(*)

Currently, this fails as the mkoffload calls the nonoffloading
compiler without the -fpic/-fPIC flags, even though the compiler
was originally invoked with those options. – And at some point,
the linker then complains.

This patch simply adds -fpic/-fPIC to the calls to the nonoffloading
("host") compiler, invoked from mkoffload, if they were present before.

For the testcase at hand, this works with both AMDGCN and nvptx
with the attached patch.

OK for the trunk?


The GCN bit is OK.

Andrew


[PATCH] build: Change conditional include and empty.mk to -include in Makefiles

2020-06-23 Thread David Edelsohn via Gcc-patches
This patch removes ifneq from Makefile fragments in gcc/Makefile.in
and empty.mk in libgcc/Makefile.in.

GNU Make supports the "-include" keyword to prevent warnings and errors due to
inclusion of non-existent files.  This patch changes gcc/ and libgcc/ to use
"-include" in place of the historical conditional inclusion and use of empty.mk
work-arounds.  This makes the GCC build machinery more consistent
across languages and target libraries.

Bootstrapped on powerpc64le-gnu-linux and powerpc-ibm-aix7.2.0.0

Okay?

Thanks, David

gcc/ChangeLog

* Makefile.in (tmake_file): Use -include.
(xmake_file): Same.
(LANG_MAKEFRAGS): Same.

libgcc/ChangeLog

* Makefile.in: Remove uses of empty.mk. Use -include.
* config/avr/t-avr: Use -include.
* empty.mk: Delete.


0001-build-Change-conditional-include-and-empty.mk-to-inc.patch
Description: Binary data


Re: [PATCH] build: Change conditional include and empty.mk to -include in Makefiles

2020-06-23 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 23, 2020 at 11:35:21AM -0400, David Edelsohn wrote:
> This patch removes ifneq from Makefile fragments in gcc/Makefile.in
> and empty.mk in libgcc/Makefile.in.
> 
> GNU Make supports the "-include" keyword to prevent warnings and errors due to
> inclusion of non-existent files.  This patch changes gcc/ and libgcc/ to use
> "-include" in place of the historical conditional inclusion and use of 
> empty.mk
> work-arounds.  This makes the GCC build machinery more consistent
> across languages and target libraries.
> 
> Bootstrapped on powerpc64le-gnu-linux and powerpc-ibm-aix7.2.0.0
> 
> Okay?

Another option is sinclude instead of -include, but no preference between
those.
Ok, thanks.

> Thanks, David
> 
> gcc/ChangeLog
> 
> * Makefile.in (tmake_file): Use -include.
> (xmake_file): Same.
> (LANG_MAKEFRAGS): Same.
> 
> libgcc/ChangeLog
> 
> * Makefile.in: Remove uses of empty.mk. Use -include.
> * config/avr/t-avr: Use -include.
> * empty.mk: Delete.


Jakub



Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-06-23 Thread Richard Sandiford
Matthew Malcomson  writes:
> @@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const struct 
> processor **res,
>return false;
>  }
>  
> +

Should just be one blank line here.

> +/* Straight line speculation indicators.  */
> +enum aarch64_sls_hardening_type
> +{
> +SLS_NONE = 0,
> +SLS_RETBR = 1,
> +SLS_BLR = 2,
> +SLS_ALL = 3,

Just indent by two spaces rather than four.

> +};
> +static enum aarch64_sls_hardening_type aarch64_sls_hardening;

Maybe easier to read with a line break here.

> +/* Return whether we should mitigatate Straight Line Speculation for the RET
> +   and BR instructions.  */
> +bool
> +aarch64_harden_sls_retbr_p (void)
> +{
> +  return aarch64_sls_hardening & SLS_RETBR;
> +}

…and here.

> +/* Return whether we should mitigatate Straight Line Speculation for the RET
> +   and BR instructions.  */
> +bool
> +aarch64_harden_sls_blr_p (void)
> +{
> +  return aarch64_sls_hardening & SLS_BLR;
> +}

Pasto: returns true for BLR speculation instead of RET + BR.

> +
> +/* As of yet we only allow setting these options globally, in the future we 
> may
> +   allow setting them per function.  */
> +static void
> +aarch64_validate_sls_mitigation (const char *const_str)
> +{
> +  char *str_root = xstrdup (const_str);
> +  char *token_save = NULL;
> +  char *str = NULL;
> +  int temp = SLS_NONE;
> +
> +  aarch64_sls_hardening = SLS_NONE;
> +  if (strcmp (str_root, "none") == 0)
> +goto finish;

In Clang I think this would override any previous option, so should
we set aarch64_sls_hardening to 0?

> +  if (strcmp (str_root, "all") == 0)
> +{
> +  aarch64_sls_hardening = SLS_ALL;
> +  goto finish;
> +}
> +
> +  str = strtok_r (str_root, ",", &token_save);
> +  if (!str)
> +{
> +  error ("invalid argument given to %<-mharden-sls=%>");
> +  goto finish;
> +}

I'm not particularly anti-goto, but in this case it looks simpler
to do the full-string comparisons on const_str and only duplicate
the string before the strtok_r.

> +  while (str)
> +{
> +  if (strcmp (str, "blr") == 0)
> + temp |= SLS_BLR;
> +  else if (strcmp (str, "retbr") == 0)
> + temp |= SLS_RETBR;
> +  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
> + {
> +   error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
> +   break;
> + }
> +  else
> + {
> +   error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
> +   break;
> + }
> +  str = strtok_r (NULL, ",", &token_save);
> +}
> +  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
> +finish:
> +  free (str_root);
> +  return;
> +}

Think it's more usual in gcc not to have explicit end-of-function void
returns.

>  /* Parses CONST_STR for branch protection features specified in
> aarch64_branch_protect_types, and set any global variables required.  
> Returns
> the parsing result and assigns LAST_STR to the last processed token from
> @@ -14710,6 +14785,9 @@ aarch64_override_options (void)
>selected_arch = NULL;
>selected_tune = NULL;
>  
> +  if (aarch64_harden_sls_string)
> +  aarch64_validate_sls_mitigation (aarch64_harden_sls_string);

Last line is indented two spaces too many.

> +
>if (aarch64_branch_protection_string)
>  aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
>  
> diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
> index 
> d99d14c137d8774d3c8dab860d475f68c01a2817..5170361fd5e5721e044d1664e522b2718f654b8e
>  100644
> --- a/gcc/config/aarch64/aarch64.opt
> +++ b/gcc/config/aarch64/aarch64.opt
> @@ -71,6 +71,10 @@ mgeneral-regs-only
>  Target Report RejectNegative Mask(GENERAL_REGS_ONLY) Save
>  Generate code which uses only the general registers.
>  
> +mharden-sls=
> +Target RejectNegative Joined Var(aarch64_harden_sls_string)
> +Generate code to mitigate against straight line speculation.
> +
>  mfix-cortex-a53-835769
>  Target Report Var(aarch64_fix_a53_err835769) Init(2) Save
>  Workaround for ARM Cortex-A53 Erratum number 835769.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> 35e8242af5fa4c52744fd2c3e2cfee0a617e22bb..8a3fab2964c9bb06c820766d284768751d63ac9a
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -696,6 +696,7 @@ Objective-C and Objective-C++ Dialects}.
>  -msign-return-address=@var{scope} @gol
>  -mbranch-protection=@var{none}|@var{standard}|@var{pac-ret}[+@var{leaf}
>  +@var{b-key}]|@var{bti} @gol
> +-mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr} @gol
>  -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
>  -moverride=@var{string}  -mverbose-cost-dump @gol
>  -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} 
> @gol
> @@ -17045,6 +17046,15 @@ functions.  The optional argument @samp{b-key} can 
> be used to sign the functions
>  with the B-key instead of the A-key.
>  @samp{bti} turns on branch target identificati

RE: [PATCH][RFC] __builtin_shuffle sometimes should produce zip1 rather than TBL (PR82199)

2020-06-23 Thread Tamar Christina
Adding AArch64 maintainers.

> -Original Message-
> From: Gcc-patches  On Behalf Of Dmitrij
> Pochepko
> Sent: Thursday, June 11, 2020 12:22 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][RFC] __builtin_shuffle sometimes should produce zip1
> rather than TBL (PR82199)
> 
> The following patch enables vector permutations optimization by using
> another vector element size when applicable.
> It allows usage of simpler instructions in applicable cases.
> 
> example:
> #define vector __attribute__((vector_size(16) ))
> 
> vector float f(vector float a, vector float b) {
>   return __builtin_shuffle  (a, b, (vector int){0, 1, 4,5}); }
> 
> was compiled into:
> ...
>   adrpx0, .LC0
>   ldr q2, [x0, #:lo12:.LC0]
>   tbl v0.16b, {v0.16b - v1.16b}, v2.16b
> ...
> 
> and after patch:
> ...
>   zip1v0.2d, v0.2d, v1.2d
> ...
> 
> bootstrapped and tested on aarch64-linux-gnu with no regressions
> 
> 
> This patch was initially introduced by Andrew Pinksi 
> with me being involved later.
> 
> (I have no write access to repo)
> 
> Thanks,
> Dmitrij
> 
> gcc/ChangeLog:
> 
> 2020-06-11Andrew Pinski   
> 
>   PR gcc/82199
> 
>   * gcc/config/aarch64/aarch64.c (aarch64_evpc_reencode): New
> function
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-06-11  Andrew Pinski   
> 
>   PR gcc/82199
> 
>   * gcc.target/aarch64/vdup_n_3.c: New test
>   * gcc.target/aarch64/vzip_1.c: New test
>   * gcc.target/aarch64/vzip_2.c: New test
>   * gcc.target/aarch64/vzip_3.c: New test
>   * gcc.target/aarch64/vzip_4.c: New test
> 
> Co-Authored-By:   Dmitrij Pochepko sw.com>
> 
> 
> 
> Thanks,
> Dmitrij


RE: [PATCH][RFC] vector creation from two parts of two vectors produces TBL rather than ins (PR93720)

2020-06-23 Thread Tamar Christina
Adding AArch64 maintainers,
Also Dmitrij the patch lacks a changelog.

> -Original Message-
> From: Gcc-patches  On Behalf Of Dmitrij
> Pochepko
> Sent: Wednesday, June 17, 2020 10:09 AM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][RFC] vector creation from two parts of two vectors
> produces TBL rather than ins (PR93720)
> 
> The following patch enables vector permutations optimization by trying to
> use ins instruction instead of slow and generic tbl.
> 
> example:
> #define vector __attribute__((vector_size(4*sizeof(float
> 
> vector float f0(vector float a, vector float b) {
>   return __builtin_shuffle (a, a, (vector int){3, 1, 2, 3}); }
> 
> 
> was compiled into:
> ...
> adrpx0, .LC0
> ldr q1, [x0, #:lo12:.LC0]
> tbl v0.16b, {v0.16b}, v1.16b
> ...
> 
> and after patch:
> ...
> ins v0.s[0], v0.s[3]
> ...
> 
> bootstrapped and tested on aarch64-linux-gnu with no regressions
> 
> 
> This patch was initially introduced by me with Andrew Pinksi
>  being involved later.
> 
> Please note that test in this patch depends on another commit (PR82199),
> which I sent not long ago.
> 
> (I have no write access to repo)
> 
> Thanks,
> Dmitrij


[PATCH v2] libiberty, include: add bsearch_r

2020-06-23 Thread Nick Alcock via Gcc-patches
libctf wants a bsearch that takes a void * arg pointer to avoid a
nonportable use of __thread.

bsearch_r is required, not optional, at this point because as far as I
can see this obvious-sounding function is not implemented by anyone's
libc.  We can easily move it to AC_LIBOBJ later if it proves necessary
to do so.

include/
* libiberty.h (bsearch_r): New.
libiberty/
* bsearch_r.c: New file.
* Makefile.in (CFILES): Add bsearch_r.c.
(REQUIRED_OFILES): Add bsearch_r.o.
* functions.texi: Regenerate.
---
 include/libiberty.h  |  7 +++
 libiberty/Makefile.in| 12 +-
 libiberty/bsearch_r.c| 93 
 libiberty/functions.texi | 21 -
 4 files changed, 130 insertions(+), 3 deletions(-)
 create mode 100644 libiberty/bsearch_r.c

v2: actually regenerate functions.texi.

Acked six months ago (!) by Jeff Law here:


Rebased against master (commit efc16503ca10bc0e934e0bace5777500e4dc757a)
and retested building this morning.

Sorry for the delay.

diff --git a/include/libiberty.h b/include/libiberty.h
index 141cb886a85..0bb5b81d4ac 100644
--- a/include/libiberty.h
+++ b/include/libiberty.h
@@ -641,6 +641,13 @@ extern int pexecute (const char *, char * const *, const 
char *,
 
 extern int pwait (int, int *, int);
 
+/* Like bsearch, but takes and passes on an argument like qsort_r.  */
+
+extern void *bsearch_r (register const void *, const void *,
+   size_t, register size_t,
+   register int (*)(const void *, const void *, void *),
+   void *);
+
 #if defined(HAVE_DECL_ASPRINTF) && !HAVE_DECL_ASPRINTF
 /* Like sprintf but provides a pointer to malloc'd storage, which must
be freed by the caller.  */
diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in
index d6b302e02fd..895f701bcd0 100644
--- a/libiberty/Makefile.in
+++ b/libiberty/Makefile.in
@@ -124,7 +124,7 @@ COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. 
-I$(INCDIR) \
 # CONFIGURED_OFILES and funcs in configure.ac.  Also run "make maint-deps"
 # to build the new rules.
 CFILES = alloca.c argv.c asprintf.c atexit.c   \
-   basename.c bcmp.c bcopy.c bsearch.c bzero.c \
+   basename.c bcmp.c bcopy.c bsearch.c bsearch_r.c bzero.c \
calloc.c choose-temp.c clock.c concat.c cp-demangle.c   \
 cp-demint.c cplus-dem.c crc32.c\
d-demangle.c dwarfnames.c dyn-string.c  \
@@ -168,6 +168,7 @@ REQUIRED_OFILES =   
\
./regex.$(objext) ./cplus-dem.$(objext) ./cp-demangle.$(objext) \
./md5.$(objext) ./sha1.$(objext) ./alloca.$(objext) \
./argv.$(objext)\
+   ./bsearch_r.$(objext)   \
./choose-temp.$(objext) ./concat.$(objext)  \
./cp-demint.$(objext) ./crc32.$(objext) ./d-demangle.$(objext)  \
./dwarfnames.$(objext) ./dyn-string.$(objext)   \
@@ -601,6 +602,15 @@ $(CONFIGURED_OFILES): stamp-picdir stamp-noasandir
else true; fi
$(COMPILE.c) $(srcdir)/bsearch.c $(OUTPUT_OPTION)
 
+./bsearch_r.$(objext): $(srcdir)/bsearch_r.c config.h $(INCDIR)/ansidecl.h
+   if [ x"$(PICFLAG)" != x ]; then \
+ $(COMPILE.c) $(PICFLAG) $(srcdir)/bsearch_r.c -o pic/$@; \
+   else true; fi
+   if [ x"$(NOASANFLAG)" != x ]; then \
+ $(COMPILE.c) $(PICFLAG) $(NOASANFLAG) $(srcdir)/bsearch_r.c -o 
noasan/$@; \
+   else true; fi
+   $(COMPILE.c) $(srcdir)/bsearch_r.c $(OUTPUT_OPTION)
+
 ./bzero.$(objext): $(srcdir)/bzero.c
if [ x"$(PICFLAG)" != x ]; then \
  $(COMPILE.c) $(PICFLAG) $(srcdir)/bzero.c -o pic/$@; \
diff --git a/libiberty/bsearch_r.c b/libiberty/bsearch_r.c
new file mode 100644
index 000..79ebae9b0be
--- /dev/null
+++ b/libiberty/bsearch_r.c
@@ -0,0 +1,93 @@
+/*
+ * Copyright (c) 1990 Regents of the University of California.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. [rescinded 22 July 1999]
+ * 4. Neither the name of the University nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * THIS

Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-23 Thread Richard Sandiford
Matthew Malcomson  writes:
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -780,6 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
>  
>  tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
>  
> +const char * aarch64_sls_barrier (int);

Should be no space after the “*”.

>  extern bool aarch64_harden_sls_retbr_p (void);
>  extern bool aarch64_harden_sls_blr_p (void);
>  
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 
> 24767c747bab0d711627c5c646937c42f210d70b..5da3d94e335fc315e1d90e6a674f2f09cf1a4529
>  100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -281,6 +281,7 @@ extern unsigned aarch64_architecture_version;
>  #define AARCH64_ISA_F32MM   (aarch64_isa_flags & AARCH64_FL_F32MM)
>  #define AARCH64_ISA_F64MM   (aarch64_isa_flags & AARCH64_FL_F64MM)
>  #define AARCH64_ISA_BF16(aarch64_isa_flags & AARCH64_FL_BF16)
> +#define AARCH64_ISA_SB  (aarch64_isa_flags & AARCH64_FL_SB)
>  
>  /* Crypto is an optional extension to AdvSIMD.  */
>  #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
> @@ -378,6 +379,9 @@ extern unsigned aarch64_architecture_version;
>  #define TARGET_FIX_ERR_A53_835769_DEFAULT 1
>  #endif
>  
> +/* SB instruction is enabled through +sb.  */
> +#define TARGET_SB (AARCH64_ISA_SB)
> +
>  /* Apply the workaround for Cortex-A53 erratum 835769.  */
>  #define TARGET_FIX_ERR_A53_835769\
>((aarch64_fix_a53_err835769 == 2)  \
> @@ -1058,8 +1062,11 @@ typedef struct
>  
>  #define RETURN_ADDR_RTX aarch64_return_addr
>  
> -/* BTI c + 3 insns + 2 pointer-sized entries.  */
> -#define TRAMPOLINE_SIZE  (TARGET_ILP32 ? 24 : 32)
> +/* BTI c + 3 insns
> +   + sls barrier of DSB + ISB.
> +   + 2 pointer-sized entries.  */
> +#define TRAMPOLINE_SIZE  (24 \
> +  + (TARGET_ILP32 ? 8 : 16))

Personal taste, sorry, but IMO this is easier to read on one line.

>  
>  /* Trampolines contain dwords, so must be dword aligned.  */
>  #define TRAMPOLINE_ALIGNMENT 64
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 775f49991e5f599a843d3ef490b8cd044acfe78f..9356937fe266c68196392a1589b3cf96607de104
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -10822,8 +10822,8 @@ aarch64_return_addr (int count, rtx frame 
> ATTRIBUTE_UNUSED)
>  static void
>  aarch64_asm_trampoline_template (FILE *f)
>  {
> -  int offset1 = 16;
> -  int offset2 = 20;
> +  int offset1 = 24;
> +  int offset2 = 28;

Huh, the offset handling in this function is a bit twisty, but that's
not your fault :-)

> […]
> @@ -11054,6 +11065,7 @@ aarch64_output_casesi (rtx *operands)
>output_asm_insn (buf, operands);
>output_asm_insn (patterns[index][1], operands);
>output_asm_insn ("br\t%3", operands);
> +  output_asm_insn (aarch64_sls_barrier (aarch64_harden_sls_retbr_p ()), 
> operands);

Long line.

> […]
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 
> ff15505d45546124868d2531b7f4e5b0f1f5bebc..75ef87a3b4674cc73cb42cc82cfb8e782acf77f6
>  100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -447,8 +447,15 @@
>  (define_insn "indirect_jump"
>[(set (pc) (match_operand:DI 0 "register_operand" "r"))]
>""
> -  "br\\t%0"
> -  [(set_attr "type" "branch")]
> +  {
> +output_asm_insn ("br\\t%0", operands);
> +return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
> +  }
> +  [(set_attr "type" "branch")
> +   (set (attr "length")
> + (cond [(match_test "!aarch64_harden_sls_retbr_p ()") (const_int 4)
> +(match_test "TARGET_SB") (const_int 8)]
> +   (const_int 12)))]

Rather than duplicating this several times, I think it would be better
to add a new attribute like “sls_mitigation”, set that attribute in the
define_insns, and then use “sls_mitigation” in the default “length”
calculation.  See e.g. what rth did with “movprfx”.

> […]
> diff --git 
> a/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c 
> b/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c
> new file mode 100644
> index 
> ..11f614b4ef2eb0fa3707cb46a55583d6685b89d0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-mharden-sls=retbr -mbranch-protection=pac-ret 
> -march=armv8.3-a" } */
> +
> +/* Testing the do_return pattern for retaa and retab.  */
> +long retbr_subcall(void);
> +long retbr_do_return_retaa(void)
> +{
> +return retbr_subcall()+1;
> +}
> +__attribute__((target("branch-protection=pac-ret+b-key")))
> +long retbr_do_return_retab(void)
> +{
> +return retbr_subcall()+1;
> +}
> +
> +/* Ensure there are no BR or RET instructions which are not directly followed

Re: [Patch v2 3/3] aarch64: Mitigate SLS for BLR instruction

2020-06-23 Thread Richard Sandiford
Matthew Malcomson  writes:
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 
> f996472d6990b7709602ae93f7a2cb7daa0e84b0..9795c929b8733f89722d3660456f5e7d6405d902
>  100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -643,6 +643,16 @@ extern unsigned aarch64_architecture_version;
>  #define GP_REGNUM_P(REGNO)   \
>(((unsigned) (REGNO - R0_REGNUM)) <= (R30_REGNUM - R0_REGNUM))
>  
> +/* Registers known to be preserved over a BL instruction.  This consists of 
> the
> +   GENERAL_REGS without x16, x17, and x30.  The x30 register is changed by 
> the BL

Long line.

> +   instruction itself, while the x16 and x17 registers may be used by veneers
> +   which can be inserted by the linker.  */
> +#define STUB_REGNUM_P(REGNO) \
> +  (GP_REGNUM_P (REGNO) \
> +   && ((unsigned) (REGNO - R0_REGNUM)) != (R16_REGNUM - R0_REGNUM) \
> +   && ((unsigned) (REGNO - R0_REGNUM)) != (R17_REGNUM - R0_REGNUM) \
> +   && ((unsigned) (REGNO - R0_REGNUM)) != (R30_REGNUM - R0_REGNUM)) \

Sorry, I should have noticed this before, but we can just compare
(REGNO) directly with R16_REGNUM etc, with subtracting R0_REGNUM from
both sides.  The R0_REGNUM stuff is only needed for range comparisons,
where the idea is to avoid reevaluating REGNO.

> […]
> @@ -10869,7 +10872,7 @@ aarch64_asm_trampoline_template (FILE *f)
>   specific attributes to choose between hardening against straight line
>   speculation or not, but such function specific attributes are likely to
>   happen in the future.  */
> -  output_asm_insn ("dsb\tsy\n\tisb", NULL);
> +  asm_fprintf (f, "\tdsb\tsy\n\tisb\n");

Looks like this should be part of 2/3.

> […]
> +rtx
> +aarch64_sls_create_blr_label (int regnum)
> +{
> +  gcc_assert (regnum < 30 && regnum != 16 && regnum != 17);

Can just use STUB_REGNUM_P here.

> […]
> +/* Emit shared BLR stubs for the current compilation unit.
> +   Over the course of compiling this unit we may have converted some BLR
> +   instructions to a BL to a shared stub function.  This is where we emit 
> those
> +   stub functions.
> +   This function is for the stubs shared between different functions in this
> +   compilation unit.  We share when optimising for size instead of speed.

optimizing (alas).

> […]
> +/* { dg-final { scan-assembler "\tbr\tx\[0-9\]\[0-9\]?" } } */

Probably easier to read with {…} quoting rather than "…" quoting,
so that no backslashes are needed for [ and ].

OK with those changes, thanks.

Richard


[Ada] Streamline implementation of renaming in gigi

2020-06-23 Thread Eric Botcazou
The main changes are 1) the bulk of the implementation is put back entirely in 
gnat_to_gnu_entity and 2) the handling of lvalues is unified, i.e. no longer 
depends on the Materialize_Entity flag being present on the entity.

Tested on x86-64/Linux, applied on the mainline.


2020-06-23  Eric Botcazou  

* gcc-interface/ada-tree.h (DECL_RENAMED_OBJECT): Delete.
* gcc-interface/decl.c (gnat_to_gnu_entity) : Always use
the stabilized reference directly for a renaming and create a variable
pointing to it separately if requested.
* gcc-interface/misc.c (gnat_print_decl): Adjust for deletion.
* gcc-interface/trans.c (Identifier_to_gnu): Likewise.
(gnat_to_gnu) :
Do not deal with side-effects here.
: Likewise.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/ada-tree.h b/gcc/ada/gcc-interface/ada-tree.h
index 11bfc37ea3d..461fa2b598c 100644
--- a/gcc/ada/gcc-interface/ada-tree.h
+++ b/gcc/ada/gcc-interface/ada-tree.h
@@ -525,13 +525,6 @@ do {		   \
 #define SET_DECL_INDUCTION_VAR(NODE, X) \
   SET_DECL_LANG_SPECIFIC (VAR_DECL_CHECK (NODE), X)
 
-/* In a VAR_DECL without the DECL_LOOP_PARM_P flag set and that is a renaming
-   pointer, points to the object being renamed, if any.  */
-#define DECL_RENAMED_OBJECT(NODE) \
-  GET_DECL_LANG_SPECIFIC (VAR_DECL_CHECK (NODE))
-#define SET_DECL_RENAMED_OBJECT(NODE, X) \
-  SET_DECL_LANG_SPECIFIC (VAR_DECL_CHECK (NODE), X)
-
 /* In a TYPE_DECL, points to the parallel type if any, otherwise 0.  */
 #define DECL_PARALLEL_TYPE(NODE) \
   GET_DECL_LANG_SPECIFIC (TYPE_DECL_CHECK (NODE))
diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index 63118bee930..270710b11d5 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -714,7 +714,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	bool mutable_p = false;
 	bool used_by_ref = false;
 	tree gnu_ext_name = NULL_TREE;
-	tree gnu_renamed_obj = NULL_TREE;
 	tree gnu_ada_size = NULL_TREE;
 
 	/* We need to translate the renamed object even though we are only
@@ -1041,13 +1040,13 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	else if (type_is_padding_self_referential (TREE_TYPE (gnu_expr)))
 	  gnu_type = TREE_TYPE (gnu_expr);
 
-	/* Case 1: if this is a constant renaming stemming from a function
-	   call, treat it as a normal object whose initial value is what
-	   is being renamed.  RM 3.3 says that the result of evaluating a
-	   function call is a constant object.  Therefore, it can be the
-	   inner object of a constant renaming and the renaming must be
-	   fully instantiated, i.e. it cannot be a reference to (part of)
-	   an existing object.  And treat other rvalues the same way.  */
+	/* If this is a constant renaming stemming from a function call,
+	   treat it as a normal object whose initial value is what is being
+	   renamed.  RM 3.3 says that the result of evaluating a function
+	   call is a constant object.  Therefore, it can be the inner
+	   object of a constant renaming and the renaming must be fully
+	   instantiated, i.e. it cannot be a reference to (part of) an
+	   existing object.  And treat other rvalues the same way.  */
 	tree inner = gnu_expr;
 	while (handled_component_p (inner) || CONVERT_EXPR_P (inner))
 	  inner = TREE_OPERAND (inner, 0);
@@ -1089,92 +1088,75 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 		&& DECL_RETURN_VALUE_P (inner)))
 	  ;
 
-	/* Case 2: if the renaming entity need not be materialized, use
-	   the elaborated renamed expression for the renaming.  But this
-	   means that the caller is responsible for evaluating the address
-	   of the renaming in the correct place for the definition case to
-	   instantiate the SAVE_EXPRs.  But we cannot use this mechanism if
-	   the renamed object is an N_Expression_With_Actions because this
-	   would fail the assertion below.  */
-	else if (!Materialize_Entity (gnat_entity)
-		 && Nkind (gnat_renamed_obj) != N_Expression_With_Actions)
+	/* Otherwise, this is an lvalue being renamed, so it needs to be
+	   elaborated as a reference and substituted for the entity.  But
+	   this means that we must evaluate the address of the renaming
+	   in the definition case to instantiate the SAVE_EXPRs.  */
+	else
 	  {
-		tree init = NULL_TREE;
+		tree gnu_init = NULL_TREE;
 
-		gnu_decl
-		  = elaborate_reference (gnu_expr, gnat_entity, definition,
-	 &init);
+		if (type_annotate_only && TREE_CODE (gnu_expr) == ERROR_MARK)
+		  break;
 
-		/* We cannot evaluate the first arm of a COMPOUND_EXPR in the
-		   correct place for this case.  */
-		gcc_assert (!init);
+		gnu_expr
+		  = elaborate_reference (gnu_expr, gnat_entity, definition,
+	 &gnu_init);
 
-		/* No DECL_EXPR will be creat

[Ada] Emit user subtypes with -fgnat-encodings=minimal

2020-06-23 Thread Eric Botcazou
This changes the compiler to emit debug info for user-defined subtypes even 
with -fgnat-encodings=minimal, as they might be needed by the debugger.

Tested on x86-64/Linux, applied on the mainline.


2020-06-23  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity) : Set
the debug type to the base type and only if the subtype is artificial.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index 33d59d556a2..589154ba392 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -3507,18 +3507,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 
 	  gnu_type = make_node (RECORD_TYPE);
 	  TYPE_NAME (gnu_type) = gnu_entity_name;
-	  if (gnat_encodings == DWARF_GNAT_ENCODINGS_MINIMAL)
-		{
-		  /* Use the ultimate base record type as the debug type.
-		 Subtypes and derived types bring no useful
-		 information.  */
-		  Entity_Id gnat_debug_type = gnat_entity;
-		  while (Etype (gnat_debug_type) != gnat_debug_type)
-		gnat_debug_type = Etype (gnat_debug_type);
-		  tree gnu_debug_type
-		= TYPE_MAIN_VARIANT (gnat_to_gnu_type (gnat_debug_type));
-		  SET_TYPE_DEBUG_TYPE (gnu_type, gnu_debug_type);
-		}
 	  TYPE_PACKED (gnu_type) = TYPE_PACKED (gnu_base_type);
 	  TYPE_REVERSE_STORAGE_ORDER (gnu_type)
 		= Reverse_Storage_Order (gnat_entity);
@@ -3580,6 +3568,13 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	 true, debug_info_p,
 	 NULL, gnat_entity);
 		}
+
+	  /* Or else, if the subtype is artificial and encodings are not
+		 used, use the base record type as the debug type.  */
+	  else if (debug_info_p
+		   && artificial_p
+		   && gnat_encodings == DWARF_GNAT_ENCODINGS_MINIMAL)
+		SET_TYPE_DEBUG_TYPE (gnu_type, gnu_unpad_base_type);
 	}
 
 	  /* Otherwise, go down all the components in the new type and make


[Ada] Emit debug info for integral variables first

2020-06-23 Thread Eric Botcazou
This makes it possible for global dynamic types to reference the DIE of these 
integral variables.

Tested on x86-64/Linux, applied on the mainline.


2020-06-23  Eric Botcazou  

* gcc-interface/utils.c (gnat_write_global_declarations): Output the
integral global variables first and the imported functions later.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 7adc3131a41..a96fde668be 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -5880,7 +5880,16 @@ gnat_write_global_declarations (void)
 	  }
 }
 
-  /* Output debug information for all global type declarations first.  This
+  /* First output the integral global variables, so that they can be referenced
+ as bounds by the global dynamic types.  Skip external variables, unless we
+ really need to emit debug info for them:, e.g. imported variables.  */
+  FOR_EACH_VEC_SAFE_ELT (global_decls, i, iter)
+if (TREE_CODE (iter) == VAR_DECL
+	&& INTEGRAL_TYPE_P (TREE_TYPE (iter))
+	&& (!DECL_EXTERNAL (iter) || !DECL_IGNORED_P (iter)))
+  rest_of_decl_compilation (iter, true, 0);
+
+  /* Now output debug information for the global type declarations.  This
  ensures that global types whose compilation hasn't been finalized yet,
  for example pointers to Taft amendment types, have their compilation
  finalized in the right context.  */
@@ -5888,30 +5897,29 @@ gnat_write_global_declarations (void)
 if (TREE_CODE (iter) == TYPE_DECL && !DECL_IGNORED_P (iter))
   debug_hooks->type_decl (iter, false);
 
-  /* Output imported functions.  */
+  /* Then output the other global variables.  We need to do that after the
+ information for global types is emitted so that they are finalized.  */
   FOR_EACH_VEC_SAFE_ELT (global_decls, i, iter)
-if (TREE_CODE (iter) == FUNCTION_DECL
-	&& DECL_EXTERNAL (iter)
-	&& DECL_INITIAL (iter) == NULL
-	&& !DECL_IGNORED_P (iter)
-	&& DECL_FUNCTION_IS_DEF (iter))
-  debug_hooks->early_global_decl (iter);
+if (TREE_CODE (iter) == VAR_DECL
+	&& !INTEGRAL_TYPE_P (TREE_TYPE (iter))
+	&& (!DECL_EXTERNAL (iter) || !DECL_IGNORED_P (iter)))
+  rest_of_decl_compilation (iter, true, 0);
 
-  /* Output global constants.  */
+  /* Output debug information for the global constants.  */
   FOR_EACH_VEC_SAFE_ELT (global_decls, i, iter)
 if (TREE_CODE (iter) == CONST_DECL && !DECL_IGNORED_P (iter))
   debug_hooks->early_global_decl (iter);
 
-  /* Then output the global variables.  We need to do that after the debug
- information for global types is emitted so that they are finalized.  Skip
- external global variables, unless we need to emit debug info for them:
- this is useful for imported variables, for instance.  */
+  /* Output it for the imported functions.  */
   FOR_EACH_VEC_SAFE_ELT (global_decls, i, iter)
-if (TREE_CODE (iter) == VAR_DECL
-	&& (!DECL_EXTERNAL (iter) || !DECL_IGNORED_P (iter)))
-  rest_of_decl_compilation (iter, true, 0);
+if (TREE_CODE (iter) == FUNCTION_DECL
+	&& DECL_EXTERNAL (iter)
+	&& DECL_INITIAL (iter) == NULL
+	&& !DECL_IGNORED_P (iter)
+	&& DECL_FUNCTION_IS_DEF (iter))
+  debug_hooks->early_global_decl (iter);
 
-  /* Output the imported modules/declarations.  In GNAT, these are only
+  /* Output it for the imported modules/declarations.  In GNAT, these are only
  materializing subprogram.  */
   FOR_EACH_VEC_SAFE_ELT (global_decls, i, iter)
if (TREE_CODE (iter) == IMPORTED_DECL && !DECL_IGNORED_P (iter))


[Ada] Fix memory corruption with vector and variant record

2020-06-23 Thread Eric Botcazou
The problem is that Has_Constrained_Partial_View must be tested on the base 
type of the designated type of an allocator.

Tested on x86-64/Linux, applied on all active branches.


2020-06-23  Eric Botcazou  

* gcc-interface/trans.c (gnat_to_gnu) : Minor tweaks.
Call Has_Constrained_Partial_View on base type of the designated type.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index c32bdb96a5e..f74e0e728c9 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -7154,9 +7154,8 @@ gnat_to_gnu (Node_Id gnat_node)
 
 case N_Allocator:
   {
-	tree gnu_init = NULL_TREE;
-	tree gnu_type;
-	bool ignore_init_type = false;
+	tree gnu_type, gnu_init;
+	bool ignore_init_type;
 
 	gnat_temp = Expression (gnat_node);
 
@@ -7165,15 +7164,22 @@ gnat_to_gnu (Node_Id gnat_node)
 	   contains both the type and an initial value for the object.  */
 	if (Nkind (gnat_temp) == N_Identifier
 	|| Nkind (gnat_temp) == N_Expanded_Name)
-	  gnu_type = gnat_to_gnu_type (Entity (gnat_temp));
+	  {
+	ignore_init_type = false;
+	gnu_init = NULL_TREE;
+	gnu_type = gnat_to_gnu_type (Entity (gnat_temp));
+	  }
+
 	else if (Nkind (gnat_temp) == N_Qualified_Expression)
 	  {
 	const Entity_Id gnat_desig_type
 	  = Designated_Type (Underlying_Type (Etype (gnat_node)));
 
-	ignore_init_type = Has_Constrained_Partial_View (gnat_desig_type);
-	gnu_init = gnat_to_gnu (Expression (gnat_temp));
+	/* The flag is effectively only set on the base types.  */
+	ignore_init_type
+	  = Has_Constrained_Partial_View (Base_Type (gnat_desig_type));
 
+	gnu_init = gnat_to_gnu (Expression (gnat_temp));
 	gnu_init = maybe_unconstrained_array (gnu_init);
 
 	gigi_checking_assert (!Do_Range_Check (Expression (gnat_temp)));


Re: [PATCH v2] libiberty, include: add bsearch_r

2020-06-23 Thread Jose E. Marchesi via Gcc-patches


Hi Nick.

libctf wants a bsearch that takes a void * arg pointer to avoid a
nonportable use of __thread.

bsearch_r is required, not optional, at this point because as far as I
can see this obvious-sounding function is not implemented by anyone's
libc.  We can easily move it to AC_LIBOBJ later if it proves necessary
to do so.

include/
* libiberty.h (bsearch_r): New.
libiberty/
* bsearch_r.c: New file.
* Makefile.in (CFILES): Add bsearch_r.c.
(REQUIRED_OFILES): Add bsearch_r.o.
* functions.texi: Regenerate.
---
 include/libiberty.h  |  7 +++
 libiberty/Makefile.in| 12 +-
 libiberty/bsearch_r.c| 93 
 libiberty/functions.texi | 21 -
 4 files changed, 130 insertions(+), 3 deletions(-)
 create mode 100644 libiberty/bsearch_r.c

v2: actually regenerate functions.texi.

Acked six months ago (!) by Jeff Law here:


Rebased against master (commit efc16503ca10bc0e934e0bace5777500e4dc757a)
and retested building this morning.

I just applied the patch in master, on your behalf.
Salud!


Sorry for the delay.

diff --git a/include/libiberty.h b/include/libiberty.h
index 141cb886a85..0bb5b81d4ac 100644
--- a/include/libiberty.h
+++ b/include/libiberty.h
@@ -641,6 +641,13 @@ extern int pexecute (const char *, char * const *, 
const char *,
 
 extern int pwait (int, int *, int);
 
+/* Like bsearch, but takes and passes on an argument like qsort_r.  */
+
+extern void *bsearch_r (register const void *, const void *,
+   size_t, register size_t,
+   register int (*)(const void *, const void *, void *),
+   void *);
+
 #if defined(HAVE_DECL_ASPRINTF) && !HAVE_DECL_ASPRINTF
 /* Like sprintf but provides a pointer to malloc'd storage, which must
be freed by the caller.  */
diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in
index d6b302e02fd..895f701bcd0 100644
--- a/libiberty/Makefile.in
+++ b/libiberty/Makefile.in
@@ -124,7 +124,7 @@ COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. 
-I$(INCDIR) \
 # CONFIGURED_OFILES and funcs in configure.ac.  Also run "make maint-deps"
 # to build the new rules.
 CFILES = alloca.c argv.c asprintf.c atexit.c   
\
-   basename.c bcmp.c bcopy.c bsearch.c bzero.c \
+   basename.c bcmp.c bcopy.c bsearch.c bsearch_r.c bzero.c \
calloc.c choose-temp.c clock.c concat.c cp-demangle.c   \
 cp-demint.c cplus-dem.c crc32.c\
d-demangle.c dwarfnames.c dyn-string.c  \
@@ -168,6 +168,7 @@ REQUIRED_OFILES =   
\
./regex.$(objext) ./cplus-dem.$(objext) ./cp-demangle.$(objext) \
./md5.$(objext) ./sha1.$(objext) ./alloca.$(objext) \
./argv.$(objext)\
+   ./bsearch_r.$(objext)   \
./choose-temp.$(objext) ./concat.$(objext)  \
./cp-demint.$(objext) ./crc32.$(objext) ./d-demangle.$(objext)  \
./dwarfnames.$(objext) ./dyn-string.$(objext)   \
@@ -601,6 +602,15 @@ $(CONFIGURED_OFILES): stamp-picdir stamp-noasandir
else true; fi
$(COMPILE.c) $(srcdir)/bsearch.c $(OUTPUT_OPTION)
 
+./bsearch_r.$(objext): $(srcdir)/bsearch_r.c config.h $(INCDIR)/ansidecl.h
+   if [ x"$(PICFLAG)" != x ]; then \
+ $(COMPILE.c) $(PICFLAG) $(srcdir)/bsearch_r.c -o pic/$@; \
+   else true; fi
+   if [ x"$(NOASANFLAG)" != x ]; then \
+ $(COMPILE.c) $(PICFLAG) $(NOASANFLAG) $(srcdir)/bsearch_r.c -o 
noasan/$@; \
+   else true; fi
+   $(COMPILE.c) $(srcdir)/bsearch_r.c $(OUTPUT_OPTION)
+
 ./bzero.$(objext): $(srcdir)/bzero.c
if [ x"$(PICFLAG)" != x ]; then \
  $(COMPILE.c) $(PICFLAG) $(srcdir)/bzero.c -o pic/$@; \
diff --git a/libiberty/bsearch_r.c b/libiberty/bsearch_r.c
new file mode 100644
index 000..79ebae9b0be
--- /dev/null
+++ b/libiberty/bsearch_r.c
@@ -0,0 +1,93 @@
+/*
+ * Copyright (c) 1990 Regents of the University of California.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this 

Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-23 Thread Thomas Schwinge
Hi!

On 2020-06-15T21:28:12+0100, Kwok Cheung Yeung  wrote:
> This patch adds support on nvptx for __sync_val_compare_and_swap operations on
> 1- and 2-byte values.

Is this a thorough review that these are the only functions missing, or
did you just implement what you found missing for some test case you've
been looking into?  Other architectures' similar libgcc files seem to be
defining more of such related functions.

Per the PTX 3.1 manual that I looked into, I see for CAS it supports:
'.b32', '.b64'.  We've got:

$ build-gcc-offload-nvptx-none/gcc/xgcc -Bbuild-gcc-offload-nvptx-none/gcc 
-dM -E -x c /dev/null | grep -i compare.and.swap
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1

..., so that would match the PTX 3.1 manual.  GCC also seems to know
about 16-byte ("'.b128'", which doesn't exist in PTX).  A quick run of
your testcase with 'GENERATE_TEST(__int128)' seems to just work, but I
haven't verified why/how.

> The implementation is a straight copy of the version for
> AMD GCN.

(Should thus be generalized?  That can be done later, as far as I'm
concerned -- there already seems to be quite some code duplication in
libgcc.)

I have not verified the algorithm.

It seems a bit unfortunate to have such a thing outlined in a separate
function, given we're talking about performance-critical code here?  Even
more so for GCN, where there's no JIT compiler that can inline it later,
as it's the case for nvptx?

You have verified that GCC itself shouldn't/can't synthesize such
replacement code, inline?

The GCN/nvptx libgcc code actually seems simple enough so that GCC could
synthesize that internally -- or is there a reason not to?  (Just
curious.)

I see 'gcc/doc/generic.texi', "OMP_ATOMIC" state:

The gimplifier tries
three alternative code generation strategies.  Whenever possible,
an atomic update built-in is used.  If that fails, a
compare-and-swap loop is attempted.  If that also fails, a
regular critical section around the expression is used.

..., and I see 'gcc/omp-expand.c:expand_omp_atomic_pipeline' synthesize
that "compare-and-swap loop" code, which looks vaguely similar to your
libgcc implementation.

..., or maybe even 'gcc/builtins.c:expand_builtin_compare_and_swap'
etc. could be doing such things?

..., and 'gcc/optabs.c:expand_compare_and_swap_loop' etc. also look
similar/related?

I clearly don't know the history behind all of that.  :-|

> I have added a new libgomp test that exercises the new operation.

Ideally we should also have nvptx target compile-time test, and scan the
PTX assembler code generated -- but we don't have that for a lot of
things, so this is probably OK without, too?

> I have also
> verified that the new code does not cause any regressions on the nvptx
> offloading tests, and that the new test passes with both nvptx and amdgcn as
> offload targets.
>
> Okay for master and OG10?

Given that this solves an actual problem, and we have precedence with the
GCN target doing the same thing, this seems OK (but I can't formally
approve).


Grüße
 Thomas


> commit 7c3a9c23ba9f5b8fe953aa5492ae75617f2444a3
> Author: Kwok Cheung Yeung 
> Date:   Mon Jun 15 12:34:55 2020 -0700
>
> nvptx: Add support for subword compare-and-swap
>
> 2020-06-15  Kwok Cheung Yeung  
>
>   libgcc/
>   * config/nvptx/atomic.c: New.
>   * config/nvptx/t-nvptx (LIB2ADD): Add atomic.c.
>
>   libgomp/
>   * testsuite/libgomp.c-c++-common/reduction-16.c: New.
>
> diff --git a/libgcc/config/nvptx/atomic.c b/libgcc/config/nvptx/atomic.c
> new file mode 100644
> index 000..4becbd2
> --- /dev/null
> +++ b/libgcc/config/nvptx/atomic.c
> @@ -0,0 +1,59 @@
> +/* NVPTX atomic operations
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   Contributed by Mentor Graphics.
> +
> +   This file is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by the
> +   Free Software Foundation; either version 3, or (at your option) any
> +   later version.
> +
> +   This file is distributed in the hope that it will be useful, but
> +   WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   General Public License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#include 
> +
> +#define __SYNC_SUBWORD_COMPARE_AND_SWAP(TYPE, SIZE)   \
> +   

Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-23 Thread Matthew Malcomson

On 23/06/2020 17:17, Richard Sandiford wrote:

Matthew Malcomson  writes:

--- a/gcc/config/aarch64/aarch64-protos.h
+/* Ensure there are no BR or RET instructions which are not directly followed
+   by a speculation barrier.  */
+/* { dg-final { scan-assembler-not 
"\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */


Isn't the “sb” alternative invalid given the -march option?

Probably slightly easier to read if the regexp is quoted using {…}
rather than "…".  Same for the other tests.



Just to check before I respin:  Using {} instead of "" means I need to 
replace \t with a literal tab -- do you still prefer it?


Re: [PATCH] nvptx: Add support for subword compare-and-swap

2020-06-23 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 23, 2020 at 06:44:26PM +0200, Thomas Schwinge wrote:
> On 2020-06-15T21:28:12+0100, Kwok Cheung Yeung  wrote:
> > This patch adds support on nvptx for __sync_val_compare_and_swap operations 
> > on
> > 1- and 2-byte values.
> 
> Is this a thorough review that these are the only functions missing, or
> did you just implement what you found missing for some test case you've
> been looking into?  Other architectures' similar libgcc files seem to be
> defining more of such related functions.
> It seems a bit unfortunate to have such a thing outlined in a separate
> function, given we're talking about performance-critical code here?  Even
> more so for GCN, where there's no JIT compiler that can inline it later,
> as it's the case for nvptx?

I think this should really be handled by the backend inline, like many other
targets do it when they only support 32-bit+ and not 8/16-bit atomics.
See e.g. sparc backend.

Jakub



Re: [PATCH][RFC] __builtin_shuffle sometimes should produce zip1 rather than TBL (PR82199)

2020-06-23 Thread Richard Sandiford
Sorry for the slow review.

Dmitrij Pochepko  writes:
> @@ -20074,6 +20076,83 @@ aarch64_evpc_trn (struct expand_vec_perm_d *d)
>return true;
>  }
>  
> +/* Try to re-encode the PERM constant so it use the bigger size up.
> +   This rewrites constants such as {0, 1, 4, 5}/V4SF to {0, 2}/V2DI.
> +   We retry with this new constant with the full suite of patterns.  */
> +static bool
> +aarch64_evpc_reencode (struct expand_vec_perm_d *d)
> +{
> +  expand_vec_perm_d newd;
> +  unsigned HOST_WIDE_INT nelt;
> +
> +  if (d->vec_flags != VEC_ADVSIMD)
> +return false;
> +
> +  unsigned int encoded_nelts = d->perm.encoding ().encoded_nelts ();
> +  for (unsigned int i = 0; i < encoded_nelts; ++i)
> +if (!d->perm[i].is_constant ())
> +  return false;
> +
> +  /* to_constant is safe since this routine is specific to Advanced SIMD
> + vectors.  */
> +  nelt = d->perm.length ().to_constant ();
> +
> +  /* Get the new mode.  Always twice the size of the inner
> + and half the elements.  */
> +  machine_mode new_mode;
> +  switch (d->vmode)
> +{
> +/* 128bit vectors.  */
> +case E_V4SFmode:
> +case E_V4SImode:
> +  new_mode = V2DImode;
> +  break;
> +case E_V8BFmode:
> +case E_V8HFmode:
> +case E_V8HImode:
> +  new_mode = V4SImode;
> +  break;
> +case E_V16QImode:
> +  new_mode = V8HImode;
> +  break;
> +/* 64bit vectors.  */
> +case E_V4BFmode:
> +case E_V4HFmode:
> +case E_V4HImode:
> +  new_mode = V2SImode;
> +  break;
> +case E_V8QImode:
> +  new_mode = V4HImode;
> +  break;
> +default:
> +  return false;
> +}
> +
> +  newd.vmode = new_mode;
> +  newd.vec_flags = VEC_ADVSIMD;
> +  newd.target = d->target ? gen_lowpart (new_mode, d->target) : NULL;
> +  newd.op0 = d->op0 ? gen_lowpart (new_mode, d->op0) : NULL;
> +  newd.op1 = d->op1 ? gen_lowpart (new_mode, d->op1) : NULL;
> +  newd.testing_p = d->testing_p;
> +  newd.one_vector_p = d->one_vector_p;
> +  vec_perm_builder newpermconst;
> +  newpermconst.new_vector (nelt / 2, nelt / 2, 1);
> +
> +  /* Convert the perm constant if we can.  Require even, odd as the pairs.  
> */
> +  for (unsigned int i = 0; i < nelt; i += 2)
> +{
> +  unsigned int elt0 = d->perm[i].to_constant ();
> +  unsigned int elt1 = d->perm[i+1].to_constant ();
> +  if ((elt0 & 1) != 0 || elt0 + 1 != elt1)
> + return false;
> +  newpermconst.quick_push (elt0 / 2);
> +}
> +  newpermconst.finalize ();

I think it would be simpler to do it in this order:

  - check for Advanced SIMD, bail out if not
  - get the new mode, bail out if none
  - calculate the permutation vector, bail out if not suitable
  - set up the rest of “newd”

There would then only be one walk over d->perm rather than two,
and we'd only create the gen_lowparts when there's something to test.

The new mode can be calculated with something like:

  poly_uint64 vec_bits = GET_MODE_BITSIZE (d->vmode);
  unsigned int new_elt_bits = GET_MODE_UNIT_BITSIZE (d->vmode) * 2;
  auto new_elt_mode = int_mode_for_size (new_elt_bits, false).require ();
  machine_mode new_mode = aarch64_simd_container_mode (new_elt_mode, vec_bits);

“new_mode” will be “word_mode” on failure.

> diff --git a/gcc/testsuite/gcc.target/aarch64/vdup_n_3.c 
> b/gcc/testsuite/gcc.target/aarch64/vdup_n_3.c
> new file mode 100644
> index 000..289604d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vdup_n_3.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +#define vector __attribute__((vector_size(4*sizeof(float
> +
> +/* These are both dups. */
> +vector float f(vector float a, vector float b)
> +{
> +  return __builtin_shuffle (a, a, (vector int){0, 1, 0, 1});
> +}
> +vector float f1(vector float a, vector float b)
> +{
> +  return __builtin_shuffle (a, a, (vector int){2, 3, 2, 3});
> +}
> +
> +/* { dg-final { scan-assembler-times "\[ \t\]*dup\[ \t\]+v\[0-9\]+\.2d" 2 } 
> } */

The regexp would be easier to read if quoted using {…}, which requires
fewer backslashes.  Same for the other tests.

Thanks,
Richard


Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-23 Thread Richard Sandiford
Matthew Malcomson  writes:
> On 23/06/2020 17:17, Richard Sandiford wrote:
>> Matthew Malcomson  writes:
>>> --- a/gcc/config/aarch64/aarch64-protos.h
>>> +/* Ensure there are no BR or RET instructions which are not directly 
>>> followed
>>> +   by a speculation barrier.  */
>>> +/* { dg-final { scan-assembler-not 
>>> "\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */
>> 
>> Isn't the “sb” alternative invalid given the -march option?
>> 
>> Probably slightly easier to read if the regexp is quoted using {…}
>> rather than "…".  Same for the other tests.
>> 
>
> Just to check before I respin:  Using {} instead of "" means I need to 
> replace \t with a literal tab -- do you still prefer it?

Are you sure?  We've been using tests like:

/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */

for SVE without problems.  Using {…} means that backslash quoting
is applied by the regexp parser rather than the Tcl string parser,
but both should work for things like \t.

Richard


Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-23 Thread Matthew Malcomson

On 23/06/2020 17:56, Richard Sandiford wrote:

Matthew Malcomson  writes:

On 23/06/2020 17:17, Richard Sandiford wrote:

Matthew Malcomson  writes:

--- a/gcc/config/aarch64/aarch64-protos.h
+/* Ensure there are no BR or RET instructions which are not directly followed
+   by a speculation barrier.  */
+/* { dg-final { scan-assembler-not 
"\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */


Isn't the “sb” alternative invalid given the -march option?

Probably slightly easier to read if the regexp is quoted using {…}
rather than "…".  Same for the other tests.



Just to check before I respin:  Using {} instead of "" means I need to
replace \t with a literal tab -- do you still prefer it?


Are you sure?  We've been using tests like:

/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */

for SVE without problems.  Using {…} means that backslash quoting
is applied by the regexp parser rather than the Tcl string parser,
but both should work for things like \t.

Richard



Ah -- my mistake -- I was just checking with `string compare` while 
making the change and didn't think too hard when I saw a -1.


Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-06-23 Thread Matthew Malcomson

On 23/06/2020 16:48, Richard Sandiford wrote:

Matthew Malcomson  writes:

@@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
return false;
  mfix-cortex-a53-835769
  Target Report Var(aarch64_fix_a53_err835769) Init(2) Save
  Workaround for ARM Cortex-A53 Erratum number 835769.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
35e8242af5fa4c52744fd2c3e2cfee0a617e22bb..8a3fab2964c9bb06c820766d284768751d63ac9a
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -696,6 +696,7 @@ Objective-C and Objective-C++ Dialects}.
  -msign-return-address=@var{scope} @gol
  -mbranch-protection=@var{none}|@var{standard}|@var{pac-ret}[+@var{leaf}
  +@var{b-key}]|@var{bti} @gol
+-mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr} @gol
  -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
  -moverride=@var{string}  -mverbose-cost-dump @gol
  -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} 
@gol
@@ -17045,6 +17046,15 @@ functions.  The optional argument @samp{b-key} can be 
used to sign the functions
  with the B-key instead of the A-key.
  @samp{bti} turns on branch target identification mechanism.
  
+@item -mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr}

+@opindex mharden-sls
+Enable compiler hardening against straight line speculation (SLS).
+There are two options for hardening against straight line speculation.
+@samp{retbr} allows inserting speculation barriers after every
+@samp{br} and @samp{ret} instruction.  While @samp{blr} enables replacing
+@samp{blr} instructions with a @samp{bl} to a function stub.
+@samp{all} enables all SLS hardening, while @samp{none} does not enable any.


OK, so this is even more picky, sorry, but the syntax and description
imply to me that you can choose only one of the four options.  I think
it would be more accurate to say something like:

@item -mharden-sls=@var{opts}
@opindex mharden-sls
Enable compiler hardening against straight line speculation (SLS).
@var{opts} is a comma-separated list of the following options:
@table @samp
@item retbr
…
@item blr
…
@end table
In addition, @samp{-mharden-sls=all} enables all SLS hardening
while @samp{-mharden-sls=none} disables all SLS hardening.

(assuming the above behaviour change for “none”)

Thanks,
Richard



Another "just to check": the same change should be made in the short 
form right? (i.e. the hunk above is now `-mharden-sls=@var{opts}`)


Re: [PATCH][RFC] vector creation from two parts of two vectors produces TBL rather than ins (PR93720)

2020-06-23 Thread Richard Sandiford
Dmitrij Pochepko  writes:
> +  unsigned int encoded_nelts = d->perm.encoding ().encoded_nelts ();
> +  for (unsigned int i = 0; i < encoded_nelts; ++i)
> +if (!d->perm[i].is_constant ())
> +  return false;

I think it would be better to test this as part of the loop below.

> +
> +  /* to_constant is safe since this routine is specific to Advanced SIMD
> + vectors.  */
> +  nelt = d->perm.length ().to_constant ();
> +  rtx insv = d->op0;
> +
> +  HOST_WIDE_INT idx = -1;
> +
> +  for (unsigned HOST_WIDE_INT i = 0; i < nelt; i ++)
> +{
> +  if (d->perm[i].to_constant () == (HOST_WIDE_INT) i)
> + continue;
> +  if (idx != -1)
> + {
> +   idx = -1;
> +   break;
> + }
> +  idx = i;
> +}
> +
> +  if (idx == -1)
> +{
> +  insv = d->op1;
> +  for (unsigned HOST_WIDE_INT i = 0; i < nelt; i ++)
> + {
> +   if (d->perm[i].to_constant () == (HOST_WIDE_INT) (i + nelt))
> + continue;
> +   if (idx != -1)
> + return false;
> +   idx = i;
> + }
> +
> +  if (idx == -1)
> + return false;
> +}
> +
> +  if (d->testing_p)
> +return true;
> +
> +  gcc_assert (idx != -1);
> +
> +  unsigned extractindex = d->perm[idx].to_constant ();
> +  rtx extractv = d->op0;
> +  if (extractindex >= nelt)
> +{
> +  extractv = d->op1;
> +  extractindex -= nelt;
> +}
> +  gcc_assert (extractindex < nelt);
> +
> +  machine_mode inner_mode = GET_MODE_INNER (mode);
> +
> +  enum insn_code inscode = optab_handler (vec_set_optab, mode);
> +  gcc_assert (inscode != CODE_FOR_nothing);
> +  enum insn_code iextcode = convert_optab_handler (vec_extract_optab, mode,
> +inner_mode);
> +  gcc_assert (iextcode != CODE_FOR_nothing);
> +  rtx tempinner = gen_reg_rtx (inner_mode);
> +  emit_insn (GEN_FCN (iextcode) (tempinner, extractv, GEN_INT 
> (extractindex)));
> +
> +  rtx temp = gen_reg_rtx (mode);
> +  emit_move_insn (temp, insv);
> +  emit_insn (GEN_FCN (inscode) (temp, tempinner, GEN_INT (idx)));
> +
> +  emit_move_insn (d->target, temp);

I think it'd be better to generate the target instruction directly.
We can do that by replacing:

(define_insn "*aarch64_simd_vec_copy_lane"

with:

(define_insn "@aarch64_simd_vec_copy_lane"

then using the expand_insn interface to create an instance of
code_for_aarch64_simd_vec_copy_lane (mode).

> +/* { dg-final { scan-assembler-times "\[ \t\]*ins\[ \t\]+v\[0-9\]+\.s" 4 } } 
> */

Same comment as the other patch about using {…} regexp quoting.

Thanks,
Richard


Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-06-23 Thread Richard Sandiford
Matthew Malcomson  writes:
> On 23/06/2020 16:48, Richard Sandiford wrote:
>> Matthew Malcomson  writes:
>>> @@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const 
>>> struct processor **res,
>>> return false;
>>>   mfix-cortex-a53-835769
>>>   Target Report Var(aarch64_fix_a53_err835769) Init(2) Save
>>>   Workaround for ARM Cortex-A53 Erratum number 835769.
>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>> index 
>>> 35e8242af5fa4c52744fd2c3e2cfee0a617e22bb..8a3fab2964c9bb06c820766d284768751d63ac9a
>>>  100644
>>> --- a/gcc/doc/invoke.texi
>>> +++ b/gcc/doc/invoke.texi
>>> @@ -696,6 +696,7 @@ Objective-C and Objective-C++ Dialects}.
>>>   -msign-return-address=@var{scope} @gol
>>>   -mbranch-protection=@var{none}|@var{standard}|@var{pac-ret}[+@var{leaf}
>>>   +@var{b-key}]|@var{bti} @gol
>>> +-mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr} @gol
>>>   -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
>>>   -moverride=@var{string}  -mverbose-cost-dump @gol
>>>   -mstack-protector-guard=@var{guard} 
>>> -mstack-protector-guard-reg=@var{sysreg} @gol
>>> @@ -17045,6 +17046,15 @@ functions.  The optional argument @samp{b-key} can 
>>> be used to sign the functions
>>>   with the B-key instead of the A-key.
>>>   @samp{bti} turns on branch target identification mechanism.
>>>   
>>> +@item -mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr}
>>> +@opindex mharden-sls
>>> +Enable compiler hardening against straight line speculation (SLS).
>>> +There are two options for hardening against straight line speculation.
>>> +@samp{retbr} allows inserting speculation barriers after every
>>> +@samp{br} and @samp{ret} instruction.  While @samp{blr} enables replacing
>>> +@samp{blr} instructions with a @samp{bl} to a function stub.
>>> +@samp{all} enables all SLS hardening, while @samp{none} does not enable 
>>> any.
>> 
>> OK, so this is even more picky, sorry, but the syntax and description
>> imply to me that you can choose only one of the four options.  I think
>> it would be more accurate to say something like:
>> 
>> @item -mharden-sls=@var{opts}
>> @opindex mharden-sls
>> Enable compiler hardening against straight line speculation (SLS).
>> @var{opts} is a comma-separated list of the following options:
>> @table @samp
>> @item retbr
>> …
>> @item blr
>> …
>> @end table
>> In addition, @samp{-mharden-sls=all} enables all SLS hardening
>> while @samp{-mharden-sls=none} disables all SLS hardening.
>> 
>> (assuming the above behaviour change for “none”)
>> 
>> Thanks,
>> Richard
>> 
>
> Another "just to check": the same change should be made in the short 
> form right? (i.e. the hunk above is now `-mharden-sls=@var{opts}`)

Yeah.


Re: [PATCHv6] Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic

2020-06-23 Thread Jason Merrill via Gcc-patches

On 6/23/20 8:27 AM, Marek Polacek wrote:

On Mon, Jun 22, 2020 at 11:11:48PM -0400, Nicholas Krause wrote:



On 6/22/20 10:01 PM, Marek Polacek wrote:

On Mon, Jun 22, 2020 at 09:42:51PM -0400, Nicholas Krause via Gcc-patches wrote:

From: Nicholas Krause 

This fixs the PR95672 by adding the missing TYPE_PACK_EXPANSION case in
cxx_incomplete_type_diagnostic in order to avoid ICES on diagnosing
incomplete template pack expansion cases. In v2, add the missing required
test case for all new patches. v3 Fixes both the test case to compile in
C++11 mode and the message to print out only the type. v4 fixes the testcase
to only target C++11. v5 and v6 fix the test case properly.

gcc/cp/ChangeLog:

* typeck2.c (cxx_incomplete_type_diagnostic): Add missing 
TYPE_EXPANSION_PACK
  check for diagnosticing incomplete types in 
cxx_incomplete_type_diagnostic.


It's already been pointed out to you that it's "diagnosing".


gcc/testsuite/ChangeLog:

* g++.dg/template/PR95672.C: New test.

Signed-off-by: Nicholas Krause 
---
   gcc/cp/typeck2.c| 6 ++
   gcc/testsuite/g++.dg/template/PR95672.C | 3 +++
   2 files changed, 9 insertions(+)
   create mode 100644 gcc/testsuite/g++.dg/template/PR95672.C

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 5fd3b82fa89..28b32fe0b5a 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -552,6 +552,12 @@ cxx_incomplete_type_diagnostic (location_t loc, const_tree 
value,
   TYPE_NAME (type));
 break;
+case TYPE_PACK_EXPANSION:
+ emit_diagnostic (diag_kind, loc, 0,


Bad indenting.


Sorry seems Jason didn't catch that.



+"invalid use of pack expansion %qT",
+ type);
+  break;
+
   case TYPENAME_TYPE:
   case DECLTYPE_TYPE:
 emit_diagnostic (diag_kind, loc, 0,
diff --git a/gcc/testsuite/g++.dg/template/PR95672.C 
b/gcc/testsuite/g++.dg/template/PR95672.C
new file mode 100644
index 000..fcc3da0a132
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/PR95672.C
@@ -0,0 +1,3 @@
+// PR c++/96572
+// { dg-do compile}
+// { dg-options "-std=c++11" }
+struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use of pack 
expansion" }


No, this is completely broken.  It passes only because the patch file is
malformed and the last line will be lost when applying the patch, so the
test is an empty file.

Marek



Yes but now that I look at it seems the issue is that the error message was
changed. My concern is that the test will have to be updated
from the current every time the message changes like the current:
error: expected primary-expression before ‘auto’
 1 | struct g_class : decltype  (auto) ... {  };

Is this something I should care about as seems emit_diagnostic and friends
sometimes do this.


decltype(auto) is a C++14 feature so won't work with -std=c++11.


Yes, and so in C++11 mode you get a different error.

Better like in your v5 patch to use


+// { dg-do compile { target c++11 } }


because that means "run for C++11 and above", whereas the dg-options 
line makes it run only in C++11 mode.


But since as Marek points out you're testing a C++14 feature, probably 
beter to use


// { dg-do compile { target c++14 } }

unless you really want to test for different errors in different modes.

Jason



[PATCH] rs6000: Allow --with-cpu=power10

2020-06-23 Thread Aaron Sawdey via Gcc-patches
Update config.gcc so that we can use --with-cpu=power10.

I've tested that this does do the expected thing 
with --with-cpu=power10 and also that it still builds and
bootstraps correctly using --with-cpu=power9 on power9. If there isn't
any other testing I need to do for this, ok for trunk?

Thanks!
   Aaron

* config.gcc: Identify power10 as a 64-bit processor and as valid
for --with-cpu and --with-tune.
---
 gcc/config.gcc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 365263a0f46..829b6f757f2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -514,7 +514,7 @@ powerpc*-*-*)
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
si2vmx.h"
extra_headers="${extra_headers} amo.h"
case x$with_cpu in
-   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower10|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
cpu_is_64bit=yes
;;
esac
@@ -4912,7 +4912,7 @@ case "${target}" in
eval "with_$which=405"
;;
"" | common | native \
-   | power[3456789] | power5+ | power6x \
+   | power[3456789] | power10 | power5+ | power6x \
| powerpc | powerpc64 | powerpc64le \
| rs64 \
| 401 | 403 | 405 | 405fp | 440 | 440fp | 464 | 464fp \
-- 
2.17.1



Re: [PATCH] rs6000: Allow --with-cpu=power10

2020-06-23 Thread Segher Boessenkool
Hi!

On Tue, Jun 23, 2020 at 01:25:42PM -0500, Aaron Sawdey via Gcc-patches wrote:
> Update config.gcc so that we can use --with-cpu=power10.

>   * config.gcc: Identify power10 as a 64-bit processor and as valid
>   for --with-cpu and --with-tune.

> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 365263a0f46..829b6f757f2 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -514,7 +514,7 @@ powerpc*-*-*)
>   extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h 
> si2vmx.h"
>   extra_headers="${extra_headers} amo.h"
>   case x$with_cpu in
> - 
> xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)
> + 
> xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower10|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500|xfuture)

Ah, "xfuture", that's why I missed this.  Ugh.

Please remove that "xfuture" entry?

> @@ -4912,7 +4912,7 @@ case "${target}" in
>   eval "with_$which=405"
>   ;;
>   "" | common | native \
> - | power[3456789] | power5+ | power6x \
> + | power[3456789] | power10 | power5+ | power6x \

And this one never had a future.  Good.

Okay for trunk with that change.  Thanks!

(I'll do the backport to 10, fold it in with the rest).


Segher


[PATCHv7] Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic

2020-06-23 Thread Nicholas Krause via Gcc-patches
From: Nicholas Krause 

This fixs the PR95672 by adding the missing TYPE_PACK_EXPANSION case in
cxx_incomplete_type_diagnostic in order to avoid ICES on diagnosing
incomplete template pack expansion cases. In v2, add the missing required
test case for all new patches. v3 Fixes both the test case to compile in
C++11 mode and the message to print out only the type. v4 fixes the testcase
to only target C++11. v5 and v6 fix the test case properly. v7 fixes running

gcc/cp/ChangeLog:

* typeck2.c (cxx_incomplete_type_diagnostic):Add missing 
  TYPE_EXPANSION_PACK check for diagnosicing incomplete 
  types in cxx_incomplete_type_diagnostic.

gcc/testsuite/ChangeLog:

* g++.dg/template/PR95672.C: New test.

Signed-off-by: Nicholas Krause 
---
 gcc/cp/typeck2.c| 5 +
 gcc/testsuite/g++.dg/template/PR95672.C | 2 ++
 2 files changed, 7 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/PR95672.C

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 5fd3b82fa89..dac135a2e11 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -552,6 +552,11 @@ cxx_incomplete_type_diagnostic (location_t loc, const_tree 
value,
   TYPE_NAME (type));
   break;
 
+case TYPE_PACK_EXPANSION:
+  emit_diagnostic (diag_kind, loc, 0,
+  "invalid use of pack expansion %qT", type);
+  break;
+
 case TYPENAME_TYPE:
 case DECLTYPE_TYPE:
   emit_diagnostic (diag_kind, loc, 0,
diff --git a/gcc/testsuite/g++.dg/template/PR95672.C 
b/gcc/testsuite/g++.dg/template/PR95672.C
new file mode 100644
index 000..104e125287f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/PR95672.C
@@ -0,0 +1,2 @@
+// { dg-do compile { target c++14 } } 
+struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use of pack 
expansion" } 
-- 
2.20.1



Re: [PATCHv7] Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic

2020-06-23 Thread Marek Polacek via Gcc-patches
The correct subject should be more like
c++: Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic [PR95672]

See git log --oneline for examples.

On Tue, Jun 23, 2020 at 03:11:44PM -0400, Nicholas Krause via Gcc-patches wrote:
> From: Nicholas Krause 
> 
> This fixs the PR95672 by adding the missing TYPE_PACK_EXPANSION case in

"fixes"

> cxx_incomplete_type_diagnostic in order to avoid ICES on diagnosing
> incomplete template pack expansion cases. In v2, add the missing required
> test case for all new patches. v3 Fixes both the test case to compile in
> C++11 mode and the message to print out only the type. v4 fixes the testcase
> to only target C++11. v5 and v6 fix the test case properly. v7 fixes running

...running what?

> gcc/cp/ChangeLog:
> 
>   * typeck2.c (cxx_incomplete_type_diagnostic):Add missing 
>   TYPE_EXPANSION_PACK check for diagnosicing incomplete 
> types in cxx_incomplete_type_diagnostic.

This still won't pass git gcc-verify and we've already told you twice
that the correct term is "diagnosing".  Missing space after ':'.  The
ChangeLog entry also misses "PR c++/95672".

You never mentioned how the patch was tested.

Marek



Re: [PATCH wwwdocs] gcc-11/changes: Document TSAN changes

2020-06-23 Thread Martin Liška

On 6/23/20 1:27 PM, Marco Elver wrote:

Is this one good to go, or any objections?


Hello.

Thanks for it, it's fine. Please install it.

Martin


Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-23 Thread Thomas Schwinge
Hi!

On 2020-06-23T17:21:06+0200, Tobias Burnus  wrote:
> If the offloading code is (only) in a library, one can come up
> with the idea to build those parts as shared library – and link
> it to the nonoffloading code.(*)

> (*) Thomas mentioned that this is supposed to work also in more
> complex cases than the one I outlined, although, that is probably
> currently the most common one.

Static linking is another such case that we've seen in the wild -- and
that supposedly does work, mostly.

The "more complex cases" would include dynamically loading/registering
offloading code, and unregistering it again, and such things.  The
interfaces are there (implemented), but testsuite coverage isn't -- so
I'm not going to claim that it actually works.  ;-)

> Currently, this fails as the mkoffload calls the nonoffloading
> compiler without the -fpic/-fPIC flags, even though the compiler
> was originally invoked with those options. – And at some point,
> the linker then complains.
>
> This patch simply adds -fpic/-fPIC to the calls to the nonoffloading
> ("host") compiler, invoked from mkoffload, if they were present before.
>
> For the testcase at hand, this works with both AMDGCN and nvptx
> with the attached patch.
>
> OK for the trunk?

I don't think I can approve, but seems fine if this works (as you've
confirmed) -- it's one incremental step forward!

Or, should this instead be handled in the LTO wrapper (?) options merging
etc. machinery?  I'd have to dig in further.  (Jakub?)

Eventually (not now...), instead of special-casing more and more options
(I somehow doubt that '-fpic', '-fPIC' are the only ones?), shouldn't we
solve this in some more generic way, like re-invoking the host compiler
exactly as invoked before (if that makes sense?), or -- freaky ;-) --
instead of doing the "LTO wrapper thing", pause the host compiler,
generate offload code, inject that code into the host compiler, and
directly proceed there?  (That is, do offload code generation as part of
the host compilation, instead of at link time.)  (Just thinking aloud,
and without and "depth" -- and any such work will be a bigger, separate
task, obviously.)

> PS: I think as mid-/longterm project it would be nice to test this
> in the testsuite, but that's unfortunately a larger task.

ACK.

>  gcc/config/gcn/mkoffload.c   | 15 +--
>  gcc/config/nvptx/mkoffload.c | 15 +--
>  2 files changed, 26 insertions(+), 4 deletions(-)

What about 'gcc/config/i386/intelmic-mkoffload.c'?  I see that one
unconditionally passes '-fPIC' to some things -- is that doing the right
thing for your case, too?


Grüße
 Thomas


> diff --git a/gcc/config/gcn/mkoffload.c b/gcc/config/gcn/mkoffload.c
> index 14f422e..0415d94 100644
> --- a/gcc/config/gcn/mkoffload.c
> +++ b/gcc/config/gcn/mkoffload.c
> @@ -483,7 +483,8 @@ process_obj (FILE *in, FILE *cfile)
>  /* Compile a C file using the host compiler.  */
>
>  static void
> -compile_native (const char *infile, const char *outfile, const char 
> *compiler)
> +compile_native (const char *infile, const char *outfile, const char 
> *compiler,
> + bool fPIC, bool fpic)
>  {
>const char *collect_gcc_options = getenv ("COLLECT_GCC_OPTIONS");
>if (!collect_gcc_options)
> @@ -493,6 +494,10 @@ compile_native (const char *infile, const char *outfile, 
> const char *compiler)
>struct obstack argv_obstack;
>obstack_init (&argv_obstack);
>obstack_ptr_grow (&argv_obstack, compiler);
> +  if (fPIC)
> +obstack_ptr_grow (&argv_obstack, "-fPIC");
> +  if (fpic)
> +obstack_ptr_grow (&argv_obstack, "-fpic");
>if (save_temps)
>  obstack_ptr_grow (&argv_obstack, "-save-temps");
>if (verbose)
> @@ -596,6 +601,8 @@ main (int argc, char **argv)
>/* Scan the argument vector.  */
>bool fopenmp = false;
>bool fopenacc = false;
> +  bool fPIC = false;
> +  bool fpic = false;
>for (int i = 1; i < argc; i++)
>  {
>  #define STR "-foffload-abi="
> @@ -614,6 +621,10 @@ main (int argc, char **argv)
>   fopenmp = true;
>else if (strcmp (argv[i], "-fopenacc") == 0)
>   fopenacc = true;
> +  else if (strcmp (argv[i], "-fPIC") == 0)
> + fPIC = true;
> +  else if (strcmp (argv[i], "-fpic") == 0)
> + fpic = true;
>else if (strcmp (argv[i], "-save-temps") == 0)
>   save_temps = true;
>else if (strcmp (argv[i], "-v") == 0)
> @@ -766,7 +777,7 @@ main (int argc, char **argv)
>xputenv (concat ("COMPILER_PATH=", cpath, NULL));
>xputenv (concat ("LIBRARY_PATH=", lpath, NULL));
>
> -  compile_native (gcn_cfile_name, outname, collect_gcc);
> +  compile_native (gcn_cfile_name, outname, collect_gcc, fPIC, fpic);
>
>return 0;
>  }
> diff --git a/gcc/config/nvptx/mkoffload.c b/gcc/config/nvptx/mkoffload.c
> index efdf9b9..4fecb2b 100644
> --- a/gcc/config/nvptx/mkoffload.c
> +++ b/gcc/config/nvptx/mkoffload.c
> @@ -356,7 +356,8 @@ process (FILE *in, FILE *out)
>  }
>
>  static void
> -compile_native (const char 

[PATCHv8] c++:Handle TYPE_PACK_EXPANSION in cxx_incomplete_type_diagnostic[PR96752]

2020-06-23 Thread Nicholas Krause via Gcc-patches
From: Nicholas Krause 

This fixes the PR95672 by adding the missing TYPE_PACK_EXPANSION case in
cxx_incomplete_type_diagnostic in order to avoid ICES on diagnosing
incomplete template pack expansion cases. In v2, add the missing required
test case for all new patches. v3 Fixes both the test case to compile in
C++11 mode and the message to print out only the type. v4 fixes the testcase
to only target C++11. v5 and v6 fix the test case properly. v7 fixes running
the testcase. v8 fixes grammar errors. Tested on  powerpc64le-unknown-linux-gnu.

gcc/cp/ChangeLog:

* typeck2.c (cxx_incomplete_type_diagnostic):
  Add missing TYPE_EXPANSION_PACK check for 
  diagnosing incomplete types in 
  cxx_incomplete_type_diagnostic to fix
  c++/PR95672.

gcc/testsuite/ChangeLog:

* g++.dg/template/PR95672.C: New test.

Signed-off-by: Nicholas Krause 
---
 gcc/cp/typeck2.c| 5 +
 gcc/testsuite/g++.dg/template/PR95672.C | 2 ++
 2 files changed, 7 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/PR95672.C

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 5fd3b82fa89..dac135a2e11 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -552,6 +552,11 @@ cxx_incomplete_type_diagnostic (location_t loc, const_tree 
value,
   TYPE_NAME (type));
   break;
 
+case TYPE_PACK_EXPANSION:
+  emit_diagnostic (diag_kind, loc, 0,
+  "invalid use of pack expansion %qT", type);
+  break;
+
 case TYPENAME_TYPE:
 case DECLTYPE_TYPE:
   emit_diagnostic (diag_kind, loc, 0,
diff --git a/gcc/testsuite/g++.dg/template/PR95672.C 
b/gcc/testsuite/g++.dg/template/PR95672.C
new file mode 100644
index 000..104e125287f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/PR95672.C
@@ -0,0 +1,2 @@
+// { dg-do compile { target c++14 } } 
+struct g_class : decltype  (auto) ... {  }; // { dg-error "invalid use of pack 
expansion" } 
-- 
2.20.1



Re: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-23 Thread Andrew Stubbs

On 23/06/2020 20:36, Thomas Schwinge wrote:

Eventually (not now...), instead of special-casing more and more options
(I somehow doubt that '-fpic', '-fPIC' are the only ones?), shouldn't we
solve this in some more generic way, like re-invoking the host compiler
exactly as invoked before (if that makes sense?), or -- freaky ;-) --
instead of doing the "LTO wrapper thing", pause the host compiler,
generate offload code, inject that code into the host compiler, and
directly proceed there?  (That is, do offload code generation as part of
the host compilation, instead of at link time.)  (Just thinking aloud,
and without and "depth" -- and any such work will be a bigger, separate
task, obviously.)


I thought the same, but it's better to fix one problem now than probably 
not getting around to fixing two problems later. In any case, we 
definitely want a white list, and explicit coding is easy to read.


Andrew


Re: [PATCH 00/28] rs6000: Auto-generate builtins from descriptions

2020-06-23 Thread Bill Schmidt via Gcc-patches

On 6/18/20 5:48 PM, will schmidt wrote:

On Thu, 2020-06-18 at 17:01 -0500, Bill Schmidt wrote:

Thanks for the review, Will!  Responses below...

On 6/18/20 11:08 AM, will schmidt wrote:

On Wed, 2020-06-17 at 14:46 -0500, Bill Schmidt wrote:

I posted a version of these patches back in stage 4 (February),
but we agreed that holding off until stage 1 was a better idea.
Since then I've made more progress and reorganized the patches
accordingly.  This group of patches lays groundwork, but does not
actually change GCC's behavior yet, other than to generate the
new initialization information and ignore it.

The current built-in support in the rs6000 back end requires at
least
a master's degree in spelunking to comprehend.  It's full of
cruft,
redundancy, and unused bits of code, and long overdue for a
replacement.  This is the first part of my project to do that.

My intent is to make adding new built-in functions as simple as
adding
a few lines to a couple of files, and automatically generating as
much
of the initialization, overload resolution, and expansion logic
as
possible.  This patch series establishes the format of the input
files
and creates a new program (rs6000-gen-builtins) to:

   * Parse the input files into an internal representation;
   * Generate a file of #defines (rs6000-vecdefines.h) for
eventual
 inclusion into altivec.h; and
   * Generate an initialization file to create and initialize
tables of
 built-in functions and overloads.

Patches 1, 3-7, and 9-19 contain the logic for rs6000-gen-
builtins.
Patch 8 provides balanced tree search support for parsing
scalability.
Patches 2 and 21-27 provide a first cut at the input files.
Patch 20 incorporates the new code into the GCC build.
Patch 28 adds comments to some existing files that will help
during the transition from the previous builtin mechanism.

The patch series is constructed so that any prefix set of the
patches
can be upstreamed without breaking anything, so we can take the
reviews slowly.  There's still plenty of work left, but I think
it
will be helpful to get this big chunk of patches upstream to make
further progress easier.

Thanks in advance for your reviews!


I've read through the series.  Nothing significant, just a few
cosmetic
nits, i've called them out below here, versus replying to the
individual emails.

generally lgtm.
thanks
-Will



Bill Schmidt (28):
rs6000: Initial create of rs6000-gen-builtins.c

ok

rs6000: Add initial input files

Whitespace/tabs in "Legal values of " blurb.
otherwise ok

Urk.  Will fix.

rs6000: Add file support and functions for diagnostic support

ok

rs6000: Add helper functions for parsing

ok

rs6000: Add functions for matching types, part 1 of 3

ok


rs6000: Add functions for matching types, part 2 of 3

ok


rs6000: Add functions for matching types, part 3 of 3

ok


rs6000: Red-black tree implementation for balanced tree search

ok


rs6000: Main function with stubs for parsing and output

ok


rs6000: Parsing built-in input file, part 1 of 3

ok

rs6000: Parsing built-in input file, part 2 of 3

ok

rs6000: Parsing built-in input file, part 3 of 3

ok

rs6000: Parsing of overload input file

use enums or consts instead of hardcoding values ?

Is this specifically about MAXOVLDSTANZAS, MAXOVLDS, or something
else?
If the former, I guess I can define these in const decls instead of
using #define if that's preferred.

No issue with those.  I was noting the constants used as the return
values in the parse_ovld_entry() function.

You have them clearly documented there.

+/* Parse one two-line entry in the overload file.  Return 0 for EOF, 1 for
+   success, 2 for end-of-stanza, and 6 for a parsing failure.  */

So just a suggestion to use other defined values for that.

I didn't notice those numbers used in the other patches, so maybe this
is already fixed up elsewhere.


I see, thanks for the explanation.  I agree, it's not just in this 
patch; the return values used in parsing are a mess.  I'll create an 
enum and use it consistently.


Thanks!
Bill





Thanks
-Will





[PATCH] Make contrib/download_prerequisites work on AIX and OpenBSD

2020-06-23 Thread Ilya Leoshkevich via Gcc-patches
Hello,

I needed to test [1] on AIX and OpenBSD and noticed
download_prerequisites doesn't work there. The attached patch fixes
it.

OK for master?

Best regards,
Ilya

[1] https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548182.html

---

contrib/ChangeLog:

2020-06-11  Ilya Leoshkevich  

* download_prerequisites: Support AIX and OpenBSD unames.
Pipe `{gzip,bzip2} -d` to `tar -xf -`.
---
 contrib/download_prerequisites | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
index aa0356e6266..da19913f9ab 100755
--- a/contrib/download_prerequisites
+++ b/contrib/download_prerequisites
@@ -47,9 +47,12 @@ force=0
 OS=$(uname)
 
 case $OS in
-  "Darwin"|"FreeBSD"|"DragonFly")
+  "Darwin"|"FreeBSD"|"DragonFly"|"AIX")
 chksum='shasum -a 512 --check'
   ;;
+  "OpenBSD")
+chksum='sha512 -c'
+  ;;
   *)
 chksum='sha512sum -c'
   ;;
@@ -242,8 +245,19 @@ for ar in $(echo_archives)
 do
 package="${ar%.tar*}"
 if [ ${force} -gt 0 ]; then rm -rf "${directory}/${package}"; fi
+case $ar in
+*.gz)
+   uncompress='gzip -d'
+   ;;
+*.bz2)
+   uncompress='bzip2 -d'
+   ;;
+*)
+   uncompress='cat'
+   ;;
+esac
 [ -e "${directory}/${package}" ]  \
-|| ( cd "${directory}" && tar -xf "${ar}" )   \
+|| ( cd "${directory}" && $uncompress <"${ar}" | tar -xf - )  \
 || die "Cannot extract package from ${ar}"
 unset package
 done
-- 
2.25.4



Re: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved registers with CMSE

2020-06-23 Thread Christophe Lyon via Gcc-patches
On Tue, 23 Jun 2020 at 15:28, Andre Vieira (lists)
 wrote:
>
> On 23/06/2020 13:10, Kyrylo Tkachov wrote:
> >
> >> -Original Message-
> >> From: Andre Vieira (lists) 
> >> Sent: 22 June 2020 09:52
> >> To: gcc-patches@gcc.gnu.org
> >> Cc: Kyrylo Tkachov 
> >> Subject: [PATCH][GCC][Arm] PR target/95646: Do not clobber callee saved
> >> registers with CMSE
> >>
> >> Hi,
> >>
> >> As reported in bugzilla when the -mcmse option is used while compiling
> >> for size (-Os) with a thumb-1 target the generated code will clear the
> >> registers r7-r10. These however are callee saved and should be preserved
> >> accross ABI boundaries. The reason this happens is because these
> >> registers are made "fixed" when optimising for size with Thumb-1 in a
> >> way to make sure they are not used, as pushing and popping hi-registers
> >> requires extra moves to and from LO_REGS.
> >>
> >> To fix this, this patch uses 'callee_saved_reg_p', which accounts for
> >> this optimisation, instead of 'call_used_or_fixed_reg_p'. Be aware of
> >> 'callee_saved_reg_p''s definition, as it does still take call used
> >> registers into account, which aren't callee_saved in my opinion, so it
> >> is a rather misnoemer, works in our advantage here though as it does
> >> exactly what we need.
> >>
> >> Regression tested on arm-none-eabi.
> >>
> >> Is this OK for trunk? (Will eventually backport to previous versions if
> >> stable.)
> > Ok.
> > Thanks,
> > Kyrill
> As I was getting ready to push this I noticed I didn't add any skip-ifs
> to prevent this failing with specific target options. So here's a new
> version with those.
>
> Still OK?
>

Hi,

This is not sufficient to skip arm-linux-gnueabi* configs built with
non-default cpu/fpu.

For instance, with arm-linux-gnueabihf --with-cpu=cortex-a9
--with-fpu=neon-fp16 --with-float=hard
I see:
FAIL: gcc.target/arm/pr95646.c (test for excess errors)
Excess errors:
cc1: error: ARMv8-M Security Extensions incompatible with selected FPU
cc1: error: target CPU does not support ARM mode

and the testcase is compiled with -mcpu=cortex-m23 -mcmse -Os

Christophe

> Cheers,
> Andre
> >
> >> Cheers,
> >> Andre
> >>
> >> gcc/ChangeLog:
> >> 2020-06-22  Andre Vieira  
> >>
> >>   PR target/95646
> >>   * config/arm/arm.c: (cmse_nonsecure_entry_clear_before_return):
> >> Use 'callee_saved_reg_p' instead of
> >>   'calL_used_or_fixed_reg_p'.
> >>
> >> gcc/testsuite/ChangeLog:
> >> 2020-06-22  Andre Vieira  
> >>
> >>   PR target/95646
> >>   * gcc.target/arm/pr95646.c: New test.


[PATCH] PR fortran/95826 - Buffer overflows with PDTs and long symbols

2020-06-23 Thread Harald Anlauf
Dear all,

here's another fix for a buffer overflow with long symbols.

OK for master / backports?

Regtested on x86_64-pc-linux-gnu.

Thanks,
Harald


PR fortran/95826 - Buffer overflows with PDTs and long symbols

With PDTs (parameterized derived types), name mangling results in variably
long internal symbols.  Use a dynamic buffer instead of a fixed-size one.

gcc/fortran/
PR fortran/95826
* decl.c (gfc_match_decl_type_spec): Replace a fixed size
buffer by a pointer and reallocate if necessary.
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index c27cfacf2e4..ac1f63f66e0 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -4095,7 +4095,7 @@ match
 gfc_match_decl_type_spec (gfc_typespec *ts, int implicit_flag)
 {
   /* Provide sufficient space to hold "pdtsymbol".  */
-  char name[GFC_MAX_SYMBOL_LEN + 1 + 3];
+  char *name = XALLOCAVEC (char, GFC_MAX_SYMBOL_LEN + 1);
   gfc_symbol *sym, *dt_sym;
   match m;
   char c;
@@ -4286,8 +4286,10 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int implicit_flag)
 	  gcc_assert (!sym->attr.pdt_template && sym->attr.pdt_type);
 	  ts->u.derived = sym;
 	  const char* lower = gfc_dt_lower_string (sym->name);
-	  size_t len = strnlen (lower, sizeof (name));
-	  gcc_assert (len < sizeof (name));
+	  size_t len = strlen (lower);
+	  /* Reallocate with sufficient size.  */
+	  if (len > GFC_MAX_SYMBOL_LEN)
+	name = XALLOCAVEC (char, len + 1);
 	  memcpy (name, lower, len);
 	  name[len] = '\0';
 	}
diff --git a/gcc/testsuite/gfortran.dg/pr95826.f90 b/gcc/testsuite/gfortran.dg/pr95826.f90
new file mode 100644
index 000..8de04e65df0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95826.f90
@@ -0,0 +1,20 @@
+! { dg-do compile }
+! { dg-options "-fsecond-underscore" }
+! PR fortran/95826 - ICE in gfc_match_decl_type_spec, at fortran/decl.c:4290
+
+program p
+  type t2345678901234567890123456789012345678901234567890123456789_123 &
+  (a2345678901234567890123456789012345678901234567890123456789_123, &
+   b2345678901234567890123456789012345678901234567890123456789_123)
+ integer, kind :: &
+   a2345678901234567890123456789012345678901234567890123456789_123
+ integer, len :: &
+   b2345678901234567890123456789012345678901234567890123456789_123
+  end type
+  integer, parameter :: &
+   n2345678901234567890123456789012345678901234567890123456789_123 = 16
+  type(t2345678901234567890123456789012345678901234567890123456789_123 &
+  (n2345678901234567890123456789012345678901234567890123456789_123,:)), &
+   allocatable :: &
+   x2345678901234567890123456789012345678901234567890123456789_123
+end


[PATCH] PR fortran/95827 - Buffer overflows with PDTs and long symbols

2020-06-23 Thread Harald Anlauf
Dear all,

here's another case with a buffer that did overflow.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports?

Thanks,
Harald


PR fortran/95827 - Buffer overflows with PDTs and long symbols

With submodules and coarrays, name mangling results in long internal
symbols.  Enlarge internal buffer.

gcc/fortran/
PR fortran/95827
* iresolve.c (gfc_get_string): Enlarge internal buffer used in
generating the mangled name.
diff --git a/gcc/fortran/iresolve.c b/gcc/fortran/iresolve.c
index aa9bb328a0f..73769615c20 100644
--- a/gcc/fortran/iresolve.c
+++ b/gcc/fortran/iresolve.c
@@ -47,8 +47,8 @@ along with GCC; see the file COPYING3.  If not see
 const char *
 gfc_get_string (const char *format, ...)
 {
-  /* Provide sufficient space to hold "_F.symbol.symbol_MOD_symbol".  */
-  char temp_name[4 + 2*GFC_MAX_SYMBOL_LEN + 5 + GFC_MAX_SYMBOL_LEN + 1];
+  /* Provide sufficient space for "_F.caf_token__symbol.symbol_MOD_symbol".  */
+  char temp_name[15 + 2*GFC_MAX_SYMBOL_LEN + 5 + GFC_MAX_SYMBOL_LEN + 1];
   const char *str;
   va_list ap;
   tree ident;
diff --git a/gcc/testsuite/gfortran.dg/pr95827.f90 b/gcc/testsuite/gfortran.dg/pr95827.f90
new file mode 100644
index 000..545e344c46d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95827.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=lib -fsecond-underscore" }
+! PR fortran/95827 - ICE in gfc_get_string, at fortran/iresolve.c:70
+
+module m2345678901234567890123456789012345678901234567890123456789_123
+  interface
+ module subroutine s2345678901234567890123456789012345678901234567890123456789_123
+ end
+   end interface
+end
+submodule(m2345678901234567890123456789012345678901234567890123456789_123) &
+  n2345678901234567890123456789012345678901234567890123456789_123
+  integer :: x2345678901234567890123456789012345678901234567890123456789_123[*]
+end


Re: [PATCH v2, RS6000 PR target/94954] Fix wrong codegen for vec_pack_to_short_fp32() builtin

2020-06-23 Thread Segher Boessenkool
Hi!

On Tue, Jun 16, 2020 at 10:53:07AM -0500, will schmidt wrote:
>   target pr/94954

PR target/94954

> * config/rs6000/altivec.h (vec_pack_to_short_fp32) Update.

Colon?

> * config/rs6000/rs6000-call.c  (P9V_BUILTIN_VEC_CONVERT_4F32_8F16): New
> overloaded builtin entry.

Only one space before (.

> -/* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } 
> */
> +/* { dg-do run { target { powerpc*-*-linux* && { p9vector_hw } } } } */

You shouldn't need those inner {} anymore, then.

Okay for trunk.  Thanks!  Also okay for backport to 10, if you want that.


Segher


Fortran: Fix character-kind=4 substring resolution (PR95837)

2020-06-23 Thread Tobias Burnus

Found when looking at another issue …

OK for the trunk?

Tobias

PS: Without the patch, it fails to compile with:
Error: Character ‘\U0001F600’ in string at (1) cannot be converted into 
character kind 1
Error: Operands of comparison operator ‘/=’ at (1) are 
CHARACTER(3)/CHARACTER(3,4)

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Fix character-kind=4 substring resolution (PR95837)

gcc/fortran/ChangeLog:

	PR fortran/95837
	* resolve.c (gfc_resolve_substring_charlen): Fix char-kind setting.

gcc/testsuite/ChangeLog:

	PR fortran/95837
	* gfortran.dg/char4-subscript.f90: New test.

 gcc/fortran/resolve.c |  7 ++-
 gcc/testsuite/gfortran.dg/char4-subscript.f90 | 30 +++
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index c53b312f7ed..6d844dd2310 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -5141,7 +5141,12 @@ gfc_resolve_substring_charlen (gfc_expr *e)
 }
 
   e->ts.type = BT_CHARACTER;
-  e->ts.kind = gfc_default_character_kind;
+  if (ts)
+e->kind = ts->kind;
+  else if (e->symtree->n.sym->ts.type == BT_CHARACTER)
+e->kind = ts->kind;
+  else
+e->kind = gfc_default_character_kind;
 
   if (!e->ts.u.cl)
 e->ts.u.cl = gfc_new_charlen (gfc_current_ns, NULL);
diff --git a/gcc/testsuite/gfortran.dg/char4-subscript.f90 b/gcc/testsuite/gfortran.dg/char4-subscript.f90
new file mode 100644
index 000..f1f915c7af9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/char4-subscript.f90
@@ -0,0 +1,30 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/95837
+!
+type t
+  character(len=:, kind=4), pointer :: str2
+end type t
+type(t) :: var
+
+allocate(character(len=5, kind=4) :: var%str2)
+
+var%str2(1:1) = 4_"d"
+var%str2(2:3) = 4_"ef"
+var%str2(4:4) = achar(int(Z'1F600'), kind=4)
+var%str2(5:5) = achar(int(Z'1F608'), kind=4)
+
+if (var%str2(1:3) /= 4_"def") stop 1
+if (ichar(var%str2(4:4)) /= int(Z'1F600')) stop 2
+if (ichar(var%str2(5:5)) /= int(Z'1F608')) stop 2
+
+deallocate(var%str2)
+end
+
+! Note: the last '\x00' is regarded as string terminator, hence, the tailing \0 byte is not in the dump
+
+! { dg-final { scan-tree-dump "  \\(\\*var\\.str2\\)\\\[1\\\]{lb: 1 sz: 4} = .dx00x00.\\\[1\\\]{lb: 1 sz: 4};" "original" } }
+! { dg-final { scan-tree-dump "  __builtin_memmove \\(\\(void \\*\\) &\\(\\*var.str2\\)\\\[2\\\]{lb: 1 sz: 4}, \\(void \\*\\) &.ex00x00x00fx00x00.\\\[1\\\]{lb: 1 sz: 4}, 8\\);" "original" } }
+! { dg-final { scan-tree-dump "  \\(\\*var.str2\\)\\\[4\\\]{lb: 1 sz: 4} = .x00xf6x01.\\\[1\\\]{lb: 1 sz: 4};" "original" } }
+! { dg-final { scan-tree-dump "  \\(\\*var.str2\\)\\\[5\\\]{lb: 1 sz: 4} = .bxf6x01.\\\[1\\\]{lb: 1 sz: 4};" "original" } }


Re: [PATCH] PR fortran/95827 - Buffer overflows with PDTs and long symbols

2020-06-23 Thread Jerry DeLisle via Gcc-patches

OK, and thanks for Patch.

On 6/23/20 2:08 PM, Harald Anlauf wrote:

Dear all,

here's another case with a buffer that did overflow.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports?

Thanks,
Harald


PR fortran/95827 - Buffer overflows with PDTs and long symbols

With submodules and coarrays, name mangling results in long internal
symbols.  Enlarge internal buffer.

gcc/fortran/
PR fortran/95827
* iresolve.c (gfc_get_string): Enlarge internal buffer used in
generating the mangled name.




Re: [PATCH v2, RS6000 PR target/94954] Fix wrong codegen for vec_pack_to_short_fp32() builtin

2020-06-23 Thread Bill Schmidt via Gcc-patches

On 6/23/20 4:36 PM, Segher Boessenkool wrote:

Hi!

On Tue, Jun 16, 2020 at 10:53:07AM -0500, will schmidt wrote:

   target pr/94954

PR target/94954


 * config/rs6000/altivec.h (vec_pack_to_short_fp32) Update.

Colon?


 * config/rs6000/rs6000-call.c  (P9V_BUILTIN_VEC_CONVERT_4F32_8F16): New
 overloaded builtin entry.

Only one space before (.


-/* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } */
+/* { dg-do run { target { powerpc*-*-linux* && { p9vector_hw } } } } */

You shouldn't need those inner {} anymore, then.

Okay for trunk.  Thanks!  Also okay for backport to 10, if you want that.


IMO, should backport to 9 and 8 also.  It's a wrong-code problem.

Bill




Segher


[pushed] c++: Improve CTAD for aggregates [PR93976]

2020-06-23 Thread Jason Merrill via Gcc-patches
P2082R1 adjusted the rules for class template argument deduction for an
aggregate to better handle arrays and pack expansions.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/93976
Implement C++20 P2082R1, Fixing CTAD for aggregates.
* cp-tree.h (TPARMS_PRIMARY_TEMPLATE): Split out from...
(DECL_PRIMARY_TEMPLATE): ...here.
(builtin_guide_p): Declare.
* decl.c (reshape_init_class): Handle bases of a template.
(reshape_init_r): An array with dependent bound takes a single
initializer.
* pt.c (tsubst_default_argument): Shortcut {}.
(unify_pack_expansion): Allow omitted arguments to trailing pack.
(builtin_guide_p): New.
(collect_ctor_idx_types): Give a trailing pack a {} default
argument.  Handle arrays better.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-aggr3.C: New test.
* g++.dg/cpp2a/class-deduction-aggr4.C: New test.
---
 gcc/cp/cp-tree.h  |  5 +-
 gcc/cp/decl.c | 55 +-
 gcc/cp/pt.c   | 73 ---
 .../g++.dg/cpp2a/class-deduction-aggr3.C  | 24 ++
 .../g++.dg/cpp2a/class-deduction-aggr4.C  | 29 
 5 files changed, 171 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr4.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c1396686968..78e8ca4150a 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4815,8 +4815,10 @@ more_aggr_init_expr_args_p (const 
aggr_init_expr_arg_iterator *iter)
templates are primary, too.  */
 
 /* Returns the primary template corresponding to these parameters.  */
+#define TPARMS_PRIMARY_TEMPLATE(NODE) (TREE_TYPE (NODE))
+
 #define DECL_PRIMARY_TEMPLATE(NODE) \
-  (TREE_TYPE (DECL_INNERMOST_TEMPLATE_PARMS (NODE)))
+  (TPARMS_PRIMARY_TEMPLATE (DECL_INNERMOST_TEMPLATE_PARMS (NODE)))
 
 /* Returns nonzero if NODE is a primary template.  */
 #define PRIMARY_TEMPLATE_P(NODE) (DECL_PRIMARY_TEMPLATE (NODE) == (NODE))
@@ -7024,6 +7026,7 @@ extern bool dguide_name_p (tree);
 extern bool deduction_guide_p  (const_tree);
 extern bool copy_guide_p   (const_tree);
 extern bool template_guide_p   (const_tree);
+extern bool builtin_guide_p(const_tree);
 extern void store_explicit_specifier   (tree, tree);
 extern tree add_outermost_template_args(tree, tree);
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1d960be1ee6..3afad5ca805 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6153,7 +6153,22 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
 
   /* The initializer for a class is always a CONSTRUCTOR.  */
   new_init = build_constructor (init_list_type_node, NULL);
-  field = next_initializable_field (TYPE_FIELDS (type));
+
+  int binfo_idx = -1;
+  tree binfo = TYPE_BINFO (type);
+  tree base_binfo = NULL_TREE;
+  if (cxx_dialect >= cxx17 && uses_template_parms (type))
+{
+  /* We get here from maybe_aggr_guide for C++20 class template argument
+deduction.  In this case we need to look through the binfo because a
+template doesn't have base fields.  */
+  binfo_idx = 0;
+  BINFO_BASE_ITERATE (binfo, binfo_idx, base_binfo);
+}
+  if (base_binfo)
+field = base_binfo;
+  else
+field = next_initializable_field (TYPE_FIELDS (type));
 
   if (!field)
 {
@@ -6171,6 +6186,9 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
   return new_init;
 }
 
+  /* For C++20 CTAD, handle pack expansions in the base list.  */
+  tree last_was_pack_expansion = NULL_TREE;
+
   /* Loop through the initializable fields, gathering initializers.  */
   while (d->cur != d->end)
 {
@@ -6218,6 +6236,13 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
   if (!field)
break;
 
+  last_was_pack_expansion = (PACK_EXPANSION_P (TREE_TYPE (field))
+? field : NULL_TREE);
+  if (last_was_pack_expansion)
+   /* Each non-trailing aggregate element that is a pack expansion is
+  assumed to correspond to no elements of the initializer list.  */
+   goto continue_;
+
   field_init = reshape_init_r (TREE_TYPE (field), d,
   /*first_initializer_p=*/NULL_TREE,
   complain);
@@ -6243,7 +6268,27 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
   if (TREE_CODE (type) == UNION_TYPE)
break;
 
-  field = next_initializable_field (DECL_CHAIN (field));
+continue_:
+  if (base_binfo)
+   {
+ BINFO_BASE_ITERATE (binfo, ++binfo_idx, base_binfo);
+ if (base_binfo)
+   

Re: [PATCH 1/6 ver 2] rs6000, Update support for vec_extract

2020-06-23 Thread Segher Boessenkool
Hi!

On Mon, Jun 15, 2020 at 04:37:47PM -0700, Carl Love wrote:
> * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)

No colon here.

>   (vextractl, vextractr)
> (vextractl_internal, vextractr_internal)
>   (VI2): Move to gcc/config/rs6000/vsx.md.

Will explained how you can easily write a changelog entry for moving
code to another file.

If you write  in a changelog left side of a colon, it usually
helps to say what iterator that is: " for VI2" for example.  This
is of course more important in more complex cases, say where you have
the exact same names just for different iterators, or when you have
 for example (the name "3" is really not very
enlightening :-) )

>   * config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)

No colon.

> (vextractl, vextractr)
> (vextractl_internal, vextractr_internal)
>   (VI2): Code was moved from config/rs6000/altivec.md.
>   * gcc/doc/extend.texi: Update documentation for vec_extractl.
>   Replace builtin name vec_extractr with vec_extracth.  Update description
>   of vec_extracth.

The indent is weird in places, I guess that is just a mail issue.

Okay for trunk with those trivialities, and the things Will found, fixed
up.  Thanks!  Just writing out the full instruction names is the easiest
for everyone btw, unless that then needs to be a huge list, that isn't
very helpful to anyone.


Segher


Re: [PATCH] PR fortran/95707 - ICE in finish_equivalences, at fortran/trans-common.c:1319

2020-06-23 Thread Jerry DeLisle via Gcc-patches

OK, and once again, thanks.
Jerry

On 6/17/20 12:27 PM, Harald Anlauf wrote:

Another corner case of buffer overflows during name mangling found by
Gerhard.  We now check that the new buffer sizes suffice.

The patch is on top of the patches for PRs 95687, 95688, 95689.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports?

Thanks,
Harald


PR fortran/95707 - ICE in finish_equivalences, at fortran/trans-common.c:1319

With submodules and equivalence declarations, name mangling may result in
long internal symbols overflowing internal buffers.  We now check that
we do not exceed the enlarged buffer sizes.

gcc/fortran/
PR fortran/95707
* gfortran.h (gfc_common_head): Enlarge buffer.
* trans-common.c (gfc_sym_mangled_common_id): Enlarge temporary
buffers, and add check on length on mangled name to prevent
overflow.




Re: Fortran: Fix character-kind=4 substring resolution (PR95837)

2020-06-23 Thread Tobias Burnus

Ups – old patch :-(

Correct one attached.

Tobias

On 6/23/20 11:41 PM, Tobias Burnus wrote:

Found when looking at another issue …

OK for the trunk?

Tobias

PS: Without the patch, it fails to compile with:
Error: Character ‘\U0001F600’ in string at (1) cannot be converted
into character kind 1
Error: Operands of comparison operator ‘/=’ at (1) are
CHARACTER(3)/CHARACTER(3,4)


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Fix character-kind=4 substring resolution (PR95837)

gcc/fortran/ChangeLog:

	PR fortran/95837
	* resolve.c (gfc_resolve_substring_charlen): Fix char-kind setting.

gcc/testsuite/ChangeLog:

	PR fortran/95837
	* gfortran.dg/char4-subscript.f90: New test.

 gcc/fortran/resolve.c |  5 -
 gcc/testsuite/gfortran.dg/char4-subscript.f90 | 30 +++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index c53b312f7ed..e07ae6d1096 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -5140,8 +5140,11 @@ gfc_resolve_substring_charlen (gfc_expr *e)
 	return;
 }
 
+  if (ts)
+e->ts.kind = ts->kind;
+  else if (e->ts.type != BT_CHARACTER)
+e->ts.kind = gfc_default_character_kind;
   e->ts.type = BT_CHARACTER;
-  e->ts.kind = gfc_default_character_kind;
 
   if (!e->ts.u.cl)
 e->ts.u.cl = gfc_new_charlen (gfc_current_ns, NULL);
diff --git a/gcc/testsuite/gfortran.dg/char4-subscript.f90 b/gcc/testsuite/gfortran.dg/char4-subscript.f90
new file mode 100644
index 000..f1f915c7af9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/char4-subscript.f90
@@ -0,0 +1,30 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/95837
+!
+type t
+  character(len=:, kind=4), pointer :: str2
+end type t
+type(t) :: var
+
+allocate(character(len=5, kind=4) :: var%str2)
+
+var%str2(1:1) = 4_"d"
+var%str2(2:3) = 4_"ef"
+var%str2(4:4) = achar(int(Z'1F600'), kind=4)
+var%str2(5:5) = achar(int(Z'1F608'), kind=4)
+
+if (var%str2(1:3) /= 4_"def") stop 1
+if (ichar(var%str2(4:4)) /= int(Z'1F600')) stop 2
+if (ichar(var%str2(5:5)) /= int(Z'1F608')) stop 2
+
+deallocate(var%str2)
+end
+
+! Note: the last '\x00' is regarded as string terminator, hence, the tailing \0 byte is not in the dump
+
+! { dg-final { scan-tree-dump "  \\(\\*var\\.str2\\)\\\[1\\\]{lb: 1 sz: 4} = .dx00x00.\\\[1\\\]{lb: 1 sz: 4};" "original" } }
+! { dg-final { scan-tree-dump "  __builtin_memmove \\(\\(void \\*\\) &\\(\\*var.str2\\)\\\[2\\\]{lb: 1 sz: 4}, \\(void \\*\\) &.ex00x00x00fx00x00.\\\[1\\\]{lb: 1 sz: 4}, 8\\);" "original" } }
+! { dg-final { scan-tree-dump "  \\(\\*var.str2\\)\\\[4\\\]{lb: 1 sz: 4} = .x00xf6x01.\\\[1\\\]{lb: 1 sz: 4};" "original" } }
+! { dg-final { scan-tree-dump "  \\(\\*var.str2\\)\\\[5\\\]{lb: 1 sz: 4} = .bxf6x01.\\\[1\\\]{lb: 1 sz: 4};" "original" } }


[PATCH] x86: Fold arch_names_table into processor_alias_table

2020-06-23 Thread H.J. Lu via Gcc-patches
In i386-builtins.c, arch_names_table is used to to map architecture name
string to internal model.  A switch statement is used to map internal
processor name to architecture name string and internal priority.

model and priority are added to processor_alias_table so that a single
entry contains architecture name string, internal processor name,
internal model and internal priority.  6 entries are appended for
i386-builtins.c, which have special architecture name strings: amd,
amdfam10h, amdfam15h, amdfam17h, shanghai and istanbul, and pta_size is
adjusted to exclude them.  Entries which are not used by i386-builtins.c
have internal model 0.  P_PROC_DYNAMIC is added to internal priority to
make entries with dynamic architecture name string or priority.

PR target/95842
* common/config/i386/i386-common.c (processor_alias_table): Add
processor model and priority to each entry.
(pta_size): Updated with -6.
(num_arch_names): New.
* common/config/i386/i386-cpuinfo.h: New file.
* config/i386/i386-builtins.c (feature_priority): Removed.
(processor_model): Likewise.
(_arch_names_table): Likewise.
(arch_names_table): Likewise.
(get_builtin_code_for_version): Use processor_alias_table.
(fold_builtin_cpu): Replace arch_names_table with
processor_alias_table.
* config/i386/i386.h: Include "common/config/i386/i386-cpuinfo.h".
(pta): Add model and priority.
(num_arch_names): New.
---
 gcc/common/config/i386/i386-common.c  | 239 ---
 gcc/common/config/i386/i386-cpuinfo.h | 134 +++
 gcc/config/i386/i386-builtins.c   | 329 +-
 gcc/config/i386/i386.h|   5 +
 4 files changed, 345 insertions(+), 362 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-cpuinfo.h

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 42d14c29f16..d3c9a3bff9a 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1619,164 +1619,202 @@ STATIC_ASSERT (ARRAY_SIZE (processor_names) == 
PROCESSOR_max);
 
 const pta processor_alias_table[] =
 {
-  {"i386", PROCESSOR_I386, CPU_NONE, 0},
-  {"i486", PROCESSOR_I486, CPU_NONE, 0},
-  {"i586", PROCESSOR_PENTIUM, CPU_PENTIUM, 0},
-  {"pentium", PROCESSOR_PENTIUM, CPU_PENTIUM, 0},
-  {"lakemont", PROCESSOR_LAKEMONT, CPU_PENTIUM, PTA_NO_80387},
-  {"pentium-mmx", PROCESSOR_PENTIUM, CPU_PENTIUM, PTA_MMX},
-  {"winchip-c6", PROCESSOR_I486, CPU_NONE, PTA_MMX},
-  {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
-  {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
-  {"samuel-2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
+  {"i386", PROCESSOR_I386, CPU_NONE, 0, 0, P_ZERO},
+  {"i486", PROCESSOR_I486, CPU_NONE, 0, 0, P_ZERO},
+  {"i586", PROCESSOR_PENTIUM, CPU_PENTIUM, 0, 0, P_ZERO},
+  {"pentium", PROCESSOR_PENTIUM, CPU_PENTIUM, 0, 0, P_ZERO},
+  {"lakemont", PROCESSOR_LAKEMONT, CPU_PENTIUM, PTA_NO_80387,
+0, P_ZERO},
+  {"pentium-mmx", PROCESSOR_PENTIUM, CPU_PENTIUM, PTA_MMX, 0, P_ZERO},
+  {"winchip-c6", PROCESSOR_I486, CPU_NONE, PTA_MMX, 0, P_ZERO},
+  {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW,
+0, P_ZERO},
+  {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW, 0, P_ZERO},
+  {"samuel-2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW,
+0, P_ZERO},
   {"c3-2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_FXSR, 0, P_ZERO},
   {"nehemiah", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_FXSR, 0, P_ZERO},
   {"c7", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, 0, P_ZERO},
   {"esther", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR},
-  {"i686", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, 0},
-  {"pentiumpro", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, 0},
-  {"pentium2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, PTA_MMX | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, 0, P_ZERO},
+  {"i686", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, 0, 0, P_ZERO},
+  {"pentiumpro", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, 0, 0, P_ZERO},
+  {"pentium2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, PTA_MMX | PTA_FXSR,
+0, P_ZERO},
   {"pentium3", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_FXSR, 0, P_ZERO},
   {"pentium3m", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_FXSR, 0, P_ZERO},
   {"pentium-m", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
-PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_FXSR},
+PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_FXSR, 0, P_ZERO},
   {"pentium4", PROCESSOR_PENTIUM4, CPU_NONE,
-PTA_MMX | PTA_SSE | PTA_SSE2 | P

Re: [PATCH v2, RS6000 PR target/94954] Fix wrong codegen for vec_pack_to_short_fp32() builtin

2020-06-23 Thread Segher Boessenkool
On Tue, Jun 23, 2020 at 04:47:42PM -0500, Bill Schmidt wrote:
> >Okay for trunk.  Thanks!  Also okay for backport to 10, if you want that.
> 
> IMO, should backport to 9 and 8 also.  It's a wrong-code problem.

Oh yes, certainly, for some reason I thought this wasn't there before
GCC 10 :-)


Segher


Re: [PATCH] recog: Use parameter packs for operator()

2020-06-23 Thread Gerald Pfeifer
On Mon, 22 Jun 2020, Richard Sandiford wrote:
> OK, I've applied the below as (hopefully) obvious after testing
> on aarch64-linux-gnu.
> 
>> I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95805 and included
>> the relevant part of the build log.
> 
> But I forgot about the PR, sorry, so didn't add it to the commit message.

No worries.  And thank you, I am happy to report those testers are
back in bootstrap land!

Gerald


[PATCH] c++: Fix CTAD for aggregates in template [PR95568]

2020-06-23 Thread Marek Polacek via Gcc-patches
95568 complains that CTAD for aggregates doesn't work within
requires-clause and it turned out that it doesn't work when we try
the deduction in a template.  The reason is that maybe_aggr_guide
creates a guide that can look like this

  template X(decltype (X::x))-> X

where the parameter is a decltype, which is a non-deduced context.  So
the subsequent build_new_function_call fails because unify_one_argument
can't deduce anything from it ([temp.deduct.type]: "If a template
parameter is used only in non-deduced contexts and is not explicitly
specified, template argument deduction fails.")

In such a case we probably want to return PTYPE and try again later like
we do when trying to deduce from a type-dependent expression.  Instead
of playing with build_new_function_call with only tf_decltype or with
uses_deducible_template_parms I'm only checking processing_template_decl
because it's cheaper and because when we arein a template it seems
pretty much guaranteed that the guide will have a DECLTYPE_TYPE
parameter.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/95568
* pt.c (do_class_deduction): When attempting CTAD for aggregates
in a template, return PTYPE.

gcc/testsuite/ChangeLog:

PR c++/95568
* g++.dg/cpp2a/class-deduction-aggr5.C: New test.
---
 gcc/cp/pt.c   |  9 -
 .../g++.dg/cpp2a/class-deduction-aggr5.C  | 20 +++
 2 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 53a64c3a15e..1f6ad8289bd 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28830,7 +28830,14 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
 }
 
   if (tree guide = maybe_aggr_guide (tmpl, init, args))
-cands = lookup_add (guide, cands);
+{
+  /* In a template, GUIDE's type can have DECLTYPE_TYPE parameters
+and those are non-deduced contexts.  Wait until they have been
+substituted.  */
+  if (processing_template_decl)
+   return ptype;
+  cands = lookup_add (guide, cands);
+}
 
   tree call = error_mark_node;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
new file mode 100644
index 000..01253f42006
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
@@ -0,0 +1,20 @@
+// PR c++/95568
+// { dg-do compile { target c++20 } }
+
+template struct X { T x; };
+template struct X2 { T x; U y; };
+template concept Y = requires { X{0}; };
+
+template
+void g()
+{
+  X{0};
+  X2{1, 2.2};
+  Y auto y = X{1};
+}
+
+void
+fn ()
+{
+  g();
+}

base-commit: 6b161257f9f8c7a26b7d119ebc32cbbc54d2e508
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[PATCH] libgomp: added simple functions and tests for OMPD

2020-06-23 Thread y2s1982 via Gcc-patches
This patch adds some unit tests for omp-tools.h header. It also adds some simple
functions related to OMPD API versions. It also partially defines the OMPD
initialization function.

2020-06-23  Tony Sim  

libgomp/ChangeLog:

* Makefile.am (toolexeclib_LTLIBRARIES and related): Add libgompd.la.
* Makefile.in: Regenerate.
* config/darwin/plugin-suffix.h (SONAME_SUFFIX): Removed ().
* config/hpux/plugin-suffix.h (SONAME_SUFFIX): Removed ().
* config/posix/plugin-suffix.h (SONAME_SUFFIX): Removed ().
* testsuite/Makefile.in: Regenerate.
* libgompd.c: New file.
* libgompd.h: New file.
* testsuite/libgomp.ompd/header-1.c: New test.
* testsuite/libgomp.ompd/header-order-1.c: New test.
* testsuite/libgomp.ompd/header-order-2.c: New test.
* testsuite/libgomp.ompd/ompd.exp: New test.

---
 libgomp/Makefile.am   |  8 +++-
 libgomp/Makefile.in   | 20 ++--
 libgomp/config/darwin/plugin-suffix.h |  2 +-
 libgomp/config/hpux/plugin-suffix.h   |  2 +-
 libgomp/config/posix/plugin-suffix.h  |  2 +-
 libgomp/libgompd.c| 46 +++
 libgomp/libgompd.h|  7 +++
 libgomp/testsuite/Makefile.in |  1 +
 libgomp/testsuite/libgomp.ompd/header-1.c |  6 +++
 .../testsuite/libgomp.ompd/header-order-1.c   |  7 +++
 .../testsuite/libgomp.ompd/header-order-2.c   |  7 +++
 libgomp/testsuite/libgomp.ompd/ompd.exp   | 38 +++
 12 files changed, 139 insertions(+), 7 deletions(-)
 create mode 100644 libgomp/libgompd.c
 create mode 100644 libgomp/libgompd.h
 create mode 100644 libgomp/testsuite/libgomp.ompd/header-1.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/header-order-1.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/header-order-2.c
 create mode 100644 libgomp/testsuite/libgomp.ompd/ompd.exp

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 4d31f4cef46..c26c26c59a4 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -20,7 +20,7 @@ AM_CPPFLAGS = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
 
-toolexeclib_LTLIBRARIES = libgomp.la
+toolexeclib_LTLIBRARIES = libgomp.la libgompd.la
 nodist_toolexeclib_HEADERS = libgomp.spec
 
 if LIBGOMP_BUILD_VERSIONED_SHLIB
@@ -67,6 +67,12 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c error.c \
oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c oacc-target.c
 
+libgompd_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
+$(lt_host_flags)
+libgompd_la_DEPENDENCIES = $(libgomp_version_dep)
+libgompd_la_LINK = $(LINK) $(libgomp_la_LDFLAGS)
+libgompd_la_SOURCES = libgompd.c
+
 include $(top_srcdir)/plugin/Makefrag.am
 
 if USE_FORTRAN
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 3ca1be0d73e..c4c9e437e94 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -234,6 +234,9 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo 
critical.lo \
teams.lo allocator.lo oacc-profiling.lo oacc-target.lo \
$(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
+libgompd_la_LIBADD =
+am_libgompd_la_OBJECTS = libgompd.lo
+libgompd_la_OBJECTS = $(am_libgompd_la_OBJECTS)
 AM_V_P = $(am__v_P_@AM_V@)
 am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
 am__v_P_0 = false
@@ -282,7 +285,8 @@ am__v_FCLD_0 = @echo "  FCLD" $@;
 am__v_FCLD_1 = 
 SOURCES = $(libgomp_plugin_gcn_la_SOURCES) \
$(libgomp_plugin_hsa_la_SOURCES) \
-   $(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
+   $(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES) \
+   $(libgompd_la_SOURCES)
 AM_V_DVIPS = $(am__v_DVIPS_@AM_V@)
 am__v_DVIPS_ = $(am__v_DVIPS_@AM_DEFAULT_V@)
 am__v_DVIPS_0 = @echo "  DVIPS   " $@;
@@ -548,8 +552,8 @@ libsubincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include
 AM_CPPFLAGS = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
-toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1) $(am__append_2) \
-   $(am__append_3)
+toolexeclib_LTLIBRARIES = libgomp.la libgompd.la $(am__append_1) \
+   $(am__append_2) $(am__append_3)
 nodist_toolexeclib_HEADERS = libgomp.spec
 
 # -Wc is only a libtool option.
@@ -576,6 +580,12 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c \
oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c $(am__append_4)
+libgompd_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
+$(lt_host_flags)
+
+libgompd_la_DEPENDENCIES = $(libgomp_version_dep)
+libgompd_la_LINK = $(LINK) $(libgomp_la_LDFLAGS)
+libgompd_la_SOURCES = libgompd.c

[PATCH] reassoc: Propagate PHI_LOOP_BIAS along single uses

2020-06-23 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested x86_64-redhat-linux, ppc64le-redhat-linux and
s390x-redhat-linux.  I also ran SPEC 2006 and 2017 on these platforms,
and the only measurable regression was 3% in 520.omnetpp_r on ppc, which
went away after inserting a single nop at the beginning of
cDynamicExpression::evaluate.

OK for master?

---

PR tree-optimization/49749 introduced code that shortens dependency
chains containing loop accumulators by placing them last on operand
lists of associative operations.

456.hmmer benchmark on s390 could benefit from this, however, the code
that needs it modifies loop accumulator before using it, and since only
so-called loop-carried phis are are treated as loop accumulators, the
code in the present form doesn't really help.   According to Bill
Schmidt - the original author - such a conservative approach was chosen
so as to avoid unnecessarily swapping operands, which might cause
unpredictable effects.  However, giving special treatment to forms of
loop accumulators is acceptable.

The definition of loop-carried phi is: it's a single-use phi, which is
used in the same innermost loop it's defined in, at least one argument
of which is defined in the same innermost loop as the phi itself.
Given this, it seems natural to treat single uses of such phis as phis
themselves.

gcc/ChangeLog:

2020-05-06  Ilya Leoshkevich  

* passes.def (pass_reassoc): Rename parameter to early_p.
* tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p):
New variable.
(phi_rank): Don't bias loop-carried phi ranks
before vectorization pass.
(loop_carried_phi): Remove (superseded by
operand_rank::biased_p).
(propagate_rank): Propagate bias along single uses.
(get_rank): Pass stmt to propagate_rank.
(execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter.
(pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p
initializer.
(pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p
value.
(pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to
execute_reassoc.
(pass_reassoc::bias_loop_carried_phi_ranks_p): New member.

gcc/testsuite/ChangeLog:

2020-05-06  Ilya Leoshkevich  

* gcc.target/s390/reassoc-1.c: New test.
* gcc.target/s390/reassoc-2.c: New test.
* gcc.target/s390/reassoc-3.c: New test.
* gcc.target/s390/reassoc.h: New test.
---
 gcc/passes.def|  4 +-
 gcc/testsuite/gcc.target/s390/reassoc-1.c |  6 ++
 gcc/testsuite/gcc.target/s390/reassoc-2.c |  7 ++
 gcc/testsuite/gcc.target/s390/reassoc-3.c |  8 ++
 gcc/testsuite/gcc.target/s390/reassoc.h   | 22 +
 gcc/tree-ssa-reassoc.c| 97 ++-
 6 files changed, 105 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/reassoc.h

diff --git a/gcc/passes.def b/gcc/passes.def
index 2b1e09fdda3..6864f583f20 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -235,7 +235,7 @@ along with GCC; see the file COPYING3.  If not see
 program and isolate those paths.  */
   NEXT_PASS (pass_isolate_erroneous_paths);
   NEXT_PASS (pass_dse);
-  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
+  NEXT_PASS (pass_reassoc, true /* early_p */);
   NEXT_PASS (pass_dce);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt, false /* early_p */);
@@ -312,7 +312,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_lower_vector_ssa);
   NEXT_PASS (pass_lower_switch);
   NEXT_PASS (pass_cse_reciprocals);
-  NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
+  NEXT_PASS (pass_reassoc, false /* early_p */);
   NEXT_PASS (pass_strength_reduction);
   NEXT_PASS (pass_split_paths);
   NEXT_PASS (pass_tracer);
diff --git a/gcc/testsuite/gcc.target/s390/reassoc-1.c 
b/gcc/testsuite/gcc.target/s390/reassoc-1.c
new file mode 100644
index 000..8343f1cd4b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/reassoc-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "reassoc.h"
+
+/* { dg-final { scan-assembler 
{(?n)\n\tl\t(%r\d+),.+(\n.*)*\n\ta\t(\1),.+(\n.*)*\n\tar\t(%r\d+),(\1)} } } */
diff --git a/gcc/testsuite/gcc.target/s390/reassoc-2.c 
b/gcc/testsuite/gcc.target/s390/reassoc-2.c
new file mode 100644
index 000..5e393ed4937
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/reassoc-2.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#define MODIFY
+#include "reassoc.h"
+
+/* { dg-final { scan-assembler 
{(?n)\n\tl\t(%r\d+),.+(\n.*)*\n\ta\t(\1),.+(\n.*)*\n\tar\t(%r\d+),(\1)} } } */
diff --git a/gcc/testsuite/gcc.target/s390/reassoc-3.c 
b/gcc/testsuite/gcc.t

Re: [PATCH 1/7 v5] ifn/optabs: Support vector load/store with length

2020-06-23 Thread Jim Wilson
On Tue, Jun 23, 2020 at 5:21 AM Richard Sandiford
 wrote:
> MVE and Power both set inactive lanes to zero.  But I'm not sure about RVV.
> AIUI, for RVV the approach instead would be to reduce the effective vector
> length for the final iteration of the vector loop, and I'm not sure
> whether in that situation it makes sense to say that the other elements
> still exist and are guaranteed to be zero.
>
> I'm the last person who should be speculating on that though.  Let's see
> whether Jim has any comments.

The RVV spec supports two policies for tail elements, i.e. elements
beyond the current vector length.  They can be undisturbed or
agnostic.  In the undisturbed case, the trail elements retain their
old values.  In the agnostic case, the implementation can choose to
either retain their old values, or set them to all ones, and this
choice can be different from lane to lane.  The latter case is useful
because registers may be wider than the execution unit, and current
vector length may not be a multiple of the width of the execution
unit.  So for instance if the vector registers can hold 8 elements,
and the execution unit works on 4 elements at a time, and the current
vector length is 2, then it might make sense to leave the last four
elements unmodified to avoid an iteration across the registers, but
the third and fourth elements might be set to all ones because you
have to write to them anyways.  The choice is left up to the
implementation because we have multiple parties designing vector
units, and some are target for low cost embedded market, and some are
target for high performance, and they couldn't agree on a single best
way to implement this.  The software is expected to choose agnostic
only if it doesn't care about what happens to tail elements, and
undisturbed if you want to preserve them.  The value of all ones was
chosen to discourage software developers from trying to use the values
in tail elements.  The choice of undisturbed or agnostic can be
changed every time you set the current vector length and type.

In most cases, I think RVV programs will use agnostic for tail
elements, since we can change the vector length at will, and it will
be rare that we will care about elements beyond the current vector
length.

Tail elements can't cause exceptions so there is no need to worry
about whether those elements hold valid values.

Jim


[RFC/PATCH] IFN: Fix mask_{load,store} optab support macros

2020-06-23 Thread Kewen.Lin via Gcc-patches
Hi,

When I am working on IFNs for vector with length, I noticed that the
current optab support query for mask_load/mask_store looks unexpected.
The mask_load/mask_store requires two modes for convert_optab query,
but the macros direct_mask_{load,store}_optab_supported_p uses
direct_optab_supported_p which asserts type pair should have the same mode.

I'm not sure whether we have some special reason here or just a typo,
since everything goes well now, mask_{load,store} optab check is mainly
handled by can_vec_mask_load_store_p.

But if we have some codes as below (eg: one checking for all IFNs finally)

  tree_pair types = direct_internal_fn_types (ifn, call);
  if(direct_internal_fn_supported_p (ifn, types, OPTIMIZE_FOR_SPEED) ...

It will cause ICE.

Does it make sense to fix it?

Thanks in advance!

BR,
Kewen
-
gcc/ChangeLog:

* internal-fn.c (direct_mask_load_optab_supported_p): Use
convert_optab_supported_p instead of direct_optab_supported_p.
(direct_mask_store_optab_supported_p): Likewise.

-
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index f9e851069a5..1e53ced60eb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3142,12 +3142,12 @@ multi_vector_optab_supported_p (convert_optab optab, 
tree_pair types,
 #define direct_cond_unary_optab_supported_p direct_optab_supported_p
 #define direct_cond_binary_optab_supported_p direct_optab_supported_p
 #define direct_cond_ternary_optab_supported_p direct_optab_supported_p
-#define direct_mask_load_optab_supported_p direct_optab_supported_p
+#define direct_mask_load_optab_supported_p convert_optab_supported_p
 #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_gather_load_optab_supported_p convert_optab_supported_p
 #define direct_len_load_optab_supported_p direct_optab_supported_p
-#define direct_mask_store_optab_supported_p direct_optab_supported_p
+#define direct_mask_store_optab_supported_p convert_optab_supported_p
 #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_store_lanes_optab_supported_p 
multi_vector_optab_supported_p
 #define direct_scatter_store_optab_supported_p convert_optab_supported_p


Re: PING^2: V5 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-06-23 Thread Martin Liška

On 6/16/20 7:11 PM, H.J. Lu wrote:

On Tue, Jun 9, 2020 at 9:35 AM H.J. Lu  wrote:


On Tue, May 26, 2020 at 6:27 AM Martin Liška  wrote:


On 5/26/20 1:59 PM, H.J. Lu wrote:

On Tue, May 26, 2020 at 2:30 AM Martin Liška  wrote:


On 5/25/20 7:42 PM, H.J. Lu wrote:

Here is the updated patch.  OK for master?


Thank you for the updated patch.

I have still few nits:

1) I would make all the:


+  has_sse3 = has_feature (FEATURE_SSE3);


a macro. The local variable seems to superfluous.


Done.


Thanks!




2) can we automatically deduce option name:


+  ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, "-mrdpid")
+  ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, "-mrdrnd")


I mean "-m" + "rdrnd" == "-mrdrnd" ?


The new option field serves 2 purposes:

1. Not all features have a corresponding command-line option

ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL)

   for (i = 0; i < ARRAY_SIZE (isa_names_table); i++)
  if (isa_names_table[i].option)

2. Some feature has a different name in the command-line option.

ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, "-mfxsr")


I noticed that, one can theoretically use "" for an option that does not
have a flag. And NULL for these which have option equal to "-m" + name.
Anyway, that's a nit.

I support the patch!
Martin



Here is the updated patch.   OK for master?

Thanks.





PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546522.html



PING.



@Honza: Can you please take a look at the patch?

Martin