Re: [PATCH] C++: target attribute - local decl

2021-03-08 Thread Martin Liška

On 3/4/21 9:54 PM, Jason Merrill wrote:

On 3/4/21 10:52 AM, Martin Liška wrote:

On 3/4/21 4:45 PM, Jason Merrill wrote:


Sure, I guess you do need to set that flag for the local decls, but that should 
be all.

Jason


Doing that also fails :/
This time likely due to how we set RECORD argument of maybe_version_functions 
function:

gcc/cp/decl.c:    && maybe_version_functions (newdecl, olddecl,
gcc/cp/decl.c-    (!DECL_FUNCTION_VERSIONED 
(newdecl)
gcc/cp/decl.c- || !DECL_FUNCTION_VERSIONED 
(olddecl


That is odd.

The other problem is that DECL_LOCAL_DECL_ALIAS isn't always set before we get 
to maybe_version_functions; it's set further down in do_pushdecl.  We need it 
to be set or things break.


Oh, I see.



This seems to work for me, what do you think?


Seems good to me, please apply the patch.



BTW, your patch was corrupted by the mailer, so I had to apply it by hand.


I know, I made a quick dirty copy&paste ... sorry.

Thanks for help,
Martin



[GCC 10][committed] aarch64: Add internal tune flag to minimise VL-based scalar ops

2021-03-08 Thread Kyrylo Tkachov via Gcc-patches
This is a backport of the cse_sve_vl_constants tuning param to GCC 10.
It's had a couple of trivial merge conflicts to resolve.

Bootstrapped and tested on the branch on aarch64-none-linux-gnu.
Pushing to the branch.

Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-tuning-flags.def (cse_sve_vl_constants):
Define.
* config/aarch64/aarch64.md (add3): Force CONST_POLY_INT 
immediates
into a register when the above is enabled.
* config/aarch64/aarch64.c (neoversev1_tunings):
AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS.
(aarch64_rtx_costs): Use AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS.

gcc/testsuite/

* gcc.target/aarch64/sve/cse_sve_vl_constants_1.c: New test.


cse-vl-10.patch
Description: cse-vl-10.patch


Re: [PATCH v2] fix Ada bootstrap on Cygwin64 (PR bootstrap/94918)

2021-03-08 Thread Richard Biener via Gcc-patches
On Sun, Jan 10, 2021 at 2:05 PM Mikael Pettersson via Gcc-patches
 wrote:
>
> On Sun, Jan 10, 2021 at 11:57 AM Arnaud Charlet  wrote:
> >
> > > This fixes a compilation error preventing bootstrap with Ada on
> > > x86_64-pc-cygwin. See PR bootstrap/94918 for details.
> > >
> > > Compared to the initial patch sent in May 2020, this v2 patch places
> > > the fix in Ada's raise-gcc.c instead of the shared unwind-generic.h,
> > > which should hopefully simplify getting it applied.
> >
> > Not sure why. Applying it there looks incomplete and kludgy, don't you
> > agree?
>
> Ok, then here's v3 which places the fix in unwind-generic.h. The fix
> matches what Ada's mingw32.h does.
>
> Tested by bootstrapping this and the preliminary workaround for
> PR98590 on x86_64-pc-cygwin.
>
> Ok for master and branches?

I wonder why we include  from this file at all,
and why it is not included from {t,}system.h instead which
is where system header specific fixups should be made
(and this one could be avoided because system headers
are included _before_ poisoning anything).

The include was added by Tristan, well - probably just merged by him.

Eric?

> libgcc/
>
> 2021-01-10  Mikael Pettersson  
>
> PR bootstrap/94918
> * unwind-generic.h (__SEH__): Prevent windows.h from including
> x86intrin.h or emmintrin.h on Cygwin64.
>
> --- gcc-11-20210103/libgcc/unwind-generic.h.~1~ 2021-01-03
> 23:32:20.0 +0100
> +++ gcc-11-20210103/libgcc/unwind-generic.h 2021-01-09
> 14:51:16.262378715 +0100
> @@ -30,6 +30,12 @@
>
>  #if defined (__SEH__) && !defined (__USING_SJLJ_EXCEPTIONS__)
>  /* Only for _GCC_specific_handler.  */
> +#if defined (__CYGWIN__) && !defined (__CYGWIN32__) && !defined (IN_RTS)
> +/* Note: windows.h on cygwin-64 includes x86intrin.h which uses malloc.
> +   That fails to compile, if malloc is poisoned, i.e. if !IN_RTS.  */
> +#define _X86INTRIN_H_INCLUDED
> +#define _EMMINTRIN_H_INCLUDED
> +#endif
>  #include 
>  #endif


[PATCH] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-03-08 Thread Alex Coplan via Gcc-patches
Hi all,

As discussed in the PR, we currently have two different numbering
schemes for SVE builtins: one for C, and one for C++. This is
problematic for LTO, where we end up getting confused about which
intrinsic we're talking about. This patch inserts placeholders into the
registered_functions vector to ensure that there is a consistent
numbering scheme for both C and C++.

Initially I tried using a null decl for these placeholders, but this
trips up some validation when the builtin trees are streamed into lto1,
so I ended up building dummy FUNCTION_DECLs as placeholders.

To get better test coverage here, I'm wondering if we could make some of
the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
For now I've just added the specific testcase from the PR.

Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?

Thanks,
Alex

---

gcc/ChangeLog:

PR target/99216
* config/aarch64/aarch64-sve-builtins.cc
(function_builder::add_function): Add placeholder_p argument, build
placeholder decls if this is set.
(function_builder::add_unique_function): Instead of conditionally adding
direct overloads, unconditionally add either a direct overload or a
placeholder.
(function_builder::add_overloaded_function): Set placeholder_p if we're
using C++ overloads.
(function_builder::add_overloaded_functions): Don't return early for
m_direct_overloads: we need to add placeholders.
* config/aarch64/aarch64-sve-builtins.h
(function_builder::add_function): Add placeholder_p argument.

gcc/testsuite/ChangeLog:

PR target/99216
* g++.target/aarch64/sve/pr99216.C: New test.
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 25612d2ea52..c0ba352599c 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -999,12 +999,25 @@ registered_function &
 function_builder::add_function (const function_instance &instance,
const char *name, tree fntype, tree attrs,
uint64_t required_extensions,
-   bool overloaded_p)
+   bool overloaded_p,
+   bool placeholder_p)
 {
   unsigned int code = vec_safe_length (registered_functions);
   code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE;
-  tree decl = simulate_builtin_function_decl (input_location, name, fntype,
- code, NULL, attrs);
+
+  /* We need to be able to generate placeholders to enusre that we have a
+ consistent numbering scheme for function codes between the C and C++
+ frontends, so that everything ties up in LTO.
+
+ The placeholder needs to be non-NULL and some node other than
+ error_mark_node to pass validation when the builtin is streamed in to 
lto1.
+
+ It's also convenient if the placeholder can store the name, so build a
+ FUNCTION_DECL, but don't register it with the frontend.  */
+  tree decl = placeholder_p
+? build_decl (input_location, FUNCTION_DECL, get_identifier (name), fntype)
+: simulate_builtin_function_decl (input_location, name, fntype,
+ code, NULL, attrs);
 
   registered_function &rfn = *ggc_alloc  ();
   rfn.instance = instance;
@@ -1036,7 +1049,7 @@ function_builder::add_unique_function (const 
function_instance &instance,
   argument_types.address ());
   tree attrs = get_attributes (instance);
   registered_function &rfn = add_function (instance, name, fntype, attrs,
-  required_extensions, false);
+  required_extensions, false, false);
 
   /* Enter the function into the hash table.  */
   hashval_t hash = instance.hash ();
@@ -1047,16 +1060,14 @@ function_builder::add_unique_function (const 
function_instance &instance,
 
   /* Also add the function under its overloaded alias, if we want
  a separate decl for each instance of an overloaded function.  */
-  if (m_direct_overloads || force_direct_overloads)
+  char *overload_name = get_name (instance, true);
+  if (strcmp (name, overload_name) != 0)
 {
-  char *overload_name = get_name (instance, true);
-  if (strcmp (name, overload_name) != 0)
-   {
- /* Attribute lists shouldn't be shared.  */
- tree attrs = get_attributes (instance);
- add_function (instance, overload_name, fntype, attrs,
-   required_extensions, false);
-   }
+  /* Attribute lists shouldn't be shared.  */
+  tree attrs = get_attributes (instance);
+  bool placeholder_p = !(m_direct_overloads || force_direct_overloads);
+  add_function (instance, overload_name, fntype, attrs,
+   required_extensions, false, p

Re: [PATCH] skip testing time before epoch on mips

2021-03-08 Thread Jonathan Wakely via Gcc-patches

On 08/03/21 14:35 +0800, Chen Li wrote:


When execute libstdc++ testcases on mips, I notice that last_write_time
alawys failed, and the failed VERIFY is  "VERIFY(
approx_equal(last_write_time(f.path), time) );" where testing time before
than epoch.

Below is the minimal case:

```
// gcc a.c
int main()
{
   struct timespec times[2] = {{1, UTIME_OMIT}, {-1201, 98500}};
   utimensat(AT_FDCWD, "test", times, 0);

}
```

$ touch test && gcc a.c && ./a.out && stat test
 File: test
 Size: 0   Blocks: 0  IO Block: 4096   regular empty file
Device: 805h/2053d  Inode: 1056841 Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  deepin)   Gid: ( 1000/  deepin)
Access: 2021-03-08 13:52:55.966354501 +0800
Modify: 2106-02-07 14:08:15.98500 +0800
Change: 2021-03-08 13:52:56.907782193 +0800
Birth: -

Undoubtedly, mtime's type is unsigned somewhere on mips.

After debuging kernel, it turns out that mtime is always -1201 in
ext4_setattr, cp_new_stat, newlstat and etc, so the problem should not
occur in kernel space.

go back to user space via copy_to_user, I finally found mips used
"unsigned int st_mtime_sec;" in struct kernel_stat, which is used to
receive -1201 from kernel.


I can't reproduce this on a mips64 machine:

jwakely@erpro8-fsf2:~$ uname -a
Linux erpro8-fsf2 4.1.4 #1 SMP PREEMPT Mon Aug 3 14:22:54 PDT 2015 mips64 
GNU/Linux
jwakely@erpro8-fsf2:~$ apt list libc6
Listing... Done
libc6/oldstable,now 2.19-18+deb8u10 mipsel [installed]
jwakely@erpro8-fsf2:~$ touch test && ./a.out && TZ= stat test
  File: ‘test’
  Size: 0   Blocks: 0  IO Block: 131072 regular empty file
Device: 21h/33d Inode: 36596524Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1049/ jwakely)   Gid: ( 1049/ jwakely)
Access: 2021-03-08 10:52:06.855991946 +
Modify: 1969-12-31 23:39:59.98500 +
Change: 2021-03-08 10:52:06.859992051 +
 Birth: -

This is an ext4 filesystem.

Do I need to compile a 64-bit executable? 


In any case, shouldn't this be fixed in glibc to return EINVAL instead
of setting a bogus time? That would make the std::filesystem library
report an error.



Maybe sparc also suffers from this problem, but I have no machine to
verify.


Your test case works correctly for me on sparc-linux (both 32-bit and
64-bit on ext4 and xfs filesystems):

$ uname -a
Linux gcc202 5.10.0-4-sparc64-smp #1 SMP Debian 5.10.19-1 (2021-03-02) sparc64 
GNU/Linux
$ apt list libc6
Listing... Done
libc6/unstable,now 2.31-4 sparc64 [installed]
$ ./a.out
$ TZ= stat test
  File: test
  Size: 0   Blocks: 0  IO Block: 8192   regular empty file
Device: fd01h/64769dInode: 657291843   Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1049/ jwakely)   Gid: ( 1049/ jwakely)
Access: 2021-03-08 10:25:43.593819539 +
Modify: 1969-12-31 23:39:59.98500 +
Change: 2021-03-08 10:34:13.711287778 +
 Birth: 2021-03-08 10:24:27.106598699 +



[PATCH] target/98856 - split vpinsrq with new peephole2

2021-03-08 Thread Richard Biener
This reduces the latency of a V2DImode construction from two
GPRs by avoiding the dependence on the GPR->XMM move with the
used vpinsrq instruction and instead allow the two GPR->XMM moves
to be concurrently executed and scheduled, performing the insert
using vpunpcklqdq.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk or do we want to defer this to GCC 12, maybe
unless we can also solve the spilling in PR98856 which would
then fix the performance regression?

2021-03-05  Richard Biener  

PR target/98856
* config/i386/sse.md (vpinsrq peephole): New peephole2
splitting vpinsrq to a vmovq and vpunpcklqdq.

* gcc.target/i386/pr98856.c: New testcase.
* gcc.target/i386/avx512dq-concatv2di-1.c: Adjust.
* gcc.target/i386/avx512vl-concatv2di-1.c: Likewise.
---
 gcc/config/i386/sse.md| 15 +++
 .../gcc.target/i386/avx512dq-concatv2di-1.c   |  4 +--
 .../gcc.target/i386/avx512vl-concatv2di-1.c   |  2 +-
 gcc/testsuite/gcc.target/i386/pr98856.c   | 25 +++
 4 files changed, 43 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr98856.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index ca4372d4164..7c9be80540b 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1427,6 +1427,21 @@
   DONE;
 })
 
+;; Further split pinsrq variants of vec_concatv2di to hide the latency
+;; the GPR->XMM transition(s).
+(define_peephole2
+  [(match_scratch:DI 3 "Yv")
+   (set (match_operand:V2DI 0 "sse_reg_operand")
+   (vec_concat:V2DI (match_operand:DI 1 "sse_reg_operand")
+(match_operand:DI 2 "nonimmediate_gr_operand")))]
+  "TARGET_64BIT && TARGET_SSE4_1
+   && !optimize_insn_for_size_p ()"
+  [(set (match_dup 3)
+(match_dup 2))
+   (set (match_dup 0)
+   (vec_concat:V2DI (match_dup 1)
+(match_dup 3)))])
+
 ;; Merge movsd/movhpd to movupd for TARGET_SSE_UNALIGNED_LOAD_OPTIMAL targets.
 (define_peephole2
   [(set (match_operand:V2DF 0 "sse_reg_operand")
diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-concatv2di-1.c 
b/gcc/testsuite/gcc.target/i386/avx512dq-concatv2di-1.c
index 82cb402575b..ac652bb1382 100644
--- a/gcc/testsuite/gcc.target/i386/avx512dq-concatv2di-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512dq-concatv2di-1.c
@@ -14,7 +14,7 @@ f1 (long long x, long long y)
   asm volatile ("" : "+v" (c));
 }
 
-/* { dg-final { scan-assembler 
"vpinsrq\[^\n\r]*\\\$1\[^\n\r]*%rsi\[^\n\r]*%xmm16\[^\n\r]*%xmm17" } } */
+/* { dg-final { scan-assembler "vpunpcklqdq\[^\n\r]*%xmm16\[^\n\r]*%xmm17" } } 
*/
 
 void
 f2 (long long x, long long *y)
@@ -27,7 +27,7 @@ f2 (long long x, long long *y)
   asm volatile ("" : "+v" (c));
 }
 
-/* { dg-final { scan-assembler 
"vpinsrq\[^\n\r]*\\\$1\[^\n\r]*%\[re]si\[^\n\r]*%xmm18\[^\n\r]*%xmm19" } } */
+/* { dg-final { scan-assembler "vpunpcklqdq\[^\n\r]*%xmm18\[^\n\r]*%xmm19" } } 
*/
 
 void
 f3 (long long x)
diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-concatv2di-1.c 
b/gcc/testsuite/gcc.target/i386/avx512vl-concatv2di-1.c
index 8e637071aa2..b8300371a21 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-concatv2di-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-concatv2di-1.c
@@ -28,7 +28,7 @@ f2 (long long x, long long *y)
   asm volatile ("" : "+v" (c));
 }
 
-/* { dg-final { scan-assembler 
"vmovhps\[^\n\r]*%\[re]si\[^\n\r]*%xmm18\[^\n\r]*%xmm19" } } */
+/* { dg-final { scan-assembler "vpunpcklqdq\[^\n\r]*%xmm18\[^\n\r]*%xmm19" } } 
*/
 
 void
 f3 (long long x)
diff --git a/gcc/testsuite/gcc.target/i386/pr98856.c 
b/gcc/testsuite/gcc.target/i386/pr98856.c
new file mode 100644
index 000..1ea24d0f1fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98856.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O3 -march=znver2" } */
+
+typedef __UINT64_TYPE__ uint64_t;
+void poly_double_le2 (unsigned char *out, const unsigned char *in)
+{
+  uint64_t W[2];
+
+  __builtin_memcpy (&W, in, 16);
+  uint64_t carry = (W[1] >> 63) * 135;
+  W[1] = (W[1] << 1) ^ (W[0] >> 63);
+  W[0] = (W[0] << 1) ^ carry;
+  __builtin_memcpy (out, &W[0], 8);
+  __builtin_memcpy (out + 8, &W[1], 8);
+}
+
+/* We should split 
+ vpinsrq $1, %rax, %xmm0, %xmm0
+   to
+ vmovq %rax, %xmm1
+ vpunpcklqdq %xmm0, %xmm1, %xmm0
+   to better hide the latency of the GPR->XMM transitions.  */
+
+/* { dg-final { scan-assembler-not "pinsrq" } } */
+/* { dg-final { scan-assembler-times "punpcklqdq" 1 } } */
-- 
2.26.2


Re: [PATCH v2] fix Ada bootstrap on Cygwin64 (PR bootstrap/94918)

2021-03-08 Thread Eric Botcazou
> I wonder why we include  from this file at all,
> and why it is not included from {t,}system.h instead which
> is where system header specific fixups should be made
> (and this one could be avoided because system headers
> are included _before_ poisoning anything).

 is the Mother of All Things on the platform so you don't want to 
include it liberally (although it can be tamed e.g. with WIN32_LEAN_AND_MEAN).
Therefore including it from tsystem.h might be worse than the actual disease.

Mikael, can you work around the problem by adding

#ifdef __CYGWIN__
#include "mingw32.h"
#endif

at the appropriate spot in raise-gcc.c instead?

-- 
Eric Botcazou




Re: [PATCH] skip testing time before epoch on mips

2021-03-08 Thread YunQiang Su via Gcc-patches
Jonathan Wakely via Gcc-patches  于2021年3月8日周一 下午6:57写道:
>
> On 08/03/21 14:35 +0800, Chen Li wrote:
> >
> >When execute libstdc++ testcases on mips, I notice that last_write_time
> >alawys failed, and the failed VERIFY is  "VERIFY(
> >approx_equal(last_write_time(f.path), time) );" where testing time before
> >than epoch.
> >
> >Below is the minimal case:
> >
> >```
> >// gcc a.c
> >int main()
> >{
> >struct timespec times[2] = {{1, UTIME_OMIT}, {-1201, 98500}};
> >utimensat(AT_FDCWD, "test", times, 0);
> >
> >}
> >```
> >
> >$ touch test && gcc a.c && ./a.out && stat test
> >  File: test
> >  Size: 0   Blocks: 0  IO Block: 4096   regular empty 
> > file
> >Device: 805h/2053d  Inode: 1056841 Links: 1
> >Access: (0644/-rw-r--r--)  Uid: ( 1000/  deepin)   Gid: ( 1000/  deepin)
> >Access: 2021-03-08 13:52:55.966354501 +0800
> >Modify: 2106-02-07 14:08:15.98500 +0800
> >Change: 2021-03-08 13:52:56.907782193 +0800
> > Birth: -
> >
> >Undoubtedly, mtime's type is unsigned somewhere on mips.
> >
> >After debuging kernel, it turns out that mtime is always -1201 in
> >ext4_setattr, cp_new_stat, newlstat and etc, so the problem should not
> >occur in kernel space.
> >
> >go back to user space via copy_to_user, I finally found mips used
> >"unsigned int st_mtime_sec;" in struct kernel_stat, which is used to
> >receive -1201 from kernel.
>
> I can't reproduce this on a mips64 machine:
>
> jwakely@erpro8-fsf2:~$ uname -a
> Linux erpro8-fsf2 4.1.4 #1 SMP PREEMPT Mon Aug 3 14:22:54 PDT 2015 mips64 
> GNU/Linux
> jwakely@erpro8-fsf2:~$ apt list libc6
> Listing... Done
> libc6/oldstable,now 2.19-18+deb8u10 mipsel [installed]

it is mips32, not mips64. Here mips64 we mean userland, not the kernel.

> jwakely@erpro8-fsf2:~$ touch test && ./a.out && TZ= stat test
>File: ‘test’
>Size: 0   Blocks: 0  IO Block: 131072 regular empty 
> file
> Device: 21h/33d Inode: 36596524Links: 1
> Access: (0644/-rw-r--r--)  Uid: ( 1049/ jwakely)   Gid: ( 1049/ jwakely)
> Access: 2021-03-08 10:52:06.855991946 +
> Modify: 1969-12-31 23:39:59.98500 +
> Change: 2021-03-08 10:52:06.859992051 +
>   Birth: -
>
> This is an ext4 filesystem.
>
> Do I need to compile a 64-bit executable?
>

Yes. you need 64bit userland.

> In any case, shouldn't this be fixed in glibc to return EINVAL instead
> of setting a bogus time? That would make the std::filesystem library
> report an error.
>

Since kernel has introduced statx(2), so we can use statx to implemente stat(3).

>
> >Maybe sparc also suffers from this problem, but I have no machine to
> >verify.
>
> Your test case works correctly for me on sparc-linux (both 32-bit and
> 64-bit on ext4 and xfs filesystems):
>
> $ uname -a
> Linux gcc202 5.10.0-4-sparc64-smp #1 SMP Debian 5.10.19-1 (2021-03-02) 
> sparc64 GNU/Linux
> $ apt list libc6
> Listing... Done
> libc6/unstable,now 2.31-4 sparc64 [installed]
> $ ./a.out
> $ TZ= stat test
>File: test
>Size: 0   Blocks: 0  IO Block: 8192   regular empty 
> file
> Device: fd01h/64769dInode: 657291843   Links: 1
> Access: (0644/-rw-r--r--)  Uid: ( 1049/ jwakely)   Gid: ( 1049/ jwakely)
> Access: 2021-03-08 10:25:43.593819539 +
> Modify: 1969-12-31 23:39:59.98500 +
> Change: 2021-03-08 10:34:13.711287778 +
>   Birth: 2021-03-08 10:24:27.106598699 +
>


-- 
YunQiang Su


Re: [PATCH] target/98856 - split vpinsrq with new peephole2

2021-03-08 Thread Jakub Jelinek via Gcc-patches
On Mon, Mar 08, 2021 at 12:04:22PM +0100, Richard Biener wrote:
> +;; Further split pinsrq variants of vec_concatv2di to hide the latency
> +;; the GPR->XMM transition(s).
> +(define_peephole2
> +  [(match_scratch:DI 3 "Yv")
> +   (set (match_operand:V2DI 0 "sse_reg_operand")
> + (vec_concat:V2DI (match_operand:DI 1 "sse_reg_operand")
> +  (match_operand:DI 2 "nonimmediate_gr_operand")))]
> +  "TARGET_64BIT && TARGET_SSE4_1
> +   && !optimize_insn_for_size_p ()"
> +  [(set (match_dup 3)
> +(match_dup 2))
> +   (set (match_dup 0)
> + (vec_concat:V2DI (match_dup 1)
> +  (match_dup 3)))])

Do we really want to do it for all vpinsrqs and not just those where
operands[1] is set from a nonimmediate_gr_operand a few instructions
earlier (or perhaps e.g. other insertions from GPRs)?
I mean, whether this is a win should depend on the latency of the
operands[1] setter if it is not too far from the vec_concat, if it has low
latency, this will only grow code without benefit, if it has high latency
it indeed could perform the GRP -> XMM move concurrently with the previous
operation.
Hardcoding the operands[1] setter in the peephole2 would mean we couldn't
match some small number of unrelated insns in between, but perhaps the
peephole2 condition could just call a function that walks the IL backward a
little and checks where the setter is and what latency it has?

Jakub



Re: [Patch] tree-nested: Update assert for Fortran module vars [PR97927]

2021-03-08 Thread Tobias Burnus

On 08.03.21 08:45, Richard Biener wrote:

On Fri, 5 Mar 2021, Tobias Burnus wrote:

Nested functions are permitted for C but not C++ as extension.
They are also permitted for Fortran, which generates DECL_CONTEXT
== NAMESPACE_DECL for module variables.

That causes the gcc_assert (decl_function_context (decl) == info->context)
to fail in tree-nested.c's lookup_field_for_decl. [...]

I think the bug is elsewhere.  We're not expecting non-local
(non-auto) variables to be queried with lookup_field_for_decl. [...]


Now changed by doing the
  'decl_function_context (decl) == info->context'
check in the caller's condition, which matches the rest of the file.

Thanks for the suggestion.

OK for mainline? GCC 10?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
tree-nested: Update assert for Fortran module vars [PR97927]
gcc/ChangeLog:

	PR fortran/97927
	* tree-nested.c (convert_local_reference_stmt): Avoid calling
	lookup_field_for_decl for Fortran module (= namespace context).

gcc/testsuite/ChangeLog:

	PR fortran/97927
	* gfortran.dg/module_variable_3.f90: New test.

 gcc/testsuite/gfortran.dg/module_variable_3.f90 | 37 +
 gcc/tree-nested.c   |  1 +
 2 files changed, 38 insertions(+)

diff --git a/gcc/testsuite/gfortran.dg/module_variable_3.f90 b/gcc/testsuite/gfortran.dg/module_variable_3.f90
new file mode 100644
index 000..0dae6d5bdd5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/module_variable_3.f90
@@ -0,0 +1,37 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/97927
+!
+! Did ICE due to the in tree-nested.c due to {clobber}
+!
+
+module mpi2
+ interface
+   subroutine MPI_Allreduce(i)
+ implicit none
+ INTEGER, OPTIONAL, INTENT(OUT) :: i
+   end subroutine MPI_Allreduce
+ end interface
+end module
+
+module modmpi
+  implicit none
+  integer ierror  ! module variable = context NAMESPACE_DECL
+end module
+
+subroutine exxengy
+  use modmpi
+  use mpi2, only: mpi_allreduce
+  implicit none
+
+  ! intent(out) implies: ierror = {clobber}
+  call mpi_allreduce(ierror)
+
+contains
+  subroutine zrho2
+return
+  end subroutine
+end subroutine
+
+! { dg-final { scan-tree-dump "ierror = {CLOBBER};" "original" } }
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index addd6eef9ab..cea917a4d58 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -2526,6 +2526,7 @@ convert_local_reference_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p,
 	{
 	  tree lhs = gimple_assign_lhs (stmt);
 	  if (DECL_P (lhs)
+	  && decl_function_context (lhs) == info->context
 	  && !use_pointer_in_frame (lhs)
 	  && lookup_field_for_decl (info, lhs, NO_INSERT))
 	{



[PATCH] i386: Enable UINTR and HRESET for -march that supports it

2021-03-08 Thread Martin Liška

Hello.

The patch fixes missing features for -march targets that support
PTA_UINTR and PTA_HRESET.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR target/99463
* config/i386/i386-options.c (ix86_option_override_internal):
Enable UINTR and HRESET for -march that supports it.

gcc/testsuite/ChangeLog:

PR target/99463
* gcc.target/i386/pr99463-2.c: New test.
* gcc.target/i386/pr99463.c: New test.
---
 gcc/config/i386/i386-options.c| 7 +++
 gcc/testsuite/gcc.target/i386/pr99463-2.c | 5 +
 gcc/testsuite/gcc.target/i386/pr99463.c   | 5 +
 3 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99463-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99463.c

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index cdeabbfca4b..410fa0cc436 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2354,6 +2354,13 @@ ix86_option_override_internal (bool main_args_p,
if (((processor_alias_table[i].flags & PTA_PKU) != 0)
&& !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_PKU))
  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PKU;
+   if (((processor_alias_table[i].flags & PTA_UINTR) != 0)
+   && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_UINTR))
+ opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_UINTR;
+   if (((processor_alias_table[i].flags & PTA_HRESET) != 0)
+   && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_HRESET))
+ opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_HRESET;
+
 
 	/* Don't enable x87 instructions if only general registers are

   allowed by target("general-regs-only") function attribute or
diff --git a/gcc/testsuite/gcc.target/i386/pr99463-2.c 
b/gcc/testsuite/gcc.target/i386/pr99463-2.c
new file mode 100644
index 000..017ca959510
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99463-2.c
@@ -0,0 +1,5 @@
+/* PR target/99463 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -march=sapphirerapids" } */
+
+#include "uintr-1.c"
diff --git a/gcc/testsuite/gcc.target/i386/pr99463.c 
b/gcc/testsuite/gcc.target/i386/pr99463.c
new file mode 100644
index 000..0b290924118
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99463.c
@@ -0,0 +1,5 @@
+/* PR target/99463 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=alderlake" } */
+
+#include "hreset-1.c"
--
2.30.1



[Committed] IBM Z: Fix vcond-shift.c testcase.

2021-03-08 Thread Andreas Krebbel via Gcc-patches
Due to a common code change the comparison in the testcase is emitted
via vec_cmp instead of vcond.  The testcase checks for an optimization
currently only available via vcond.

Fixed by implementing the same optimization also in
s390_expand_vec_compare.

Bootstrapped and regression tested on s390x with -march=z15

This fixes the following testsuite fails:

< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-not vzero\\t*
< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times 
vesrab\\t%v.?,%v.?,7 6
< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times 
vesraf\\t%v.?,%v.?,31 6
< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times 
vesrah\\t%v.?,%v.?,15 6

gcc/ChangeLog:

* config/s390/s390.c (s390_expand_vec_compare): Implement <0
comparison with arithmetic right shift.
(s390_expand_vcond): No need for a force_reg anymore.
s390_vec_compare will do it.
* config/s390/vector.md ("vec_cmp"): Accept also
immediate operands.
---
 gcc/config/s390/s390.c| 20 +++-
 gcc/config/s390/vector.md |  2 +-
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index f3d0d1ba596..c9aea21fe40 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6569,6 +6569,7 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,
 
   if (GET_MODE_CLASS (GET_MODE (cmp_op1)) == MODE_VECTOR_FLOAT)
 {
+  cmp_op2 = force_reg (GET_MODE (cmp_op1), cmp_op2);
   switch (cond)
{
  /* NE a != b -> !(a == b) */
@@ -6607,6 +6608,19 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,
 }
   else
 {
+  /* Turn x < 0 into x >> (bits per element - 1)  */
+  if (cond == LT && cmp_op2 == CONST0_RTX (mode))
+   {
+ int shift = GET_MODE_BITSIZE (GET_MODE_INNER (mode)) - 1;
+ rtx res = expand_simple_binop (mode, ASHIFTRT, cmp_op1,
+GEN_INT (shift), target,
+0, OPTAB_DIRECT);
+ if (res != target)
+   emit_move_insn (target, res);
+ return;
+   }
+  cmp_op2 = force_reg (GET_MODE (cmp_op1), cmp_op2);
+
   switch (cond)
{
  /* NE: a != b -> !(a == b) */
@@ -6824,11 +6838,7 @@ s390_expand_vcond (rtx target, rtx then, rtx els,
   if (!REG_P (cmp_op1))
 cmp_op1 = force_reg (GET_MODE (cmp_op1), cmp_op1);
 
-  if (!REG_P (cmp_op2))
-cmp_op2 = force_reg (GET_MODE (cmp_op2), cmp_op2);
-
-  s390_expand_vec_compare (result_target, cond,
-  cmp_op1, cmp_op2);
+  s390_expand_vec_compare (result_target, cond, cmp_op1, cmp_op2);
 
   /* If the results are supposed to be either -1 or 0 we are done
  since this is what our compare instructions generate anyway.  */
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index bc52211c55e..c80d582a300 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -1589,7 +1589,7 @@ (define_expand "vec_cmp"
   [(set (match_operand:  0 "register_operand" "")
(match_operator: 1 "vcond_comparison_operator"
  [(match_operand:V_HW 2 "register_operand" "")
-  (match_operand:V_HW 3 "register_operand" "")]))]
+  (match_operand:V_HW 3 "nonmemory_operand" "")]))]
   "TARGET_VX"
 {
   s390_expand_vec_compare (operands[0], GET_CODE(operands[1]), operands[2], 
operands[3]);
-- 
2.29.2



Re: [PATCH] target/98856 - split vpinsrq with new peephole2

2021-03-08 Thread Richard Biener
On Mon, 8 Mar 2021, Jakub Jelinek wrote:

> On Mon, Mar 08, 2021 at 12:04:22PM +0100, Richard Biener wrote:
> > +;; Further split pinsrq variants of vec_concatv2di to hide the latency
> > +;; the GPR->XMM transition(s).
> > +(define_peephole2
> > +  [(match_scratch:DI 3 "Yv")
> > +   (set (match_operand:V2DI 0 "sse_reg_operand")
> > +   (vec_concat:V2DI (match_operand:DI 1 "sse_reg_operand")
> > +(match_operand:DI 2 "nonimmediate_gr_operand")))]
> > +  "TARGET_64BIT && TARGET_SSE4_1
> > +   && !optimize_insn_for_size_p ()"
> > +  [(set (match_dup 3)
> > +(match_dup 2))
> > +   (set (match_dup 0)
> > +   (vec_concat:V2DI (match_dup 1)
> > +(match_dup 3)))])
> 
> Do we really want to do it for all vpinsrqs and not just those where
> operands[1] is set from a nonimmediate_gr_operand a few instructions
> earlier (or perhaps e.g. other insertions from GPRs)?
> I mean, whether this is a win should depend on the latency of the
> operands[1] setter if it is not too far from the vec_concat, if it has low
> latency, this will only grow code without benefit, if it has high latency
> it indeed could perform the GRP -> XMM move concurrently with the previous
> operation.

In principle even a single high-latency op (GPR or MEM) is worth splitting
if scheduling can move it enough.  What "enough" is of course depends
on the CPU and its OOO issue capabilities.  It's not exactly clear
what the uops are but I suspect it's MEM load or GPR->xmm move, which
would mean the dependency on the first slow op should be mitigated by
OOO issueing as well.

OK, so I threw the following at IACA:

.L3:
movl $111, %ebx
.byte 0x64, 0x67, 0x90
#movq%rdi, %xmm0
#   vpinsrq $1, %rax, %xmm0, %xmm0
#movq%rdi, %xmm1
#   vpinsrq $1, %rax, %xmm1, %xmm1
movq%rdi, %xmm0
movq%rax, %xmm3
vpunpcklqdq %xmm3, %xmm0, %xmm0
movq%rdi, %xmm1
movq%rax, %xmm4
vpunpcklqdq %xmm4, %xmm1, %xmm1
vpaddq  %xmm0, %xmm2, %xmm2
vpaddq  %xmm1, %xmm2, %xmm2
addl$1, %edx
cmpl%edx, %eax
movl $222, %ebx
.byte 0x64, 0x67, 0x90
jne .L3

and the commented vs. the uncommented sequences make no difference in its
analysis of the loop throughput (but there's no work to be issued
between the slow moves and the unpack) - IACA also doesn't seem to
OOO execute here (the first vpaddq is only issued after the 2nd unpck).

Going back to the Botan case there's no speed difference between the
following which has the spilling removed and the vpinsr removed:

vmovdqu (%rsi), %xmm4
movq8(%rsi), %rdx
shrq$63, %rdx
imulq   $135, %rdx, %rdi
movq(%rsi), %rdx
vmovq   %rdi, %xmm0
vpsllq  $1, %xmm4, %xmm1
shrq$63, %rdx
vmovq   %rdx, %xmm5
vpunpcklqdq %xmm0, %xmm5, %xmm0
vpxor   %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rax)

to the variant with just the spilling removed:

vmovdqu (%rsi), %xmm4
movq8(%rsi), %rdx
shrq$63, %rdx
imulq   $135, %rdx, %rdi
movq(%rsi), %rdx
vmovq   %rdi, %xmm0
vpsllq  $1, %xmm4, %xmm1
shrq$63, %rdx
vpinsrq $1, %rdx, %xmm0, %xmm0
vpxor   %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rax)

so it looks I was barking at the wrong tree here and OOO issueing
works fine here (the patch might still make a difference on in-order
micro architectures or in situations not tested).

Thus I'll drop it for now.

That said, the spilling issue is way harder to address I fear.  For
reference, this was the original:

vmovdqu (%rsi), %xmm4
vmovdqa %xmm4, 16(%rsp)
movq24(%rsp), %rdx
vmovdqa 16(%rsp), %xmm5
shrq$63, %rdx
imulq   $135, %rdx, %rdi
movq16(%rsp), %rdx
vmovq   %rdi, %xmm0
vpsllq  $1, %xmm5, %xmm1
shrq$63, %rdx
vpinsrq $1, %rdx, %xmm0, %xmm0
vpxor   %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rax)
jmp .L53

and the important part is to do the loads to %rdx/%rdi via
(%rsi) and not the spill slot.  Performing more "clever"
reloads from %xmm4 like via

vmovdqu (%rsi), %xmm4
vmovhlps %xmm4, %xmm4, %xmm5
movq %xmm5, %rdx
vmovdqa 16(%rsp), %xmm5
shrq$63, %rdx
imulq   $135, %rdx, %rdi
movq%xmm4, %rdx
vmovq   %rdi, %xmm0
vpsllq  $1, %xmm4, %xmm1
shrq$63, %rdx
vpinsrq $1, %rdx, %xmm0, %xmm0
vpxor   %xmm1, %xmm0, %xmm0
vmovdqu %xmm0, (%rax)
jmp .L53

only improves things marginally.

Richard.

> Hardcoding the operands[1] setter in the peephole2 would mean we couldn't
> match some small number of unrelated insns in between, but perhaps the
> peephole2 condition could just call a function that walks the IL backward a
> litt

Re: [Patch] tree-nested: Update assert for Fortran module vars [PR97927]

2021-03-08 Thread Richard Biener
On Mon, 8 Mar 2021, Tobias Burnus wrote:

> On 08.03.21 08:45, Richard Biener wrote:
> > On Fri, 5 Mar 2021, Tobias Burnus wrote:
> >> Nested functions are permitted for C but not C++ as extension.
> >> They are also permitted for Fortran, which generates DECL_CONTEXT
> >> == NAMESPACE_DECL for module variables.
> >>
> >> That causes the gcc_assert (decl_function_context (decl) == info->context)
> >> to fail in tree-nested.c's lookup_field_for_decl. [...]
> > I think the bug is elsewhere.  We're not expecting non-local
> > (non-auto) variables to be queried with lookup_field_for_decl. [...]
> 
> Now changed by doing the
>   'decl_function_context (decl) == info->context'
> check in the caller's condition, which matches the rest of the file.
> 
> Thanks for the suggestion.
> 
> OK for mainline? GCC 10?

OK for trunk and GCC 10 after a while.

Richard.

> Tobias
> 
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank
> Thürauf
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH] i386: Properly set ix86_isa_flags

2021-03-08 Thread Martin Liška

Hi.

The bug is about usage of ix86_isa_flags instead of opts->x_ix86_isa_flags.
I'm planning a bigger refactoring regarding the flags & PTA_PCLMUL conditions.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR target/99464
* config/i386/i386-options.c (ix86_option_override_internal):
Set isa_flags for OPTS argument and not for the global
global_options.

gcc/testsuite/ChangeLog:

PR target/99464
* gcc.target/i386/pr99464.c: New test.
---
 gcc/config/i386/i386-options.c  |  8 
 gcc/testsuite/gcc.target/i386/pr99464.c | 15 +++
 2 files changed, 19 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99464.c

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index cdeabbfca4b..982c335e1fa 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2159,11 +2159,11 @@ ix86_option_override_internal (bool main_args_p,
&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_MOVBE))
  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MOVBE;
if (((processor_alias_table[i].flags & PTA_AES) != 0)
-   && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_AES))
- ix86_isa_flags |= OPTION_MASK_ISA_AES;
+   && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_AES))
+ opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AES;
if (((processor_alias_table[i].flags & PTA_SHA) != 0)
-   && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_SHA))
- ix86_isa_flags |= OPTION_MASK_ISA_SHA;
+   && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_SHA))
+ opts->x_ix86_isa_flags |= OPTION_MASK_ISA_SHA;
if (((processor_alias_table[i].flags & PTA_PCLMUL) != 0)
&& !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_PCLMUL))
  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PCLMUL;
diff --git a/gcc/testsuite/gcc.target/i386/pr99464.c 
b/gcc/testsuite/gcc.target/i386/pr99464.c
new file mode 100644
index 000..98dd938973e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99464.c
@@ -0,0 +1,15 @@
+/* PR target/99464 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#pragma GCC target("arch=cannonlake")
+
+#include 
+
+volatile __m128i x;
+
+void extern
+sha_test (void)
+{
+  x = _mm_sha1msg1_epu32 (x, x);
+}
--
2.30.1



Re: [PATCH] i386: Enable UINTR and HRESET for -march that supports it

2021-03-08 Thread Uros Bizjak via Gcc-patches
On Mon, Mar 8, 2021 at 1:45 PM Martin Liška  wrote:
>
> Hello.
>
> The patch fixes missing features for -march targets that support
> PTA_UINTR and PTA_HRESET.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> PR target/99463
> * config/i386/i386-options.c (ix86_option_override_internal):
> Enable UINTR and HRESET for -march that supports it.
>
> gcc/testsuite/ChangeLog:
>
> PR target/99463
> * gcc.target/i386/pr99463-2.c: New test.
> * gcc.target/i386/pr99463.c: New test.

OK, but looking at i386.h, similar handing for several other PTA_*
flags is missing.

Thanks,
Uros.

> ---
>   gcc/config/i386/i386-options.c| 7 +++
>   gcc/testsuite/gcc.target/i386/pr99463-2.c | 5 +
>   gcc/testsuite/gcc.target/i386/pr99463.c   | 5 +
>   3 files changed, 17 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.target/i386/pr99463-2.c
>   create mode 100644 gcc/testsuite/gcc.target/i386/pr99463.c
>
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index cdeabbfca4b..410fa0cc436 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -2354,6 +2354,13 @@ ix86_option_override_internal (bool main_args_p,
> if (((processor_alias_table[i].flags & PTA_PKU) != 0)
> && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_PKU))
>   opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PKU;
> +   if (((processor_alias_table[i].flags & PTA_UINTR) != 0)
> +   && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_UINTR))
> + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_UINTR;
> +   if (((processor_alias_table[i].flags & PTA_HRESET) != 0)
> +   && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_HRESET))
> + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_HRESET;
> +
>
> /* Don't enable x87 instructions if only general registers are
>allowed by target("general-regs-only") function attribute or
> diff --git a/gcc/testsuite/gcc.target/i386/pr99463-2.c 
> b/gcc/testsuite/gcc.target/i386/pr99463-2.c
> new file mode 100644
> index 000..017ca959510
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99463-2.c
> @@ -0,0 +1,5 @@
> +/* PR target/99463 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -march=sapphirerapids" } */
> +
> +#include "uintr-1.c"
> diff --git a/gcc/testsuite/gcc.target/i386/pr99463.c 
> b/gcc/testsuite/gcc.target/i386/pr99463.c
> new file mode 100644
> index 000..0b290924118
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99463.c
> @@ -0,0 +1,5 @@
> +/* PR target/99463 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=alderlake" } */
> +
> +#include "hreset-1.c"
> --
> 2.30.1
>


Re: [PATCH] i386: Enable UINTR and HRESET for -march that supports it

2021-03-08 Thread Martin Liška

On 3/8/21 2:15 PM, Uros Bizjak wrote:

OK, but looking at i386.h, similar handing for several other PTA_*
flags is missing.


Thanks.

Well, I think all other should be covered. I wrote a comparison script and
difference is following after my patch:

{'PTA_PREFETCH_SSE', 'PTA_64BIT', 'PTA_NO_80387', 'PTA_NO_TUNE'}

Which should be fine.
Martin


[PATCH, OG10, OpenMP, committed] Support A->B expressions in map clause (C front-end)

2021-03-08 Thread Chung-Lin Tang

This patch is a merge of parts from:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562467.html

and devel/omp/gcc-10 commit 36a1eb, which was a modified merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558975.html

to provide the equivalent front-end patches for support "map(A->B)"
clauses for the C front-end (only the C++ front-end received such
changes before). Some associated middle-end changes are also in
this patch.

Tested without regressions, and pushed to devel/omp/gcc-10.

Chung-Lin
From 08caada8efd8f35db634647bbda6091fb667b00d Mon Sep 17 00:00:00 2001
From: Chung-Lin Tang 
Date: Mon, 8 Mar 2021 15:56:52 +0800
Subject: [PATCH] Arrow operator handling for C front-end in OpenMP map clauses

This patch merges some of the equivalent changes already done for the C++
front-end to the C parts.

2021-03-08  Chung-Lin Tang  

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_clause_map): Set 'allow_deref' argument in
call to c_parser_omp_variable_list to 'true'.
* c-typeck.c (handle_omp_array_sections_1): Add strip of MEM_REF in
array base handling.
(c_finish_omp_clauses): Handle 'A->member' case in map clauses.

gcc/ChangeLog:

* gimplify.c (gimplify_scan_omp_clauses): Add MEM_REF case when
handling component_ref_p case. Add unshare_expr and gimplification
when created GOMP_MAP_STRUCT is not a DECL. Add code to add
firstprivate pointer for *pointer-to-struct case.

gcc/testsuite/ChangeLog:

* gcc.dg/gomp/target-3.c: New test.
---
 gcc/c/c-parser.c |  3 +-
 gcc/c/c-typeck.c | 22 +++
 gcc/gimplify.c   | 41 ++--
 gcc/testsuite/gcc.dg/gomp/target-3.c | 16 +++
 4 files changed, 79 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gomp/target-3.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index fae597128e9..0a6aee439f6 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -15700,7 +15700,8 @@ c_parser_omp_clause_map (c_parser *parser, tree list)
}
 }
 
-  nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_MAP, list);
+  nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_MAP, list,
+  C_ORT_OMP, true);
 
   for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
 OMP_CLAUSE_SET_MAP_KIND (c, kind);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 6af19766324..7c887a80ce9 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12917,6 +12917,12 @@ handle_omp_array_sections_1 (tree c, tree t, vec 
&types,
  return error_mark_node;
}
  t = TREE_OPERAND (t, 0);
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)
+ && TREE_CODE (t) == MEM_REF)
+   {
+ t = TREE_OPERAND (t, 0);
+ STRIP_NOPS (t);
+   }
  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
{
  if (maybe_ne (mem_ref_offset (t), 0))
@@ -13778,6 +13784,7 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
   tree ordered_clause = NULL_TREE;
   tree schedule_clause = NULL_TREE;
   bool oacc_async = false;
+  bool indir_component_ref_p = false;
   tree last_iterators = NULL_TREE;
   bool last_iterators_remove = false;
   tree *nogroup_seen = NULL;
@@ -14505,6 +14512,11 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
{
  while (TREE_CODE (t) == COMPONENT_REF)
t = TREE_OPERAND (t, 0);
+ if (TREE_CODE (t) == MEM_REF)
+   {
+ t = TREE_OPERAND (t, 0);
+ STRIP_NOPS (t);
+   }
  if (bitmap_bit_p (&map_field_head, DECL_UID (t)))
break;
  if (bitmap_bit_p (&map_head, DECL_UID (t)))
@@ -14561,6 +14573,15 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
   bias) to zero here, so it is not set erroneously to the pointer
   size later on in gimplify.c.  */
OMP_CLAUSE_SIZE (c) = size_zero_node;
+ indir_component_ref_p = false;
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)
+ && TREE_CODE (t) == COMPONENT_REF
+ && TREE_CODE (TREE_OPERAND (t, 0)) == MEM_REF)
+   {
+ t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+ indir_component_ref_p = true;
+ STRIP_NOPS (t);
+   }
  if (TREE_CODE (t) == COMPONENT_REF
  && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
{
@@ -14633,6 +14654,7 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
  else if ((OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP
|| (OMP_CLAUSE_MAP_KIND (c)
!= GOM

Re: [PATCH] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-03-08 Thread Richard Biener via Gcc-patches
On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
 wrote:
>
> Hi all,
>
> As discussed in the PR, we currently have two different numbering
> schemes for SVE builtins: one for C, and one for C++. This is
> problematic for LTO, where we end up getting confused about which
> intrinsic we're talking about. This patch inserts placeholders into the
> registered_functions vector to ensure that there is a consistent
> numbering scheme for both C and C++.
>
> Initially I tried using a null decl for these placeholders, but this
> trips up some validation when the builtin trees are streamed into lto1,

Care to elaborate?  Note we stream builtins by their function code
and thus expect to be able to materialize them later - materializing
to NULL obviously fails but then the issue is the proper decl isn't there
in that case.  Materializing a "dummy" decl just will delay the inevitable
breakage.

> so I ended up building dummy FUNCTION_DECLs as placeholders.
>
> To get better test coverage here, I'm wondering if we could make some of
> the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> For now I've just added the specific testcase from the PR.
>
> Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> ---
>
> gcc/ChangeLog:
>
> PR target/99216
> * config/aarch64/aarch64-sve-builtins.cc
> (function_builder::add_function): Add placeholder_p argument, build
> placeholder decls if this is set.
> (function_builder::add_unique_function): Instead of conditionally 
> adding
> direct overloads, unconditionally add either a direct overload or a
> placeholder.
> (function_builder::add_overloaded_function): Set placeholder_p if 
> we're
> using C++ overloads.
> (function_builder::add_overloaded_functions): Don't return early for
> m_direct_overloads: we need to add placeholders.
> * config/aarch64/aarch64-sve-builtins.h
> (function_builder::add_function): Add placeholder_p argument.
>
> gcc/testsuite/ChangeLog:
>
> PR target/99216
> * g++.target/aarch64/sve/pr99216.C: New test.


Re: [PATCH] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-03-08 Thread Richard Sandiford via Gcc-patches
Alex Coplan  writes:
> Hi all,
>
> As discussed in the PR, we currently have two different numbering
> schemes for SVE builtins: one for C, and one for C++. This is
> problematic for LTO, where we end up getting confused about which
> intrinsic we're talking about. This patch inserts placeholders into the
> registered_functions vector to ensure that there is a consistent
> numbering scheme for both C and C++.
>
> Initially I tried using a null decl for these placeholders, but this
> trips up some validation when the builtin trees are streamed into lto1,
> so I ended up building dummy FUNCTION_DECLs as placeholders.
>
> To get better test coverage here, I'm wondering if we could make some of
> the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> For now I've just added the specific testcase from the PR.
>
> Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?

OK, thanks.  I think we should backport this to GCC 10, but as Richi says,
that would need a bump in lto-streamer.h:LTO_minor_version.

Thanks,
Richard

>
> Thanks,
> Alex
>
> ---
>
> gcc/ChangeLog:
>
>   PR target/99216
>   * config/aarch64/aarch64-sve-builtins.cc
>   (function_builder::add_function): Add placeholder_p argument, build
>   placeholder decls if this is set.
>   (function_builder::add_unique_function): Instead of conditionally adding
>   direct overloads, unconditionally add either a direct overload or a
>   placeholder.
>   (function_builder::add_overloaded_function): Set placeholder_p if we're
>   using C++ overloads.
>   (function_builder::add_overloaded_functions): Don't return early for
>   m_direct_overloads: we need to add placeholders.
>   * config/aarch64/aarch64-sve-builtins.h
>   (function_builder::add_function): Add placeholder_p argument.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/99216
>   * g++.target/aarch64/sve/pr99216.C: New test.
>
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins.cc
> index 25612d2ea52..c0ba352599c 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -999,12 +999,25 @@ registered_function &
>  function_builder::add_function (const function_instance &instance,
>   const char *name, tree fntype, tree attrs,
>   uint64_t required_extensions,
> - bool overloaded_p)
> + bool overloaded_p,
> + bool placeholder_p)
>  {
>unsigned int code = vec_safe_length (registered_functions);
>code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE;
> -  tree decl = simulate_builtin_function_decl (input_location, name, fntype,
> -   code, NULL, attrs);
> +
> +  /* We need to be able to generate placeholders to enusre that we have a
> + consistent numbering scheme for function codes between the C and C++
> + frontends, so that everything ties up in LTO.
> +
> + The placeholder needs to be non-NULL and some node other than
> + error_mark_node to pass validation when the builtin is streamed in to 
> lto1.
> +
> + It's also convenient if the placeholder can store the name, so build a
> + FUNCTION_DECL, but don't register it with the frontend.  */
> +  tree decl = placeholder_p
> +? build_decl (input_location, FUNCTION_DECL, get_identifier (name), 
> fntype)
> +: simulate_builtin_function_decl (input_location, name, fntype,
> +   code, NULL, attrs);
>  
>registered_function &rfn = *ggc_alloc  ();
>rfn.instance = instance;
> @@ -1036,7 +1049,7 @@ function_builder::add_unique_function (const 
> function_instance &instance,
>  argument_types.address ());
>tree attrs = get_attributes (instance);
>registered_function &rfn = add_function (instance, name, fntype, attrs,
> -required_extensions, false);
> +required_extensions, false, false);
>  
>/* Enter the function into the hash table.  */
>hashval_t hash = instance.hash ();
> @@ -1047,16 +1060,14 @@ function_builder::add_unique_function (const 
> function_instance &instance,
>  
>/* Also add the function under its overloaded alias, if we want
>   a separate decl for each instance of an overloaded function.  */
> -  if (m_direct_overloads || force_direct_overloads)
> +  char *overload_name = get_name (instance, true);
> +  if (strcmp (name, overload_name) != 0)
>  {
> -  char *overload_name = get_name (instance, true);
> -  if (strcmp (name, overload_name) != 0)
> - {
> -   /* Attribute lists shouldn't be shared.  */
> -   tree attrs = get_attributes (instance);
> -   add_function (instance, overload_name, fntype, attrs,
> - requ

[committed] [PR99422] LRA: Skip modifiers when processing memory address.

2021-03-08 Thread Vladimir Makarov via Gcc-patches

The following patch solves

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99422

The patch was successfully bootstrapped and tested on ppc64le, x86-64 
and arm64.


commit 04b4828c6dd215385fde6964a5e13da8a01a78ba (HEAD -> master)
Author: Vladimir N. Makarov 
Date:   Mon Mar 8 09:24:57 2021 -0500

[PR99422] LRA: Skip modifiers when processing memory address.

  Function process_address_1 can wrongly look at constraint modifiers
instead of the 1st constraint itself.  The patch solves the problem.

gcc/ChangeLog:

PR target/99422
* lra-constraints.c (skip_contraint_modifiers): New function.
(process_address_1): Use it before lookup_constraint call.

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 9253690561a..76e3ff7efe6 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -3392,6 +3392,21 @@ equiv_address_substitution (struct address_info *ad)
   return change_p;
 }
 
+/* Skip all modifiers and whitespaces in constraint STR and return the
+   result.  */
+static const char *
+skip_contraint_modifiers (const char *str)
+{
+  for (;;str++)
+switch (*str)
+  {
+  case '+' : case '&' : case '=': case '*': case ' ': case '\t':
+  case '$': case '^' : case '%': case '?': case '!':
+	break;
+  default: return str;
+  }
+}
+
 /* Major function to make reloads for an address in operand NOP or
check its correctness (If CHECK_ONLY_P is true). The supported
cases are:
@@ -3426,8 +3441,8 @@ process_address_1 (int nop, bool check_only_p,
   HOST_WIDE_INT scale;
   rtx op = *curr_id->operand_loc[nop];
   rtx mem = extract_mem_from_operand (op);
-  const char *constraint = curr_static_id->operand[nop].constraint;
-  enum constraint_num cn = lookup_constraint (constraint);
+  const char *constraint;
+  enum constraint_num cn;
   bool change_p = false;
 
   if (MEM_P (mem)
@@ -3435,6 +3450,9 @@ process_address_1 (int nop, bool check_only_p,
   && GET_CODE (XEXP (mem, 0)) == SCRATCH)
 return false;
 
+  constraint
+= skip_contraint_modifiers (curr_static_id->operand[nop].constraint);
+  cn = lookup_constraint (constraint);
   if (insn_extra_address_constraint (cn)
   /* When we find an asm operand with an address constraint that
 	 doesn't satisfy address_operand to begin with, we clear


[PATCH][pushed] libsanitizer: cherry-pick ad294e572bc5c16f9dc420cc994322de6ca3fbfb

2021-03-08 Thread Martin Liška

I'm going to push one libsanitizer commit. And I'm adding a test-case for it.

Martin

libsanitizer/ChangeLog:

PR sanitizer/98920
* asan/asan_interceptors.cpp (COMMON_INTERCEPT_FUNCTION_VER):
Cherry pick.
(COMMON_INTERCEPT_FUNCTION_VER_UNVERSIONED_FALLBACK): Likewise.
* asan/asan_interceptors.h 
(ASAN_INTERCEPT_FUNC_VER_UNVERSIONED_FALLBACK): Likewise.
* sanitizer_common/sanitizer_common_interceptors.inc
(COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN): Likewise.
(INIT_REGEX): Likewise.
* tsan/tsan_interceptors_posix.cpp 
(COMMON_INTERCEPT_FUNCTION_VER_UNVERSIONED_FALLBACK):
Likewise.

gcc/testsuite/ChangeLog:

PR sanitizer/98920
* c-c++-common/asan/pr98920.c: New test.
---
 gcc/testsuite/c-c++-common/asan/pr98920.c | 24 +++
 libsanitizer/asan/asan_interceptors.cpp   |  5 +++-
 libsanitizer/asan/asan_interceptors.h |  7 ++
 .../sanitizer_common_interceptors.inc | 19 ++-
 libsanitizer/tsan/tsan_interceptors_posix.cpp |  2 ++
 5 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/pr98920.c

diff --git a/gcc/testsuite/c-c++-common/asan/pr98920.c 
b/gcc/testsuite/c-c++-common/asan/pr98920.c
new file mode 100644
index 000..881d3d4901e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/pr98920.c
@@ -0,0 +1,24 @@
+/* PR sanitizer/98920 */
+/* { dg-do run } */
+
+#include 
+#include 
+#include 
+
+int main(void)
+{
+regex_t r;
+const char s[] = "ban\0ana";
+regmatch_t pmatch[10];
+pmatch[0].rm_so = 0;
+pmatch[0].rm_eo = sizeof(s);
+if (regcomp(&r, "ana", 0))
+return 2;
+if (regexec(&r, s, sizeof(pmatch)/sizeof(pmatch[0]), pmatch, 
REG_STARTEND)) {
+fprintf(stderr, "failed to match\n");
+regfree(&r);
+return 3;
+}
+regfree(&r);
+return 0;
+}
diff --git a/libsanitizer/asan/asan_interceptors.cpp 
b/libsanitizer/asan/asan_interceptors.cpp
index b19cf25c7cd..4e68b3b0b47 100644
--- a/libsanitizer/asan/asan_interceptors.cpp
+++ b/libsanitizer/asan/asan_interceptors.cpp
@@ -90,8 +90,10 @@ DECLARE_REAL_AND_INTERCEPTOR(void, free, void *)
   (void) ctx;  
\
 
 #define COMMON_INTERCEPT_FUNCTION(name) ASAN_INTERCEPT_FUNC(name)

-#define COMMON_INTERCEPT_FUNCTION_VER(name, ver)  \
+#define COMMON_INTERCEPT_FUNCTION_VER(name, ver) \
   ASAN_INTERCEPT_FUNC_VER(name, ver)
+#define COMMON_INTERCEPT_FUNCTION_VER_UNVERSIONED_FALLBACK(name, ver) \
+  ASAN_INTERCEPT_FUNC_VER_UNVERSIONED_FALLBACK(name, ver)
 #define COMMON_INTERCEPTOR_WRITE_RANGE(ctx, ptr, size) \
   ASAN_WRITE_RANGE(ctx, ptr, size)
 #define COMMON_INTERCEPTOR_READ_RANGE(ctx, ptr, size) \
@@ -687,6 +689,7 @@ void InitializeAsanInterceptors() {
 
   // Intercept threading-related functions

 #if ASAN_INTERCEPT_PTHREAD_CREATE
+// TODO: this should probably have an unversioned fallback for newer arches?
 #if defined(ASAN_PTHREAD_CREATE_VERSION)
   ASAN_INTERCEPT_FUNC_VER(pthread_create, ASAN_PTHREAD_CREATE_VERSION);
 #else
diff --git a/libsanitizer/asan/asan_interceptors.h 
b/libsanitizer/asan/asan_interceptors.h
index 43cb4e3bb4f..56dc34b7d93 100644
--- a/libsanitizer/asan/asan_interceptors.h
+++ b/libsanitizer/asan/asan_interceptors.h
@@ -150,6 +150,13 @@ DECLARE_REAL(char*, strstr, const char *s1, const char *s2)
   VReport(1, "AddressSanitizer: failed to intercept '%s@@%s'\n", #name, \
   #ver);\
   } while (0)
+#define ASAN_INTERCEPT_FUNC_VER_UNVERSIONED_FALLBACK(name, ver)  \
+  do {   \
+if (!INTERCEPT_FUNCTION_VER(name, ver) && !INTERCEPT_FUNCTION(name)) \
+  VReport(1, "AddressSanitizer: failed to intercept '%s@@%s' or '%s'\n", \
+  #name, #ver, #name);   \
+  } while (0)
+
 #else
 // OS X interceptors don't need to be initialized with INTERCEPT_FUNCTION.
 #define ASAN_INTERCEPT_FUNC(name)
diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index 729eead43c0..2f2787e283a 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -239,6 +239,23 @@ extern const short *_tolower_tab_;
 COMMON_INTERCEPT_FUNCTION(fn)
 #endif
 
+#ifdef __GLIBC__

+// If we could not find the versioned symbol, fall back to an unversioned
+// lookup. This is needed to work around a GLibc bug that causes dlsym
+// with RTLD_NEXT to return the oldest versioned symbol.
+// See https://sourceware.org/bugzilla/show_bug.cgi?id=14932.
+// For certain symbols (e.g. regexec) we have to perform a versioned lookup,
+// but that versioned symbol wi

[PATCH] aarch64: Fix PR99437 - tighten shift predicates for narrowing shift patterns

2021-03-08 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

In this bug combine forms the (R)SHRN(2) instructions with an invalid shift 
amount.
The intrinsic expanders for these patterns validate the right shift amount but 
if the
final patterns end up being matched by combine (or other RTL passes I suppose) 
they
still let the wrong const_vector through.

This patch tightens up the predicates for the instructions involved by using 
predicates
for the right shift amount const_vectors.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

PR target/99437
* config/aarch64/predicates.md (aarch64_simd_shift_imm_vec_qi): Define.
(aarch64_simd_shift_imm_vec_hi): Likewise.
(aarch64_simd_shift_imm_vec_si): Likewise.
(aarch64_simd_shift_imm_vec_di): Likewise.
* config/aarch64/aarch64-simd.md (aarch64_shrn_insn_le): Use
predicate from above.
(aarch64_shrn_insn_be): Likewise.
(aarch64_rshrn_insn_le): Likewise.
(aarch64_rshrn_insn_be): Likewise.
(aarch64_shrn2_insn_le): Likewise.
(aarch64_shrn2_insn_be): Likewise.
(aarch64_rshrn2_insn_le): Likewise.
(aarch64_rshrn2_insn_be): Likewise.

gcc/testsuite/ChangeLog:

PR target/99437
* gcc.target/aarch64/simd/pr99437.c: New test.


shrn-preds.patch
Description: shrn-preds.patch


Re: [PATCH] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-03-08 Thread Alex Coplan via Gcc-patches
On 08/03/2021 14:57, Richard Biener wrote:
> On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
>  wrote:
> >
> > Hi all,
> >
> > As discussed in the PR, we currently have two different numbering
> > schemes for SVE builtins: one for C, and one for C++. This is
> > problematic for LTO, where we end up getting confused about which
> > intrinsic we're talking about. This patch inserts placeholders into the
> > registered_functions vector to ensure that there is a consistent
> > numbering scheme for both C and C++.
> >
> > Initially I tried using a null decl for these placeholders, but this
> > trips up some validation when the builtin trees are streamed into lto1,
> 
> Care to elaborate?  Note we stream builtins by their function code
> and thus expect to be able to materialize them later - materializing
> to NULL obviously fails but then the issue is the proper decl isn't there
> in that case.  Materializing a "dummy" decl just will delay the inevitable
> breakage.

So the validation I was referring to is this part of
tree-streamer-in.c:unpack_ts_function_decl_value_fields:

else if (cl == BUILT_IN_MD)
{
  tree result = targetm.builtin_decl (fcode, true);
  if (!result || result == error_mark_node)
fatal_error (input_location, "target specific builtin not available");
}

Because we stream the original decl and reconstruct it, I was assuming
that we would be using this (real) node everywhere and we would never
need to map from function code to tree node. Indeed, grepping for
"targetm.builtin_decl", the only results are this use, plus a few uses
in the avr backend, so it seems that right now this doesn't matter for
AArch64 (unless I'm missing some other path).

However, it seems that returning a subtly incorrect result for this
taget hook is asking for trouble, and we should return the real node
here.

Richard, do you have a feeling for how we should do this? Perhaps
instantiate real nodes (instead of placeholders) if in_lto_p? Or
unconditionally?

Thanks,
Alex

> 
> > so I ended up building dummy FUNCTION_DECLs as placeholders.
> >
> > To get better test coverage here, I'm wondering if we could make some of
> > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > For now I've just added the specific testcase from the PR.
> >
> > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> >
> > Thanks,
> > Alex
> >
> > ---
> >
> > gcc/ChangeLog:
> >
> > PR target/99216
> > * config/aarch64/aarch64-sve-builtins.cc
> > (function_builder::add_function): Add placeholder_p argument, build
> > placeholder decls if this is set.
> > (function_builder::add_unique_function): Instead of conditionally 
> > adding
> > direct overloads, unconditionally add either a direct overload or a
> > placeholder.
> > (function_builder::add_overloaded_function): Set placeholder_p if 
> > we're
> > using C++ overloads.
> > (function_builder::add_overloaded_functions): Don't return early for
> > m_direct_overloads: we need to add placeholders.
> > * config/aarch64/aarch64-sve-builtins.h
> > (function_builder::add_function): Add placeholder_p argument.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/99216
> > * g++.target/aarch64/sve/pr99216.C: New test.


Re: [PATCH] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-03-08 Thread Richard Biener via Gcc-patches
On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan  wrote:
>
> On 08/03/2021 14:57, Richard Biener wrote:
> > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
> >  wrote:
> > >
> > > Hi all,
> > >
> > > As discussed in the PR, we currently have two different numbering
> > > schemes for SVE builtins: one for C, and one for C++. This is
> > > problematic for LTO, where we end up getting confused about which
> > > intrinsic we're talking about. This patch inserts placeholders into the
> > > registered_functions vector to ensure that there is a consistent
> > > numbering scheme for both C and C++.
> > >
> > > Initially I tried using a null decl for these placeholders, but this
> > > trips up some validation when the builtin trees are streamed into lto1,
> >
> > Care to elaborate?  Note we stream builtins by their function code
> > and thus expect to be able to materialize them later - materializing
> > to NULL obviously fails but then the issue is the proper decl isn't there
> > in that case.  Materializing a "dummy" decl just will delay the inevitable
> > breakage.
>
> So the validation I was referring to is this part of
> tree-streamer-in.c:unpack_ts_function_decl_value_fields:
>
> else if (cl == BUILT_IN_MD)
> {
>   tree result = targetm.builtin_decl (fcode, true);
>   if (!result || result == error_mark_node)
> fatal_error (input_location, "target specific builtin not available");
> }
>
> Because we stream the original decl and reconstruct it, I was assuming
> that we would be using this (real) node everywhere and we would never
> need to map from function code to tree node. Indeed, grepping for
> "targetm.builtin_decl", the only results are this use, plus a few uses
> in the avr backend, so it seems that right now this doesn't matter for
> AArch64 (unless I'm missing some other path).

Hmm, I thought we were streaming those by reference - indeed
this probably got changed when we stopped doing that for
the middle-end builtins which we did because those are often
"misrecognized".

> However, it seems that returning a subtly incorrect result for this
> taget hook is asking for trouble, and we should return the real node
> here.

Yeah, the intent of this target hook is exactly that.  There's now of course
the option to simply remove it if it just exists for the checking done above
(need to check the avr backend, of course).  I suppose the cost in the
LTO IR stream isn't so big thus we can safely ditch the optimization
forever?

> Richard, do you have a feeling for how we should do this? Perhaps
> instantiate real nodes (instead of placeholders) if in_lto_p? Or
> unconditionally?
>
> Thanks,
> Alex
>
> >
> > > so I ended up building dummy FUNCTION_DECLs as placeholders.
> > >
> > > To get better test coverage here, I'm wondering if we could make some of
> > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > > For now I've just added the specific testcase from the PR.
> > >
> > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> > >
> > > Thanks,
> > > Alex
> > >
> > > ---
> > >
> > > gcc/ChangeLog:
> > >
> > > PR target/99216
> > > * config/aarch64/aarch64-sve-builtins.cc
> > > (function_builder::add_function): Add placeholder_p argument, 
> > > build
> > > placeholder decls if this is set.
> > > (function_builder::add_unique_function): Instead of conditionally 
> > > adding
> > > direct overloads, unconditionally add either a direct overload or 
> > > a
> > > placeholder.
> > > (function_builder::add_overloaded_function): Set placeholder_p if 
> > > we're
> > > using C++ overloads.
> > > (function_builder::add_overloaded_functions): Don't return early 
> > > for
> > > m_direct_overloads: we need to add placeholders.
> > > * config/aarch64/aarch64-sve-builtins.h
> > > (function_builder::add_function): Add placeholder_p argument.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR target/99216
> > > * g++.target/aarch64/sve/pr99216.C: New test.


Re: [PATCH] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-03-08 Thread Alex Coplan via Gcc-patches
On 08/03/2021 16:21, Richard Biener wrote:
> On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan  wrote:
> >
> > On 08/03/2021 14:57, Richard Biener wrote:
> > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
> > >  wrote:
> > > >
> > > > Hi all,
> > > >
> > > > As discussed in the PR, we currently have two different numbering
> > > > schemes for SVE builtins: one for C, and one for C++. This is
> > > > problematic for LTO, where we end up getting confused about which
> > > > intrinsic we're talking about. This patch inserts placeholders into the
> > > > registered_functions vector to ensure that there is a consistent
> > > > numbering scheme for both C and C++.
> > > >
> > > > Initially I tried using a null decl for these placeholders, but this
> > > > trips up some validation when the builtin trees are streamed into lto1,
> > >
> > > Care to elaborate?  Note we stream builtins by their function code
> > > and thus expect to be able to materialize them later - materializing
> > > to NULL obviously fails but then the issue is the proper decl isn't there
> > > in that case.  Materializing a "dummy" decl just will delay the inevitable
> > > breakage.
> >
> > So the validation I was referring to is this part of
> > tree-streamer-in.c:unpack_ts_function_decl_value_fields:
> >
> > else if (cl == BUILT_IN_MD)
> > {
> >   tree result = targetm.builtin_decl (fcode, true);
> >   if (!result || result == error_mark_node)
> > fatal_error (input_location, "target specific builtin not 
> > available");
> > }
> >
> > Because we stream the original decl and reconstruct it, I was assuming
> > that we would be using this (real) node everywhere and we would never
> > need to map from function code to tree node. Indeed, grepping for
> > "targetm.builtin_decl", the only results are this use, plus a few uses
> > in the avr backend, so it seems that right now this doesn't matter for
> > AArch64 (unless I'm missing some other path).
> 
> Hmm, I thought we were streaming those by reference - indeed
> this probably got changed when we stopped doing that for
> the middle-end builtins which we did because those are often
> "misrecognized".
> 
> > However, it seems that returning a subtly incorrect result for this
> > taget hook is asking for trouble, and we should return the real node
> > here.
> 
> Yeah, the intent of this target hook is exactly that.  There's now of course
> the option to simply remove it if it just exists for the checking done above
> (need to check the avr backend, of course).  I suppose the cost in the
> LTO IR stream isn't so big thus we can safely ditch the optimization
> forever?

The option of just removing the target hook seems appealing, if that
would be acceptable. It looks like avr could easily be changed to just
call its internal implementation (avr_builtin_decl) instead of the
target hook.

> 
> > Richard, do you have a feeling for how we should do this? Perhaps
> > instantiate real nodes (instead of placeholders) if in_lto_p? Or
> > unconditionally?
> >
> > Thanks,
> > Alex
> >
> > >
> > > > so I ended up building dummy FUNCTION_DECLs as placeholders.
> > > >
> > > > To get better test coverage here, I'm wondering if we could make some of
> > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > > > For now I've just added the specific testcase from the PR.
> > > >
> > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > > ---
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR target/99216
> > > > * config/aarch64/aarch64-sve-builtins.cc
> > > > (function_builder::add_function): Add placeholder_p argument, 
> > > > build
> > > > placeholder decls if this is set.
> > > > (function_builder::add_unique_function): Instead of 
> > > > conditionally adding
> > > > direct overloads, unconditionally add either a direct overload 
> > > > or a
> > > > placeholder.
> > > > (function_builder::add_overloaded_function): Set placeholder_p 
> > > > if we're
> > > > using C++ overloads.
> > > > (function_builder::add_overloaded_functions): Don't return 
> > > > early for
> > > > m_direct_overloads: we need to add placeholders.
> > > > * config/aarch64/aarch64-sve-builtins.h
> > > > (function_builder::add_function): Add placeholder_p argument.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR target/99216
> > > > * g++.target/aarch64/sve/pr99216.C: New test.


[patch] Fix PR C++/90448

2021-03-08 Thread Eric Botcazou
Hi,

this is a regression present on the mainline and 10 branch for architectures
that pass all structure types by reference, e.g. 32-bit PowerPC or SPARC.

Jakub posted a detailed analysis in the audit trail and this boils down to
the RTL expander trying to take the address of a DECL whose RTX is a register.

Bootstrapped/regtested on x86-64/Linux, PowerPC64/Linux and SPARC/Solaris,
OK for the mainline and 10 branch?


2021-03-08  Eric Botcazou  

PR C++/90448
* calls.c (initialize_argument_information): When the argument
is passed by reference, do not make a copy in a thunk only if
the argument is already in memory.  Remove redundant test for
the case of callee copy.

-- 
Eric Botcazoudiff --git a/gcc/calls.c b/gcc/calls.c
index 1fea022ad8a..ff606204772 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -2388,19 +2388,17 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
   function_arg_info arg (type, argpos < n_named_args);
   if (pass_by_reference (args_so_far_pnt, arg))
 	{
-	  bool callee_copies;
-	  tree base = NULL_TREE;
-
-	  callee_copies = reference_callee_copied (args_so_far_pnt, arg);
-
-	  /* If we're compiling a thunk, pass through invisible references
-	 instead of making a copy.  */
-	  if (call_from_thunk_p
-	  || (callee_copies
-		  && !TREE_ADDRESSABLE (type)
-		  && (base = get_base_address (args[i].tree_value))
-		  && TREE_CODE (base) != SSA_NAME
-		  && (!DECL_P (base) || MEM_P (DECL_RTL (base)
+	  const bool callee_copies
+	= reference_callee_copied (args_so_far_pnt, arg);
+	  tree base;
+
+	  /* If we're compiling a thunk, pass directly the address of an object
+	 already in memory, instead of making a copy.  Likewise if we want
+	 to make the copy in the callee instead of the caller.  */
+	  if ((call_from_thunk_p || callee_copies)
+	  && (base = get_base_address (args[i].tree_value))
+	  && TREE_CODE (base) != SSA_NAME
+	  && (!DECL_P (base) || MEM_P (DECL_RTL (base
 	{
 	  /* We may have turned the parameter value into an SSA name.
 		 Go back to the original parameter so we can take the


[patch] Fix PR fortran/96983

2021-03-08 Thread Eric Botcazou
Hi,

AFAICS the code in build_round_expr implicitly assumes that __float128 exists, 
which is *not* the common case among 64-bit architectures since "long double" 
is generally already 128-bit for them.

Tested on x86-64/Linux and SPARC64/Linux, OK for the mainline?


2021-03-08  Eric Botcazou  

PR fortran/96983
* trans-intrinsic.c (build_round_expr): Do not implicitly assume
that __float128 is the 128-bit floating-point type.

-- 
Eric Botcazoudiff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 5c9258c65c3..ae359a07973 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -407,7 +407,10 @@ build_round_expr (tree arg, tree restype)
   if (kind < 0)
 	gfc_internal_error ("Could not find real kind with at least %d bits",
 			resprec);
-  arg = fold_convert (gfc_float128_type_node, arg);
+  if (gfc_real16_is_float128)
+	arg = fold_convert (gfc_float128_type_node, arg);
+  else
+	arg = fold_convert (long_double_type_node, arg);
   fn = gfc_builtin_decl_for_float_kind (BUILT_IN_ROUND, kind);
 }
   else


[PATCH,rs6000] Tighten predicates for p10 ld/cmpi fusion

2021-03-08 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

PR99070 is caused by a fusion pattern matching that the individual
instructions do not match when it is split later. In this case the
ld+cmpi patterns were allowing a d-form load address, which the split
condition would rightly split, however that left us with something that
could not be matched by a ds-form ld instruction, hence the ICE. This
only happened if the target cpu was not power10 -- if we were targeting
power10 then a prefixed pld instruction would get generated because that
can handle d-form. However this is not optimal code either.

So the solution is a new predicate (ds_form_mem_operand) that only
accepts what we can take as for a ds-form load. Then a small
modification of the genfusion.pl script changes the relevant
ld+cmpi patterns to use the new predicate.

OK for trunk if bootstrap/regtest passes?

gcc/ChangeLog

PR target/99070
* config/rs6000/predicates.md (ds_form_mem_operand) New
predicate.
* config/rs6000/genfusion.pl (gen_ld_cmpi_p10) Use
ds_form_mem_operand in ld/lwa patterns.
* config/rs6000/fusion.md: Regenerate file.
---
 gcc/config/rs6000/fusion.md | 177 
 gcc/config/rs6000/genfusion.pl  |   7 +-
 gcc/config/rs6000/predicates.md |  20 
 3 files changed, 113 insertions(+), 91 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index 737a6da385f..56478fcae1d 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -1,7 +1,6 @@
-;; -*- buffer-read-only: t -*-
 ;; Generated automatically by genfusion.pl
 
-;; Copyright (C) 2020 Free Software Foundation, Inc.
+;; Copyright (C) 2020,2021 Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
 ;;
@@ -23,18 +22,18 @@
 ;; load mode is DI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-(compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
- (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+(match_operand:DI 3 "const_m1_to_1_operand" "n")))
(clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
-  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "ld%X1 %0,%1\;cmpdi %2,%0,%3"
   "&& reload_completed
&& (cc_reg_not_cr0_operand (operands[2], CCmode)
-   || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
NON_PREFIXED_DS))"
+   || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
+  DImode, NON_PREFIXED_DS))"
   [(set (match_dup 0) (match_dup 1))
(set (match_dup 2)
-(compare:CC (match_dup 0)
-   (match_dup 3)))]
+(compare:CC (match_dup 0) (match_dup 3)))]
   ""
   [(set_attr "type" "load")
(set_attr "cost" "8")
@@ -44,18 +43,18 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
 ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-(compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
- (match_operand:DI 3 "const_0_to_1_operand" "n")))
+(compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+   (match_operand:DI 3 "const_0_to_1_operand" "n")))
(clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
-  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
+  "ld%X1 %0,%1\;cmpldi %2,%0,%3"
   "&& reload_completed
&& (cc_reg_not_cr0_operand (operands[2], CCmode)
-   || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, 
NON_PREFIXED_DS))"
+   || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
+  DImode, NON_PREFIXED_DS))"
   [(set (match_dup 0) (match_dup 1))
(set (match_dup 2)
-(compare:CCUNS (match_dup 0)
-   (match_dup 3)))]
+(compare:CCUNS (match_dup 0) (match_dup 3)))]
   ""
   [(set_attr "type" "load")
(set_attr "cost" "8")
@@ -65,18 +64,18 @@ (define_insn_and_split 
"*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
 ;; load mode is DI result mode is DI compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-(compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
- (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+(compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+(match_operand:DI 3 "const_m1_to_1_operand" "n")))
(set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
-  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "ld%X1 %0,%1\;cmpdi %2,%0,%3"
   "&& reload_completed
&& (cc

Re: [PATCH,rs6000] Tighten predicates for p10 ld/cmpi fusion

2021-03-08 Thread Segher Boessenkool
Hi!

On Mon, Mar 08, 2021 at 10:32:16AM -0600, acsaw...@linux.ibm.com wrote:
> PR99070 is caused by a fusion pattern matching that the individual
> instructions do not match when it is split later. In this case the
> ld+cmpi patterns were allowing a d-form load address, which the split
> condition would rightly split, however that left us with something that
> could not be matched by a ds-form ld instruction, hence the ICE. This
> only happened if the target cpu was not power10 -- if we were targeting
> power10 then a prefixed pld instruction would get generated because that
> can handle d-form. However this is not optimal code either.
> 
> So the solution is a new predicate (ds_form_mem_operand) that only
> accepts what we can take as for a ds-form load. Then a small
> modification of the genfusion.pl script changes the relevant
> ld+cmpi patterns to use the new predicate.
> 
> OK for trunk if bootstrap/regtest passes?

Okay.  Thanks!

Some comments:

>   * config/rs6000/predicates.md (ds_form_mem_operand) New
>   predicate.
>   * config/rs6000/genfusion.pl (gen_ld_cmpi_p10) Use
>   ds_form_mem_operand in ld/lwa patterns.

You are missing the colon in thse two lines.  Supposedly the serverside
scripts will not allow to push then?

> index bd6ef1e56a5..1556514263a 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -992,6 +992,26 @@ (define_predicate "lwa_operand"
>return INTVAL (offset) % 4 == 0;
>  })
>  
> +;; Return 1 if the operand is a memory operand that has a valid address for
> +;; a DS-form instruction. I.e. the address has to be either just a register,
> +;; or register + const where the two low order bits of const are zero.
> +(define_predicate "ds_form_mem_operand"
> +  (match_code "subreg,mem")
> +{
> +  rtx inner, addr, offset;
> +
> +  inner = op;
> +  if (reload_completed && SUBREG_P (inner))
> +inner = SUBREG_REG (inner);

You just copied this from lwa_operand, but, what is that
reload_completed for?  It is not clear to me, it could use a comment.
Do you know?

Should it support subregs *at all* here, since you do not allow reg
(unlike lwa_operand, which is a bit confusingly named because of that).

(The "inner" variable can just be replaced by "op" everywhere, here and
in lwa_operand, "op" is never used again).


Segher


[PATCH v2] Extract a common logger the analyzer framework

2021-03-08 Thread Philip Herron
In development of the Rust FE logging is useful in debugging. Instead of
rolling our own logger it became clear the loggers part of analyzer and jit
projects could be extracted and made generic so they could be reused in
other projects.

To test this patch make check-jit was invoked, for analyzer the following
flags were used -fanalyzer -fdump-analyzer -fanalyzer-verbosity=4.

gcc/ChangeLog:

2021-03-8  Philip Herron 

* logging.h: added new generic logger based off analyzer's logger
* logging.c: Likewise

gcc/analyzer/ChangeLog:

2021-03-8  Philip Herron 

* analyzer-logging.h: has been extract out to gcc/logging.h
* analyzer-logging.c: Likewise
---
 gcc/Makefile.in   |   3 +-
 gcc/analyzer/analysis-plan.cc |   2 +-
 gcc/analyzer/analysis-plan.h  |   2 +-
 gcc/analyzer/analyzer.h   |   7 +-
 gcc/analyzer/checker-path.cc  |   2 +-
 gcc/analyzer/complexity.cc|   2 +-
 gcc/analyzer/diagnostic-manager.cc|   2 +-
 gcc/analyzer/diagnostic-manager.h |   2 +-
 gcc/analyzer/engine.cc|  10 +-
 gcc/analyzer/exploded-graph.h |   4 +-
 gcc/analyzer/pending-diagnostic.cc|   2 +-
 gcc/analyzer/program-point.cc |   2 +-
 gcc/analyzer/program-state.cc |   2 +-
 gcc/analyzer/region-model-impl-calls.cc   |   2 +-
 gcc/analyzer/region-model-manager.cc  |   2 +-
 gcc/analyzer/region-model-reachability.cc |   2 +-
 gcc/analyzer/region-model.cc  |   2 +-
 gcc/analyzer/region.cc|   2 +-
 gcc/analyzer/sm-file.cc   |   2 +-
 gcc/analyzer/sm-malloc.cc |   2 +-
 gcc/analyzer/sm-pattern-test.cc   |   2 +-
 gcc/analyzer/sm-sensitive.cc  |   2 +-
 gcc/analyzer/sm-signal.cc |   2 +-
 gcc/analyzer/sm-taint.cc  |   2 +-
 gcc/analyzer/sm.cc|   2 +-
 gcc/analyzer/sm.h |   2 +-
 gcc/analyzer/state-purge.cc   |   4 +-
 gcc/analyzer/state-purge.h|   2 +-
 gcc/analyzer/store.cc |   2 +-
 gcc/analyzer/supergraph.cc|   2 +-
 gcc/analyzer/supergraph.h |   8 +-
 gcc/analyzer/svalue.cc|   2 +-
 .../analyzer-logging.cc => logging.c} |  37 +++
 .../analyzer-logging.h => logging.h}  | 100 +-
 34 files changed, 111 insertions(+), 114 deletions(-)
 rename gcc/{analyzer/analyzer-logging.cc => logging.c} (87%)
 rename gcc/{analyzer/analyzer-logging.h => logging.h} (71%)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a63c5d9cab6..408ef6e3f0b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1244,7 +1244,6 @@ C_COMMON_OBJS = c-family/c-common.o 
c-family/c-cppbuiltin.o c-family/c-dump.o \
 ANALYZER_OBJS = \
analyzer/analysis-plan.o \
analyzer/analyzer.o \
-   analyzer/analyzer-logging.o \
analyzer/analyzer-pass.o \
analyzer/analyzer-selftests.o \
analyzer/bar-chart.o \
@@ -1710,7 +1709,7 @@ OBJS-libcommon = diagnostic.o diagnostic-color.o 
diagnostic-show-locus.o \
pretty-print.o intl.o \
sbitmap.o \
vec.o input.o version.o hash-table.o ggc-none.o memory-block.o \
-   selftest.o selftest-diagnostic.o sort.o
+   selftest.o selftest-diagnostic.o sort.o logging.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
diff --git a/gcc/analyzer/analysis-plan.cc b/gcc/analyzer/analysis-plan.cc
index 7dfc48e9c3e..40d325976ca 100644
--- a/gcc/analyzer/analysis-plan.cc
+++ b/gcc/analyzer/analysis-plan.cc
@@ -30,7 +30,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "json.h"
 #include "analyzer/analyzer.h"
 #include "diagnostic-core.h"
-#include "analyzer/analyzer-logging.h"
+#include "logging.h"
 #include "analyzer/analysis-plan.h"
 #include "ordered-hash-map.h"
 #include "options.h"
diff --git a/gcc/analyzer/analysis-plan.h b/gcc/analyzer/analysis-plan.h
index 1d10b772833..51fc5f6d460 100644
--- a/gcc/analyzer/analysis-plan.h
+++ b/gcc/analyzer/analysis-plan.h
@@ -31,7 +31,7 @@ namespace ana {
- which callgraph edges should use call summaries
TODO: the above is a work-in-progress.  */
 
-class analysis_plan : public log_user
+class analysis_plan : public gcc::log_user
 {
 public:
   analysis_plan (const supergraph &sg, logger *logger);
diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index f50ac662516..2c2982e8adb 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -23,6 +23,12 @@ along with GCC; see the file COPYING3.  If not see
 
 class graphviz_out;
 
+namespace gcc {
+  class logger;
+}
+
+using gcc::logger;
+
 namespace ana {
 
 /* 

c++: Incorrect specialization hash table [PR 99285]

2021-03-08 Thread Nathan Sidwell


Class template partial specializations need to be in the
specialization hash, but not all of them.  This defers adding
streamed-in entities to the hash table, in the same way I deferred
adding the instantiation and specialization lists for 99170.

PR c++/99285
gcc/cp/
* cp-tree.h (match_mergeable_specialization)
(add_mergeable_specialization): Adjust parms.
* module.cc (trees_in::decl_value): Adjust
add_mergeable_specialization calls.
(trees_out::key_mergeable): Adjust match_mergeable_specialization
calls.
(specialization_add): Likewise.
* pt.c (match_mergeable_specialization): Do not insert.
(add_mergeable_specialization): Add to hash table here.
gcc/testsuite/
* g++.dg/modules/pr99285_a.H: New.
* g++.dg/modules/pr99285_b.H: New.

--
Nathan Sidwell
diff --git c/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 39e2ad83abd..81ff375f8a5 100644
--- c/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -7239,11 +7239,10 @@ extern void walk_specializations		(bool,
 		 void (*)(bool, spec_entry *,
 			  void *),
 		 void *);
-extern tree match_mergeable_specialization	(bool is_decl, spec_entry *,
-		 bool insert = true);
+extern tree match_mergeable_specialization	(bool is_decl, spec_entry *);
 extern unsigned get_mergeable_specialization_flags (tree tmpl, tree spec);
-extern void add_mergeable_specialization(tree tmpl, tree args,
-		 tree spec, unsigned);
+extern void add_mergeable_specialization(bool is_decl, spec_entry *,
+		 tree outer, unsigned);
 extern tree add_outermost_template_args		(tree, tree);
 extern tree add_extra_args			(tree, tree);
 extern tree build_extra_args			(tree, tree, tsubst_flags_t);
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 48862dd9bbc..2518d73c220 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -8059,9 +8059,14 @@ trees_in::decl_value ()
 	set_constraints (decl, spec.spec);
   if (mk & MK_template_mask
 	  || mk == MK_partial)
-	/* Add to specialization tables now that constraints etc are
-	   added.  */
-	add_mergeable_specialization (spec.tmpl, spec.args, decl, spec_flags);
+	{
+	  /* Add to specialization tables now that constraints etc are
+	 added.  */
+	  bool is_decl = (mk & MK_template_mask) && (mk & MK_tmpl_decl_mask);
+
+	  spec.spec = is_decl ? inner : type;
+	  add_mergeable_specialization (is_decl, &spec, decl, spec_flags);
+	}
 
   if (TREE_CODE (decl) == INTEGER_CST && !TREE_OVERFLOW (decl))
 	{
@@ -8154,7 +8159,10 @@ trees_in::decl_value ()
 {
   tree e = match_mergeable_specialization (true, &spec);
   if (!e)
-	add_mergeable_specialization (spec.tmpl, spec.args, decl, spec_flags);
+	{
+	  spec.spec = inner;
+	  add_mergeable_specialization (true, &spec, decl, spec_flags);
+	}
   else if (e != existing)
 	set_overrun ();
 }
@@ -10344,14 +10352,14 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 	{
 	  /* Make sure we can locate the decl.  */
 	  tree existing = match_mergeable_specialization
-	(bool (mk & MK_tmpl_decl_mask), entry, false);
+	(bool (mk & MK_tmpl_decl_mask), entry);
 
 	  gcc_assert (existing);
 	  if (mk & MK_tmpl_decl_mask)
 	{
 	  if (mk & MK_tmpl_alias_mask)
 		/* It should be in both tables.  */
-		gcc_assert (match_mergeable_specialization (false, entry, false)
+		gcc_assert (match_mergeable_specialization (false, entry)
 			== TREE_TYPE (existing));
 	  else if (mk & MK_tmpl_tmpl_mask)
 		if (tree ti = DECL_TEMPLATE_INFO (existing))
@@ -10659,6 +10667,7 @@ trees_in::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 
   if (mk & MK_template_mask)
 {
+  // FIXME: We could stream the specialization hash?
   spec_entry spec;
   spec.tmpl = tree_node ();
   spec.args = tree_node ();
@@ -12869,7 +12878,7 @@ specialization_add (bool decl_p, spec_entry *entry, void *data_)
/* Only alias templates can appear in both tables (and
 	  if they're in the type table they must also be in the decl table).  */
gcc_checking_assert
-	 (!match_mergeable_specialization (true, entry, false)
+	 (!match_mergeable_specialization (true, entry)
 	  == (decl_p || !DECL_ALIAS_TEMPLATE_P (entry->tmpl)));
 }
   else if (VAR_OR_FUNCTION_DECL_P (entry->spec))
diff --git c/gcc/cp/pt.c w/gcc/cp/pt.c
index 81df8c88ccd..5e485f10d19 100644
--- c/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -29955,28 +29955,19 @@ walk_specializations (bool decls_p,
 }
 
 /* Lookup the specialization of *ELT, in the decl or type
-   specialization table.  Return the SPEC that's already there (NULL if
-   nothing).  If INSERT is true, and there was nothing, add the new
-   spec.  */
+   specialization table.  Return the SPEC that's already there, or
+   NULL if nothing.  */
 
 tree
-match_mergeable_specialization (bool decl_p, spec_entry *elt, bool insert)
+match_mergeable_specialization (bool decl_p, spec_entry *elt)
 {
   

c++: Poor diagnostic in header-unit [PR 99468]

2021-03-08 Thread Nathan Sidwell


We didn't specifically check for a module-decl inside a header unit.
 That leads to a confusing diagostic.  Fixed thusly.

gcc/cp/
* lex.c (module_token_filter::resume): Ignore module-decls inside
header-unit.
* parser.c (cp_parser_module_declaration): Reject in header-unit.
gcc/testsuite/
* g++.dg/modules/pr99468.H: New.
--
Nathan Sidwell
diff --git c/gcc/cp/lex.c w/gcc/cp/lex.c
index c83346b617d..73e14b8394c 100644
--- c/gcc/cp/lex.c
+++ w/gcc/cp/lex.c
@@ -515,7 +515,7 @@ struct module_token_filter
 	  {
 	  module_end:;
 	/* End of the directive, handle the name.  */
-	if (import)
+	if (import && (is_import || !flag_header_unit))
 	  if (module_state *m
 		  = preprocess_module (import, token_loc, module != NULL,
    is_import, got_export, reader))
diff --git c/gcc/cp/parser.c w/gcc/cp/parser.c
index 378e4572f8b..f636bb746d4 100644
--- c/gcc/cp/parser.c
+++ w/gcc/cp/parser.c
@@ -13745,7 +13745,13 @@ cp_parser_module_declaration (cp_parser *parser, module_parse mp_state,
   parser->lexer->in_pragma = true;
   cp_token *token = cp_lexer_consume_token (parser->lexer);
 
-  if (mp_state == MP_FIRST && !exporting
+  if (flag_header_unit)
+{
+  error_at (token->location,
+		"module-declaration not permitted in header-unit");
+  goto skip_eol;
+}
+  else if (mp_state == MP_FIRST && !exporting
   && cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
 {
   /* Start global module fragment.  */
diff --git c/gcc/testsuite/g++.dg/modules/pr99468.H w/gcc/testsuite/g++.dg/modules/pr99468.H
new file mode 100644
index 000..b6be0c349d5
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99468.H
@@ -0,0 +1,7 @@
+// PR 99468, stupidly worded diagnostic
+// { dg-additional-options -fmodule-header }
+
+module M; // { dg-error "module-declaration not permitted" }
+
+int i;
+// { dg-prune-output "not writing" }


Re: [PATCH] i386: Properly set ix86_isa_flags

2021-03-08 Thread Uros Bizjak via Gcc-patches
On Mon, Mar 8, 2021 at 2:52 PM Martin Liška  wrote:
>
> Hi.
>
> The bug is about usage of ix86_isa_flags instead of opts->x_ix86_isa_flags.
> I'm planning a bigger refactoring regarding the flags & PTA_PCLMUL conditions.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> PR target/99464
> * config/i386/i386-options.c (ix86_option_override_internal):
> Set isa_flags for OPTS argument and not for the global
> global_options.
>
> gcc/testsuite/ChangeLog:
>
> PR target/99464
> * gcc.target/i386/pr99464.c: New test.

OK.

Thanks,
Uros.

> ---
>   gcc/config/i386/i386-options.c  |  8 
>   gcc/testsuite/gcc.target/i386/pr99464.c | 15 +++
>   2 files changed, 19 insertions(+), 4 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/i386/pr99464.c
>
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index cdeabbfca4b..982c335e1fa 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -2159,11 +2159,11 @@ ix86_option_override_internal (bool main_args_p,
> && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_MOVBE))
>   opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_MOVBE;
> if (((processor_alias_table[i].flags & PTA_AES) != 0)
> -   && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_AES))
> - ix86_isa_flags |= OPTION_MASK_ISA_AES;
> +   && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_AES))
> + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AES;
> if (((processor_alias_table[i].flags & PTA_SHA) != 0)
> -   && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_SHA))
> - ix86_isa_flags |= OPTION_MASK_ISA_SHA;
> +   && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_SHA))
> + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_SHA;
> if (((processor_alias_table[i].flags & PTA_PCLMUL) != 0)
> && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_PCLMUL))
>   opts->x_ix86_isa_flags |= OPTION_MASK_ISA_PCLMUL;
> diff --git a/gcc/testsuite/gcc.target/i386/pr99464.c 
> b/gcc/testsuite/gcc.target/i386/pr99464.c
> new file mode 100644
> index 000..98dd938973e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99464.c
> @@ -0,0 +1,15 @@
> +/* PR target/99464 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +#pragma GCC target("arch=cannonlake")
> +
> +#include 
> +
> +volatile __m128i x;
> +
> +void extern
> +sha_test (void)
> +{
> +  x = _mm_sha1msg1_epu32 (x, x);
> +}
> --
> 2.30.1
>


C++: Enable c++2b module mode [PR 99436]

2021-03-08 Thread Nathan Sidwell


This adds support for c++23 mode to modules, and enables such testing.

PR c++/99436
gcc/cp/
* name-lookup.c (get_cxx_dialect_name): Add cxx23.
gcc/testsuite/
* g++.dg/modules/modules.exp (MOD_STD_LIST): Add 2b.

--
Nathan Sidwell
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 092fa6b8768..28271ba0485 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -6963,6 +6963,8 @@ get_cxx_dialect_name (enum cxx_dialect dialect)
   return "C++17";
 case cxx20:
   return "C++20";
+case cxx23:
+  return "C++23";
 }
 }
 
diff --git c/gcc/testsuite/g++.dg/modules/modules.exp w/gcc/testsuite/g++.dg/modules/modules.exp
index 38654caf7b9..da7afc2a226 100644
--- c/gcc/testsuite/g++.dg/modules/modules.exp
+++ w/gcc/testsuite/g++.dg/modules/modules.exp
@@ -35,7 +35,7 @@ if ![info exists DEFAULT_CXXFLAGS] then {
 set DEFAULT_CXXFLAGS " -pedantic-errors -Wno-long-long"
 }
 set DEFAULT_MODFLAGS $DEFAULT_CXXFLAGS
-set MOD_STD_LIST { 17 2a }
+set MOD_STD_LIST { 17 2a 2b }
 
 dg-init
 


[pushed] c++: Add test for PR96268.

2021-03-08 Thread Marek Polacek via Gcc-patches
This works since the recent r11-7102, but we didn't have a test for
a template-argument context.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/testsuite/ChangeLog:

PR c++/96268
* g++.dg/cpp2a/nontype-class41.C: New test.
---
 gcc/testsuite/g++.dg/cpp2a/nontype-class41.C | 13 +
 1 file changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class41.C

diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class41.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class41.C
new file mode 100644
index 000..2be88437072
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class41.C
@@ -0,0 +1,13 @@
+// PR c++/96268
+// { dg-do compile { target c++20 } }
+
+template 
+struct static_string { char chars[N]; /* operator<=> */ };
+
+template 
+static_string(char const(&)[N]) -> static_string;
+
+static_string hi = {"hi"};
+
+template  struct name {};
+using Hi = name<{"hi"}>;

base-commit: bc56d27de97ecea813279ce5ba45b278dcccfe21
-- 
2.29.2



Re: [PATCH V3 0/5] Support for the CTF and BTF debug formats

2021-03-08 Thread Jose E. Marchesi via Gcc-patches


ping

> [Changes from V2:
> - Rebased to latest master.
> - Several bug fixes in the CTF support.
> - Several bug fixes in the BTF support.
> - New tests for BTF.
> - Fix a couple of tests in the CTF testsuite.
> - More testing: we are building Gentoo with -gctf activated by
>   default and fixing problems as we find them.  More than 160 packages
>   built so far.]
>
> Hi people!
>
> Last year we submitted a first patch series introducing support for
> the CTF debugging format in GCC [1].  We got a lot of feedback that
> prompted us to change the approach used to generate the debug info,
> and this patch series is the result of that.
>
> This series also add support for the BTF debug format, which is needed
> by the BPF backend (more on this below.)
>
> This implementation works, but there are several points that need
> discussion and agreement with the upstream community, as they impact
> the way debugging options work.  We are also proposing a way to add
> additional debugging formats in the future.  See below for more
> details.
>
> Finally, a patch makes the BPF GCC backend to use the DWARF debug
> hooks in order to make -gbtf available to it.
>
> [1] https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01297.html
>
> About CTF
> =
>
> CTF is a debugging format designed in order to express C types in a
> very compact way.  The key is compactness and simplicity.  For more
> information see:
>
> - CTF specification
>   http://www.esperi.org.uk/~oranix/ctf/ctf-spec.pdf
>
> - Compact C-Type support in the GNU toolchain (talk + slides)
>   https://linuxplumbersconf.org/event/4/contributions/396/
>
> - On type de-duplication in CTF (talk + slides)
>   https://linuxplumbersconf.org/event/7/contributions/725/
>
> About BTF
> =
>
> BTF is a debugging format, similar to CTF, that is used in the Linux
> kernel as the debugging format for BPF programs.  From the kernel
> documentation:
>
> "BTF (BPF Type Format) is the metadata format which encodes the debug
>  info related to BPF program/map. The name BTF was used initially to
>  describe data types. The BTF was later extended to include function
>  info for defined subroutines, and line info for source/line
>  information."
>
> Supporting BTF in GCC is important because compiled BPF programs
> (which GCC supports as a target) require the type information in order
> to be loaded and run in diverse kernel versions.  This mechanism is
> known as CO-RE (compile-once, run-everywhere) and is described in the
> "Update of the BPF support in the GNU Toolchain" talk mentioned below.
>
> The BTF is documented in the Linux kernel documentation tree:
> - linux/Documentation/bpf/btf.rst
>
> CTF in the GNU Toolchain
> 
>
> During the last year we have been working in adding support for CTF to
> several components of the GNU toolchain:
>
> - binutils support is already upstream.  It supports linking objects
>   with CTF information with full type de-duplication.
>
> - GDB support is to be sent upstream very shortly.  It makes the
>   debugger capable to use the CTF information whenever available.
>   This is useful in cases where DWARF has been stripped out but CTF is
>   kept.
>
> - GCC support is being discussed and submitted in this series.
>
> Overview of the Implementation
> ==
>
>   dwarf2out.c
>
> The enabled debug formats are hooked in dwarf2out_early_finish.
>
>   dwarf2ctf.c
>
> Code that tranform the internal GCC DWARF DIEs into CTF container
> structures.  This file is included in dwarf2out.c.
>
>   ctfc.c
>   ctfc.h
>
> These two files implement the "CTF container", which is shared
> among CTF and BTF, due to the many similarities between both
> formats.
>
>   ctfout.c
>
> Code that emits assembler with the .ctf section data, from the CTF
> container.
>
>   btfout.c
>
> Code that emits assembler with the .BTF section data, from the CTF
> container.
>
> From debug hooks to debug formats
> =
>
> Our first attempt in adding CTF to GCC used the obvious approach of
> adding a new set of debug hooks as defined in gcc/debug.h.
>
> During our first interaction with the upstream community we were told
> to _not_ use debug hooks, because these are to be obsoleted at some
> point.  We were suggested to instead hook our handlers (which
> processed type TREE nodes producing CTF types from them) somewhere
> else.  So we did.
>
> However at the time we were also facing the need to support BTF, which
> is another type-related debug format needed by the BPF GCC backend.
> Hooking here and there doesn't sound like such a good idea when it
> comes to support several debug formats.
>
> Therefore we thought about how to make GCC support diverse debugging
> formats in a better way.  This led to a proposal we tried to discuss
> at the GNU Tools Track in LPC2020:
>
> - Update of the BPF support in the GNU Toolchain
>   https://linuxplumbersconf.org

[PATCH] Fix __gnu_debug::basic_string

2021-03-08 Thread François Dumont via Gcc-patches
__gnu_debug::basic_string is not extensively used nor extensively tested 
so not actively maintained.


So here is a first patch to fix some problems and increase test 
coverage. I plan another patch cause I think it is missing some C++17 
methods.


Tested under Linux x86_64, ok to commit ?

    libstdc++: Fix __gnu_debug::basic_string<> and adapt tests

    libstdc++-v3/ChangeLog:

    * include/debug/string
    (basic_string(const basic_string&, const _Alloc&)): Define 
even if !_GLIBCXX_USE_CXX11_ABI.
    (basic_string(basic_string&&, const _Alloc&)): Likewise and 
add noexcept qualification.

    (basic_string<>::erase): Adapt to take __const_iterator.
    * 
testsuite/21_strings/basic_string/requirements/citerators.cc: Adapt to 
test __gnu_debug::string

    when _GLIBCXX_DEBUG is defined.
    * 
testsuite/21_strings/basic_string/requirements/dr438/constructor.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/exception/basic.cc: Likewise.
    * 
testsuite/21_strings/basic_string/requirements/exception/generation_prohibited.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/exception/propagation_consistent.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/explicit_instantiation/char/1.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/explicit_instantiation/char16_t/1.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/explicit_instantiation/char32_t/1.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/explicit_instantiation/char8_t/1.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/explicit_instantiation/wchar_t/1.cc: 
Likewise.
    * 
testsuite/21_strings/basic_string/requirements/typedefs.cc: Likewise.

    * testsuite/util/exception/safety.h
    * testsuite/util/testsuite_container_traits.h

François

diff --git a/libstdc++-v3/include/debug/string b/libstdc++-v3/include/debug/string
index 172179530aa..d6eb5280ade 100644
--- a/libstdc++-v3/include/debug/string
+++ b/libstdc++-v3/include/debug/string
@@ -123,21 +123,21 @@ namespace __gnu_debug
 
   using _Base::npos;
 
-  basic_string()
-	_GLIBCXX_NOEXCEPT_IF(std::is_nothrow_default_constructible<_Base>::value)
-	: _Base() { }
-
   // 21.3.1 construct/copy/destroy:
+
   explicit
   basic_string(const _Allocator& __a) _GLIBCXX_NOEXCEPT
   : _Base(__a) { }
 
 #if __cplusplus < 201103L
+  basic_string() : _Base() { }
+
   basic_string(const basic_string& __str)
   : _Base(__str) { }
 
   ~basic_string() { }
 #else
+  basic_string() = default;
   basic_string(const basic_string&) = default;
   basic_string(basic_string&&) = default;
 
@@ -146,13 +146,15 @@ namespace __gnu_debug
   : _Base(__l, __a)
   { }
 
-#if _GLIBCXX_USE_CXX11_ABI
   basic_string(const basic_string& __s, const _Allocator& __a)
   : _Base(__s, __a) { }
 
   basic_string(basic_string&& __s, const _Allocator& __a)
-  : _Base(std::move(__s), __a) { }
-#endif
+  noexcept( noexcept(
+	_Base(std::declval<_Base>()), std::declval()) )
+  : _Safe(std::move(__s._M_safe()), __a),
+	_Base(std::move(__s._M_base()), __a)
+  { }
 
   ~basic_string() = default;
 
@@ -676,7 +678,7 @@ namespace __gnu_debug
   }
 
   iterator
-  erase(iterator __position)
+  erase(__const_iterator __position)
   {
 	__glibcxx_check_erase(__position);
 	typename _Base::iterator __res = _Base::erase(__position.base());
@@ -685,7 +687,7 @@ namespace __gnu_debug
   }
 
   iterator
-  erase(iterator __first, iterator __last)
+  erase(__const_iterator __first, __const_iterator __last)
   {
 	// _GLIBCXX_RESOLVE_LIB_DEFECTS
 	// 151. can't currently clear() empty container
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/requirements/citerators.cc b/libstdc++-v3/testsuite/21_strings/basic_string/requirements/citerators.cc
index 99bf5726dcc..69d4a8d0e51 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/requirements/citerators.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/requirements/citerators.cc
@@ -18,14 +18,21 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-#include 
 #include 
 
+#ifdef _GLIBCXX_DEBUG
+# include 
+using namespace __gnu_debug;
+#else
+# include 
+using namespace std;
+#endif
+
 int main()
 {
-  __gnu_test::citerator test1;
+  __gnu_test::citerator test1;
 #ifdef _GLIBCXX_USE_WCHAR_T
-  __gnu_test::citerator test2;
+  __gnu_test::citerator test2;
 #endif
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/requirements/dr438/constructor.cc b/libstdc++-v3/testsuite/21_strings/basic_string/requirements/dr438/constructor.cc
index 82d42ebb6a6..104fd653642 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_st

Re: [PATCH v2] fix Ada bootstrap on Cygwin64 (PR bootstrap/94918)

2021-03-08 Thread Mikael Pettersson via Gcc-patches
On Mon, Mar 8, 2021 at 12:07 PM Eric Botcazou  wrote:
>
> > I wonder why we include  from this file at all,
> > and why it is not included from {t,}system.h instead which
> > is where system header specific fixups should be made
> > (and this one could be avoided because system headers
> > are included _before_ poisoning anything).
>
>  is the Mother of All Things on the platform so you don't want to
> include it liberally (although it can be tamed e.g. with WIN32_LEAN_AND_MEAN).
> Therefore including it from tsystem.h might be worse than the actual disease.
>
> Mikael, can you work around the problem by adding
>
> #ifdef __CYGWIN__
> #include "mingw32.h"
> #endif
>
> at the appropriate spot in raise-gcc.c instead?

This one worked. Is that what you had in mind?

* raise-gcc.c: On Cygwin include mingw32.h to prevent
windows.h from including x86intrin.h or emmintrin.h.
---
 gcc/ada/raise-gcc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/ada/raise-gcc.c b/gcc/ada/raise-gcc.c
index 1446bfaaeb7..b096eba1b75 100644
--- a/gcc/ada/raise-gcc.c
+++ b/gcc/ada/raise-gcc.c
@@ -79,6 +79,12 @@ typedef char bool;
(SJLJ or DWARF). We need a consistently named interface to import from
a-except, so wrappers are defined here.  */

+#ifdef __CYGWIN__
+/* Prevent compile error due to unwind-generic.h including ,
+   see comment above #include  in mingw32.h.  */
+#include "mingw32.h"
+#endif
+
 #ifndef IN_RTS
   /* For gnat1/gnatbind compilation: cannot use unwind.h, as it is for the
  target. So mimic configure...
From 44a276e7900a506ee4b6f85d25ae5d96a11bd91e Mon Sep 17 00:00:00 2001
From: Mikael Pettersson 
Date: Mon, 8 Mar 2021 22:31:16 +0100
Subject: [PATCH] 	PR bootstrap/94918

	* raise-gcc.c: On Cygwin include mingw32.h to prevent
	windows.h from including x86intrin.h or emmintrin.h.
---
 gcc/ada/raise-gcc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/ada/raise-gcc.c b/gcc/ada/raise-gcc.c
index 1446bfaaeb7..b096eba1b75 100644
--- a/gcc/ada/raise-gcc.c
+++ b/gcc/ada/raise-gcc.c
@@ -79,6 +79,12 @@ typedef char bool;
(SJLJ or DWARF). We need a consistently named interface to import from
a-except, so wrappers are defined here.  */
 
+#ifdef __CYGWIN__
+/* Prevent compile error due to unwind-generic.h including ,
+   see comment above #include  in mingw32.h.  */
+#include "mingw32.h"
+#endif
+
 #ifndef IN_RTS
   /* For gnat1/gnatbind compilation: cannot use unwind.h, as it is for the
  target. So mimic configure...
-- 
2.26.2



Re: [PATCH v2] fix Ada bootstrap on Cygwin64 (PR bootstrap/94918)

2021-03-08 Thread Eric Botcazou
> This one worked. Is that what you had in mind?
> 
> * raise-gcc.c: On Cygwin include mingw32.h to prevent
> windows.h from including x86intrin.h or emmintrin.h.

Yep, exactly, thanks, you may put it on whichever branch you need.

-- 
Eric Botcazou




Leaving Red Hat, updating email address

2021-03-08 Thread Jeff Law via Gcc-patches
As some of you already know, Friday is my last day with Red Hat.  This
adjusts my email address in the maintainers file to a personal one. 
I'll update it for my new employer once I'm on-boarded and we've got all
the usual copyright stuff put in place.

I'll still be working with GNU tools, though my areas of focus will
almost certainly be changing.

Jeff
commit 75897e3d780085ac871c9b3ac7a82d403975b95a
Author: Jeff Law 
Date:   Mon Mar 8 15:32:54 2021 -0700

Adjust my email address to a personal one

* MAINTAINERS: Update my email address

diff --git a/MAINTAINERS b/MAINTAINERS
index f72b649b6ce..fd9eb81d395 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -24,7 +24,7 @@ Richard Earnshaw  

 Richard Biener 
 Jakub Jelinek  
 Richard Kenner 
-Jeff Law   
+Jeff Law   
 Michael Meissner   
 Jason Merrill  
 David S. Miller
@@ -68,8 +68,8 @@ fr30 port Nick Clifton

 frv port   Nick Clifton
 frv port   Alexandre Oliva 
 ft32 port  James Bowman
-h8 portJeff Law
-hppa port  Jeff Law
+h8 portJeff Law
+hppa port  Jeff Law
 hppa port  John David Anglin   
 i386 port  Jan Hubicka 
 i386 port  Uros Bizjak 
@@ -77,14 +77,14 @@ i386 vector ISA extns   Kirill Yukhin   

 iq2000 portNick Clifton
 lm32 port  Sebastien Bourdeauducq  
 m32r port  Nick Clifton
-m68k port (?)  Jeff Law
+m68k port (?)  Jeff Law
 m68k port  Andreas Schwab  
 m68k-motorola-sysv portPhilippe De Muyter  
 mcore port Nick Clifton
 microblaze Michael Eager   
 mips port  Matthew Fortune 
 mmix port  Hans-Peter Nilsson  
-mn10300 port   Jeff Law
+mn10300 port   Jeff Law
 mn10300 port   Alexandre Oliva 
 moxie port Anthony Green   
 msp430 portNick Clifton
@@ -183,11 +183,11 @@ line map  Dodji Seketeli  

 soft-fpJoseph Myers

 scheduler (+ haifa)Jim Wilson  
 scheduler (+ haifa)Michael Meissner
-scheduler (+ haifa)Jeff Law
+scheduler (+ haifa)Jeff Law
 scheduler (+ haifa)Vladimir Makarov
 modulo-scheduler   Roman Zhuykov   
-reorg  Jeff Law
-caller-save.c  Jeff Law
+reorg  Jeff Law
+caller-save.c  Jeff Law
 callgraph  Jan Hubicka 
 debugging code Jim Wilson  
 dwarf debugging code   Jason Merrill   
@@ -201,8 +201,8 @@ fixincludes Bruce Korb  
 *gimpl*Jakub Jelinek   
 *gimpl*Aldy Hernandez  
 *gimpl*Jason Merrill   
-gcse.c Jeff Law
-global opt framework   Jeff Law
+gcse.c Jeff Law
+global opt framework   Jeff Law
 hsaMartin Jambor   
 jump.c David S. Miller 
 web pages  Gerald Pfeifer  
@@ -225,7 +225,7 @@ gcovJan Hubicka 

 gcov   Martin Liska
 gcov   Nathan Sidwell  
 option handlingJoseph Myers

-middle-end Jeff Law
+middle-end Jeff Law
 middle-end Ian Lance Taylor
 middle-end Richard Biener  
 tree-ssa   Andrew MacLeod  


[r11-7560 Regression] FAIL: g++.dg/modules/builtin-3_a.C -std=c++2b scan-lang-dump module "Wrote GMF:-[0-9]* type_decl:'::__builtin_va_list'@builtins" on Linux/x86_64

2021-03-08 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

bc56d27de97ecea813279ce5ba45b278dcccfe21 is the first bad commit
commit bc56d27de97ecea813279ce5ba45b278dcccfe21
Author: Nathan Sidwell 
Date:   Mon Mar 8 11:55:26 2021 -0800

C++: Enable c++2b module mode [PR 99436]

caused

FAIL: g++.dg/modules/builtin-3_a.C -std=c++17  scan-lang-dump module 
"Writing:-[0-9]*'s named merge key \\(decl\\) type_decl:'::__builtin_va_list'"
FAIL: g++.dg/modules/builtin-3_a.C -std=c++17  scan-lang-dump module "Wrote 
GMF:-[0-9]* type_decl:'::__builtin_va_list'@builtins"
FAIL: g++.dg/modules/builtin-3_a.C -std=c++2a  scan-lang-dump module 
"Writing:-[0-9]*'s named merge key \\(decl\\) type_decl:'::__builtin_va_list'"
FAIL: g++.dg/modules/builtin-3_a.C -std=c++2a  scan-lang-dump module "Wrote 
GMF:-[0-9]* type_decl:'::__builtin_va_list'@builtins"
FAIL: g++.dg/modules/builtin-3_a.C -std=c++2b  scan-lang-dump module 
"Writing:-[0-9]*'s named merge key \\(decl\\) type_decl:'::__builtin_va_list'"
FAIL: g++.dg/modules/builtin-3_a.C -std=c++2b  scan-lang-dump module "Wrote 
GMF:-[0-9]* type_decl:'::__builtin_va_list'@builtins"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7560/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/builtin-3_a.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/builtin-3_a.C 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


libgo patch committed: Cast SIGSTKSZ to uintptr

2021-03-08 Thread Ian Lance Taylor via Gcc-patches
This libgo patch casts SIGSTKSZ to uintptr before comparing it to a
uintptr value.  This fixes build failures with newer versions of glibc
in which SIGSTKSZ has changed such that the type is now long.  The
build failure is a signed-unsigned comparison warning that turns into
an error due to -Werror.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline and GCC 9 and 10 branches.

Ian
d03d1688cebc0985aac189bcf6f46089044056ad
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 5c9fc7db4e1..5b45f03a26e 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-2c5188b5ad6143e791f2ba42f02a4ea7887d87b6
+93380a9126e76b71fa208e62c31c7914084c0e37
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/runtime/proc.c b/libgo/runtime/proc.c
index c037df645b9..38bf7a6b255 100644
--- a/libgo/runtime/proc.c
+++ b/libgo/runtime/proc.c
@@ -802,8 +802,8 @@ runtime_malg(bool allocatestack, bool signalstack, byte** 
ret_stack, uintptr* re
if(signalstack) {
stacksize = 32 * 1024; // OS X wants >= 8K, GNU/Linux 
>= 2K
 #ifdef SIGSTKSZ
-   if(stacksize < SIGSTKSZ)
-   stacksize = SIGSTKSZ;
+   if(stacksize < (uintptr)(SIGSTKSZ))
+   stacksize = (uintptr)(SIGSTKSZ);
 #endif
}
 


Re: [PATCH v2] c++: -Wconversion vs value-dependent expressions [PR99331]

2021-03-08 Thread Marek Polacek via Gcc-patches
On Fri, Mar 05, 2021 at 05:03:45PM -0500, Jason Merrill wrote:
> On 3/4/21 9:37 PM, Marek Polacek wrote:
> > This PR complains that we issue a -Wconversion warning in
> > 
> >template  struct X {};
> >template  X foo();
> > 
> > saying "conversion from 'long unsigned int' to 'int' may change value".
> > While it's not technically wrong, I suspect -Wconversion warnings aren't
> > all that useful for value-dependent expressions.  So this patch disables
> > them, though I'm open to other ideas.
> 
> How about suppressing -Wconversion in
> build_converted_constant_expr_internal?  If the size_t value ended up being
> too large for the int parameter, we would give an error about overflow in a
> constant expression, not just a warning.

That works for me too.  As you say, if we convert to a type that cannot
represent all the values of the original type, we give a pedantic-error
warning in check_narrowing.

I've slightly enhanced the test, too.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

-- >8 --
This PR complains that we issue a -Wconversion warning in

  template  struct X {};
  template  X foo();

saying "conversion from 'long unsigned int' to 'int' may change value".
While it's not technically wrong, I suspect -Wconversion warnings aren't
all that useful for value-dependent expressions.  So this patch disables
them.  This is a regression that started with r241425:

@@ -7278,7 +7306,7 @@ convert_template_argument (tree parm,
  val = error_mark_node;
}
}
-  else if (!dependent_template_arg_p (orig_arg)
+  else if (!type_dependent_expression_p (orig_arg)
   && !uses_template_parms (t))
/* We used to call digest_init here.  However, digest_init
   will report errors, which we don't want when complain

Here orig_arg is SIZEOF_EXPR; dependent_template_arg_p (orig_arg) was
true, but type_dependent_expression_p (orig_arg) is false so we warn in
convert_nontype_argument.

gcc/cp/ChangeLog:

PR c++/99331
* call.c (build_converted_constant_expr_internal): Don't emit
-Wconversion warnings.

gcc/testsuite/ChangeLog:

PR c++/99331
* g++.dg/warn/Wconversion5.C: New test.
---
 gcc/cp/call.c|  3 +++
 gcc/testsuite/g++.dg/warn/Wconversion5.C | 19 +++
 2 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wconversion5.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 7d12fea60f2..55d7e71c0c9 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4484,6 +4484,9 @@ build_converted_constant_expr_internal (tree type, tree 
expr,
  && processing_template_decl)
conv = next_conversion (conv);
 
+  /* Issuing conversion warnings for value-dependent expressions is
+likely too noisy.  */
+  warning_sentinel w (warn_conversion);
   conv->check_narrowing = true;
   conv->check_narrowing_const_only = true;
   expr = convert_like (conv, expr, complain);
diff --git a/gcc/testsuite/g++.dg/warn/Wconversion5.C 
b/gcc/testsuite/g++.dg/warn/Wconversion5.C
new file mode 100644
index 000..f5ae6312bc5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wconversion5.C
@@ -0,0 +1,19 @@
+// PR c++/99331
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wconversion" }
+// Don't issue -Wconversion warnings for value-dependent expressions.
+
+template  struct X {};
+template  struct Y {};
+template  X foo();
+template  X foo2();
+template Y foo3();
+template Y<1024> foo4(); // { dg-error "narrowing conversion" }
+template Y<1u> foo5();
+template X<__INT_MAX__ + 1U> foo6(); // { dg-error "narrowing conversion" 
}
+
+template 
+struct S {
+  using t = X;
+  using u = X;
+};

base-commit: bd85b4dd2dd7b00b6342ed1e33fb48035a3dcb61
-- 
2.29.2



Re: [PATCH v2] c++: ICE with real-to-int conversion in template [PR97973]

2021-03-08 Thread Marek Polacek via Gcc-patches
On Fri, Mar 05, 2021 at 05:15:49PM -0500, Jason Merrill via Gcc-patches wrote:
> On 3/3/21 7:55 PM, Marek Polacek wrote:
> > In this test we are building a call in a template, but since neither
> > the function nor any of its arguments are dependent, we go down the
> > normal path in finish_call_expr.  convert_arguments sees that we're
> > binding a reference to int to double and therein convert_to_integer
> > creates a FIX_TRUNC_EXPR.  Later, we call check_function_arguments
> > which folds the arguments, and, in a template, fold_for_warn calls
> > fold_non_dependent_expr.  But tsubst_copy_and_build should not see
> > a FIX_TRUNC_EXPR (see the patch discussed in
> > )
> > or we crash.
> 
> This again, sigh.  Let me take a step back.

:-(
 
> So basically, the output of convert_like_real in a template is a mix of
> template and non-template trees, and thus unsuitable for consumption by
> anything other than grabbing its type and throwing it away, as most callers
> do.
> 
> The problem here is that cp_build_function_call_vec calls
> check_function_arguments with these trees.  build_over_call, however, does
> not call check_function_arguments in a template.  Preventing that call in a
> template also fixes the testcase, though it regresses diagnostic location in
> Wnonnull5.C (which it shouldn't, that's a separate bug).

Yeah, that does strike me as wrong.  So I've tried

--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -4038,8 +4038,10 @@ cp_build_function_call_vec (tree function, vec **params,

   /* Check for errors in format strings and inappropriately
  null parameters.  */
-  bool warned_p = check_function_arguments (input_location, fndecl, fntype,
-   nargs, argarray, NULL);
+  bool warned_p
+= (!processing_template_decl
+   && check_function_arguments (input_location, fndecl, fntype,
+   nargs, argarray, NULL));

   ret = build_cxx_call (function, nargs, argarray, complain, orig_fndecl);

and saw no failures with it (with/out my patch).  In fact, I'd like to
apply both patches.
 
> IMPLICIT_CONV_EXPR is a way to represent the conversion so that the result
> is still a template tree, and therefore suitable for fold_for_warn, which
> allows us to warn when parsing the template, which is generally desirable.

That sounds like a fair assessment.
 
> I think the approach of expanding IMPLICIT_CONV_EXPR is probably the right
> choice, but perhaps we should expand it to all non-trivial conversions, not
> just those that would use problematic tree codes.

Yeah; it's worked pretty well for classes after we've dealt with the
initial fallout.  I've factored the check into a new function, and also
extended it with the case where we'd create a FLOAT_EXPR.  I don't know
if that covers all non-trivial conversions.

Do we want to assert that FLOAT_EXPR never makes its way into tsubst now?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

-- >8 --
In this test we are building a call in a template, but since neither
the function nor any of its arguments are dependent, we go down the
normal path in finish_call_expr.  convert_arguments sees that we're
binding a reference to int to double and therein convert_to_integer
creates a FIX_TRUNC_EXPR.  Later, we call check_function_arguments
which folds the arguments, and, in a template, fold_for_warn calls
fold_non_dependent_expr.  But tsubst_copy_and_build should not see
a FIX_TRUNC_EXPR (see the patch discussed in
)
or we crash.

So let's not create a FIX_TRUNC_EXPR in a template in the first place
and instead use IMPLICIT_CONV_EXPR.

gcc/cp/ChangeLog:

PR c++/97973
* call.c (conv_unsafe_in_template_p): New.
(convert_like): Use it.

gcc/testsuite/ChangeLog:

PR c++/97973
* g++.dg/conversion/real-to-int1.C: New test.
---
 gcc/cp/call.c | 23 ++-
 .../g++.dg/conversion/real-to-int1.C  | 17 ++
 2 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/conversion/real-to-int1.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 7d12fea60f2..f450691d3f6 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8048,6 +8048,27 @@ convert_like_internal (conversion *convs, tree expr, 
tree fn, int argnum,
   return expr;
 }
 
+/* Return true if converting FROM to TO is unsafe in a template.  */
+
+static bool
+conv_unsafe_in_template_p (tree to, tree from)
+{
+  /* Converting classes involves TARGET_EXPR.  */
+  if (CLASS_TYPE_P (to) || CLASS_TYPE_P (from))
+return true;
+
+  /* Converting real to integer produces FIX_TRUNC_EXPR which tsubst
+ doesn't handle.  */
+  if (SCALAR_FLOAT_TYPE_P (from) && INTEGRAL_OR_ENUMERATION_TYPE_P (to))
+return true;
+
+  /* Converting integer to real isn't a trivial conversion, either.  */
+  if (INTEGRAL_OR_ENUMERATION

Re: [PATCH] c++: -Wconversion vs value-dependent expressions [PR99331]

2021-03-08 Thread Martin Sebor via Gcc-patches

On 3/5/21 3:03 PM, Jason Merrill via Gcc-patches wrote:

On 3/4/21 9:37 PM, Marek Polacek wrote:

This PR complains that we issue a -Wconversion warning in

   template  struct X {};
   template  X foo();

saying "conversion from 'long unsigned int' to 'int' may change value".
While it's not technically wrong, I suspect -Wconversion warnings aren't
all that useful for value-dependent expressions.  So this patch disables
them, though I'm open to other ideas.


How about suppressing -Wconversion in 
build_converted_constant_expr_internal?  If the size_t value ended up 
being too large for the int parameter, we would give an error about 
overflow in a constant expression, not just a warning.


I was going to suggest to continue to issue some warning in contexts
where the result of the conversion cannot be represented in the type
but not otherwise.  As in:

  template  struct X { };
  template  X foo();

  void bar ()
  {
foo();
  }

Does your suggestion have the same effect?

Martin


Australasian Oil & Gas Exhibition & Conference - Verified attendees

2021-03-08 Thread Emma Kinsley via Gcc-patches
Good Day,



We are following up to know your interest in obtaining the Potential mailing 
list.

Australasian Oil & Gas Exhibition & Conference

10 - 11 Mar 2021

Perth Convention and Exhibition Centre, Perth, Australia

Visitors count: 6,723



FLAT 50% Discount on the list



Please let us know your thoughts so that we can send you the cost and 
additional information.



Kind Regards,

Emma Kinsley - Marketing team


Re: [PATCH] c++: -Wconversion vs value-dependent expressions [PR99331]

2021-03-08 Thread Marek Polacek via Gcc-patches
On Mon, Mar 08, 2021 at 06:10:05PM -0700, Martin Sebor wrote:
> On 3/5/21 3:03 PM, Jason Merrill via Gcc-patches wrote:
> > On 3/4/21 9:37 PM, Marek Polacek wrote:
> > > This PR complains that we issue a -Wconversion warning in
> > > 
> > >    template  struct X {};
> > >    template  X foo();
> > > 
> > > saying "conversion from 'long unsigned int' to 'int' may change value".
> > > While it's not technically wrong, I suspect -Wconversion warnings aren't
> > > all that useful for value-dependent expressions.  So this patch disables
> > > them, though I'm open to other ideas.
> > 
> > How about suppressing -Wconversion in
> > build_converted_constant_expr_internal?  If the size_t value ended up
> > being too large for the int parameter, we would give an error about
> > overflow in a constant expression, not just a warning.
> 
> I was going to suggest to continue to issue some warning in contexts
> where the result of the conversion cannot be represented in the type
> but not otherwise.  As in:
> 
>   template  struct X { };
>   template  X foo();
> 
>   void bar ()
>   {
> foo();
>   }
> 
> Does your suggestion have the same effect?

Either of the patches I've posted suppresses the -Wconversion warning
in this test.  But that should be ok -- rather, we should probably
reject the conversion in check_narrowing.  Which we currently don't.
It looks like we just fold the SIZEOF_EXPR to 0 and never detect the
overflow.

Marek



Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)

2021-03-08 Thread Martin Sebor via Gcc-patches

Attached is a revised patch with three changes:

1) use wi::to_offset (max_object_size ()) instead of tree_to_shwi()
   as requested,
2) avoid warning for accesses to elements of arrays of empty types
   (PR 99475 that I noticed while testing the original patch),
3) include the size of the destination even for declared objects to
   make warnings for nonempty accesses to elements of arrays of empty
   structs slightly clearer.

Accesses to zero-length arrays continue to be diagnosed (except for
trailing arrays of unknown objects), as are nonempty accesses to empty
types.

The warning message for (3) remains unchanged, i.e., for the following:

  struct S { } a[3];

  void g (int n)
  {
((int*)a)[0] = 0;
  }

it's:

  warning: array subscript 0 is outside array bounds of ‘struct S[3]’ 
[-Warray-bounds]


This could be improved by mentioning the type of the access when
it's not the same as that of the accessed object as is already done
for partially out of bounds accesses:

  warning: subscript int[0] is outside array bounds of ‘struct S[3]’ 
[-Warray-bounds]


but making this change at the same time feels like feature creep and
out of scope for stage 4.

Retested on x86_64-linux.

On 2/18/21 1:55 PM, Martin Sebor wrote:

On 2/18/21 11:03 AM, Jakub Jelinek wrote:

On Thu, Feb 18, 2021 at 07:00:52PM +0100, Jakub Jelinek wrote:

The size of the VLA is zero regardless of its bound and accessing
it is invalid so the warning is expected.


Yes, some warning, but not the one you are giving, that is nonsensical.
Array subscript 0 is not outside of array bounds of struct S[n], 
a[anything]

will still be zero sized and will not be problematic.


The warning is designed for ordinary arrays of nonzero size.  There's
no point in putting an effort into coming up with a special warning
just for those because they serve no purpose in these contexts (as
complete objects).



Scalar objects with zero size will always have that zero size,
similarly arrays thereof (constant or variable sized).
So the warning should be simply if eltsize == 0,
check if the access is before or after the object and complain
that a memory access is done before or after a zero sized object %qD.

Jakub


No, I don't think making this exception would be helpful.  Zero length
arrays are a non-standard extension meant to be used as struct members,
before flexible array members were added to C.  In other contexts, they
are almost certainly unintended and so likely bugs.  There's no valid
use case for such arrays, and diagnosing accesses to them helps find
such bugs.

That said, I also don't think a fix for the ICE should be held up
because we disagree on this vanishingly unimportant corner case.
The ICE effectively prevents using such arrays (VLAs) and since no
bug reports have been raised for it since it was introduced in GCC
9 it's unlikely that any code relies on it.  (I suspect the bug
itself was the result of fuzzing.)

Martin


PR tree-optimization/99121 - ICE in -Warray-bounds on a VLA of zero-length array
PR tree-optimization/99475 - bogus -Warray-bounds accessing an array element of empty structs

gcc/ChangeLog:

	PR tree-optimization/99121
	PR tree-optimization/99475
	* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
	Avoid assuming array element size is constant.  Handle zero-length
	arrays of VLAs.

gcc/testsuite/ChangeLog:

	PR tree-optimization/99121
	PR tree-optimization/99475
	* c-c++-common/Warray-bounds-10.c: New test.
	* c-c++-common/Warray-bounds-9.c: New test.
	* gcc.dg/Warray-bounds-71.c: New test.
	* gcc.dg/Warray-bounds-72.c: New test.
	* gcc.dg/Warray-bounds-73.c: New test.

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 54f32051199..bd6caee7d5a 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -419,7 +419,7 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
   tree cstoff = TREE_OPERAND (ref, 1);
   tree varoff = NULL_TREE;
 
-  const offset_int maxobjsize = tree_to_shwi (max_object_size ());
+  const offset_int maxobjsize = wi::to_offset (max_object_size ());
 
   /* The zero-based array or string constant bounds in bytes.  Initially
  set to [-MAXOBJSIZE - 1, MAXOBJSIZE]  until a tighter bound is
@@ -553,7 +553,12 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
 	offrange[1] = arrbounds[1];
 }
 
+  /* Type of the access.  */
+  tree axstype = TREE_TYPE (ref);
+  /* TYpe of the referenced object if it can be determined or array
+ of unsigned char for allocated objects.  */
   tree reftype = NULL_TREE;
+  /* The byte size of the referenced array element.  */
   offset_int eltsize = -1;
   if (arrbounds[0] >= 0)
 {
@@ -602,34 +607,44 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
 	  || (DECL_EXTERNAL (arg) && array_at_struct_end_p (ref
 	return false;
 
-  /* FIXME: Should this be 1 for Fortran?  */
   arrbounds[0] = 0;
 
   if (TREE_CODE (reftype) ==

Re: add powerpc_vsx_ok requirement to undef-bool tests

2021-03-08 Thread Alexandre Oliva
On Feb 26, 2021, David Edelsohn  wrote:

> This patch is okay.

Thanks, I've finally checked it in.

>> From: Joel Brobecker 
>> gcc/testsuite/ChangeLog:
>> 
>> * gcc.target/powerpc/undef-bool-2.c: Add
>> dg-require-effective-target powerpc_vsx_ok directive.
>> * g++.dg/ext/undef-bool-1.C: Add dg-require-effective-target
>> powerpc_vsx_ok directive.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar