date:20201114

[PATCH] lto: Fix typo in comment of gcc/lto/lto-symtab.c

2020-11-14 Thread Jerry Clcanny via Gcc-patches

Hi, thanks for reviewing this patch. This patch just change a typo in
comment of gcc/lto/lto-symtab.c. The original comment is "because after
removing one of duplicate decls the hash is not correcly updated to the
ohter dupliate.", I change "ohter" to "other". So I don't do any tesst and
provide reports.


fix-typo-in-comment-of-lto-symtab-c.patch
Description: Binary data

Re: [PATCH] Add MODE_OPAQUE

2020-11-14 Thread Richard Sandiford via Gcc-patches

acsaw...@linux.ibm.com writes:
> From: Aaron Sawdey 
>
> After discussion with Richard Sandiford on IRC, he suggested adding a
> new mode class MODE_OPAQUE to deal with the problems (PR 96791) we had
> been having with POImode/PXImode in powerpc target. This patch is the
> accumulation of changes I needed to make to add this and make it useable
> for the purposes of what power10 MMA needed.
>
> MODE_OPAQUE modes allow you to have modes for which you can just
> define loads and stores. By design, optimization does not expect to
> know how to do arithmetic or subregs on these modes. This allows us to
> have modes for multi-register vector operations where we don't want to
> open Pandora's Box and define general arithmetic operations.
>
> This patch will be followed by a target specific patch to change the
> powerpc power10 MMA builtins to use opaque modes, and will also let use use
> the vector pair loads/stores defined with that in the inline expansion
> of memcpy/memmove, allowing me to fix PR 96791.
>
> Regstrap in progress on ppc64le and x86_64, ok for trunk if successful?

Thanks for doing this.  Mostly LGTM, but:

> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 3706f0a03fd..44a3b660bd0 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -6268,6 +6268,9 @@ init_emit_once (void)
>  mode <= MAX_MODE_PARTIAL_INT;
>  mode = (machine_mode)((int)(mode) + 1))
>   const_tiny_rtx[i][(int) mode] = GEN_INT (i);
> +
> +  FOR_EACH_MODE_IN_CLASS (mode, MODE_OPAQUE)
> + const_tiny_rtx[i][(int) mode] = GEN_INT (i);

I'm not sure about reusing CONST_INT & co. for something that isn't
a scalar_int_mode.  The only reason routines like trunc_int_for_mode
allow general scalar_modes is that, when the new mode wrappers were
added, (const_int 0) was used for the MPX pointer bounds modes.
We should probably go back and tighten it now that MPX has been removed.

So I guess the question is: how integer-like do the modes need to be?
Do we need to be able to represent any constant bitpattern as a constant
rtx?

If we do need to be able to represent any value as a constant, does treating
opaque modes as SCALAR_INT_MODE_P reintroduce the original problem?  If not,
I think doing that would make things more self-consistent.

If we don't need to be able to represent any value as a constant,
which values do we need?  Is all-zeroes enough?  If so, maybe it
would be better to have a new CONST_… rtx code.

> diff --git a/gcc/ira.c b/gcc/ira.c
> index 050405f1833..d7a0482d121 100644
> --- a/gcc/ira.c
> +++ b/gcc/ira.c
> @@ -4666,7 +4666,8 @@ find_moveable_pseudos (void)
>   || !DF_REF_INSN_INFO (def)
>   || HARD_REGISTER_NUM_P (regno)
>   || DF_REG_EQ_USE_COUNT (regno) > 0
> - || (!INTEGRAL_MODE_P (mode) && !FLOAT_MODE_P (mode)))
> + || (!INTEGRAL_MODE_P (mode) && !FLOAT_MODE_P (mode)
> + && !OPAQUE_MODE_P (mode)))

Nit: should be one && per line now that it doesn't fit on a single line.

> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
> index 6ac97fe39c4..1fb21635a3b 100644
> --- a/gcc/tree-ssanames.c
> +++ b/gcc/tree-ssanames.c
> @@ -516,6 +516,9 @@ get_nonzero_bits (const_tree name)
>if (TREE_CODE (name) == INTEGER_CST)
>  return wi::to_wide (name);
>  
> +  if (OPAQUE_TYPE_P (TREE_TYPE (name)))
> +return wi::shwi (-1, 1);
> +
>/* Use element_precision instead of TYPE_PRECISION so complex and
>   vector types get a non-zero precision.  */
>unsigned int precision = element_precision (TREE_TYPE (name));

Using a precision of 1 looks odd here.  What goes wrong if we use
the precision calculated by element_precision?

Does OPAQUE_TYPE have a valid TYPE_PRECISION (equal to the mode's
GET_MODE_BITSIZE)?  If not, I guess it probably should.

The new type needs to be documented in doc/generic.texi.

Thanks,
Richard

Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-14 Thread Liu Hao via Gcc-patches

This is the third revision of my patch:

1. Two typos in the commit message have been fixed.
2. Support for `%a` and `%A` has been added. Documentation can be
   found on the same page in the commit message.
3. GCC will no longer warn about 'ISO C does not support the ‘L’
   ms_printf length modifier'. This was caused by mistaken array
   indices in `TARGET_OVERRIDES_FORMAT_INIT`.


-- 
Best regards,
LH_Mouse
From f73d5dcdc61a5da1a4265b144739067b4ec40ec2 Mon Sep 17 00:00:00 2001
From: Liu Hao 
Date: Thu, 12 Nov 2020 22:20:29 +0800
Subject: [PATCH] gcc: Add `ll` and `L` length modifiers for `ms_printf`

Previous code abused `FMT_LEN_L` for the `I` modifier. As `L` is a
valid modifier for `f`, `e`, `g`, etc. and `I` has the same semantics
as the C99 `z` modifier, `FMT_LEN_z` is now used instead.

First, in the Microsoft ABI, type `long double` has the same layout as
type `double`, so `%Lg` behaves identically to `%g`. Users should pass
in `double`s instead of `long double`s, as GCC uses the 10-byte format.

Second, with a CRT that is recent enough (MSVCRT since Vista, MSVCR80,
UCRT, or mingw-w64 8.0), `printf`-family functions can handle the `ll`
length modifier correctly. This ability is assumed to be available
universally. A lot of libraries (such as libgomp) that use the
`format(printf, ...)` attribute used to suffer from warnings about
unknown format specifiers.

Reference: 
https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/tcxf1dw6(v=vs.90)
Reference: 
https://docs.microsoft.com/en-us/cpp/porting/visual-cpp-what-s-new-2003-through-2015#new-crt-features
Signed-off-by: Liu Hao 

gcc/:
* config/i386/msformat-c.c: Add more length modifiers
---
 gcc/config/i386/msformat-c.c  | 53 ++-
 gcc/testsuite/gcc.dg/format/ms_c99-printf-3.c | 22 +++-
 2 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/gcc/config/i386/msformat-c.c b/gcc/config/i386/msformat-c.c
index 4ceec633a6e..085ac88789a 100644
--- a/gcc/config/i386/msformat-c.c
+++ b/gcc/config/i386/msformat-c.c
@@ -32,10 +32,11 @@ along with GCC; see the file COPYING3.  If not see
 static format_length_info ms_printf_length_specs[] =
 {
   { "h", FMT_LEN_h, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
-  { "l", FMT_LEN_l, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 },
+  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
+  { "L", FMT_LEN_L, STD_C89, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I32", FMT_LEN_l, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { "I64", FMT_LEN_ll, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
-  { "I", FMT_LEN_L, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
+  { "I", FMT_LEN_z, STD_EXT, NULL, FMT_LEN_none, STD_C89, 1 },
   { NULL, FMT_LEN_none, STD_C89, NULL, FMT_LEN_none, STD_C89, 0 }
 };
 
@@ -90,33 +91,35 @@ static const format_flag_pair ms_strftime_flag_pairs[] =
 static const format_char_info ms_print_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  T99_SST, 
 BADLEN, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "-wp0 +'",  "i",  NULL 
},
-  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0#", "i",  NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, T99_ST, 
BADLEN, BADLEN, BADLEN, BADLEN,  BADLEN,  BADLEN }, "-wp0'","i",  NULL },
-  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#'", "",   NULL },
-  { "eE",  0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN }, "-wp0 +#",  "",   NULL },
-  { "c",   0, STD_C89, { T89_I,   BADLEN,  T89_S,  T94_WI,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","",   NULL 
},
-  { "s",   1, STD_C89, { T89_C,   BADLEN,  T89_S,  T94_W,   BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp",   "cR", NULL 
},
-  { "p",   1, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-w","c",  NULL 
},
-  { "n",   1, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN,  
BADLEN, BADLEN,  T99_IM,  BADLEN,  BADLEN,  BADLEN }, "",  "W",  NULL },
+  { "di",  0, STD_C89, { T89_I,   BADLEN,  T89_S,   T89_L,   T9L_LL,  BADLEN, 
T99_SST, BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0 +'",  "i",  NULL },
+  { "oxX", 0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0#","i",  NULL },
+  { "u",   0, STD_C89, { T89_UI,  BADLEN,  T89_US,  T89_UL,  T9L_ULL, BADLEN, 
T99_ST,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN }, "-wp0'","i",  NULL },
+  { "fgG", 0, STD_C89, { T89_D,   BADLEN,  BADLEN,  T99_D,   BADLEN,  T89_D,  
BADLEN,  BADLEN,

Re: [PATCH] [libiberty] Fix write buffer overflow in cplus_demangle

2020-11-14 Thread Tim Rühsen


Hey,

On 13.11.20 05:45, Jeff Law wrote:


On 11/29/19 12:15 PM, Tim Rühsen wrote:

* cplus-dem.c (ada_demangle): Correctly calculate the demangled
   size by using two passes.


So I'm not sure why, but I can't get this patch to apply.  What's even
more interesting is ada_demangle doesn't seem to have changed since 2010
and even if I checkout a Nov 2019 trunk, I still can't apply the patch.


I can see what you're doing with your patch (it's primarily introducing
a loop where you count on the first pass and allocate on the second and
re-indent all the necessary code), I'd prefer not to muck it up trying
to apply by hand.


Any change you could update the patch so that it applies to the trunk.
THe review is done, so it should be able to go straight in.  If you have
commit privs (I don't recall if you do or not), you can go ahead and
commit it yourself.


hm sorry, I am a bit out of the loop currently. It would be awesome if 
someone with more project knowledge could apply the patch.


From what I can see here, the patch was made on top of binutils-gdb 
commit 3d18c3354209bd42361cb26ec611455cdf8b401b. Hope this helps.



Sorry for the insane delays here.


That is how life goes ;-)
A delay is better than never.

Regards, Tim


OpenPGP_0x08302DB6A2670428.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature

Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling

2020-11-14 Thread Mike Crowe via Gcc-patches

On Saturday 14 November 2020 at 00:17:22 +, Jonathan Wakely wrote:
> On 13/11/20 22:45 +, Jonathan Wakely wrote:
> > On 13/11/20 21:12 +, Jonathan Wakely wrote:
> > > On 13/11/20 20:29 +, Mike Crowe via Libstdc++ wrote:
> > > > On Friday 13 November 2020 at 17:25:22 +, Jonathan Wakely wrote:
> > > > > +  // Return the relative duration from (now_s + now_ns) to (abs_s + 
> > > > > abs_ns)
> > > > > +  // as a timespec.
> > > > > +  struct timespec
> > > > > +  relative_timespec(chrono::seconds abs_s, chrono::nanoseconds 
> > > > > abs_ns,
> > > > > + time_t now_s, long now_ns)
> > > > > +  {
> > > > > +struct timespec rt;
> > > > > +
> > > > > +// Did we already time out?
> > > > > +if (now_s > abs_s.count())
> > > > > +  {
> > > > > + rt.tv_sec = -1;
> > > > > + return rt;
> > > > > +  }
> > > > > +
> > > > > +auto rel_s = abs_s.count() - now_s;
> > > > > +
> > > > > +// Avoid overflows
> > > > > +if (rel_s > __gnu_cxx::__int_traits::__max)
> > > > > +  rel_s = __gnu_cxx::__int_traits::__max;
> > > > > +else if (rel_s < __gnu_cxx::__int_traits::__min)
> > > > > +  rel_s = __gnu_cxx::__int_traits::__min;
> > > > 
> > > > I may be missing something, but if the line above executes...
> > > > 
> > > > > +
> > > > > +// Convert the absolute timeout value to a relative timeout
> > > > > +rt.tv_sec = rel_s;
> > > > > +rt.tv_nsec = abs_ns.count() - now_ns;
> > > > > +if (rt.tv_nsec < 0)
> > > > > +  {
> > > > > + rt.tv_nsec += 10;
> > > > > + --rt.tv_sec;
> > > > 
> > > > ...and so does this line above, then I think that we'll end up
> > > > underflowing. (Presumably rt.tv_sec will wrap round to being some time 
> > > > in
> > > > 2038 on most 32-bit targets.)
> > > 
> > > Ugh.
> > > 
> > > > I'm currently trying to persuade myself that this can actually happen 
> > > > and
> > > > if so work out how to come up with a test case for it.
> > > 
> > > Maybe something like:
> > > 
> > > auto d = 
> > > chrono::floor(system_clock::now().time_since_epoch() - 
> > > seconds(INT_MAX + 2LL));
> > > fut.wait_until(system_clock::time_point(d));
> > > 
> > > This will create a sys_time with a value that is slightly more than
> > > INT_MAX seconds before the current time, with a zero nanoseconds
> > 
> > Ah, but such a time will never reach the overflow because the first
> > thing that the new relative_timespec function does is:
> > 
> > if (now_s > abs_s.count())
> >   {
> > rt.tv_sec = -1;
> > return rt;
> >   }
> > 
> > So in fact we can never have a negative rel_s anyway.
> 
> Here's what I've pushed now, after testing on x86_64-linux.
> 
> 

> commit b8d36dcc917e8a06d8c20b9f5ecc920ed2b9e947
> Author: Jonathan Wakely 
> Date:   Fri Nov 13 20:57:15 2020
> 
> libstdc++: Remove redundant overflow check for futex timeout [PR 93456]
> 
> The relative_timespec function already checks for the case where the
> specified timeout is in the past, so the difference can never be
> negative. That means we dn't need to check if it's more negative than
> the minimum time_t value.
> 
> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/93456
> * src/c++11/futex.cc (relative_timespec): Remove redundant check
> negative values.
> * testsuite/30_threads/future/members/wait_until_overflow.cc: 
> Moved to...
> * testsuite/30_threads/future/members/93456.cc: ...here.
> 
> diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
> index 15959cebee57..c008798318c9 100644
> --- a/libstdc++-v3/src/c++11/futex.cc
> +++ b/libstdc++-v3/src/c++11/futex.cc
> @@ -73,21 +73,23 @@ namespace
>   return rt;
>}
>  
> -auto rel_s = abs_s.count() - now_s;
> +const auto rel_s = abs_s.count() - now_s;
>  
> -// Avoid overflows
> +// Convert the absolute timeout to a relative timeout, without overflow.
>  if (rel_s > __int_traits::__max) [[unlikely]]
> -  rel_s = __int_traits::__max;
> -else if (rel_s < __int_traits::__min) [[unlikely]]
> -  rel_s = __int_traits::__min;
> -
> -// Convert the absolute timeout value to a relative timeout
> -rt.tv_sec = rel_s;
> -rt.tv_nsec = abs_ns.count() - now_ns;
> -if (rt.tv_nsec < 0)
>{
> - rt.tv_nsec += 10;
> - --rt.tv_sec;
> + rt.tv_sec = __int_traits::__max;
> + rt.tv_nsec = 9;
> +  }
> +else
> +  {
> + rt.tv_sec = rel_s;
> + rt.tv_nsec = abs_ns.count() - now_ns;
> + if (rt.tv_nsec < 0)
> +   {
> + rt.tv_nsec += 10;
> + --rt.tv_sec;
> +   }
>}
>  
>  return rt;

LGTM. Thanks.

Mike.

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

2020-11-14 Thread Mike Crowe via Gcc-patches

On Saturday 14 November 2020 at 00:17:59 +, Jonathan Wakely via Libstdc++ 
wrote:
> On 32-bit targets where userspace has switched to 64-bit time_t, we
> cannot pass struct timespec to SYS_futex or SYS_clock_gettime, because
> the userspace definition of struct timespec will not match what the
> kernel expects.
> 
> We use the existence of the SYS_futex_time64 or SYS_clock_gettime_time64
> macros to imply that userspace *might* have switched to the new timespec
> definition.  This is a conservative assumption. It's possible that the
> new syscall numbers are defined in the libc headers but that timespec
> hasn't been updated yet (as is the case for glibc currently). But using
> the alternative struct with two longs is still OK, it's just redundant
> if userspace timespec still uses a 32-bit time_t.
> 
> We also check that SYS_futex_time64 != SYS_futex so that we don't try
> to use a 32-bit tv_sec on modern targets that only support the 64-bit
> system calls and define the old macro to the same value as the new one.
> 
> We could possibly check #ifdef __USE_TIME_BITS64 to see whether
> userspace has actually been updated, but it's not clear if user code
> is meant to inspect that or if it's only for libc internal use.
> 
> libstdc++-v3/ChangeLog:
> 
>   PR libstdc++/93421
>   * src/c++11/chrono.cc [_GLIBCXX_USE_CLOCK_GETTIME_SYSCALL]
>   (syscall_timespec): Define a type suitable for SYS_clock_gettime
>   calls.
>   (system_clock::now(), steady_clock::now()): Use syscall_timespec
>   instead of timespec.
>   * src/c++11/futex.cc (syscall_timespec): Define a type suitable
>   for SYS_futex and SYS_clock_gettime calls.
>   (relative_timespec): Use syscall_timespec instead of timespec.
>   (__atomic_futex_unsigned_base::_M_futex_wait_until): Likewise.
>   (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
>   Likewise.
> 
> Tested x86_64-linux, -m32 too. Committed to trunk.
> 

> commit 4d039cb9a1d0ffc6910fe09b726c3561b64527dc
> Author: Jonathan Wakely 
> Date:   Thu Sep 24 17:35:52 2020
> 
> libstdc++: Use custom timespec in system calls [PR 93421]
> 
> On 32-bit targets where userspace has switched to 64-bit time_t, we
> cannot pass struct timespec to SYS_futex or SYS_clock_gettime, because
> the userspace definition of struct timespec will not match what the
> kernel expects.
> 
> We use the existence of the SYS_futex_time64 or SYS_clock_gettime_time64
> macros to imply that userspace *might* have switched to the new timespec
> definition.  This is a conservative assumption. It's possible that the
> new syscall numbers are defined in the libc headers but that timespec
> hasn't been updated yet (as is the case for glibc currently). But using
> the alternative struct with two longs is still OK, it's just redundant
> if userspace timespec still uses a 32-bit time_t.
> 
> We also check that SYS_futex_time64 != SYS_futex so that we don't try
> to use a 32-bit tv_sec on modern targets that only support the 64-bit
> system calls and define the old macro to the same value as the new one.
> 
> We could possibly check #ifdef __USE_TIME_BITS64 to see whether
> userspace has actually been updated, but it's not clear if user code
> is meant to inspect that or if it's only for libc internal use.

Presumably this is change is only good for the short term? We really want
to be calling the 64-bit time_t versions of SYS_futex and SYS_clock_gettime
passing 64-bit struct timespec so that this code continues to work
correctly after 2038 (for CLOCK_REALTIME) or if someone is unlucky enough
to have a system uptime of over 68 years (for CLOCK_MONOTONIC.) Perhaps
that's part of the post-GCC11 work that you plan to do.

(There's another comment on the patch itself below.)

> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/93421
> * src/c++11/chrono.cc [_GLIBCXX_USE_CLOCK_GETTIME_SYSCALL]
> (syscall_timespec): Define a type suitable for SYS_clock_gettime
> calls.
> (system_clock::now(), steady_clock::now()): Use syscall_timespec
> instead of timespec.
> * src/c++11/futex.cc (syscall_timespec): Define a type suitable
> for SYS_futex and SYS_clock_gettime calls.
> (relative_timespec): Use syscall_timespec instead of timespec.
> (__atomic_futex_unsigned_base::_M_futex_wait_until): Likewise.
> (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
> Likewise.
> 
> diff --git a/libstdc++-v3/src/c++11/chrono.cc 
> b/libstdc++-v3/src/c++11/chrono.cc
> index 723f3002d11a..f10be7d8c073 100644
> --- a/libstdc++-v3/src/c++11/chrono.cc
> +++ b/libstdc++-v3/src/c++11/chrono.cc
> @@ -35,6 +35,17 @@
>  #ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
>  #include 
>  #include 
> +
> +# if defined(SYS_clock_gettime_time64) \
> +  && SYS_clock_gettime_time64 != SY

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

2020-11-14 Thread Jonathan Wakely via Gcc-patches

On Sat, 14 Nov 2020, 13:30 Mike Crowe via Libstdc++, 
wrote:

> On Saturday 14 November 2020 at 00:17:59 +, Jonathan Wakely via
> Libstdc++ wrote:
> > On 32-bit targets where userspace has switched to 64-bit time_t, we
> > cannot pass struct timespec to SYS_futex or SYS_clock_gettime, because
> > the userspace definition of struct timespec will not match what the
> > kernel expects.
> >
> > We use the existence of the SYS_futex_time64 or SYS_clock_gettime_time64
> > macros to imply that userspace *might* have switched to the new timespec
> > definition.  This is a conservative assumption. It's possible that the
> > new syscall numbers are defined in the libc headers but that timespec
> > hasn't been updated yet (as is the case for glibc currently). But using
> > the alternative struct with two longs is still OK, it's just redundant
> > if userspace timespec still uses a 32-bit time_t.
> >
> > We also check that SYS_futex_time64 != SYS_futex so that we don't try
> > to use a 32-bit tv_sec on modern targets that only support the 64-bit
> > system calls and define the old macro to the same value as the new one.
> >
> > We could possibly check #ifdef __USE_TIME_BITS64 to see whether
> > userspace has actually been updated, but it's not clear if user code
> > is meant to inspect that or if it's only for libc internal use.
> >
> > libstdc++-v3/ChangeLog:
> >
> >   PR libstdc++/93421
> >   * src/c++11/chrono.cc [_GLIBCXX_USE_CLOCK_GETTIME_SYSCALL]
> >   (syscall_timespec): Define a type suitable for SYS_clock_gettime
> >   calls.
> >   (system_clock::now(), steady_clock::now()): Use syscall_timespec
> >   instead of timespec.
> >   * src/c++11/futex.cc (syscall_timespec): Define a type suitable
> >   for SYS_futex and SYS_clock_gettime calls.
> >   (relative_timespec): Use syscall_timespec instead of timespec.
> >   (__atomic_futex_unsigned_base::_M_futex_wait_until): Likewise.
> >   (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
> >   Likewise.
> >
> > Tested x86_64-linux, -m32 too. Committed to trunk.
> >
>
> > commit 4d039cb9a1d0ffc6910fe09b726c3561b64527dc
> > Author: Jonathan Wakely 
> > Date:   Thu Sep 24 17:35:52 2020
> >
> > libstdc++: Use custom timespec in system calls [PR 93421]
> >
> > On 32-bit targets where userspace has switched to 64-bit time_t, we
> > cannot pass struct timespec to SYS_futex or SYS_clock_gettime,
> because
> > the userspace definition of struct timespec will not match what the
> > kernel expects.
> >
> > We use the existence of the SYS_futex_time64 or
> SYS_clock_gettime_time64
> > macros to imply that userspace *might* have switched to the new
> timespec
> > definition.  This is a conservative assumption. It's possible that
> the
> > new syscall numbers are defined in the libc headers but that timespec
> > hasn't been updated yet (as is the case for glibc currently). But
> using
> > the alternative struct with two longs is still OK, it's just
> redundant
> > if userspace timespec still uses a 32-bit time_t.
> >
> > We also check that SYS_futex_time64 != SYS_futex so that we don't try
> > to use a 32-bit tv_sec on modern targets that only support the 64-bit
> > system calls and define the old macro to the same value as the new
> one.
> >
> > We could possibly check #ifdef __USE_TIME_BITS64 to see whether
> > userspace has actually been updated, but it's not clear if user code
> > is meant to inspect that or if it's only for libc internal use.
>
> Presumably this is change is only good for the short term? We really want
> to be calling the 64-bit time_t versions of SYS_futex and SYS_clock_gettime
> passing 64-bit struct timespec so that this code continues to work
> correctly after 2038 (for CLOCK_REALTIME) or if someone is unlucky enough
> to have a system uptime of over 68 years (for CLOCK_MONOTONIC.) Perhaps
> that's part of the post-GCC11 work that you plan to do.
>

Right. This is definitely a short term solution, but I ran out of time to
do something better for GCC 11.


> (There's another comment on the patch itself below.)
>
> > libstdc++-v3/ChangeLog:
> >
> > PR libstdc++/93421
> > * src/c++11/chrono.cc [_GLIBCXX_USE_CLOCK_GETTIME_SYSCALL]
> > (syscall_timespec): Define a type suitable for
> SYS_clock_gettime
> > calls.
> > (system_clock::now(), steady_clock::now()): Use
> syscall_timespec
> > instead of timespec.
> > * src/c++11/futex.cc (syscall_timespec): Define a type
> suitable
> > for SYS_futex and SYS_clock_gettime calls.
> > (relative_timespec): Use syscall_timespec instead of
> timespec.
> > (__atomic_futex_unsigned_base::_M_futex_wait_until):
> Likewise.
> > (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
> > Likewise.
> >
> > diff --git a/libstdc++-v3/src/c++11/chrono.c

[PATCH 1/2] middle-end : Initial scaffolding and definitions for SLP patttern matches

2020-11-14 Thread Tamar Christina via Gcc-patches

Hi All,

This patch adds the pre-requisites and general scaffolding for supporting doing
SLP pattern matching.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* tree-vect-loop.c (vect_dissolve_slp_only_patterns): New.
(vect_dissolve_slp_only_groups): Call here.
* tree-vect-slp.c (vect_free_slp_tree, vect_create_new_slp_node): Export
from file.
(vect_build_slp_tree_2): Set vectype for externals.
(vect_print_slp_tree): Print SLP only patterns.
(optimize_load_redistribution_1, optimize_load_redistribution,
vect_match_slp_patterns_2, vect_match_slp_patterns): New.
(vect_analyze_slp): Call matcher.
* tree-vectorizer.c (vec_info::add_pattern_stmt): Save relevancy.
* tree-vectorizer.h (STMT_VINFO_SAVED_RELEVANT, vect_pop_relevancy,
vect_dissolve_pattern_relevancy, vect_save_relevancy,
vect_push_relevancy, vect_free_slp_tree, enum _complex_operation,
class vect_pattern): New.

--- inline copy of patch --

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 
39b7319e8253c351a4f6fbdd8c154330f08f2b1b..791d9c6cb0649862a84fd3c80efc89fefedbb085
 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -1979,6 +1979,61 @@ vect_get_datarefs_in_loop (loop_p loop, basic_block *bbs,
   return opt_result::success ();
 }
 
+/* For every SLP only pattern created by the pattern matched rooted in ROOT
+   restore the relevancy of the original statements over those of the pattern
+   and destroy the pattern relationship.  This restores the SLP tree to a state
+   where it can be used when SLP build is cancelled or re-tried.  */
+
+static void
+vect_dissolve_slp_only_patterns (loop_vec_info loop_vinfo,
+hash_set *visited, slp_tree root)
+{
+  if (!root || visited->add (root))
+return;
+
+  unsigned int i;
+  slp_tree node;
+  stmt_vec_info related_stmt_info;
+  stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (root);
+
+  if (stmt_info && STMT_VINFO_SLP_VECT_ONLY (stmt_info))
+{
+  vect_pop_relevancy (stmt_info);
+  if ((related_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info)) != NULL)
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_NOTE, vect_location,
+"dissolving relevancy of %G",
+STMT_VINFO_STMT (stmt_info));
+ vect_dissolve_pattern_relevancy (related_stmt_info);
+   }
+}
+
+  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (root), i, node)
+vect_dissolve_slp_only_patterns (loop_vinfo, visited, node);
+}
+
+/* Lookup any SLP Only Pattern statements created by the SLP pattern matcher in
+   all slp_instances in LOOP_VINFO and undo the relevancy of statements such
+   that the original SLP tree before the pattern matching is used.  */
+
+static void
+vect_dissolve_slp_only_patterns (loop_vec_info loop_vinfo)
+{
+
+  unsigned int i;
+  hash_set visited;
+
+  DUMP_VECT_SCOPE ("vect_dissolve_slp_only_patterns");
+
+  /* Unmark any SLP only patterns as relevant and restore the STMT_INFO of the
+ related instruction.  */
+  slp_instance instance;
+  FOR_EACH_VEC_ELT (LOOP_VINFO_SLP_INSTANCES (loop_vinfo), i, instance)
+vect_dissolve_slp_only_patterns (loop_vinfo, &visited,
+SLP_INSTANCE_TREE (instance));
+}
+
 /* Look for SLP-only access groups and turn each individual access into its own
group.  */
 static void
@@ -2583,6 +2638,9 @@ again:
   /* Ensure that "ok" is false (with an opt_problem if dumping is enabled).  */
   gcc_assert (!ok);
 
+  /* Dissolve any SLP patterns created by the SLP pattern matcher.  */
+  vect_dissolve_slp_only_patterns (loop_vinfo);
+
   /* Try again with SLP forced off but if we didn't do any SLP there is
  no point in re-trying.  */
   if (!slp)
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 
0c065e835ad13ad32d222e2590e05ef56849c411..3be565a2e566e09a9e42d6c77ba402b9499b06b6
 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -105,7 +105,7 @@ _slp_tree::~_slp_tree ()
 
 /* Recursively free the memory allocated for the SLP tree rooted at NODE.  */
 
-static void
+void
 vect_free_slp_tree (slp_tree node)
 {
   int i;
@@ -148,7 +148,7 @@ vect_free_slp_instance (slp_instance instance)
 
 /* Create an SLP node for SCALAR_STMTS.  */
 
-slp_tree
+static slp_tree
 vect_create_new_slp_node (slp_tree node,
  vec scalar_stmts, unsigned nops)
 {
@@ -165,7 +165,7 @@ vect_create_new_slp_node (slp_tree node,
 
 /* Create an SLP node for SCALAR_STMTS.  */
 
-static slp_tree
+slp_tree
 vect_create_new_slp_node (vec scalar_stmts, unsigned nops)
 {
   return vect_create_new_slp_node (new _slp_tree, scalar_stmts, nops);
@@ -1646,6 +1646,7 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
{
  slp_tree invnode = vect_create_new_slp_node (oprnd_info->ops);
  SLP_TREE

RE: [PATCH v2 14/16]Arm: Add NEON RTL patterns for Complex Addition, Multiply and FMA.

2020-11-14 Thread Tamar Christina via Gcc-patches

ping

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Ramana Radhakrishnan 
> Subject: [PATCH v2 14/16]Arm: Add NEON RTL patterns for Complex Addition,
> Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex additions.  With this the
> following C code:
> 
>   void f90 (float complex a[restrict N], float complex b[restrict N],
>   float complex c[restrict N])
>   {
> for (int i=0; i < N; i++)
>   c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   f90:
> add r3, r2, #1600
>   .L2:
> vld1.32 {q8}, [r0]!
> vld1.32 {q9}, [r1]!
> vcadd.f32   q8, q8, q9, #90
> vst1.32 {q8}, [r2]!
> cmp r3, r2
> bne .L2
> bx  lr
> 
> 
> instead of
> 
>   f90:
> add r3, r2, #1600
>   .L2:
> vld2.32 {d24-d27}, [r0]!
> vld2.32 {d20-d23}, [r1]!
> vsub.f32  q8, q12, q11
> vadd.f32  q9, q13, q10
> vst2.32 {d16-d19}, [r2]!
> cmp r3, r2
> bne .L2
> bx  lr
> 
> 
> Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/arm/iterators.md (rot): Add UNSPEC_VCMLS,
> UNSPEC_VCMUL and
>   UNSPEC_VCMUL180.
>   (rot_op, rotsplit1, rotsplit2, fcmac1, VCMLA_OP, VCMUL_OP): New.
>   * config/arm/neon.md (cadd3,
> cml4,
>   cmul3): New.
>   * config/arm/unspecs.md (UNSPEC_VCMUL, UNSPEC_VCMUL180,
> UNSPEC_VCMLS,
>   UNSPEC_VCMLS180): New.
> 
> --

RE: [PATCH v2 15/16]Arm: Add MVE RTL patterns for Complex Addition, Multiply and FMA.

2020-11-14 Thread Tamar Christina via Gcc-patches

ping

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:32 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Ramana Radhakrishnan 
> Subject: [PATCH v2 15/16]Arm: Add MVE RTL patterns for Complex Addition,
> Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex operations.  With this
> the following C code:
> 
>   void f90 (int _Complex a[restrict N], int _Complex b[restrict N],
>   int _Complex c[restrict N])
>   {
> for (int i=0; i < N; i++)
>   c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   .L3:
> mov r3, r0
> vldrw.32  q2, [r3]
> mov r3, r1
> vldrw.32  q1, [r3]
> mov r3, r2
> vcadd.i32   q3, q2, q1, #90
> addsr0, r0, #16
> vstrw.32  q3, [r3]
> addsr1, r1, #16
> addsr2, r2, #16
> le  lr, .L3
> pop {r4, r5, r6, r7, r8, pc}
> 
> which is not ideal due to register allocation and addressing mode issues with
> MVE in general.  However -frename-register cleans up the register allocation:
> 
>   .L3:
> mov r5, r0
> mov r6, r1
> vldrw.32  q2, [r5]
> vldrw.32  q1, [r6]
> mov r7, r2
> vcadd.i32   q3, q2, q1, #90
> addsr0, r0, #16
> vstrw.32  q3, [r7]
> addsr1, r1, #16
> addsr2, r2, #16
> le  lr, .L3
> pop {r4, r5, r6, r7, r8, pc}
> 
> but leaves the addressing mode problems.
> 
> Before this patch it generated a scalar loop
> 
>   .L2:
> ldr r7, [r0, r3, lsl #2]
> ldr r5, [r6, r3, lsl #2]
> ldr r4, [r1, r3, lsl #2]
> subsr5, r7, r5
> ldr r7, [lr, r3, lsl #2]
> add r4, r4, r7
> str r5, [r2, r3, lsl #2]
> str r4, [ip, r3, lsl #2]
> addsr3, r3, #2
> cmp r3, #200
> bne .L2
> pop {r4, r5, r6, r7, pc}
> 
> 
> 
> Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.
> Cross compiled arm-none-eabi and ran with -march=armv8.1-
> m.main+mve.fp -mfloat-abi=hard -mfpu=auto and regression is on-going.
> 
> Unfortunately MVE does not currently implement auto-vectorization of
> floating point values.  As such I cannot test this directly.  But since they 
> share
> 90% of the code with NEON these should just work whenever support is
> added so I would still like to commit these.
> 
> To support this I had to refactor the MVE bits a bit.  This now uses the same
> unspecs for both NEON and MVE and removes the unneeded different
> signed and unsigned unspecs since they both point to the signed instruction.
> 
> I have tried multiple approaches to cleaning this up but I think this is the
> nicest it can get given the slight ISA differences.
> 
> Ok for master if no issues?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm_mve.h (__arm_vcaddq_rot90_u8,
> __arm_vcaddq_rot270_u8,
>   , __arm_vcaddq_rot90_s8, __arm_vcaddq_rot270_s8,
>   __arm_vcaddq_rot90_u16, __arm_vcaddq_rot270_u16,
> __arm_vcaddq_rot90_s16,
>   __arm_vcaddq_rot270_s16, __arm_vcaddq_rot90_u32,
>   __arm_vcaddq_rot270_u32, __arm_vcaddq_rot90_s32,
>   __arm_vcaddq_rot270_s32, __arm_vcmulq_rot90_f16,
>   __arm_vcmulq_rot270_f16, __arm_vcmulq_rot180_f16,
>   __arm_vcmulq_f16, __arm_vcaddq_rot90_f16,
> __arm_vcaddq_rot270_f16,
>   __arm_vcmulq_rot90_f32, __arm_vcmulq_rot270_f32,
>   __arm_vcmulq_rot180_f32, __arm_vcmulq_f32,
> __arm_vcaddq_rot90_f32,
>   __arm_vcaddq_rot270_f32, __arm_vcmlaq_f16,
> __arm_vcmlaq_rot180_f16,
>   __arm_vcmlaq_rot270_f16, __arm_vcmlaq_rot90_f16,
> __arm_vcmlaq_f32,
>   __arm_vcmlaq_rot180_f32, __arm_vcmlaq_rot270_f32,
>   __arm_vcmlaq_rot90_f32): Update builtin calls.
>   * config/arm/arm_mve_builtins.def (vcaddq_rot90_u,
> vcaddq_rot270_u,
>   vcaddq_rot90_s, vcaddq_rot270_s, vcaddq_rot90_f,
> vcaddq_rot270_f,
>   vcmulq_f, vcmulq_rot90_f, vcmulq_rot180_f, vcmulq_rot270_f,
>   vcmlaq_f, vcmlaq_rot90_f, vcmlaq_rot180_f, vcmlaq_rot270_f):
> Removed.
>   (vcaddq_rot90, vcaddq_rot270, vcmulq, vcmulq_rot90,
> vcmulq_rot180,
>   vcmulq_rot270, vcmlaq, vcmlaq_rot90, vcmlaq_rot180,
> vcmlaq_rot270):
>   New.
>   * config/arm/constraints.md (Dz): Include MVE.
>   * config/arm/iterators.md (mve_rotsplit1, mve_rotsplit2): New.
>   * config/arm/mve.md (VCADDQ_ROT270_S, VCADDQ_ROT90_S,
> VCADDQ_ROT270_U,
>   VCADDQ_ROT90_U, VCADDQ_ROT270_F, VCADDQ_ROT90_F,
> VCMULQ_F,
>   VCMULQ_ROT180_F, VCMULQ_ROT270_F, VCMULQ_ROT90_F,
> VCMLAQ_F,
>   VCMLAQ_ROT180_F, VCMLAQ_ROT90_F, VCMLAQ_ROT270_F,
> VCADDQ_ROT270_S,
>   VCADDQ_ROT270, VCADDQ_ROT90): Removed.
>   (mve_rot, VCMUL): New.
>   (mve_vcaddq_rot270_ mve_vcaddq_rot90_,
>   mve_vcaddq_rot270_f, mve_vc

RE: [PATCH v2 10/16]AArch64: Add NEON RTL patterns for Complex Addition, Multiply and FMA.

2020-11-14 Thread Tamar Christina via Gcc-patches

ping

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:30 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Marcus Shawcroft 
> Subject: [PATCH v2 10/16]AArch64: Add NEON RTL patterns for Complex
> Addition, Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex operations.  With this
> the following C code:
> 
>   void f90 (float complex a[restrict N], float complex b[restrict N],
>   float complex c[restrict N])
>   {
> for (int i=0; i < N; i++)
>   c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   f90:
> mov x3, 0
> .p2align 3,,7
>   .L2:
> ldr q0, [x0, x3]
> ldr q1, [x1, x3]
> fcadd   v0.4s, v0.4s, v1.4s, #90
> str q0, [x2, x3]
> add x3, x3, 16
> cmp x3, 1600
> bne .L2
> ret
> 
> instead of
> 
>   f90:
> add x3, x1, 1600
> .p2align 3,,7
>   .L2:
> ld2 {v4.4s - v5.4s}, [x0], 32
> ld2 {v2.4s - v3.4s}, [x1], 32
> fsubv0.4s, v4.4s, v3.4s
> faddv1.4s, v5.4s, v2.4s
> st2 {v0.4s - v1.4s}, [x2], 32
> cmp x3, x1
> bne .L2
> ret
> 
> It defined a new iterator VALL_ARITH which contains types for which we can
> do general arithmetic (excludes bfloat16).
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-simd.md (cadd3,
>   cml4, cmul3): New.
>   * config/aarch64/iterators.md (VALL_ARITH, UNSPEC_FCMUL,
>   UNSPEC_FCMUL180, UNSPEC_FCMLS, UNSPEC_FCMLS180,
> UNSPEC_CMLS,
>   UNSPEC_CMLS180, UNSPEC_CMUL, UNSPEC_CMUL180, FCMLA_OP,
> FCMUL_OP, rot_op,
>   rotsplit1, rotsplit2, fcmac1): New.
>   (rot): Add UNSPEC_FCMLS, UNSPEC_FCMUL, UNSPEC_FCMUL180.
> 
> --

RE: [PATCH v2 12/16]AArch64: Add SVE2 Integer RTL patterns for Complex Addition, Multiply and FMA.

2020-11-14 Thread Tamar Christina via Gcc-patches

ping

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Marcus Shawcroft 
> Subject: [PATCH v2 12/16]AArch64: Add SVE2 Integer RTL patterns for
> Complex Addition, Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex operations.  With this
> the following C code:
> 
>   void f90 (int _Complex a[restrict N], int _Complex b[restrict N],
>   int _Complex c[restrict N])
>   {
> for (int i=0; i < N; i++)
>   c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   f90:
> mov x3, 0
> mov x4, 200
> whilelo p0.s, xzr, x4
> .p2align 3,,7
>   .L2:
> ld1wz0.s, p0/z, [x0, x3, lsl 2]
> ld1wz1.s, p0/z, [x1, x3, lsl 2]
> caddz0.s, z0.s, z1.s, #90
> st1wz0.s, p0, [x2, x3, lsl 2]
> incwx3
> whilelo p0.s, x3, x4
> b.any   .L2
> ret
> 
> instead of
> 
>   f90:
> mov x3, 0
> mov x4, 0
> mov w5, 100
> whilelo p0.s, wzr, w5
> .p2align 3,,7
>   .L2:
> ld2w{z4.s - z5.s}, p0/z, [x0, x3, lsl 2]
> ld2w{z2.s - z3.s}, p0/z, [x1, x3, lsl 2]
> sub z0.s, z4.s, z3.s
> add z1.s, z5.s, z2.s
> st2w{z0.s - z1.s}, p0, [x2, x3, lsl 2]
> incwx4
> inchx3
> whilelo p0.s, w4, w5
> b.any   .L2
> ret
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-sve2.md (cadd3,
>   cml4, cmul3): New.
>   * config/aarch64/iterators.md (SVE2_INT_CMLA_OP,
> SVE2_INT_CMUL_OP,
>   SVE2_INT_CADD_OP): New.
> 
> --

RE: [PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex Addition, Multiply and FMA.

2020-11-14 Thread Tamar Christina via Gcc-patches

ping

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar
> Christina
> Sent: Friday, September 25, 2020 3:30 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ;
> Marcus Shawcroft 
> Subject: [PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex
> Addition, Multiply and FMA.
> 
> Hi All,
> 
> This adds implementation for the optabs for complex operations.  With this
> the following C code:
> 
>   void f90 (float complex a[restrict N], float complex b[restrict N],
>   float complex c[restrict N])
>   {
> for (int i=0; i < N; i++)
>   c[i] = a[i] + (b[i] * I);
>   }
> 
> generates
> 
>   f90:
> mov x3, 0
> mov x4, 400
> ptrue   p1.b, all
> whilelo p0.s, xzr, x4
> .p2align 3,,7
>   .L2:
> ld1wz0.s, p0/z, [x0, x3, lsl 2]
> ld1wz1.s, p0/z, [x1, x3, lsl 2]
> fcadd   z0.s, p1/m, z0.s, z1.s, #90
> st1wz0.s, p0, [x2, x3, lsl 2]
> incwx3
> whilelo p0.s, x3, x4
> b.any   .L2
> ret
> 
> instead of
> 
>   f90:
> mov x3, 0
> mov x4, 0
> mov w5, 200
> whilelo p0.s, wzr, w5
> .p2align 3,,7
>   .L2:
> ld2w{z4.s - z5.s}, p0/z, [x0, x3, lsl 2]
> ld2w{z2.s - z3.s}, p0/z, [x1, x3, lsl 2]
> fsubz0.s, z4.s, z3.s
> faddz1.s, z2.s, z5.s
> st2w{z0.s - z1.s}, p0, [x2, x3, lsl 2]
> incwx4
> inchx3
> whilelo p0.s, w4, w5
> b.any   .L2
> ret
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-sve.md (cadd3,
>   cml4, cmul3): New.
>   * config/aarch64/iterators.md (sve_rot1, sve_rot2): New.
> 
> --

Re: [PATCH v5 2/8] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

2020-11-14 Thread Mike Crowe via Gcc-patches

On Friday 13 November 2020 at 21:58:25 +, Mike Crowe via Libstdc++ wrote:
> On Thursday 12 November 2020 at 23:07:47 +, Jonathan Wakely wrote:
> > On 29/05/20 07:17 +0100, Mike Crowe via Libstdc++ wrote:
> > > The futex system call supports waiting for an absolute time if
> > > FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT.  Doing so provides two
> > > benefits:
> > > 
> > > 1. The call to gettimeofday is not required in order to calculate a
> > >   relative timeout.
> > > 
> > > 2. If someone changes the system clock during the wait then the futex
> > >   timeout will correctly expire earlier or later.  Currently that only
> > >   happens if the clock is changed prior to the call to gettimeofday.
> > > 
> > > According to futex(2), support for FUTEX_CLOCK_REALTIME was added in the
> > > v2.6.28 Linux kernel and FUTEX_WAIT_BITSET was added in v2.6.25.  To 
> > > ensure
> > > that the code still works correctly with earlier kernel versions, an 
> > > ENOSYS
> > > error from futex[1] results in the futex_clock_realtime_unavailable flag
> > > being set.  This flag is used to avoid the unnecessary unsupported futex
> > > call in the future and to fall back to the previous gettimeofday and
> > > relative time implementation.
> > > 
> > > glibc applied an equivalent switch in pthread_cond_timedwait to use
> > > FUTEX_CLOCK_REALTIME and FUTEX_WAIT_BITSET rather than FUTEX_WAIT for
> > > glibc-2.10 back in 2009.  See
> > > glibc:cbd8aeb836c8061c23a5e00419e0fb25a34abee7
> > > 
> > > The futex_clock_realtime_unavailable flag is accessed using
> > > std::memory_order_relaxed to stop it becoming a bottleneck.  If the first
> > > two calls to _M_futex_wait_until happen to happen simultaneously then the
> > > only consequence is that both will try to use FUTEX_CLOCK_REALTIME, both
> > > risk discovering that it doesn't work and, if so, both set the flag.
> > > 
> > > [1] This is how glibc's nptl-init.c determines whether these flags are
> > >supported.
> > > 
> > >   * libstdc++-v3/src/c++11/futex.cc: Add new constants for required
> > >   futex flags.  Add futex_clock_realtime_unavailable flag to store
> > >   result of trying to use
> > >   FUTEX_CLOCK_REALTIME. 
> > > (__atomic_futex_unsigned_base::_M_futex_wait_until):
> > >   Try to use FUTEX_WAIT_BITSET with FUTEX_CLOCK_REALTIME and only
> > >   fall back to using gettimeofday and FUTEX_WAIT if that's not
> > >   supported.
> > 
> > Mike,
> > 
> > I've been doing some performance comparisons and this patch seems to
> > make quite a big difference to code that polls a future by calling
> > fut.wait_until(t) using any t < now() as the timeout. For example,
> > fut.wait_until(chrono::system_clock::time_point{}) to wait until the
> > UNIX epoch.
> > 
> > With GCC 10 (or with the if (!futex_clock_realtime_unavailable.load(...)
> > commented out) I see that polling take < 100ns. With the change, it
> > takes 3000ns or more.
> > 
> > Now this is still far better than polling using fut.wait_for(0s) which
> > takes around 5ns due to the clock_gettime call, but I'm about to
> > fix that.
> > 
> > I'm not sure how important it is for wait_until(past) to be fast, but
> > the difference from 100ns to 3000ns seems significant. Do you see the
> > same kind of numbers? Is this just a property of the futex wait with
> > an absolute time?
> > 
> > N.B. using wait_until(system_clock::time_point::min()) or any other
> > time before the epoch doesn't work. The futex syscall returns EINVAL
> > which we don't check for. I'm about to fix that too.
> 
> I see similar behaviour. I suppose this is because the
> gettimeofday/clock_gettime system calls are in the VDSO and therefore
> usually much cheaper to call than the real system call SYS_futex.
> 
> If rather than bailing out early when the relative timeout is negative, I
> call the relative SYS_futex with rt.tv_sec = rt.tv_nsec = 0 then the
> wait_until call takes about ten times longer than when using the absolute
> SYS_futex. I can't really explain that.
> 
> Calling these functions with a time in the past is probably quite common if
> you calculate a single timeout for several operations in sequence. What's
> less clear is whether the performance matters that much when the return
> value indicates a timeout anyway.
> 
> If gettimeofday/clock_gettime are cheap enough then I suppose we can call
> them even in the absolute timeout case (losing benefit 1 above, which
> appears to not really exist) to get the improved performance for timeouts
> in the past whilst retaining the correct behaviour if the clock is warped
> that this patch addressed (benefit 2 above.)

I wrote the attached standalone program to measure the relative performance
of wait operations in the past (or with zero timeout in the relative case)
and ran it on a variety of machines. The results below are in nanoseconds:

|+-+--+---+--+-|
||  Kernel |futex | futex |

[r11-5029 Regression] FAIL: gcc.dg/guality/pr59776.c -Os -DPREVENT_OPTIMIZATION line pr59776.c:20 s2.f == 5.0 on Linux/x86_64

2020-11-14 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

520d5ad337eaa15860a5a964daf7ca46cf31c029 is the first bad commit
commit 520d5ad337eaa15860a5a964daf7ca46cf31c029
Author: Jan Hubicka 
Date:   Sat Nov 14 13:52:36 2020 +0100

Detect EAF flags in ipa-modref

caused

FAIL: gcc.dg/guality/pr59776.c   -O1  -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O1  -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O1  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O1  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O1  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s2.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2  -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2  -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O2  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O2  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s2.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line pr59776.c:17 s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line pr59776.c:17 s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line pr59776.c:20 s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line pr59776.c:20 s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line pr59776.c:20 s2.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line pr59776.c:17 s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line pr59776.c:17 s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line pr59776.c:20 s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line pr59776.c:20 s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line pr59776.c:20 s2.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 
pr59776.c:17 s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 
pr59776.c:17 s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 
pr59776.c:20 s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 
pr59776.c:20 s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 
pr59776.c:20 s2.f == 5.0
FAIL: gcc.dg/guality/pr59776.c  -Og -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c  -Og -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c  -Og -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s2.f == 0.0
FAIL: gcc.dg/guality/pr59776.c  -Og -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c  -Og -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c  -Og -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s2.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -Os  -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -Os  -DPREVENT_OPTIMIZATION  line pr59776.c:17 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -Os  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.f == 5.0
FAIL: gcc.dg/guality/pr59776.c   -Os  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s1.g == 6.0
FAIL: gcc.dg/guality/pr59776.c   -Os  -DPREVENT_OPTIMIZATION  line pr59776.c:20 
s2.f == 5.0

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5029/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr59776.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr59776.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr59776.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr59776.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at

[PATCH/RFC] Add GCC_EXTRA_DIAGNOSTIC_OUTPUT environment variable for fix-it hints

2020-11-14 Thread David Malcolm via Gcc-patches

GCC has had the ability to emit fix-it hints in machine-readable form
since GCC 7 via -fdiagnostics-parseable-fixits and
-fdiagnostics-generate-patch.

The former emits additional specially-formatted lines to stderr; the
option and its format were directly taken from a pre-existing option
in clang.

Ideally this could be used by IDEs so that the user can select specific
fix-it hints and have the IDE apply them to the user's source code
(perhaps turning them into clickable elements, perhaps with an
"Apply All" option, etc). Eclipse CDT has supported this option in
this way for a few years:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=497670

As a user of Emacs I would like Emacs to support such a feature.
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=25987 tracks supporting
GCC fix-it output in Emacs. The discussion there identifies some
issues with the existing option:

(a) columns in the output are specified as byte-offsets within the
line (for exact compatibility with the option in clang), whereas emacs
would prefer to consume them as what GCC 11 calls "display columns".
https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Message-Formatting-Options.html#index-fdiagnostics-column-unit

(b) injecting a command-line option into the build is a fiddly manual
step, varying between build systems. It's far easier for the
user if Emacs simply sets an environment variable when compiling,
GCC uses this to enable the option if it recognizes the value, and
the emacs compilation buffer decodes the additional lines of output
and adds appropriate widgets. In some ways it is a workaround for
not having a language server. Doing it this way means that for the
various combinations of older and newer GCC and older and newer Emacs
that a sufficiently modern combination of both can automatically
support the rich fix-it UI, whereas other combinations will either
not provide the envvar, or silently ignore it, gracefully doing
nothing extra.

Hence this patch adds a new GCC_EXTRA_DIAGNOSTIC_OUTPUT environment
variable to GCC which enables output of machine-parseable fix-it hints.

GCC_EXTRA_DIAGNOSTIC_OUTPUT=fixits-v1 is equivalent to the existing
-fdiagnostics-parseable-fixits option.

GCC_EXTRA_DIAGNOSTIC_OUTPUT=fixits-v2 is the same, but changes the
column output mode to "display columns" rather than bytes, as
required by Emacs.

One remaining issue raised in that Emacs bug is the encoding of these
lines, and, indeed, the encoding of GCC's stderr in general:
currently we emit a mixture of bytes and UTF-8; I believe we emit
filenames as bytes, diagnostic messages as UTF-8, and quote source code
in the original encoding (PR other/93067 covers converting it to UTF-8 on
output). Currently this patch prints octal-escaped bytes for bytes
within filenames and replacement text that aren't printable (which
is what -fdiagnostics-parseable-fixits also does). Doing so at least
allows the common case of ASCII-encoded sources and filenames to work,
whilst allowing for future formats that address the encoding issues.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Thoughts?
Dave

gcc/ChangeLog:
* diagnostic.c (diagnostic_initialize): Eliminate
parseable_fixits_p in favor of initializing extra_output_kind from
GCC_EXTRA_DIAGNOSTIC_OUTPUT.
(convert_column_unit): New function, split out from...
(diagnostic_converted_column): ...this.
(print_parseable_fixits): Add "column_unit" and "tabstop" params.
Use them to call convert_column_unit on the column values.
(diagnostic_report_diagnostic): Eliminate conditional on
parseable_fixits_p in favor of a switch statement on
extra_output_kind, passing the appropriate values to the new
params of print_parseable_fixits.
(selftest::test_print_parseable_fixits_none): Update for new
params of print_parseable_fixits.
(selftest::test_print_parseable_fixits_insert): Likewise.
(selftest::test_print_parseable_fixits_remove): Likewise.
(selftest::test_print_parseable_fixits_replace): Likewise.
(selftest::test_print_parseable_fixits_bytes_vs_display_columns):
New.
(selftest::diagnostic_c_tests): Call it.
* diagnostic.h (enum diagnostics_extra_output_kind): New.
(diagnostic_context::parseable_fixits_p): Delete field in favor
of...
(diagnostic_context::extra_output_kind): ...this new field.
* doc/invoke.texi (Environment Variables): Add
GCC_EXTRA_DIAGNOSTIC_OUTPUT.
* opts.c (common_handle_option): Update handling of
OPT_fdiagnostics_parseable_fixits for change to diagnostic_context
fields.

gcc/testsuite/ChangeLog:
*
gcc.dg/plugin/diagnostic-test-show-locus-GCC_EXTRA_DIAGNOSTIC_OUTPUT-fixits-v1.c:
New file.
*
gcc.dg/plugin/diagnostic-test-show-locus-GCC_EXTRA_DIAGNOSTIC_OUTPUT-fixits-v2.c:
New file.
* gcc.dg/plugin/plugin.exp (plugin_tes

Re: [PATCH] Remove vr_values::extract_range_builtin.

2020-11-14 Thread Aldy Hernandez via Gcc-patches

Any news on the latest snapshot? Can we remove the duplicate range built-in
code?

Aldy

On Thu, Nov 5, 2020, 22:43 Jeff Law  wrote:

>
> On 11/5/20 2:40 PM, Aldy Hernandez wrote:
> > I'll wait for the 11/01 snapshot to finish then.
>
> I'm worried that the 11/01 snapshot is going to generate so many
> failures that it may not be useful.  I'm not sure what's going on, but
> I'm getting a ton of what appear to be codegen correctness issues.
>
>
> jeff
>
>
>

Re: Ping: [PATCH] Ensure colorization doesn't corrupt multibyte sequences in diagnostics

2020-11-14 Thread Lewis Hyatt via Gcc-patches

On Fri, Nov 13, 2020 at 5:27 PM Jeff Law  wrote:
>
>
> On 1/14/20 5:05 PM, Lewis Hyatt wrote:
> > Hello-
> >
> > I thought I might ping this short patch please, just in case it may
> > make sense to include in GCC 10 along with the other UTF-8-related
> > fixes to diagnostics. Thanks!
> >
> > https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html
>
> This is fine for the trunk.  Note that due to the changes to handle
> tabs/control bytes will require this patch to be updated.  It may be as
> simple as moving the c = dw.next_byte() statement up.
>
>
> Go ahead and do the necessary update and retest & repost the patch for
> archival purposes.  If you have commit privs, go ahead and commit the
> updated patch, else indicate in the patch repost that someone needs to
> apply it for you.
>
>
> Thanks for your patience,
>
> Jeff
>
>
> >> #1. diagnostic_show_locus() should be sure it will not corrupt output in
> >> this way, regardless of what ranges it is given to work with.
>
> Yes.
>
>
> >>
> >> #2. libcpp should probably generate a range that includes the whole UTF-8
> >> character. Actually in other ways the range seems not ideal, for example
> >> if an invalid character appears in the middle of the identifier, the
> >> diagnostic still points to the first byte of the identifier.
>
> Probably.  We haven't traditionally worried a  lot about multitbyte
> sequences, so I'm not surprised we're not handling them particularly well.
>
>
> >>
> >> The attached patch fixes #1. It's essentially a one-line change, plus a
> >> new selftest. Would you please have a look at it sometime? bootstrap
> >> and testsuite were done on linux x86-64.
> >>
> >> Other questions that I have:
> >>
> >> - I am not quite clear when a selftest is preferred vs a dejagnu test. In
> >>   this case I stuck with the selftest because color diagnostics don't seem
> >>   to work well with dg-error etc, and it didn't seem worth creating a new
> >>   plugin-based test like g++.dg/plugin just for this. (I also considered
> >>   using the existing g++.dg plugin, but it seems this test should run for
> >>   gcc as well.)
>
> It varies and there's cases that are fine in either and I suspect there
> are many tests in the dejagnu suite that would be better as selftests --
> selftests are a fairly new concept.
>
>
> The guidance I would give is the more a particular test is tied to the
> internals of the code, the more likely a selftest is the right
> approach.  THe more the test needs an end-to-end run through passes of
> the compiler, the more it belongs in the dejagnu suite.
>
>
>
> >>
> >> - I wasn't sure if I should create a PR for an issue such as this, if
> >>   there is already a patch readily available. And if I did create a PR,
> >>   not sure if it's preferred to post the patch to gcc-patches, or as an
> >>   attachment to the PR.
>
> We still prefer patches to go to gcc-patches -- I personally don't troll
> BZ looking for attached patches.
>
>
> >>
> >> - Does it seem worth me looking into #2? I think the patch to address #1 is
> >>   appropriate in any case, because it handles generically all potential
> >>   cases where this may arise, but still perhaps the ranges coming out of
> >>   libcpp could be improved?
>
> I don't think it can hurt to look into the difficulty in addressing #2.
>
>
> jeff
>

Thanks very much for the detailed comments, that's all very useful to
me. This particular patch was subsumed by r11-2092, which added the
support for tab expansion, since this whole function was redone and
now handles multibyte correctly. Sorry I probably should have updated
the thread for this old patch in addition to mentioning in the new
one, to save you some time. I will try to take a look sometime at the
ranges that libcpp outputs too. Thanks again!

-Lewis

[pushed] testsuite, Objective-C : Amend PR23214 for Darwin11.

2020-11-14 Thread Iain Sandoe


Hi,

The test needs to use Object rather than NSObject on this and earlier
OS versions. Although the PR reports against the GNU runtime, we run
this on NeXT as well.

tested on x86_64-darwin11 and x86_64-darwin16
pushed to master,
thanks
Iain

gcc/testsuite/ChangeLog:

* objc.dg/pr23214.m: Use Object as the root object before
Darwin12 (and NSObject after).
---
 gcc/testsuite/objc.dg/pr23214.m | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/objc.dg/pr23214.m  
b/gcc/testsuite/objc.dg/pr23214.m

index 341a2837da5..56cdc025161 100644
--- a/gcc/testsuite/objc.dg/pr23214.m
+++ b/gcc/testsuite/objc.dg/pr23214.m
@@ -7,7 +7,7 @@

 #if defined (__NEXT_RUNTIME__) && defined(__OBJC2__) \
 && defined(__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__) \
-&& __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 1070
+&& __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 1080
 #include 
 #define OBJECT NSObject
 #else
--
2.24.1

[PATCH, v1] PR fortran/48958 - Add runtime diagnostics for SIZE intrinsic function

2020-11-14 Thread Harald Anlauf

Dear all,

here is a first version to check the status of ALLOCATABLE and POINTER
arguments to the SIZE intrinsic at runtime.

What it does not yet cover is situations like

  complex, allocatable :: z(:)
  print *, size (z% re)

Feedback, such as comments for improvement, are welcome.

As is, the patch regtests cleanly on x86_64-pc-linux-gnu.

Thanks,
Harald


PR fortran/48958 - Add runtime diagnostics for SIZE intrinsic function

Add code for runtime checking of status of ALLOCATABLE and POINTER
arguments to the SIZE intrinsic when -fcheck=pointer is specified.

gcc/fortran/ChangeLog:

* trans-intrinsic.c (gfc_conv_intrinsic_size): Generate runtime
checking code for status of argument.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr48958.f90: New test.

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index e0afc10d105..d17b623924c 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -7929,6 +7929,35 @@ gfc_conv_intrinsic_size (gfc_se * se, gfc_expr * expr)
   && strcmp (e->ref->u.c.component->name, "_data") == 0)
 sym = e->symtree->n.sym;

+  if ((gfc_option.rtcheck & GFC_RTCHECK_POINTER)
+  && e
+  && (e->expr_type == EXPR_VARIABLE || e->expr_type == EXPR_FUNCTION))
+{
+  symbol_attribute attr;
+  char *msg;
+
+  attr = gfc_expr_attr (e);
+  if (attr.allocatable)
+	msg = xasprintf ("Allocatable argument '%s' is not allocated",
+			 e->symtree->n.sym->name);
+  else if (attr.pointer)
+	msg = xasprintf ("Pointer argument '%s' is not associated",
+			 e->symtree->n.sym->name);
+  else
+	goto end_arg_check;
+
+  argse.descriptor_only = 1;
+  gfc_conv_expr_descriptor (&argse, actual->expr);
+  tree temp = gfc_conv_descriptor_data_get (argse.expr);
+  tree cond = fold_build2_loc (input_location, EQ_EXPR,
+   logical_type_node, temp,
+   fold_convert (TREE_TYPE (temp),
+		 null_pointer_node));
+  gfc_trans_runtime_check (true, false, cond, &argse.pre, &e->where, msg);
+  free (msg);
+}
+ end_arg_check:
+
   argse.data_not_needed = 1;
   if (gfc_is_class_array_function (e))
 {
diff --git a/gcc/testsuite/gfortran.dg/pr48958.f90 b/gcc/testsuite/gfortran.dg/pr48958.f90
new file mode 100644
index 000..2b109374f40
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr48958.f90
@@ -0,0 +1,25 @@
+! { dg-do run }
+! { dg-options "-fcheck=pointer -fdump-tree-original" }
+! { dg-shouldfail "Fortran runtime error: Allocatable argument 'a' is not allocated" }
+! { dg-output "At line 13 .*" }
+! PR48958 - Add runtime diagnostics for SIZE intrinsic function
+
+program p
+  integer :: n
+  integer,  allocatable :: a(:)
+  integer,  pointer :: b(:)
+  class(*), allocatable :: c(:)
+  integer   :: d(10)
+  print *, size (a)
+  print *, size (b)
+  print *, size (c)
+  print *, size (d)
+  print *, size (f(n))
+contains
+  function f (n)
+integer, intent(in) :: n
+real, allocatable   :: f(:)
+  end function f
+end
+
+! { dg-final { scan-tree-dump-times "_gfortran_runtime_error_at" 4 "original" } }

[PATCH] introduce --param max-object-size

2020-11-14 Thread Martin Sebor via Gcc-patches


GCC considers PTRDIFF_MAX - 1 to be the size of the largest object
so that the difference between a pointer to the byte just past its
end and the first one is no more than PTRDIFF_MAX.  This is too
liberal in LP64 on most systems because the size of the address
space is constrained to much less than that, both by the width
of the address bus for physical memory and by the practical
limitations of disk sizes for swap files.

I've been meaning to add a parameter to specify a lower size limit
to help detect more bugs due to excessive sizes in various function
calls (malloc, memcpy, etc.), and also to help better verify that
warnings use the limit correctly.

Attached is a patch that adds this parameter.  Testing it exposed
a few minor bugs in GCC, both in the AWK script that processes
parameter and option files, as well as in the warning code (and
in tests that exercise it).

Martin
Add new parameter max-object-size.

gcc/ChangeLog:

	* builtins.c (warn_string_no_nul): Print parameter when it's set.
	(maybe_warn_for_bound): Same.
	(compute_objsize_r): Use new parameter.
	* calls.c (alloc_max_size): Same.
	* gimple-ssa-sprintf.c (get_string_length): Same.
	* gimple-ssa-warn-restrict.c (maybe_diag_access_bounds):
	* opt-functions.awk (function var_type_struct): Test for Host_Wide_Int.
	* params.opt (max-object-size): New parameter.
	* tree-ssa-strlen.c (maybe_set_strlen_range): Use new parameter.
	(get_len_or_size): Same.
	* tree.c (max_object_size): Same.

gcc/testsuite/ChangeLog:

	* c-c++-common/Warray-bounds-2.c: Adjust to new parameter value.
	* c-c++-common/Warray-bounds-3.c: Same.
	* c-c++-common/Wrestrict.c: Same.
	* g++.dg/warn/Wplacement-new-size-5.C: Same.
	* gcc.dg/Wstringop-overflow-41.c: Same.
	* gcc.dg/Wstringop-overflow-50.c: Same.
	* gcc.dg/Wstringop-overflow-54.c: Same.
	* gcc.dg/Wstringop-overflow-62.c: Same.
	* gcc.dg/Wstringop-overread-2.c: Same.
	* gcc.dg/attr-nonstring-2.c: Same.
	* gcc.dg/attr-nonstring-3.c: Same.
	* gcc.dg/attr-nonstring-4.c: Same.
	* gcc.dg/strlenopt-40.c: Same.
	* gcc.dg/Wstringop-overflow-64.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index aad99da01c2..78ffad5ccab 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1127,6 +1122,7 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 		 (unsigned long long) bndrng[1].to_uhwi ());
 }
 
+  bool max_exceed = false;
   const tree maxobjsize = max_object_size ();
   const wide_int maxsiz = wi::to_wide (maxobjsize);
   if (expr)
@@ -1134,7 +1130,8 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
   tree func = get_callee_fndecl (expr);
   if (bndrng)
 	{
-	  if (wi::ltu_p (maxsiz, bndrng[0]))
+	  max_exceed = wi::ltu_p (maxsiz, bndrng[0]);
+	  if (max_exceed)
 	warned = warning_at (loc, OPT_Wstringop_overread,
  "%K%qD specified bound %s exceeds "
  "maximum object size %E",
@@ -1165,7 +1162,8 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 {
   if (bndrng)
 	{
-	  if (wi::ltu_p (maxsiz, bndrng[0]))
+	  max_exceed = wi::ltu_p (maxsiz, bndrng[0]);
+	  if (max_exceed)
 	warned = warning_at (loc, OPT_Wstringop_overread,
  "%qs specified bound %s exceeds "
  "maximum object size %E",
@@ -1195,8 +1193,16 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 
   if (warned)
 {
-  inform (DECL_SOURCE_LOCATION (decl),
-	  "referenced argument declared here");
+  if (max_exceed)
+	{
+	  if (param_max_object_size < HOST_WIDE_INT_MAX)
+	inform (loc,
+		"set by %<%s %llu%>",
+		"--param max-object-size", param_max_object_size);
+	}
+  else
+	inform (DECL_SOURCE_LOCATION (decl),
+		"referenced argument declared here");
   TREE_NO_WARNING (arg) = 1;
   if (expr)
 	TREE_NO_WARNING (expr) = 1;
@@ -3955,7 +3961,9 @@ maybe_warn_for_bound (int opt, location_t loc, tree exp, tree func,
 {
   bool maybe = pad && pad->src.phi ();
 
-  if (tree_int_cst_lt (maxobjsize, bndrng[0]))
+  /* Set when the maximum object size has been exceeded.  */
+  const bool max_exceed = tree_int_cst_lt (maxobjsize, bndrng[0]);
+  if (max_exceed)
 	{
 	  if (bndrng[0] == bndrng[1])
 	warned = (func
@@ -4027,7 +4035,14 @@ maybe_warn_for_bound (int opt, location_t loc, tree exp, tree func,
 exp, bndrng[0], bndrng[1], size));
   if (warned)
 	{
-	  if (pad && pad->src.ref)
+	  if (max_exceed)
+	{
+	  if (param_max_object_size < HOST_WIDE_INT_MAX)
+		inform (loc,
+			"set by %<%s=%llu%>",
+			"--param max-object-size", param_max_object_size);
+	}
+	  else if (pad && pad->src.ref)
 	{
 	  if (DECL_P (pad->src.ref))
 		inform (DECL_SOURCE_LOCATION (pad->src.ref),
@@ -4043,7 +4058,10 @@ maybe_warn_for_bound (int opt, location_t loc, tree exp, tree func,
 }
 
   bool maybe = pad && pad->dst.phi ();
-  if (tree_int_cst_lt (maxobjsize, bndrng[0]))
+
+  /* Set when the maximum object size has been exceeded.  */
+  const bool max_exceed = tree_int

Re: [PATCH] Simplify testing symbol sections

2020-11-14 Thread David Edelsohn via Gcc-patches

> Jeffrey Law wrote:

> I worry a bit about the less common native targets -- aix, hpux and the
> like.  But testing them is too painful to contemplate these days.  I'm
> sure those with access to suitable hardware will chime in if something
> is amiss.

All of these testcases now fail on AIX with "no section detected".
One cannot XFAIL the scan, one must skip the entire test because of
the manner in which scan-assembler-symbol-section works.

And, Jeff, the "too painful to contemplate" snide comment is playing
the victim.  There are multiple AIX systems in the GNU Compile Farm
with instructions on how to bootstrap GCC on the system.  Jonathan
Wakely and others don't have a problem testing patches on AIX.

Other people are able to more thoroughly test patches.  As one of the
original GCC developers and a member of the GCC SC, this sets a poor
example of patch development and testing.  This patch should not have
been pushed and should be reverted until it can gracefully fail on
non-ELF targets.

Thanks, David

Re: [PATCH] Simplify testing symbol sections

2020-11-14 Thread Jeff Law via Gcc-patches



On 11/14/20 6:35 PM, David Edelsohn wrote:
>> Jeffrey Law wrote:
>> I worry a bit about the less common native targets -- aix, hpux and the
>> like.  But testing them is too painful to contemplate these days.  I'm
>> sure those with access to suitable hardware will chime in if something
>> is amiss.
> All of these testcases now fail on AIX with "no section detected".
> One cannot XFAIL the scan, one must skip the entire test because of
> the manner in which scan-assembler-symbol-section works.
>
> And, Jeff, the "too painful to contemplate" snide comment is playing
> the victim.  There are multiple AIX systems in the GNU Compile Farm
> with instructions on how to bootstrap GCC on the system.  Jonathan
> Wakely and others don't have a problem testing patches on AIX.
>
> Other people are able to more thoroughly test patches.  As one of the
> original GCC developers and a member of the GCC SC, this sets a poor
> example of patch development and testing.  This patch should not have
> been pushed and should be reverted until it can gracefully fail on
> non-ELF targets.

I've tried repeatedly through the years to use the compile farm to build
aix without success at some point it's just no longer worth my time.  I
tested far more targets than is required by our policies and procedures
and I made a decision to move forward.


Let's just xfail the tests for aix and get on with our lives.


jeff


jeff

>
> Thanks, David
>

Re: [PATCH] Remove vr_values::extract_range_builtin.

2020-11-14 Thread Jeff Law via Gcc-patches



On 11/14/20 1:05 PM, Aldy Hernandez wrote:
> Any news on the latest snapshot? Can we remove the duplicate range
> built-in code?

11-08 looks real good, best we've had since mid-sept.


jeff

Re: [PATCH] Simplify testing symbol sections

2020-11-14 Thread David Edelsohn via Gcc-patches

On Sat, Nov 14, 2020 at 8:58 PM Jeff Law  wrote:
>
>
> On 11/14/20 6:35 PM, David Edelsohn wrote:
> >> Jeffrey Law wrote:
> >> I worry a bit about the less common native targets -- aix, hpux and the
> >> like.  But testing them is too painful to contemplate these days.  I'm
> >> sure those with access to suitable hardware will chime in if something
> >> is amiss.
> > All of these testcases now fail on AIX with "no section detected".
> > One cannot XFAIL the scan, one must skip the entire test because of
> > the manner in which scan-assembler-symbol-section works.
> >
> > And, Jeff, the "too painful to contemplate" snide comment is playing
> > the victim.  There are multiple AIX systems in the GNU Compile Farm
> > with instructions on how to bootstrap GCC on the system.  Jonathan
> > Wakely and others don't have a problem testing patches on AIX.
> >
> > Other people are able to more thoroughly test patches.  As one of the
> > original GCC developers and a member of the GCC SC, this sets a poor
> > example of patch development and testing.  This patch should not have
> > been pushed and should be reverted until it can gracefully fail on
> > non-ELF targets.
>
> I've tried repeatedly through the years to use the compile farm to build
> aix without success at some point it's just no longer worth my time.  I
> tested far more targets than is required by our policies and procedures
> and I made a decision to move forward.
>
>
> Let's just xfail the tests for aix and get on with our lives.

As I wrote, XFAIL doesn't work with scanasm.

I actually have expanded scan-assembler-symbol-section to recognize
AIX CSECT sections and I am updating the tests to check for the
appropriate strings generated in AIX XCOFF.  It all just works.

It would have been nice if you and Matthew had given me and other less
common targets a heads up to test the patch.

Thanks, David

[PATCH] lto: Fix typo in comment of gcc/lto/lto-symtab.c

Re: [PATCH] Add MODE_OPAQUE

Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

Re: [PATCH] [libiberty] Fix write buffer overflow in cplus_demangle

Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

Re: [committed] libstdc++: Use custom timespec in system calls [PR 93421]

[PATCH 1/2] middle-end : Initial scaffolding and definitions for SLP patttern matches

RE: [PATCH v2 14/16]Arm: Add NEON RTL patterns for Complex Addition, Multiply and FMA.

RE: [PATCH v2 15/16]Arm: Add MVE RTL patterns for Complex Addition, Multiply and FMA.

RE: [PATCH v2 10/16]AArch64: Add NEON RTL patterns for Complex Addition, Multiply and FMA.

RE: [PATCH v2 12/16]AArch64: Add SVE2 Integer RTL patterns for Complex Addition, Multiply and FMA.

RE: [PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex Addition, Multiply and FMA.

Re: [PATCH v5 2/8] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

[r11-5029 Regression] FAIL: gcc.dg/guality/pr59776.c -Os -DPREVENT_OPTIMIZATION line pr59776.c:20 s2.f == 5.0 on Linux/x86_64

[PATCH/RFC] Add GCC_EXTRA_DIAGNOSTIC_OUTPUT environment variable for fix-it hints

Re: [PATCH] Remove vr_values::extract_range_builtin.

Re: Ping: [PATCH] Ensure colorization doesn't corrupt multibyte sequences in diagnostics

[pushed] testsuite, Objective-C : Amend PR23214 for Darwin11.

[PATCH, v1] PR fortran/48958 - Add runtime diagnostics for SIZE intrinsic function

[PATCH] introduce --param max-object-size

Re: [PATCH] Simplify testing symbol sections

Re: [PATCH] Simplify testing symbol sections

Re: [PATCH] Remove vr_values::extract_range_builtin.

Re: [PATCH] Simplify testing symbol sections

25 matches

Site Navigation

Mail list logo

Footer information