Re: [PATCH][middle-end][i386][version 5]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-gpr-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-29 Thread Uros Bizjak via Gcc-patches
On Thu, Oct 29, 2020 at 12:55 AM Qing Zhao  wrote:
>
> Hi,
>
> This is the 5th version of the implementation of patch -fzero-call-used-regs.
>
> The major change compared to the previous version (4th version) are:
>
> 1. Documentation change per Richard’s suggestion;
> 2. Use namespace for zero_regs_code;
> 3. Add more general testing cases per Richard’s suggestion;
> 4. I386 part, ST/MM register sets clearing per Uros’s suggestion.
> 5. Add more i386 testing cases for ST/MM clearing per Uros’s suggestion.
> 6. Some minor style fixes.
>
> I have tested this new GCC on both x86 and arm64, no regression.
>
> Please let me know whether it’s ready for stage 1 gcc11?
>
> Thanks.
>
> Qing
>
> **The documentation (gcc.info):
> 'zero_call_used_regs ("CHOICE")'
>
>  The 'zero_call_used_regs' attribute causes the compiler to zero a
>  subset of all call-used registers at function return according to
>  CHOICE.  This is used to increase the program security by either
>  mitigating Return-Oriented Programming (ROP) or preventing
>  information leak through registers.
>
>  A "call-used" register is a register whose contents can be changed
>  by a function call; therefore, a caller cannot assume that the
>  register has the same contents on return from the function as it
>  had before calling the function.  Such registers are also called
>  "call-clobbered", "caller-saved", or "volatile".
>
>  In order to satisfy users with different security needs and control
>  the run-time overhead at the same time, GCC provides a flexible way
>  to choose the subset of the call-used registers to be zeroed.
>
>  The three basic values of CHOICE are:
>
> * 'skip' doesn't zero any call-used registers.
>
> * 'used' only zeros call-used registers that are used in the
>   function.  A "used" register is one whose content has been set
>   or referenced in the function.
>
> * 'all' zeros all call-used registers.
>
>  In addition to these three basic choices, it is possible to modify
>  'used' or 'all' as follows:
>
> * Adding '-gpr' restricts the zeroing to general-purpose
>   registers.
>
> * Adding '-arg' restricts the zeroing to registers that can
>   sometimes be used to pass function arguments.  This includes
>   all argument registers defined by the platform's calling
>   conversion, regardless of whether the function uses those
>   registers for function arguments or not.
>
>  The modifiers can be used individually or together.  If they are
>  used together, they must appear in the order above.
>
>  The full list of CHOICEs is therefore:
>
> * 'skip' doesn't zero any call-used register.
>
> * 'used' only zeros call-used registers that are used in the
>   function.
>
> * 'all' zeros all call-used registers.
>
> * 'used-arg' only zeros used call-used registers that pass
>   arguments.
>
> * 'used-gpr' only zeros used call-used general purpose
>   registers.
>
> * 'used-gpr-arg' only zeros used call-used general purpose
>   registers that pass arguments.
>
> * 'all-gpr-arg' zeros all call-used general purpose registers
>   that pass arguments.
>
> * 'all-arg' zeros all call-used registers that pass arguments.
>
> * 'all-gpr' zeros all call-used general purpose registers.
>
>  Among this list, 'used-gpr-arg', 'used-arg', 'all-gpr-arg', and
>  'all-arg' are mainly used for ROP mitigation.
>
>  The default for the attribute is controlled by
>  '-fzero-call-used-regs’.
>
> '-fzero-call-used-regs=CHOICE'
>  Zero call-used registers at function return to increase the program
>  security by either mitigating Return-Oriented Programming (ROP) or
>  preventing information leak through registers.
>
>  The possible values of CHOICE are the same as for the
>  'zero_call_used_regs' attribute (*note Function Attributes::).  The
>  default is 'skip'.
>
>  You can control this behavior for a specific function by using the
>  function attribute 'zero_call_used_regs' (*note Function
>  Attributes::).
>
> **The changelog:
>
> gcc/ChangeLog:
>
> 2020-10-28  Qing Zhao  
> H.J.Lu  
>
> * common.opt: Add new option -fzero-call-used-regs
> * config/i386/i386.c (zero_call_used_regno_p): New function.
> (zero_call_used_regno_mode): Likewise.
> (zero_all_vector_registers): Likewise.
> (zero_all_st_registers): Likewise.
> (zero_all_mm_registers): Likewise.
> (ix86_zero_call_used_regs): Likewise.
> (TARGET_ZERO_CALL_USED_REGS): Define.
> * df-scan.c (df_epilogue_uses_p): New function.
> (df_get_exit_block_use_set): Replace EPILOGUE_USES with
> df_epilogue_uses_p.
> * df.h (df_epilogue_uses_p): Declare.
> * doc/ext

Fwd: libstdc++: Attempt to resolve PR83562

2020-10-29 Thread Liu Hao via Gcc-patches
I forward it here for comments.

Basing on the behavior of both GCC and Clang, `__cxa_thread_atexit` is used to 
register the
destructor of thread_local objects directly, suggesting the first parameter 
should have `__thiscall`
convention.

libstdc++ used the default `__cdecl` convention and caused crashes on 
1686-w64-mingw32 (see
PR83562). But to my surprise, libcxxabi uses `__cdecl` too [1], but I haven't 
heard any of relevant
reports so far.

Original patch is attached in case you can't find it in gcc-patches.


[1]
https://github.com/llvm/llvm-project/blob/97b351a827677ebbedc10bfbce8ef8844c246553/libcxxabi/src/cxa_thread_atexit.cpp#L22





 转发的消息 
主题: Re: libstdc++: Attempt to resolve PR83562
日期: Tue, 27 Oct 2020 22:38:29 +0800
发件人: Liu Hao 
收件人: Jason Merrill , GCC Patches 

在 2020/10/8 22:56, Jason Merrill 写道:
> 
> Hmm, why isn't the mingw implementation used for all programs?  That would 
> avoid the bug.
> 

There was a little further discussion about this [1].

TL;DR: The mingw-w64 function is linked statically and subject to issues about 
order of destruction.

Recently mingw-w64 has got its own `__cxa_thread_atexit()` so libstdc++ no 
longer exposes it. This
patch for libstdc++ fixes
calling conventions for destructors on i686 so they match mingw-w64 ones.


[1] https://github.com/msys2/MINGW-packages/issues/7096

[2] Below is a direct quote from #mingw-w64 on OFTC:
(lh_ideapad is me and wbs is Martin Storsjö.)

```
[14:29:32]  wbs, what was the rationale for the `__thiscall` 
convention for
`__cxa_thread_atexit()` and
`__cxa_atexit()` in our CRT? I suspect it is correct, but it is not specified 
anywhere in Itanium ABI.
[14:30:41]  In case of evidence for that, the GCC prototype (with 
default __cdecl)
should be wrong.
[14:31:10]  See this:  
https://github.com/msys2/MINGW-packages/issues/7096
[14:52:05]  lh_ideapad: itanium ABI doesn't really talk about windows 
things, but, the function
that is passed to
__cxa_thread_atexit is the object's destructor function, and when calling the 
destructor, which is
technically a member
function, it's done with the thiscall calling convention
[14:52:31]  lh_ideapad: example: https://godbolt.org/z/qbfWT1 (only clang 
as there's no
gcc-mingw there, but if you try
the same there you'll see the same thing)
[14:52:35]  Title: Compiler Explorer (at godbolt.org)
[14:52:58]  lh_ideapad: the destruct function shows that when calling 
__ZN7MyClassD1Ev, the
destructor, it passes the
object pointer in ecx, i.e. thiscall
[14:53:42]  lh_ideapad: and when adding the object to the cleanup list, 
the __ZN7MyClassD1Ev
function is passed
directly to ___cxa_thread_atexit, which then will need to call the function 
using the thiscall
convention
[14:59:54]  lh_ideapad: so yes, I would agree with your patch changing 
libsupc++ to use thiscall
[15:13:01]  gcc is doing the same thing with a wrong calling 
convention , leaving a
garbage value on
i686-w64-mingw32.
[15:13:38]  yup, so definite +1 on your libsupc++ patch for that
[15:14:00]  then how about `__cxa_atexit`?
[15:14:26]  I would say it should work the same, but gcc doesn't normally 
use that one, right?
[15:14:29]  it's not used by GCC (the standard `atexit()` is used).
[15:15:26]  clang has a flag -fuse-cxa-atexit, which makes it use 
cxa_atexit instead of atexit
[15:15:40]  I was a bit dubious on it.
[15:18:59]  GCC has `-fuse-cxa-atexit` too .  Let me check.
[15:18:59]  (I tested it), clang does use __cxa_atexit if the 
-fuse-cxa-atexit flag is used,
and then the dtor
(thiscall) is passed directly to __cxa_atexit, so that's +1 datapoint to that 
it should have
thiscall. gcc doesn't use
__cxa_atexit for i686 windows despite -fuse-cxa-atexit, so that's no points in 
either direction
[15:19:28]  both clang and gcc use a wrapper function that fixes the 
calling convention, when
using atexit at least
[15:20:22]  `-fuse-cxa-atexit` seems to have no effect on 
`i686-w64-mingw32-g++`.
[15:20:46]  exactly. so in practice it doesn't matter for gcc, but I think 
libsupc++ should
handle it the same
[15:23:11]  ok I will compose a new patch for both functions later 
today.
[15:23:13]  :)
[15:23:25]  \o/
[15:24:40]  then for the other issue that the user was posting about; I 
remember testing and
noticing that the
mingw-w64-crt implementation of __cxa_thread_atexit doesn't work with emutls, 
but in all of my
tests, it has been a
non-issue as it has ended up using the libsupc++ code instead
[15:50:50]  probably static linking is broken, so one must link 
against shared libstdc++.
[15:52:20]  it doesn't matter whether it is the CRT or libsupc++ 
implementation that is
linked statically.
[15:53:13]  it seems like current msys builds of libstdc++ doesn't include 
__cxa_thread_atexit
in libstdc++ at all. I'm
pretty sure I tested this back when I made the mingw version, but I'll 
investigate and try to
pinpoint what changed (did gcc
at some point start noticing that mingw-w64-crt contains it and stop providing 
the

Re: PING [PATCH] Enable GCC support for Intel Key Locker extension

2020-10-29 Thread Uros Bizjak via Gcc-patches
On Thu, Oct 29, 2020 at 7:52 AM Hongyu Wang  wrote:
>
> Hi Uros,
>
> > is there a reason to introduce all these (with corresponding changes)?
> > SSE options live in ISA bitmap, so it is kind of strange you need to
> > handle them in ISA2 bitmap. Option handling is not exactly my area,
> > please ask HJ to comment and review this part.
>
> As Hongtao said, this part is needed for new non-avx512 ISAs in ISA2 bitmap.
> Keylocker is the first one, and we do similar thing in AVX-VNNI.

Thanks for the explanation, LGTM for this part.

> > +  pat = gen_rtx_EQ (QImode, gen_rtx_REG (CCZmode, FLAGS_REG),
> > +const0_rtx);
> > +  emit_move_insn (target, pat);
> >
> > emit_insn (gen_rtx_SET (target, pat));
> >
>
> Changed.
>
> > +op1 = copy_to_suggested_reg (op1,
> > + gen_rtx_REG (V2DImode,
> > +  GET_SSE_REGNO (0)),
> > + V2DImode);
> > +
> > +xmm_regs[0] = op1;
> >
> > this is no better than:
> >
> > reg = gen_rtx_REG (V2DImode, GET_SSE_REGNO (0));
> > emit_move_insn (reg, op1)
>
> Changed.
>
> > +xmm_regs[0] = op1;
> > +for (i = 1; i < 3; i++)
> > +  xmm_regs[i] = gen_rtx_REG (V2DImode, GET_SSE_REGNO (i));
> >
> > The first line is dead code, copy_to_suggested reg generated (reg
> > xmm0) RTX for op1. Just use:
> >
> > for (i = 0; i < 3; i++)
> >   xmm_regs[i] = gen_rtx_REG (V2DImode, GET_SSE_REGNO (i));
> >
> > Similar comments for:
>
> All changed.
>
> > +  for (i = 0; i < 4; i++)
> > +{
> > +  tmp_unspec
> > += gen_rtx_UNSPEC_VOLATILE (V2DImode,
> > +   gen_rtvec (1, const0_rtx),
> > +   UNSPECV_ENCODEKEY256U32);
> >
> > Please move the above out of the loop.
> Sorry for such code. Changed.
>
> > Here lies the reason to use hard registers. "sse_reg_operand" is not
> > enough for register allocator. We have no constraint for first SSE
> > reg, so again
> > use hard registers here instead of operands 2 and 3: (reg:V2DI
> > XMM0_REG) and (reg:V2DI XMM1_REG).
>
> Changed for encodekey128u32, encodekey256u32 expander and
> patterns, adjust corresponding function call in i386-expand.c.
>
> Thanks for all the the helpful comments. Updated patch.

The patch is OK for mainline.

(There are some cleanup opportunities, I'll commit them as a
no-functional-change patch in stage3.)

Thanks,
Uros.


Re: stdbool.h: Update true and false expansions for C2x

2020-10-29 Thread Richard Biener via Gcc-patches
On Thu, Oct 29, 2020 at 12:26 AM Joseph Myers  wrote:
>
> C2x has changed the expansions of the true and false macros in
>  so that they have type _Bool (including in #if conditions,
> i.e. an unsigned type in that context).  Use the new expansions in
> GCC's  for C2x.
>
> See bug 82272 for related discussion (but this patch does *not*
> implement the warning discussed there).
>
> Note that it's possible there may be a further change to make bool,
> true and false keywords (there was support in principle for that at
> the April WG14 meeting).  But currently these expansions of type _Bool
> are what C2x requires and there isn't actually a paper before WG14 at
> present that would introduce the new keywords.
>
> Bootstrapped with no regressions on x86_64-pc-linux-gnu.  OK to
> commit?

OK

> gcc/
> 2020-10-28  Joseph Myers  
>
> * ginclude/stdbool.c [__STDC_VERSION__ > 201710L] (true, false):
> Define with type _Bool.
>
> gcc/testsuite/
> 2020-10-28  Joseph Myers  
>
> * gcc.dg/c11-bool-1.c, gcc.dg/c2x-bool-1.c, gcc.dg/c99-bool-4.c:
> New tests.
>
> diff --git a/gcc/ginclude/stdbool.h b/gcc/ginclude/stdbool.h
> index 1b56498d96f..23554223d67 100644
> --- a/gcc/ginclude/stdbool.h
> +++ b/gcc/ginclude/stdbool.h
> @@ -31,8 +31,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #ifndef __cplusplus
>
>  #define bool   _Bool
> +#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
> +#define true   ((_Bool)+1u)
> +#define false  ((_Bool)+0u)
> +#else
>  #define true   1
>  #define false  0
> +#endif
>
>  #else /* __cplusplus */
>
> diff --git a/gcc/testsuite/gcc.dg/c11-bool-1.c 
> b/gcc/testsuite/gcc.dg/c11-bool-1.c
> new file mode 100644
> index 000..0412624a706
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c11-bool-1.c
> @@ -0,0 +1,50 @@
> +/* Test macro expansions in  in C11.  */
> +/* { dg-do run } */
> +/* { dg-options "-std=c11 -pedantic-errors" } */
> +
> +#include 
> +
> +#define str(x) xstr(x)
> +#define xstr(x) #x
> +
> +extern void abort (void);
> +extern void exit (int);
> +extern int strcmp (const char *, const char *);
> +
> +#if false - 1 >= 0
> +#error "false unsigned in #if"
> +#endif
> +
> +#if false != 0
> +#error "false not 0 in #if"
> +#endif
> +
> +#if true - 2 >= 0
> +#error "true unsigned in #if"
> +#endif
> +
> +#if true != 1
> +#error "true not 1 in #if"
> +#endif
> +
> +int
> +main (void)
> +{
> +  if (strcmp (str (bool), "_Bool") != 0)
> +abort ();
> +  if (_Generic (true, int : 1) != 1)
> +abort ();
> +  if (true != 1)
> +abort ();
> +  if (strcmp (str (true), "1") != 0)
> +abort ();
> +  if (_Generic (false, int : 1) != 1)
> +abort ();
> +  if (false != 0)
> +abort ();
> +  if (strcmp (str (false), "0") != 0)
> +abort ();
> +  if (strcmp (str (__bool_true_false_are_defined), "1") != 0)
> +abort ();
> +  exit (0);
> +}
> diff --git a/gcc/testsuite/gcc.dg/c2x-bool-1.c 
> b/gcc/testsuite/gcc.dg/c2x-bool-1.c
> new file mode 100644
> index 000..b64da1f7b43
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c2x-bool-1.c
> @@ -0,0 +1,50 @@
> +/* Test macro expansions in  in C2x.  */
> +/* { dg-do run } */
> +/* { dg-options "-std=c2x -pedantic-errors" } */
> +
> +#include 
> +
> +#define str(x) xstr(x)
> +#define xstr(x) #x
> +
> +extern void abort (void);
> +extern void exit (int);
> +extern int strcmp (const char *, const char *);
> +
> +#if false - 1 < 0
> +#error "false signed in #if"
> +#endif
> +
> +#if false != 0
> +#error "false not 0 in #if"
> +#endif
> +
> +#if true - 2 < 0
> +#error "true signed in #if"
> +#endif
> +
> +#if true != 1
> +#error "true not 1 in #if"
> +#endif
> +
> +int
> +main (void)
> +{
> +  if (strcmp (str (bool), "_Bool") != 0)
> +abort ();
> +  if (_Generic (true, _Bool : 1) != 1)
> +abort ();
> +  if (true != 1)
> +abort ();
> +  if (strcmp (str (true), "((_Bool)+1u)") != 0)
> +abort ();
> +  if (_Generic (false, _Bool : 1) != 1)
> +abort ();
> +  if (false != 0)
> +abort ();
> +  if (strcmp (str (false), "((_Bool)+0u)") != 0)
> +abort ();
> +  if (strcmp (str (__bool_true_false_are_defined), "1") != 0)
> +abort ();
> +  exit (0);
> +}
> diff --git a/gcc/testsuite/gcc.dg/c99-bool-4.c 
> b/gcc/testsuite/gcc.dg/c99-bool-4.c
> new file mode 100644
> index 000..5cae18ad0ce
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c99-bool-4.c
> @@ -0,0 +1,46 @@
> +/* Test macro expansions in  in C99.  */
> +/* { dg-do run } */
> +/* { dg-options "-std=c99 -pedantic-errors" } */
> +
> +#include 
> +
> +#define str(x) xstr(x)
> +#define xstr(x) #x
> +
> +extern void abort (void);
> +extern void exit (int);
> +extern int strcmp (const char *, const char *);
> +
> +#if false - 1 >= 0
> +#error "false unsigned in #if"
> +#endif
> +
> +#if false != 0
> +#error "false not 0 in #if"
> +#endif
> +
> +#if true - 2 >= 0
> +#error "true unsigned in #if"
> +#endif
> +
> +#if true != 1
> +#error "true not 1 in #if"
> +#endif
> +
> +int
> +main 

Re: PING [PATCH] Enable GCC support for Intel Key Locker extension

2020-10-29 Thread Hongyu Wang via Gcc-patches
Thanks for your review! I'll ask Hongtao to check-in the patch for me.

Uros Bizjak  于2020年10月29日周四 下午4:08写道:
>
> On Thu, Oct 29, 2020 at 7:52 AM Hongyu Wang  wrote:
> >
> > Hi Uros,
> >
> > > is there a reason to introduce all these (with corresponding changes)?
> > > SSE options live in ISA bitmap, so it is kind of strange you need to
> > > handle them in ISA2 bitmap. Option handling is not exactly my area,
> > > please ask HJ to comment and review this part.
> >
> > As Hongtao said, this part is needed for new non-avx512 ISAs in ISA2 bitmap.
> > Keylocker is the first one, and we do similar thing in AVX-VNNI.
>
> Thanks for the explanation, LGTM for this part.
>
> > > +  pat = gen_rtx_EQ (QImode, gen_rtx_REG (CCZmode, FLAGS_REG),
> > > +const0_rtx);
> > > +  emit_move_insn (target, pat);
> > >
> > > emit_insn (gen_rtx_SET (target, pat));
> > >
> >
> > Changed.
> >
> > > +op1 = copy_to_suggested_reg (op1,
> > > + gen_rtx_REG (V2DImode,
> > > +  GET_SSE_REGNO (0)),
> > > + V2DImode);
> > > +
> > > +xmm_regs[0] = op1;
> > >
> > > this is no better than:
> > >
> > > reg = gen_rtx_REG (V2DImode, GET_SSE_REGNO (0));
> > > emit_move_insn (reg, op1)
> >
> > Changed.
> >
> > > +xmm_regs[0] = op1;
> > > +for (i = 1; i < 3; i++)
> > > +  xmm_regs[i] = gen_rtx_REG (V2DImode, GET_SSE_REGNO (i));
> > >
> > > The first line is dead code, copy_to_suggested reg generated (reg
> > > xmm0) RTX for op1. Just use:
> > >
> > > for (i = 0; i < 3; i++)
> > >   xmm_regs[i] = gen_rtx_REG (V2DImode, GET_SSE_REGNO (i));
> > >
> > > Similar comments for:
> >
> > All changed.
> >
> > > +  for (i = 0; i < 4; i++)
> > > +{
> > > +  tmp_unspec
> > > += gen_rtx_UNSPEC_VOLATILE (V2DImode,
> > > +   gen_rtvec (1, const0_rtx),
> > > +   UNSPECV_ENCODEKEY256U32);
> > >
> > > Please move the above out of the loop.
> > Sorry for such code. Changed.
> >
> > > Here lies the reason to use hard registers. "sse_reg_operand" is not
> > > enough for register allocator. We have no constraint for first SSE
> > > reg, so again
> > > use hard registers here instead of operands 2 and 3: (reg:V2DI
> > > XMM0_REG) and (reg:V2DI XMM1_REG).
> >
> > Changed for encodekey128u32, encodekey256u32 expander and
> > patterns, adjust corresponding function call in i386-expand.c.
> >
> > Thanks for all the the helpful comments. Updated patch.
>
> The patch is OK for mainline.
>
> (There are some cleanup opportunities, I'll commit them as a
> no-functional-change patch in stage3.)
>
> Thanks,
> Uros.

-- 
Regards,

Hongyu, Wang


[PATCH] More BB vectorization tweaks

2020-10-29 Thread Richard Biener
This tweaks the op build from splats to allow loads marked as not
vectorizable.  It also amends some dump prints with the address of
the SLP node or the instance to better be able to debug things.

Bootstrapped & tested on x86_64-unknown-linux-gnu, pushed.

2020-10-29  Richard Biener  

* tree-vect-slp.c (vect_build_slp_tree_2): Allow splatting
not vectorizable loads.
(vect_build_slp_instance): Amend dumping with address.
(vect_slp_convert_to_external): Likewise.

* gcc.dg/vect/bb-slp-pr65935.c: Adjust.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c |  5 +++--
 gcc/tree-vect-slp.c| 10 ++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
index ea37e4e614c..c262d731150 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
@@ -60,6 +60,7 @@ int main()
 /* We should also be able to use 2-lane SLP to initialize the real and
imaginary components in the first loop of main.  */
 /* { dg-final { scan-tree-dump-times "optimized: basic block" 10 "slp1" } } */
-/* We should see the s->phase[dir] operand and only that operand built
+/* We should see the s->phase[dir] operand splatted and no other operand built
from scalars.  See PR97334.  */
-/* { dg-final { scan-tree-dump-times "Building vector operands from scalars" 1 
"slp1" } } */
+/* { dg-final { scan-tree-dump-times "Using a splat" 1 "slp1" } } */
+/* { dg-final { scan-tree-dump-times "Building vector operands from scalars" 0 
"slp1" } } */
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ff3a0c2fd8e..0a7b8e61632 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1627,8 +1627,10 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
  break;
  if (j == group_size
  /* But avoid doing this for loads where we may be
-able to CSE things.  */
- && !gimple_vuse (first_def->stmt))
+able to CSE things, unless the stmt is not
+vectorizable.  */
+ && (!STMT_VINFO_VECTORIZABLE (first_def)
+ || !gimple_vuse (first_def->stmt)))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
@@ -2379,7 +2381,7 @@ vect_build_slp_instance (vec_info *vinfo,
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
-  "Final SLP tree for instance:\n");
+  "Final SLP tree for instance %p:\n", 
new_instance);
  vect_print_slp_graph (MSG_NOTE, vect_location,
SLP_INSTANCE_TREE (new_instance));
}
@@ -3402,7 +3404,7 @@ vect_slp_convert_to_external (vec_info *vinfo, slp_tree 
node,
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location,
-"Building vector operands from scalars instead\n");
+"Building vector operands of %p from scalars instead\n", 
node);
 
   /* Don't remove and free the child nodes here, since they could be
  referenced by other structures.  The analysis and scheduling phases
-- 
2.26.2


[committed] libstdc++: Rename _UniformRandomNumberGenerator parameters

2020-10-29 Thread Jonathan Wakely via Gcc-patches
The paper P0346R1 renamed uniform random number generators to
uniform random bit generators, to describe their purpose more
accurately. This makes that same change in one of the relevant
files (but not the others).

libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (uniform_int_distribution):
Rename _UniformRandomNumberGenerator template parameters to
_UniformRandomBitGenerator, as per P0346R1.

Tested x86_64-linux. Committed to trunk.

commit 68990ed13dc36cb98f94afa84e9dadc39e955e8c
Author: Jonathan Wakely 
Date:   Thu Oct 29 09:09:44 2020

libstdc++: Rename _UniformRandomNumberGenerator parameters

The paper P0346R1 renamed uniform random number generators to
uniform random bit generators, to describe their purpose more
accurately. This makes that same change in one of the relevant
files (but not the others).

libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (uniform_int_distribution):
Rename _UniformRandomNumberGenerator template parameters to
_UniformRandomBitGenerator, as per P0346R1.

diff --git a/libstdc++-v3/include/bits/uniform_int_dist.h 
b/libstdc++-v3/include/bits/uniform_int_dist.h
index ecb8574864a..cf6ba35c675 100644
--- a/libstdc++-v3/include/bits/uniform_int_dist.h
+++ b/libstdc++-v3/include/bits/uniform_int_dist.h
@@ -184,35 +184,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /**
* @brief Generating functions.
*/
-  template
+  template
result_type
-   operator()(_UniformRandomNumberGenerator& __urng)
+   operator()(_UniformRandomBitGenerator& __urng)
 { return this->operator()(__urng, _M_param); }
 
-  template
+  template
result_type
-   operator()(_UniformRandomNumberGenerator& __urng,
+   operator()(_UniformRandomBitGenerator& __urng,
   const param_type& __p);
 
   template
+  typename _UniformRandomBitGenerator>
void
__generate(_ForwardIterator __f, _ForwardIterator __t,
-  _UniformRandomNumberGenerator& __urng)
+  _UniformRandomBitGenerator& __urng)
{ this->__generate(__f, __t, __urng, _M_param); }
 
   template
+  typename _UniformRandomBitGenerator>
void
__generate(_ForwardIterator __f, _ForwardIterator __t,
-  _UniformRandomNumberGenerator& __urng,
+  _UniformRandomBitGenerator& __urng,
   const param_type& __p)
{ this->__generate_impl(__f, __t, __urng, __p); }
 
-  template
+  template
void
__generate(result_type* __f, result_type* __t,
-  _UniformRandomNumberGenerator& __urng,
+  _UniformRandomBitGenerator& __urng,
   const param_type& __p)
{ this->__generate_impl(__f, __t, __urng, __p); }
 
@@ -227,10 +227,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 private:
   template
+  typename _UniformRandomBitGenerator>
void
__generate_impl(_ForwardIterator __f, _ForwardIterator __t,
-   _UniformRandomNumberGenerator& __urng,
+   _UniformRandomBitGenerator& __urng,
const param_type& __p);
 
   param_type _M_param;
@@ -265,17 +265,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   template
-template
+template
   typename uniform_int_distribution<_IntType>::result_type
   uniform_int_distribution<_IntType>::
-  operator()(_UniformRandomNumberGenerator& __urng,
+  operator()(_UniformRandomBitGenerator& __urng,
 const param_type& __param)
   {
-   typedef typename _UniformRandomNumberGenerator::result_type
- _Gresult_type;
-   typedef typename std::make_unsigned::type __utype;
-   typedef typename std::common_type<_Gresult_type, __utype>::type
- __uctype;
+   typedef typename _UniformRandomBitGenerator::result_type _Gresult_type;
+   typedef typename make_unsigned::type __utype;
+   typedef typename common_type<_Gresult_type, __utype>::type __uctype;
 
const __uctype __urngmin = __urng.min();
const __uctype __urngmax = __urng.max();
@@ -351,19 +349,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 template
+typename _UniformRandomBitGenerator>
   void
   uniform_int_distribution<_IntType>::
   __generate_impl(_ForwardIterator __f, _ForwardIterator __t,
- _UniformRandomNumberGenerator& __urng,
+ _UniformRandomBitGenerator& __urng,
  const param_type& __param)
   {
__glibcxx_function_requires(_ForwardIteratorConcept<_ForwardIterator>)
-   typedef typename _UniformRandomNumberGenerator::result_type
- _Gresult_type;
-   typedef typename std::make_unsigned::type __utype;
-   typedef typename std::common_type<_Gresult_type, __utype>::type
-  

Re: [PATCH] openmp: Implicit 'declare target' for C++ static initializers

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Oct 28, 2020 at 02:20:29PM +, Kwok Cheung Yeung wrote:
> OpenMP 5.0 has a new feature for implicitly marking variables and functions
> that are referenced in the initializers of static variables and functions
> that are already marked 'declare target'. Support was added in the commit
> 'openmp: Implement discovery of implicit declare target to clauses'
> (dc703151d4f4560e647649506d5b4ceb0ee11e90). However, this does not work with
> non-constant C++ initializers, where the initializers can contain references
> to other (non-constant) variables and function calls.
> 
> The C++ front-end stores the initialization information in the
> static_aggregates list (with the variable decl in the TREE_VALUE of an entry
> and the initialization in TREE_PURPOSE) rather than in
> TREE_INITIAL(var_decl). I have added an extra function in omp-offload.cpp to
> walk the variable initialiser trees in static_aggregates, and added a call
> to it from the FE shortly before the initializations are emitted. I have
> also added a testcase to ensure that the implicitly marked
> variables/functions can be referenced in offloaded code.

I'm actually not sure how this can work correctly.
Let's say we have 
int foo () { return 1; }
int bar () { return 2; }
int baz () { return 3; }
int qux () { return 4; }
int a = foo ();
int b = bar ();
int c = baz ();
int *d = &c;
int e = qux ();
int f = e + 1;
int *g = &f;
#pragma omp declare target to (b, d, g)
So, for the implicit declare target discovery, a is not declare target to,
nor is foo, and everything else is; b, d, g explicitly, c because it is
referenced in initializer of b, f because it is mentioned in initializer of
g and e because it is mentioned in initializer of f.
Haven't checked if the new function you've added is called before or after
analyze_function calls omp_discover_implicit_declare_target, but I don't
really see how it can work when it is not inside of that function, so that
discovery of new static vars that are implicitly declare target to doesn't
result in marking of its dynamic initializers too.  Perhaps we need a
langhook for that.  But if it is a separate function, either it is called
before the other discovery and will ignore static initializers for vars
that will only be marked as implicit declare target to later, or it is done
afterwards, but then it would really need to duplicate everything what the
other function does, otherwise it woiuldn't discover everything.

Anyway, that is one thing, the other is even if the implicit declare target
discovery handles those correctly, the question is what should we do
afterwards.  Because the C++ FE normally creates a single function that
performs the dynamic initialization of the TUs variables.  But that function
shouldn't be really declare target to, it initializes not only (explicit or
implicit) declare target to variables, but also host only variables.
So we'll probably need to create next to that host only TU constructor
also a device only constructor function that will only initialize the
declare target to variables.

Jakub



[PATCH][pushed] opts: Sanity check for param names.

2020-10-29 Thread Martin Liška

Hello.

This provides new sanity check:

options.c:1:2: error: #error Parameter option name 
'-param-ipa-jump-function-lookups=' must start with '-param='
1 | #error Parameter option name '-param-ipa-jump-function-lookups=' must 
start with '-param='
  |  ^

Apart form that, I fix the affected option.
I'm going to install the patch.
Martin

gcc/ChangeLog:

* optc-gen.awk: Check that params start with -param=.
* params.opt: Fix ipa-jump-function-lookups.
---
 gcc/optc-gen.awk | 3 +++
 gcc/params.opt   | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/optc-gen.awk b/gcc/optc-gen.awk
index 73a96bac9e7..9e7e9970395 100644
--- a/gcc/optc-gen.awk
+++ b/gcc/optc-gen.awk
@@ -104,6 +104,9 @@ for (i = 0; i < n_opts; i++) {
enabledby_negarg = nth_arg(3, enabledby_arg);
 lang_enabled_by(enabledby_langs, enabledby_name, enabledby_posarg, 
enabledby_negarg);
 }
+
+if (flag_set_p("Param", flags[i]) && !(opts[i] ~ "^-param="))
+  print "#error Parameter option name '" opts[i] "' must start with 
'-param='"
 }
 
 
diff --git a/gcc/params.opt b/gcc/params.opt

index 563c67c11f2..7bac39a9d58 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -253,7 +253,7 @@ The size of translation unit that IPA-CP pass considers 
large.
 Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param 
Optimization
 Maximum size of a list of values associated with each parameter for 
interprocedural constant propagation.
 
--param-ipa-jump-function-lookups=

+-param=ipa-jump-function-lookups=
 Common Joined UInteger Var(param_ipa_jump_function_lookups) Init(8) Param 
Optimization
 Maximum number of statements visited during jump function offset discovery.
 
--

2.29.0



Re: [PATCH][PR target/97540] Don't extract memory from operand for normal memory constraint.

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Tue, Oct 27, 2020 at 11:13:21AM +, Richard Sandiford via Gcc-patches 
wrote:
> Sorry to stick my oar in, but I think we should reconsider the
> bcst_mem_operand approach.  It seems like these patches (and the
> previous one) are fighting against the principle that operands
> cannot be arbitrary expressions.

Many operands already are fairly complex expressions, so it is unclear how
this changes that.
And LRA etc. already handles SUBREGs of MEM which is kind of similar to
this.

> This kind of thing was attempted long ago (even before my time!)
> for SIGN_EXTEND on MIPS.  It ended up causing more problems than
> it solved and in the end it had to be taken out.  I'm worried that
> we might end up going through the same cycle again.
> 
> Also, this LRA code is extremely performance-sensitive in terms
> of compile time: it's often at the top or near the top of the profile.
> So adding calls to new functions like extract_mem_from_operand for
> a fairly niche case probably isn't a good trade-off.

It can be just an inline function that looks through just the target
selected rtxes rather than arbitrary ones (derived from *.md properties or
something).

> I think we should instead find a nice(?) syntax for generating separate
> patterns for the two bcst_vector_operand alternatives from a single
> .md pattern.  That would fit the existing model much more closely.

That would result in thousands of new patterns, I'm not sure it is a good
idea.  Pretty much all AVX512* instructions allow those.

Jakub



Re: [PATCH][middle-end][i386][version 5]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-gpr-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-29 Thread Richard Sandiford via Gcc-patches
Qing Zhao via Gcc-patches  writes:
> +/* Handle a "zero_call_used_regs" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_zero_call_used_regs_attribute (tree *node, tree name, tree args,
> +   int ARG_UNUSED (flags),
> +   bool *no_add_attrs)
> +{
> +  tree decl = *node;
> +  tree id = TREE_VALUE (args);
> +
> +  if (TREE_CODE (decl) != FUNCTION_DECL)
> +{
> +  error_at (DECL_SOURCE_LOCATION (decl),
> + "%qE attribute applies only to functions", name);
> +  *no_add_attrs = true;
> +}
> +
> +  if (TREE_CODE (id) != STRING_CST)
> +{
> +  error_at (DECL_SOURCE_LOCATION (decl),
> + "attribute %qE arguments not a string", name);

The existing message for this seems to be:

  "%qE argument not a string"

(which seems a bit terse, but hey)

> +  *no_add_attrs = true;
> +}
> +
> +  bool found = false;
> +  for (unsigned int i = 0; zero_call_used_regs_opts[i].name != NULL; ++i)
> +if (strcmp (TREE_STRING_POINTER (id),
> + zero_call_used_regs_opts[i].name) == 0)
> +  {
> + found = true;
> + break;
> +  }
> +
> +  if (!found)
> +{
> +  error_at (DECL_SOURCE_LOCATION (decl),
> + "unrecognized zero_call_used_regs attribute: %qs",
> + TREE_STRING_POINTER (id));

The attribute name needs to be quoted, and it would be good if it
wasn't hard-coded into the string:

  error_at (DECL_SOURCE_LOCATION (decl),
"unrecognized %qE attribute argument %qs", name,
TREE_STRING_POINTER (id));

> @@ -228,6 +228,10 @@ unsigned int flag_sanitize_coverage
>  Variable
>  bool dump_base_name_prefixed = false
>  
> +; What subset of registers should be zeroed

Think it would be useful to add “ on function return.”.

> +Variable
> +unsigned int flag_zero_call_used_regs
> +
>  ###
>  Driver
>  
> diff --git a/gcc/df.h b/gcc/df.h
> index 8b6ca8c..0f098d7 100644
> --- a/gcc/df.h
> +++ b/gcc/df.h
> @@ -1085,6 +1085,7 @@ extern void df_update_entry_exit_and_calls (void);
>  extern bool df_hard_reg_used_p (unsigned int);
>  extern unsigned int df_hard_reg_used_count (unsigned int);
>  extern bool df_regs_ever_live_p (unsigned int);
> +extern bool df_epilogue_uses_p (unsigned int);
>  extern void df_set_regs_ever_live (unsigned int, bool);
>  extern void df_compute_regs_ever_live (bool);
>  extern void df_scan_verify (void);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index c9f7299..b011c17 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3992,6 +3992,96 @@ performing a link with relocatable output (i.e.@: 
> @code{ld -r}) on them.
>  A declaration to which @code{weakref} is attached and that is associated
>  with a named @code{target} must be @code{static}.
>  
> +@item zero_call_used_regs ("@var{choice}")
> +@cindex @code{zero_call_used_regs} function attribute
> +
> +The @code{zero_call_used_regs} attribute causes the compiler to zero
> +a subset of all call-used registers at function return according to
> +@var{choice}.

Suggest dropping “according to @var{choice}” here, since it's now
disconnected with the part that talks about what @var{choice} is.

> +This is used to increase the program security by either mitigating

s/the program security/program security/

> +Return-Oriented Programming (ROP) or preventing information leak

leakage

(FWIW, I'm not sure “mitigating ROP” is really correct usage, but I don't
have any better suggestions.)

> +through registers.
> +
> +A ``call-used'' register is a register whose contents can be changed by
> +a function call; therefore, a caller cannot assume that the register has
> +the same contents on return from the function as it had before calling
> +the function.  Such registers are also called ``call-clobbered'',
> +``caller-saved'', or ``volatile''.

Reading it back, perhaps it would be better to put this whole paragraph
in a footnote immediately after the first use of “call-used registers”,
i.e.

…call-used registers@footnote{A ``call-used'' register…}…

It obviously breaks the flow when reading the raw .texi, but I think it
reads better in the final version.

> +In order to satisfy users with different security needs and control the
> +run-time overhead at the same time, GCC provides a flexible way to choose
> +the subset of the call-used registers to be zeroed.

Maybe s/GCC/the @var{choice} parameter/.

> +
> +The three basic values of @var{choice} are:

After which, I think this should be part of the previous paragraph.

> +@itemize @bullet
> +@item
> +@samp{skip} doesn't zero any call-used registers.
> +
> +@item
> +@samp{used} only zeros call-used registers that are used in the function.
> +A ``used'' register is one whose content has been set or referenced in
> +the function.
> +
> +@item
> +@samp{all} zeros all call-used registers.
> +@end itemize
> +
> +In addition to these three basic choices, it 

[PATCH] Consistently pass the vector type for scalar SLP cost compute

2020-10-29 Thread Richard Biener
This avoids randomly (based on whether the stmt is
SLP_TREE_REPRESENTATIVE and not a pattern stmt) passing a vector
type or NULL to the add_stmt_cost hook for scalar code cost
compute.  For example the x86 backend uses only the vector type to
decide on the scalar computation mode which makes costing off.

So the following explicitely passes the vector type and uses
SLP_TREE_VECTYPE for this purpose.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

2020-10-29  Richard Biener  

* tree-vect-slp.c (vect_bb_slp_scalar_cost): Pass
SLP_TREE_VECTYPE to record_stmt_cost.
---
 gcc/tree-vect-slp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 0a7b8e61632..7a08908cde8 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3937,7 +3937,8 @@ vect_bb_slp_scalar_cost (vec_info *vinfo,
continue;
   else
kind = scalar_stmt;
-  record_stmt_cost (cost_vec, 1, kind, orig_stmt_info, 0, vect_body);
+  record_stmt_cost (cost_vec, 1, kind, orig_stmt_info,
+   SLP_TREE_VECTYPE (node), 0, vect_body);
 }
 
   auto_vec subtree_life;
-- 
2.26.2


Re: testsuite: Adjust target requirements for sad-vectorize and signbit

2020-10-29 Thread Alan Modra via Gcc-patches
Fixes
FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-not stxvd2x
FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times mfvsrd 3
FAIL: gcc.target/powerpc/signbit-1.c scan-assembler-times srdi 3
FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times ld 1
FAIL: gcc.target/powerpc/signbit-2.c scan-assembler-times srdi 1
on powerpc-linux (or powerpc64-linux biarch -m32).

signbit-1.c is quite obviously a 64-bit only testcase given the
scan-assembler directives, and the purpose of the testcase to verify
the 64-bit only UNSPEC_SIGNBIT patterns.  It could be made to pass for
-m32 by adding -mpowerpc64, but that option that isn't very effective
when bi-arch testing and results in errors on rs6000-aix.  And it is
pointless to match -m32 stores to the stack followed by loads, which
is what we do at the moment.

signbit-2.c on the other hand has more reasonable 32-bit output.

Regression tested powerpc64-linux biarch.

* gcc.target/powerpc/signbit-1.c: Reinstate lp64 condition.
* gcc.target/powerpc/signbit-2.c: Match 32-bit output too.

diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-1.c 
b/gcc/testsuite/gcc.target/powerpc/signbit-1.c
index eb4f53e397d..1642bf46d7a 100644
--- a/gcc/testsuite/gcc.target/powerpc/signbit-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/signbit-1.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
 /* { dg-require-effective-target ppc_float128_sw } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power8 -O2 -mfloat128" } */
diff --git a/gcc/testsuite/gcc.target/powerpc/signbit-2.c 
b/gcc/testsuite/gcc.target/powerpc/signbit-2.c
index ff6af963dda..1b792916eba 100644
--- a/gcc/testsuite/gcc.target/powerpc/signbit-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/signbit-2.c
@@ -13,5 +13,7 @@ int do_signbit_kf (__float128 *a) { return __builtin_signbit 
(*a); }
 /* { dg-final { scan-assembler-not   "lxvw4x"   } } */
 /* { dg-final { scan-assembler-not   "lxsd" } } */
 /* { dg-final { scan-assembler-not   "lxsdx"} } */
-/* { dg-final { scan-assembler-times "ld" 1 } } */
-/* { dg-final { scan-assembler-times "srdi"   1 } } */
+/* { dg-final { scan-assembler-times "ld" 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "srdi"   1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwz"1 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "rlwinm" 1 { target ilp32 } } } */

-- 
Alan Modra
Australia Development Lab, IBM


[committed] libstdc++: Fix some warnings in headers

2020-10-29 Thread Jonathan Wakely via Gcc-patches
These are usually suppressed without -Wsystem-headers.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h (_Local_iterator_base): Cast
value to avoid -Wsign-compare warnings.
* include/bits/regex.h (sub_match::_M_str): Avoid narrowing
conversion.
* include/bits/regex_compiler.tcc (_Compiler::_M_quantifier):
Initialize variable to avoid -Wmaybe-uninitialized warning.
* include/bits/shared_ptr_base.h (_Sp_counted_deleter::_Impl):
Reorder mem-initializer-list to avoid -Wreorder warning.
* include/bits/stl_tree.h (_Rb_tree_impl): Explicitly
initialize base class in copy constructor.
* include/debug/safe_iterator.h (_Safe_iterator): Likewise.
* include/ext/debug_allocator.h: Reorder mem-initializer-list
to avoid -Wreorder warning.
* include/ext/throw_allocator.h (throw_allocator_limit)
(throw_allocator_random): Add user-declared assignment operators
to avoid -Wdeprecated-copy warnings.

Tested x86_64-linux. Committed to trunk.

commit eb6b71b83c9f099808bc50c6a467a0caf4002e50
Author: Jonathan Wakely 
Date:   Thu Oct 29 11:43:55 2020

libstdc++: Fix some warnings in headers

These are usually suppressed without -Wsystem-headers.

libstdc++-v3/ChangeLog:

* include/bits/hashtable_policy.h (_Local_iterator_base): Cast
value to avoid -Wsign-compare warnings.
* include/bits/regex.h (sub_match::_M_str): Avoid narrowing
conversion.
* include/bits/regex_compiler.tcc (_Compiler::_M_quantifier):
Initialize variable to avoid -Wmaybe-uninitialized warning.
* include/bits/shared_ptr_base.h (_Sp_counted_deleter::_Impl):
Reorder mem-initializer-list to avoid -Wreorder warning.
* include/bits/stl_tree.h (_Rb_tree_impl): Explicitly
initialize base class in copy constructor.
* include/debug/safe_iterator.h (_Safe_iterator): Likewise.
* include/ext/debug_allocator.h: Reorder mem-initializer-list
to avoid -Wreorder warning.
* include/ext/throw_allocator.h (throw_allocator_limit)
(throw_allocator_random): Add user-declared assignment operators
to avoid -Wdeprecated-copy warnings.

diff --git a/libstdc++-v3/include/bits/hashtable_policy.h 
b/libstdc++-v3/include/bits/hashtable_policy.h
index f5ce7209957..cea5e549d25 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -1368,7 +1368,7 @@ namespace __detail
 
   ~_Local_iterator_base()
   {
-   if (_M_bucket_count != -1)
+   if (_M_bucket_count != size_t(-1))
  _M_destroy();
   }
 
@@ -1376,7 +1376,7 @@ namespace __detail
   : __node_iter_base(__iter._M_cur), _M_bucket(__iter._M_bucket)
   , _M_bucket_count(__iter._M_bucket_count)
   {
-   if (_M_bucket_count != -1)
+   if (_M_bucket_count != size_t(-1))
  _M_init(*__iter._M_h());
   }
 
diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index 15e4289bf95..3cbd0d5913e 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -994,7 +994,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
_M_str() const noexcept
{
  if (this->matched)
-   if (auto __len = this->second - this->first)
+   if (size_t __len = this->second - this->first)
  return { std::__addressof(*this->first), __len };
  return {};
}
diff --git a/libstdc++-v3/include/bits/regex_compiler.tcc 
b/libstdc++-v3/include/bits/regex_compiler.tcc
index 2ae4af02c89..c26b28a6965 100644
--- a/libstdc++-v3/include/bits/regex_compiler.tcc
+++ b/libstdc++-v3/include/bits/regex_compiler.tcc
@@ -233,16 +233,16 @@ namespace __detail
  _StateSeqT __e(*_M_nfa, _M_nfa->_M_insert_dummy());
  long __min_rep = _M_cur_int_value(10);
  bool __infi = false;
- long __n;
+ long __n = 0;
 
  // {3
  if (_M_match_token(_ScannerT::_S_token_comma))
-   if (_M_match_token(_ScannerT::_S_token_dup_count)) // {3,7}
- __n = _M_cur_int_value(10) - __min_rep;
-   else
- __infi = true;
- else
-   __n = 0;
+   {
+ if (_M_match_token(_ScannerT::_S_token_dup_count)) // {3,7}
+   __n = _M_cur_int_value(10) - __min_rep;
+ else
+   __infi = true;
+   }
  if (!_M_match_token(_ScannerT::_S_token_interval_end))
__throw_regex_error(regex_constants::error_brace,
"Unexpected end of brace expression.");
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 543783ba034..368b2d7379a 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/

Re: [PATCH, 1/3, OpenMP] Target mapping changes for OpenMP 5.0, front-end parts

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Oct 28, 2020 at 06:31:22PM +0800, Chung-Lin Tang wrote:
> > > +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> > > +if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> > > + && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER
> > > + && TREE_CODE (TREE_TYPE (OMP_CLAUSE_DECL (c))) != ARRAY_TYPE)
> > > +  {
> > > + tree ptr = OMP_CLAUSE_DECL (c);
> > > + bool ptr_mapped = false;
> > > + if (is_target)
> > > +   {
> > > + for (tree m = clauses; m; m = OMP_CLAUSE_CHAIN (m))
> > Isn't this O(n^2) in number of clauses?  I mean, e.g. for the equality
> > comparisons (but see below) it could be dealt with e.g. using some bitmap
> > with DECL_UIDs.
> 
> At this stage, we really don't assume any ordering of the clauses, nor try to
> modify its ordering yet, so the base-pointer map (if it exists) could be any
> where in the list (building some "visited set" isn't really suitable here).
> I don't think this is really that much an issue of concern though.

Many functions try hard to avoid O(n^2) issues, see e.g. all the bitmap
handling in *finish_omp_clauses etc.
One can have tens of thousands of clauses and then the quadraticness will
hit hard.  This does a mere OMP_CLAUSE_DECL (c) == ptr comparison, so it
is only about the decls and decls can be very easily handled through
DECL_UID (bitmaps, hash sets/maps/tables).

> +extern void c_omp_adjust_clauses (tree, bool);

So, can you please rename the function to either
c_omp_adjust_target_clauses or c_omp_adjust_mapping_clauses or
c_omp_adjust_map_clauses?

> --- a/gcc/c-family/c-omp.c
> +++ b/gcc/c-family/c-omp.c
> @@ -2579,3 +2579,50 @@ c_omp_map_clause_name (tree clause, bool oacc)
>  }
>return omp_clause_code_name[OMP_CLAUSE_CODE (clause)];
>  }
> +
> +/* Adjust map clauses after normal clause parsing, mainly to turn specific
> +   base-pointer map cases into attach/detach and mark them addressable.  */
> +void
> +c_omp_adjust_clauses (tree clauses, bool is_target)
> +{
> +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> + && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER

If this is only meant to handle decls, perhaps there should be
&& DECL_P (OMP_CLAUSE_DECL (c))
?

> + && TREE_CODE (TREE_TYPE (OMP_CLAUSE_DECL (c))) != ARRAY_TYPE)
> +  {
> + tree ptr = OMP_CLAUSE_DECL (c);
> + bool ptr_mapped = false;
> + if (is_target)
> +   {
> + for (tree m = clauses; m; m = OMP_CLAUSE_CHAIN (m))
> +   if (OMP_CLAUSE_CODE (m) == OMP_CLAUSE_MAP
> +   && OMP_CLAUSE_DECL (m) == ptr
> +   && (OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_ALLOC
> +   || OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_TO
> +   || OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_FROM
> +   || OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_TOFROM
> +   || OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_ALWAYS_TO
> +   || OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_ALWAYS_FROM
> +   || OMP_CLAUSE_MAP_KIND (m) == GOMP_MAP_ALWAYS_TOFROM))
> + {
> +   ptr_mapped = true;
> +   break;
> + }

What you could e.g. do is have this loop at the start of function, with
&& DECL_P (OMP_CLAUSE_DECL (m))
instead of the == ptr check, and perhaps && POINTER_TYPE_P (TREE_TYPE
(OMP_CLAUSE_DECL (m))) check and set a bit in a bitmap for each such decl,
then in the GOMP_MAP_FIRSTPRIVATE_POINTER loop just check the bitmap.
Or, keep it in the loop like it is above, but populate the bitmap
lazily (upon seeing the first GOMP_MAP_FIRSTPRIVATE_POINTER) and for further
ones just use it.

Jakub



Re: [PATCH, 2/3, OpenMP] Target mapping changes for OpenMP 5.0, middle-end parts and compiler testcases

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Oct 28, 2020 at 06:32:21PM +0800, Chung-Lin Tang wrote:
> > > @@ -8958,25 +9083,20 @@ gimplify_scan_omp_clauses (tree *list_p, 
> > > gimple_seq *pre_p,
> > > /* An "attach/detach" operation on an update directive 
> > > should
> > >behave as a GOMP_MAP_ALWAYS_POINTER.  Beware that
> > >unlike attach or detach map kinds, 
> > > GOMP_MAP_ALWAYS_POINTER
> > >depends on the previous mapping.  */
> > > if (code == OACC_UPDATE
> > > && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH_DETACH)
> > >   OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_ALWAYS_POINTER);
> > > -   if (gimplify_expr (pd, pre_p, NULL, is_gimple_lvalue, fb_lvalue)
> > > -   == GS_ERROR)
> > > - {
> > > -   remove = true;
> > > -   break;
> > > - }
> > So what gimplifies those now?
> 
> They're gimplified somewhere during omp-low now.
> (some gimplify scan testcases were adjusted to accommodate this change)
> 
> I don't remember the exact case I encountered, but there were some issues 
> with gimplified
> expressions inside the map clauses making some later checking more difficult. 
> I haven't seen
> any negative effect of this modification so far.

I don't like that, it goes against many principles, gimplification really
shouldn't leave around non-GIMPLE IL.
If you need to compare same expression or same expression bases later,
perhaps detect the equalities during gimplification before actually gimplifying 
the
clauses and ensure they are gimplified to the same expression or are using
same base (e.g. by adding SAVE_EXPRs or TARGET_EXPRs before the
gimplification).

Jakub



Re: [PATCH] vect: Fix load costs for SLP permutes

2020-10-29 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, Oct 28, 2020 at 2:39 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> For the following test case (compiled with load/store lanes
>> disabled locally):
>>
>>   void
>>   f (uint32_t *restrict x, uint8_t *restrict y, int n)
>>   {
>> for (int i = 0; i < n; ++i)
>>   {
>> x[i * 2] = x[i * 2] + y[i * 2];
>> x[i * 2 + 1] = x[i * 2 + 1] + y[i * 2];
>>   }
>>   }
>>
>> we have a redundant no-op permute on the x[] load node:
>>
>>node 0x4472350 (max_nunits=8, refcnt=2)
>>   stmt 0 _5 = *_4;
>>   stmt 1 _13 = *_12;
>>   load permutation { 0 1 }
>>
>> Then, when costing it, we pick a cost of 1, even though we need 4 copies
>> of the x[] load to match a single y[] load:
>>
>>==> examining statement: _5 = *_4;
>>Vectorizing an unaligned access.
>>vect_model_load_cost: unaligned supported by hardware.
>>vect_model_load_cost: inside_cost = 1, prologue_cost = 0 .
>>
>> The problem is that the code only considers the permutation for
>> the first scalar iteration, rather than for all VF iterations.
>>
>> This patch tries to fix that by using similar logic to
>> vect_transform_slp_perm_load.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> I wonder if we can instead do this counting in vect_transform_slp_load
> where we already count the number of permutes.  That would avoid
> the duplication of the "logic".

Very scathing quotes :-)

But yeah, agree that's better.  How about this version?  Tested as before.

Richard


gcc/
* tree-vectorizer.h (vect_transform_slp_perm_load): Take an
optional extra parameter.
* tree-vect-stmts.c (vect_transform_slp_perm_load): Calculate
the number of loads as well as the number of permutes, taking
the counting loop from...
(vect_model_load_cost): ...here.  Use the value computed by
vect_transform_slp_perm_load for ncopies.
---
 gcc/tree-vect-slp.c   | 39 +--
 gcc/tree-vect-stmts.c | 32 
 gcc/tree-vectorizer.h |  3 ++-
 3 files changed, 43 insertions(+), 31 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ff3a0c2fd8e..2b959af4c56 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4827,13 +4827,16 @@ vect_get_slp_defs (vec_info *,
 
 /* Generate vector permute statements from a list of loads in DR_CHAIN.
If ANALYZE_ONLY is TRUE, only check that it is possible to create valid
-   permute statements for the SLP node NODE.  */
+   permute statements for the SLP node NODE.  Store the number of vector
+   permute instructions in *N_PERMS and the number of vector load
+   instructions in *N_LOADS.  */
 
 bool
 vect_transform_slp_perm_load (vec_info *vinfo,
  slp_tree node, vec dr_chain,
  gimple_stmt_iterator *gsi, poly_uint64 vf,
- bool analyze_only, unsigned *n_perms)
+ bool analyze_only, unsigned *n_perms,
+ unsigned int *n_loads)
 {
   stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0];
   int vec_index = 0;
@@ -4885,6 +4888,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
   vec_perm_builder mask;
   unsigned int nelts_to_build;
   unsigned int nvectors_per_build;
+  unsigned int in_nlanes;
   bool repeating_p = (group_size == DR_GROUP_SIZE (stmt_info)
  && multiple_p (nunits, group_size));
   if (repeating_p)
@@ -4895,6 +4899,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
   mask.new_vector (nunits, group_size, 3);
   nelts_to_build = mask.encoded_nelts ();
   nvectors_per_build = SLP_TREE_VEC_STMTS (node).length ();
+  in_nlanes = DR_GROUP_SIZE (stmt_info) * 3;
 }
   else
 {
@@ -4906,7 +4911,10 @@ vect_transform_slp_perm_load (vec_info *vinfo,
   mask.new_vector (const_nunits, const_nunits, 1);
   nelts_to_build = const_vf * group_size;
   nvectors_per_build = 1;
+  in_nlanes = const_vf * DR_GROUP_SIZE (stmt_info);
 }
+  auto_sbitmap used_in_lanes (in_nlanes);
+  bitmap_clear (used_in_lanes);
 
   unsigned int count = mask.encoded_nelts ();
   mask.quick_grow (count);
@@ -4918,6 +4926,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
   unsigned int stmt_num = j % group_size;
   unsigned int i = (iter_num * DR_GROUP_SIZE (stmt_info)
+ SLP_TREE_LOAD_PERMUTATION (node)[stmt_num]);
+  bitmap_set_bit (used_in_lanes, i);
   if (repeating_p)
{
  first_vec_index = 0;
@@ -5031,6 +5040,32 @@ vect_transform_slp_perm_load (vec_info *vinfo,
}
 }
 
+  if (n_loads)
+{
+  if (repeating_p)
+   *n_loads = SLP_TREE_NUMBER_OF_VEC_STMTS (node);
+  else
+   {
+ /* Enforced above when !repeating_p.  */
+ unsigned int const_nunits = nunits.to_constant ();
+ *n_loads = 0;
+ bool load

Re: [PATCH] vect: Fix load costs for SLP permutes

2020-10-29 Thread Richard Biener via Gcc-patches
On Thu, Oct 29, 2020 at 12:52 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Wed, Oct 28, 2020 at 2:39 PM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> For the following test case (compiled with load/store lanes
> >> disabled locally):
> >>
> >>   void
> >>   f (uint32_t *restrict x, uint8_t *restrict y, int n)
> >>   {
> >> for (int i = 0; i < n; ++i)
> >>   {
> >> x[i * 2] = x[i * 2] + y[i * 2];
> >> x[i * 2 + 1] = x[i * 2 + 1] + y[i * 2];
> >>   }
> >>   }
> >>
> >> we have a redundant no-op permute on the x[] load node:
> >>
> >>node 0x4472350 (max_nunits=8, refcnt=2)
> >>   stmt 0 _5 = *_4;
> >>   stmt 1 _13 = *_12;
> >>   load permutation { 0 1 }
> >>
> >> Then, when costing it, we pick a cost of 1, even though we need 4 copies
> >> of the x[] load to match a single y[] load:
> >>
> >>==> examining statement: _5 = *_4;
> >>Vectorizing an unaligned access.
> >>vect_model_load_cost: unaligned supported by hardware.
> >>vect_model_load_cost: inside_cost = 1, prologue_cost = 0 .
> >>
> >> The problem is that the code only considers the permutation for
> >> the first scalar iteration, rather than for all VF iterations.
> >>
> >> This patch tries to fix that by using similar logic to
> >> vect_transform_slp_perm_load.
> >>
> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >
> > I wonder if we can instead do this counting in vect_transform_slp_load
> > where we already count the number of permutes.  That would avoid
> > the duplication of the "logic".
>
> Very scathing quotes :-)
>
> But yeah, agree that's better.  How about this version?  Tested as before.

OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
> * tree-vectorizer.h (vect_transform_slp_perm_load): Take an
> optional extra parameter.
> * tree-vect-stmts.c (vect_transform_slp_perm_load): Calculate
> the number of loads as well as the number of permutes, taking
> the counting loop from...
> (vect_model_load_cost): ...here.  Use the value computed by
> vect_transform_slp_perm_load for ncopies.
> ---
>  gcc/tree-vect-slp.c   | 39 +--
>  gcc/tree-vect-stmts.c | 32 
>  gcc/tree-vectorizer.h |  3 ++-
>  3 files changed, 43 insertions(+), 31 deletions(-)
>
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index ff3a0c2fd8e..2b959af4c56 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -4827,13 +4827,16 @@ vect_get_slp_defs (vec_info *,
>
>  /* Generate vector permute statements from a list of loads in DR_CHAIN.
> If ANALYZE_ONLY is TRUE, only check that it is possible to create valid
> -   permute statements for the SLP node NODE.  */
> +   permute statements for the SLP node NODE.  Store the number of vector
> +   permute instructions in *N_PERMS and the number of vector load
> +   instructions in *N_LOADS.  */
>
>  bool
>  vect_transform_slp_perm_load (vec_info *vinfo,
>   slp_tree node, vec dr_chain,
>   gimple_stmt_iterator *gsi, poly_uint64 vf,
> - bool analyze_only, unsigned *n_perms)
> + bool analyze_only, unsigned *n_perms,
> + unsigned int *n_loads)
>  {
>stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0];
>int vec_index = 0;
> @@ -4885,6 +4888,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
>vec_perm_builder mask;
>unsigned int nelts_to_build;
>unsigned int nvectors_per_build;
> +  unsigned int in_nlanes;
>bool repeating_p = (group_size == DR_GROUP_SIZE (stmt_info)
>   && multiple_p (nunits, group_size));
>if (repeating_p)
> @@ -4895,6 +4899,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
>mask.new_vector (nunits, group_size, 3);
>nelts_to_build = mask.encoded_nelts ();
>nvectors_per_build = SLP_TREE_VEC_STMTS (node).length ();
> +  in_nlanes = DR_GROUP_SIZE (stmt_info) * 3;
>  }
>else
>  {
> @@ -4906,7 +4911,10 @@ vect_transform_slp_perm_load (vec_info *vinfo,
>mask.new_vector (const_nunits, const_nunits, 1);
>nelts_to_build = const_vf * group_size;
>nvectors_per_build = 1;
> +  in_nlanes = const_vf * DR_GROUP_SIZE (stmt_info);
>  }
> +  auto_sbitmap used_in_lanes (in_nlanes);
> +  bitmap_clear (used_in_lanes);
>
>unsigned int count = mask.encoded_nelts ();
>mask.quick_grow (count);
> @@ -4918,6 +4926,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
>unsigned int stmt_num = j % group_size;
>unsigned int i = (iter_num * DR_GROUP_SIZE (stmt_info)
> + SLP_TREE_LOAD_PERMUTATION (node)[stmt_num]);
> +  bitmap_set_bit (used_in_lanes, i);
>if (repeating_p)
> {
>   first_vec_index = 0;
> @@ -5031,6 +5040,32 @@ vect_transform_slp_perm_loa

[PATCH][AArch64] ACLE intrinsics: convert from BFloat16 to Float32

2020-10-29 Thread Dennis Zhang via Gcc-patches
Hi all,

This patch enables intrinsics to convert BFloat16 scalar and vector operands to 
Float32 modes.
The intrinsics are implemented by shifting each BFloat16 item 16 bits to left 
using shl/shll/shll2 instructions.

Intrinsics are documented at 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics
ISA is documented at https://developer.arm.com/docs/ddi0596/latest

Regtested and bootstrapped.

Is it OK for trunk please?

Thanks
Dennis

gcc/ChangeLog:

2020-10-29  Dennis Zhang  

* config/aarch64/aarch64-simd-builtins.def(vbfcvt): New entry.
(vbfcvt_high, bfcvt): Likewise.
* config/aarch64/aarch64-simd.md(aarch64_vbfcvt): New entry.
(aarch64_vbfcvt_highv8bf, aarch64_bfcvtsf): Likewise.
* config/aarch64/arm_bf16.h (vcvtah_f32_bf16): New intrinsic.
* config/aarch64/arm_neon.h (vcvt_f32_bf16): Likewise.
(vcvtq_low_f32_bf16, vcvtq_high_f32_bf16): Likewise.

gcc/testsuite/ChangeLog

2020-10-29  Dennis Zhang  

* gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c
(test_vcvt_f32_bf16, test_vcvtq_low_f32_bf16): New tests.
(test_vcvtq_high_f32_bf16, test_vcvth_f32_bf16): Likewise.diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 5bc596dbffc..b68c3ca7f4b 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -732,3 +732,8 @@
   VAR1 (UNOP, bfcvtn_q, 0, ALL, v8bf)
   VAR1 (BINOP, bfcvtn2, 0, ALL, v8bf)
   VAR1 (UNOP, bfcvt, 0, ALL, bf)
+
+  /* Implemented by aarch64_{v}bfcvt{_high}.  */
+  VAR2 (UNOP, vbfcvt, 0, ALL, v4bf, v8bf)
+  VAR1 (UNOP, vbfcvt_high, 0, ALL, v8bf)
+  VAR1 (UNOP, bfcvt, 0, ALL, sf)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 381a702eba0..5ae79d67981 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -7238,3 +7238,31 @@
   "bfcvt\\t%h0, %s1"
   [(set_attr "type" "f_cvt")]
 )
+
+;; Use shl/shll/shll2 to convert BF scalar/vector modes to SF modes.
+(define_insn "aarch64_vbfcvt"
+  [(set (match_operand:V4SF 0 "register_operand" "=w")
+	(unspec:V4SF [(match_operand:VBF 1 "register_operand" "w")]
+		  UNSPEC_BFCVTN))]
+  "TARGET_BF16_SIMD"
+  "shll\\t%0.4s, %1.4h, #16"
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_vbfcvt_highv8bf"
+  [(set (match_operand:V4SF 0 "register_operand" "=w")
+	(unspec:V4SF [(match_operand:V8BF 1 "register_operand" "w")]
+		  UNSPEC_BFCVTN2))]
+  "TARGET_BF16_SIMD"
+  "shll2\\t%0.4s, %1.8h, #16"
+  [(set_attr "type" "neon_shift_imm_long")]
+)
+
+(define_insn "aarch64_bfcvtsf"
+  [(set (match_operand:SF 0 "register_operand" "=w")
+	(unspec:SF [(match_operand:BF 1 "register_operand" "w")]
+		UNSPEC_BFCVT))]
+  "TARGET_BF16_FP"
+  "shl\\t%d0, %d1, #16"
+  [(set_attr "type" "neon_shift_reg")]
+)
diff --git a/gcc/config/aarch64/arm_bf16.h b/gcc/config/aarch64/arm_bf16.h
index 984875dcc01..881615498d3 100644
--- a/gcc/config/aarch64/arm_bf16.h
+++ b/gcc/config/aarch64/arm_bf16.h
@@ -40,6 +40,13 @@ vcvth_bf16_f32 (float32_t __a)
   return __builtin_aarch64_bfcvtbf (__a);
 }
 
+__extension__ extern __inline float32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtah_f32_bf16 (bfloat16_t __a)
+{
+  return __builtin_aarch64_bfcvtsf (__a);
+}
+
 #pragma GCC pop_options
 
 #endif
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 85c0d62ca12..9c0386ed7b1 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -35716,6 +35716,27 @@ vcvtq_high_bf16_f32 (bfloat16x8_t __inactive, float32x4_t __a)
   return __builtin_aarch64_bfcvtn2v8bf (__inactive, __a);
 }
 
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvt_f32_bf16 (bfloat16x4_t __a)
+{
+  return __builtin_aarch64_vbfcvtv4bf (__a);
+}
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtq_low_f32_bf16 (bfloat16x8_t __a)
+{
+  return __builtin_aarch64_vbfcvtv8bf (__a);
+}
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vcvtq_high_f32_bf16 (bfloat16x8_t __a)
+{
+  return __builtin_aarch64_vbfcvt_highv8bf (__a);
+}
+
 #pragma GCC pop_options
 
 /* AdvSIMD 8-bit Integer Matrix Multiply (I8MM) intrinsics.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c
index bbea630b182..47af7c494d9 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c
@@ -46,3 +46,43 @@ bfloat16_t test_bfcvt (float32_t a)
 {
   return vcvth_bf16_f32 (a);
 }
+
+/*
+**test_vcvt_f32_bf16:
+** shll	v0.4s, v0.4h, #16
+** ret
+*/
+float32

[PATCH][AArch64] ACLE intrinsics: get low/high half from BFloat16 vector

2020-10-29 Thread Dennis Zhang via Gcc-patches
Hi all,

This patch implements ACLE intrinsics vget_low_bf16 and vget_high_bf16 to 
extract lower or higher half from a bfloat16x8 vector.
The vget_high_bf16 is done by 'dup' instruction. The vget_low_bf16 could be 
done by a 'dup' or 'mov', or it's mostly optimized out by just using the lower 
half of a vector register.
The test for vget_low_bf16 only checks that the interface can be compiled but 
no instruction is checked since none is generated in the test case.

Arm ACLE document at 
https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Regtested and bootstrapped.

Is it OK for trunk please?

Thanks
Denni

gcc/ChangeLog:

2020-10-29  Dennis Zhang  

* config/aarch64/aarch64-simd-builtins.def (vget_half): New entry.
* config/aarch64/aarch64-simd.md (aarch64_vget_halfv8bf): New entry.
* config/aarch64/arm_neon.h (vget_low_bf16): New intrinsic.
(vget_high_bf16): Likewise.
* config/aarch64/predicates.md (aarch64_zero_or_1): New predicate
for zero or one immediate to indicate the lower or higher half.

gcc/testsuite/ChangeLog

2020-10-29  Dennis Zhang  

* gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c
(test_vget_low_bf16, test_vget_high_bf16): New tests.diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 332a0b6b1ea..39ebb776d1d 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -719,6 +719,9 @@
   VAR1 (QUADOP_LANE, bfmlalb_lane_q, 0, ALL, v4sf)
   VAR1 (QUADOP_LANE, bfmlalt_lane_q, 0, ALL, v4sf)
 
+  /* Implemented by aarch64_vget_halfv8bf.  */
+  VAR1 (GETREG, vget_half, 0, ALL, v8bf)
+
   /* Implemented by aarch64_simd_mmlav16qi.  */
   VAR1 (TERNOP, simd_smmla, 0, NONE, v16qi)
   VAR1 (TERNOPU, simd_ummla, 0, NONE, v16qi)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 9f0e2bd1e6f..f62c52ca327 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -7159,6 +7159,19 @@
   [(set_attr "type" "neon_dot")]
 )
 
+;; vget_low/high_bf16
+(define_expand "aarch64_vget_halfv8bf"
+  [(match_operand:V4BF 0 "register_operand")
+   (match_operand:V8BF 1 "register_operand")
+   (match_operand:SI 2 "aarch64_zero_or_1")]
+  "TARGET_BF16_SIMD"
+{
+  int hbase = INTVAL (operands[2]);
+  rtx sel = aarch64_gen_stepped_int_parallel (4, hbase * 4, 1);
+  emit_insn (gen_aarch64_get_halfv8bf (operands[0], operands[1], sel));
+  DONE;
+})
+
 ;; bfmmla
 (define_insn "aarch64_bfmmlaqv4sf"
   [(set (match_operand:V4SF 0 "register_operand" "=w")
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 50f8b23bc17..c6ac0b8dd17 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -35530,6 +35530,20 @@ vbfmlaltq_laneq_f32 (float32x4_t __r, bfloat16x8_t __a, bfloat16x8_t __b,
   return __builtin_aarch64_bfmlalt_lane_qv4sf (__r, __a, __b, __index);
 }
 
+__extension__ extern __inline bfloat16x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vget_low_bf16 (bfloat16x8_t __a)
+{
+  return __builtin_aarch64_vget_halfv8bf (__a, 0);
+}
+
+__extension__ extern __inline bfloat16x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vget_high_bf16 (bfloat16x8_t __a)
+{
+  return __builtin_aarch64_vget_halfv8bf (__a, 1);
+}
+
 __extension__ extern __inline bfloat16x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vcvt_bf16_f32 (float32x4_t __a)
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 215fcec5955..0c8bc2b0c73 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -84,6 +84,10 @@
 		 (ior (match_test "op == constm1_rtx")
 		  (match_test "op == const1_rtx"))
 
+(define_predicate "aarch64_zero_or_1"
+  (and (match_code "const_int")
+   (match_test "op == const0_rtx || op == const1_rtx")))
+
 (define_predicate "aarch64_reg_or_orr_imm"
(ior (match_operand 0 "register_operand")
 	(and (match_code "const_vector")
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c
index c42c7acbbe9..35f4cb864f2 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c
@@ -83,3 +83,14 @@ bfloat16_t test_vduph_laneq_bf16 (bfloat16x8_t a)
   return vduph_laneq_bf16 (a, 7);
 }
 /* { dg-final { scan-assembler-times "dup\\th\[0-9\]+, v\[0-9\]+\.h\\\[7\\\]" 2 } } */
+
+bfloat16x4_t test_vget_low_bf16 (bfloat16x8_t a)
+{
+  return vget_low_bf16 (a);
+}
+
+bfloat16x4_t test_vget_high_bf16 (bfloat16x8_t a)
+{
+  return vget_high_bf16 (a);
+}
+/* { dg-final { scan-assembler-times "dup\\td\[0-9\]+, v\[0-9\]+\.d\\\[1\\\]" 1 } } */


c++: Stop (most) function-scope entities having a template header

2020-10-29 Thread Nathan Sidwell


Currently push_template_decl (mostly) decides whether to add a
template header to an entity by seeing if it has DECL_LANG_SPECIFIC.
That might have been a useful predicate at one time, but basing
semantic implications on how we've decided to represent decls is bound
to be brittle.  And indeed it is, as more decls grow a use for
lang-specific.  In particular I discovered that function-scope
VAR_DECLs couild grow lang-specific, and thereby get a template
header.  There's no need for that, and it breaks an invariant modules
was expected.

This patch changes that, and bases the descision on the properties of
the decl.  In particular the only function-scope decl that gets a
template header is an implicit-typedef.

I also cleaned up the behaviour of it building a template-info only to
ignore it.

gcc/cp/
* pt.c (push_template_decl): Do not give function-scope entities
other than implicit typedefs a template header. Do not readd
template info to a redeclared template.


pushing to trunk

nathan
--
Nathan Sidwell
diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index fdeaa02c887..861e4e7020a 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -5682,11 +5682,6 @@ template_parm_outer_level (tree t, void *data)
 tree
 push_template_decl (tree decl, bool is_friend)
 {
-  int new_template_p = 0;
-  /* True if the template is a member template, in the sense of
- [temp.mem].  */
-  bool member_template_p = false;
-
   if (decl == error_mark_node || !current_template_parms)
 return error_mark_node;
 
@@ -5739,12 +5734,17 @@ push_template_decl (tree decl, bool is_friend)
   else
 is_primary = template_parm_scope_p ();
 
+  /* True if the template is a member template, in the sense of
+ [temp.mem].  */
+  bool member_template_p = false;
+
   if (is_primary)
 {
   warning (OPT_Wtemplates, "template %qD declared", decl);
 
   if (DECL_CLASS_SCOPE_P (decl))
 	member_template_p = true;
+
   if (TREE_CODE (decl) == TYPE_DECL
 	  && IDENTIFIER_ANON_P (DECL_NAME (decl)))
 	{
@@ -5812,11 +5812,16 @@ push_template_decl (tree decl, bool is_friend)
 	}
 }
 
+  bool local_p = (!DECL_IMPLICIT_TYPEDEF_P (decl)
+		  && ((ctx && TREE_CODE (ctx) == FUNCTION_DECL)
+		  || (VAR_OR_FUNCTION_DECL_P (decl)
+			  && DECL_LOCAL_DECL_P (decl;
+
   /* Check to see that the rules regarding the use of default
  arguments are not being violated.  We check args for a friend
  functions when we know whether it's a definition, introducing
  declaration or re-declaration.  */
-  if (!is_friend || TREE_CODE (decl) != FUNCTION_DECL)
+  if (!local_p && (!is_friend || TREE_CODE (decl) != FUNCTION_DECL))
 check_default_tmpl_args (decl, current_template_parms,
 			 is_primary, is_partial, is_friend);
 
@@ -5869,14 +5874,20 @@ push_template_decl (tree decl, bool is_friend)
 return process_partial_specialization (decl);
 
   tree args = current_template_args ();
-
-  tree tmpl;
-  if (!ctx
-  || TREE_CODE (ctx) == FUNCTION_DECL
-  || (CLASS_TYPE_P (ctx) && TYPE_BEING_DEFINED (ctx))
-  || (TREE_CODE (decl) == TYPE_DECL && LAMBDA_TYPE_P (TREE_TYPE (decl)))
-  || (is_friend && !(DECL_LANG_SPECIFIC (decl)
-			 && DECL_TEMPLATE_INFO (decl
+  tree tmpl = NULL_TREE;
+  bool new_template_p = false;
+  if (local_p)
+{
+  /* Does not get a template head.  */
+  tmpl = NULL_TREE;
+  gcc_checking_assert (!is_primary);
+}
+  else if (!ctx
+	   || TREE_CODE (ctx) == FUNCTION_DECL
+	   || (CLASS_TYPE_P (ctx) && TYPE_BEING_DEFINED (ctx))
+	   || (TREE_CODE (decl) == TYPE_DECL && LAMBDA_TYPE_P (TREE_TYPE (decl)))
+	   || (is_friend && !(DECL_LANG_SPECIFIC (decl)
+			  && DECL_TEMPLATE_INFO (decl
 {
   if (DECL_LANG_SPECIFIC (decl)
 	  && DECL_TEMPLATE_INFO (decl)
@@ -5903,7 +5914,7 @@ push_template_decl (tree decl, bool is_friend)
 	{
 	  tmpl = build_template_decl (decl, current_template_parms,
   member_template_p);
-	  new_template_p = 1;
+	  new_template_p = true;
 
 	  if (DECL_LANG_SPECIFIC (decl)
 	  && DECL_TEMPLATE_SPECIALIZATION (decl))
@@ -5935,8 +5946,6 @@ push_template_decl (tree decl, bool is_friend)
 	  && DECL_TEMPLATE_SPECIALIZATION (decl)
 	  && DECL_MEMBER_TEMPLATE_P (tmpl))
 	{
-	  tree new_tmpl;
-
 	  /* The declaration is a specialization of a member
 	 template, declared outside the class.  Therefore, the
 	 innermost template arguments will be NULL, so we
@@ -5944,7 +5953,7 @@ push_template_decl (tree decl, bool is_friend)
 	 earlier call to check_explicit_specialization.  */
 	  args = DECL_TI_ARGS (decl);
 
-	  new_tmpl
+	  tree new_tmpl
 	= build_template_decl (decl, current_template_parms,
    member_template_p);
 	  DECL_TI_TEMPLATE (decl) = new_tmpl;
@@ -6016,7 +6025,7 @@ push_template_decl (tree decl, bool is_friend)
 	}
 }
 
-  gcc_checking_assert (DECL_TEMPLATE_RESULT (tmpl) == decl);
+  gcc_checking_assert (!tmpl || DECL_TEMPLATE_RESULT (tmpl) == decl);
 
   if (new_templa

Re: [PATCH] LTO: get_section: add new argument

2020-10-29 Thread Jan Hubicka


> From 33c58cab6bc0d779b11e7ffb36bfb485d73d6816 Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Wed, 21 Oct 2020 11:11:03 +0200
> Subject: [PATCH] LTO: get_section: add new argument
> 
> gcc/ChangeLog:
> 
>   PR lto/97508
>   * langhooks.c (lhd_begin_section): Call get_section with
>   not_existing = true.
>   * output.h (get_section): Add new argument.
>   * varasm.c (get_section): Fail when NOT_EXISTING is true
>   and a section already exists.
>   * ipa-cp.c (ipcp_write_summary): Remove.
>   (ipcp_read_summary): Likewise.
>   * ipa-fnsummary.c (ipa_fn_summary_read): Always read jump
>   functions summary.
>   (ipa_fn_summary_write): Always stream it.

OK with ...
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index ea0b59cf44a..207c9b077d1 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -277,10 +277,12 @@ get_noswitch_section (unsigned int flags, 
> noswitch_section_callback callback)
>  }
>  
>  /* Return the named section structure associated with NAME.  Create
> -   a new section with the given fields if no such structure exists.  */
> +   a new section with the given fields if no such structure exists.
> +   When NOT_EXISTING, then fail if the section already exists.  */
>  
>  section *
> -get_section (const char *name, unsigned int flags, tree decl)
> +get_section (const char *name, unsigned int flags, tree decl,
> +  bool not_existing)
>  {
>section *sect, **slot;
>  
> @@ -297,6 +299,12 @@ get_section (const char *name, unsigned int flags, tree 
> decl)
>  }
>else
>  {
> +  if (not_existing)
> + {
> +   error ("Section already exists: %qs", name);
> +   gcc_unreachable ();
> + }

internal_error here?
OK, I see that you do checking in the get_section that is not lto
streaming only.  I guess in that case you also want to do same checking
in the place we produce .o file directly (during WPA->ltrans streaming).

Honza
> +
>sect = *slot;
>/* It is fine if one of the sections has SECTION_NOTYPE as long as
>   the other has none of the contrary flags (see the logic at the end
> -- 
> 2.28.0
> 



[PATCH 2/7] C-SKY: Delete LO_REGS and HI_REGS, use HILO_REGS instead.

2020-10-29 Thread gengqi via Gcc-patches
gcc/ChangeLog:

* config/csky/constraints.md ("l", "h"): Delete.
* config/csky/csky.h (reg_class, REG_CLASS_NAMES,
REG_CLASS_CONTENTS):  Delete LO_REGS and HI_REGS.
* config/csky/csky.c (regno_reg_classm,
csky_secondary_reload, csky_register_move_cost):
Use HILO_REGS instead of LO_REGS and HI_REGS.
---
 gcc/config/csky/constraints.md | 2 --
 gcc/config/csky/csky.c | 7 +++
 gcc/config/csky/csky.h | 8 
 3 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/gcc/config/csky/constraints.md b/gcc/config/csky/constraints.md
index 1f0bed2a..3f4d5df 100644
--- a/gcc/config/csky/constraints.md
+++ b/gcc/config/csky/constraints.md
@@ -24,8 +24,6 @@
 (define_register_constraint "b" "LOW_REGS"  "r0 - r15")
 (define_register_constraint "c" "C_REGS" "C register")
 (define_register_constraint "y" "HILO_REGS" "HI and LO registers")
-(define_register_constraint "l" "LO_REGS" "LO register")
-(define_register_constraint "h" "HI_REGS" "HI register")
 (define_register_constraint "v" "V_REGS" "vector registers")
 (define_register_constraint "z" "SP_REGS" "SP register")
 
diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c
index c957443..24560b1 100644
--- a/gcc/config/csky/csky.c
+++ b/gcc/config/csky/csky.c
@@ -112,7 +112,7 @@ enum reg_class regno_reg_class[FIRST_PSEUDO_REGISTER] =
   /* Reserved.  */
   RESERVE_REGS,
   /* CC,HI,LO registers.  */
-  C_REGS,  HI_REGS, LO_REGS,
+  C_REGS,  HILO_REGS, HILO_REGS,
   /* Reserved.  */
   RESERVE_REGS, RESERVE_REGS, RESERVE_REGS, RESERVE_REGS,
   RESERVE_REGS, RESERVE_REGS, RESERVE_REGS, RESERVE_REGS,
@@ -2477,8 +2477,7 @@ csky_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
   /* We always require a general register when copying anything to
  HI/LO_REGNUM, except when copying an SImode value from HI/LO_REGNUM
  to a general register, or when copying from register 0.  */
-  if ((rclass == HILO_REGS || rclass == LO_REGS || rclass == HI_REGS)
-  && !CSKY_GENERAL_REGNO_P (regno))
+  if (rclass == HILO_REGS && !CSKY_GENERAL_REGNO_P (regno))
 return GENERAL_REGS;
 
   if (rclass == V_REGS && !CSKY_GENERAL_REGNO_P (regno))
@@ -6549,7 +6548,7 @@ csky_register_move_cost (machine_mode mode 
ATTRIBUTE_UNUSED,
|| (CLASS) == LOW_REGS)
 
 #define HILO_REG_CLASS_P(CLASS) \
-  ((CLASS) == HI_REGS || (CLASS) == LO_REGS || (CLASS) == HILO_REGS)
+  ((CLASS) == HILO_REGS)
 
 #define V_REG_CLASS_P(CLASS) \
   ((CLASS) == V_REGS)
diff --git a/gcc/config/csky/csky.h b/gcc/config/csky/csky.h
index 0906e86..0246906 100644
--- a/gcc/config/csky/csky.h
+++ b/gcc/config/csky/csky.h
@@ -685,8 +685,6 @@ enum reg_class
   LOW_REGS,
   GENERAL_REGS,
   C_REGS,
-  HI_REGS,
-  LO_REGS,
   HILO_REGS,
   V_REGS,
   OTHER_REGS,
@@ -706,8 +704,6 @@ enum reg_class
   "LOW_REGS",  \
   "GENERAL_REGS",  \
   "C_REGS",\
-  "HI_REGS",   \
-  "LO_REGS",   \
   "HILO_REGS", \
   "V_REGS",\
   "OTHER_REGS",\
@@ -731,10 +727,6 @@ enum reg_class
0x, 0x, 0x},/* GENERAL_REGS 
 */   \
   {0x, 0x0002, 0x, 0x,   \
0x, 0x, 0x},/* C_REGS   
 */   \
-  {0x, 0x0004, 0x, 0x,   \
-   0x, 0x, 0x},/* HI_REG   
 */   \
-  {0x, 0x0008, 0x, 0x,   \
-   0x, 0x, 0x},/* LO_REG   
 */   \
   {0x, 0x000c, 0x, 0x,   \
0x, 0x, 0x},/* HILO_REGS
 */   \
   {0x, 0xFFF0, 0x007FFF8F, 0x,   \
-- 
2.7.4



[PATCH 6/7] C-SKY: Cases for csky fpuv3 instructions.

2020-10-29 Thread gengqi via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_div.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fadd.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fdtos.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fftoi_rm.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fftoi_rz.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fhtos.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fitof.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmov.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmovi.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmula.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmuls.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fneg.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fnmula.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fnmuls.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fstod.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fstoh.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fsub.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fxtof.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_h.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_hs.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_hsz.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_hz.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_ls.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_lsz.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_lt.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_ltz.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_max.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_min.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_mul.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_mula.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_muls.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_ne.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_nez.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_recip.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_sqrt.c: New.
* gcc/testsuite/gcc.target/csky/fpuv3/fpv3_unordered.c: New.
---
 gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp  | 50 +++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_div.c | 15 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fadd.c| 23 ++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fdtos.c   | 11 +++
 .../gcc.target/csky/fpuv3/fpv3_fftoi_rm.c  | 55 +
 .../gcc.target/csky/fpuv3/fpv3_fftoi_rz.c  | 41 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fhtos.c   | 11 +++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fitof.c   | 72 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmov.c| 96 ++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmovi.c   | 31 +++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmula.c   | 23 ++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fmuls.c   | 23 ++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fneg.c| 22 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fnmula.c  | 14 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fnmuls.c  | 14 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fstod.c   | 11 +++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fstoh.c   | 11 +++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fsub.c| 23 ++
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_fxtof.c   | 76 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_h.c   | 20 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_hs.c  | 19 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_hsz.c | 21 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_hz.c  | 20 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_ls.c  | 19 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_lsz.c | 20 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_lt.c  | 19 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_ltz.c | 20 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_max.c | 16 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_min.c | 16 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_mul.c | 15 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_mula.c| 16 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_muls.c| 16 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_ne.c  | 19 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_nez.c | 21 +
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_recip.c   | 14 
 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_sqrt.c| 16 
 .../gcc.target/csky/fpuv3/fpv3_unordered.c | 29 +++
 37 files changed, 958 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp
 create mode 100644 gcc/testsuite/gcc.target/csky/fpuv3/fpv3_div.c
 create mode 100644 gcc/testsuite/gcc.target/csk

[PATCH 3/7] C-SKY: Bug fix for bad setting of TARGET_DSP and TARGET_DIV.

2020-10-29 Thread gengqi via Gcc-patches
gcc/ChangeLog:

* config/csky/csky.c (csky_option_override):
Init csky_arch_isa_features[]  advanced, so TARGET_DSP
and TARGET_DIV can be set well.
---
 gcc/config/csky/csky.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c
index 24560b1..18835f4 100644
--- a/gcc/config/csky/csky.c
+++ b/gcc/config/csky/csky.c
@@ -2680,6 +2680,18 @@ csky_option_override (void)
   TARGET_FDIVDU = 0;
 }
 
+  /* Initialize boolean versions of the architectural flags, for use
+ in the .md file.  */
+
+#undef CSKY_ISA
+#define CSKY_ISA(IDENT, DESC)\
+  {  \
+csky_arch_isa_features[CSKY_ISA_FEATURE_GET (IDENT)] =\
+  bitmap_bit_p (csky_active_target.isa, CSKY_ISA_FEATURE_GET (IDENT)); \
+  }
+#include "csky_isa.def"
+#undef CSKY_ISA
+
   /* Extended LRW instructions are enabled by default on CK801, disabled
  otherwise.  */
   if (TARGET_ELRW == -1)
@@ -2752,18 +2764,6 @@ csky_option_override (void)
   TARGET_MULTIPLE_STLD = 0;
 }
 
-  /* Initialize boolean versions of the architectural flags, for use
- in the .md file.  */
-
-#undef CSKY_ISA
-#define CSKY_ISA(IDENT, DESC)\
-  {  \
-csky_arch_isa_features[CSKY_ISA_FEATURE_GET (IDENT)] =\
-  bitmap_bit_p (csky_active_target.isa, CSKY_ISA_FEATURE_GET (IDENT)); \
-  }
-#include "csky_isa.def"
-#undef CSKY_ISA
-
   /* TODO  */
 
   /* Resynchronize the saved target options.  */
-- 
2.7.4



[PATCH 5/7] C-SKY: Add insn "ldbs".

2020-10-29 Thread gengqi via Gcc-patches
gcc/ChangeLog:

config/csky/csky.md (cskyv2_sextend_ldbs): New insn.
---
 gcc/config/csky/csky.md | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/config/csky/csky.md b/gcc/config/csky/csky.md
index 62875bf..ce9c252 100644
--- a/gcc/config/csky/csky.md
+++ b/gcc/config/csky/csky.md
@@ -1533,6 +1533,7 @@
   }"
 )
 
+;; hi -> si
 (define_insn "extendhisi2"
   [(set (match_operand:SI0 "register_operand" "=r")
(sign_extend:SI (match_operand:HI 1 "register_operand" "r")))]
@@ -1557,6 +1558,15 @@
   "sextb  %0, %1"
 )
 
+(define_insn "*cskyv2_sextend_ldbs"
+  [(set (match_operand:SI0 "register_operand" "=r")
+(sign_extend:SI (match_operand:QI 1 "csky_simple_mem_operand" "m")))]
+  "CSKY_ISA_FEATURE (E2)"
+  "ld.bs\t%0, %1"
+  [(set_attr "length" "4")
+   (set_attr "type" "load")]
+)
+
 ;; qi -> hi
 (define_insn "extendqihi2"
   [(set (match_operand:HI0 "register_operand" "=r")
-- 
2.7.4



[PATCH 4/7] C-SKY: Separate FRAME_POINTER_REGNUM into FRAME_POINTER_REGNUM and HARD_FRAME_POINTER_REGNUM.

2020-10-29 Thread gengqi via Gcc-patches
gcc/ChangeLog:

* config/csky/csky.h
(FRAME_POINTER_REGNUM): Use HARD_FRAME_POINTER_REGNUM and
FRAME_POINTER_REGNUM instead of the signle definition. The
signle definition may not work well at simplify_subreg_regno().
(ELIMINABLE_REGS): Add for HARD_FRAME_POINTER_REGNUM.
* config/csky/csky.c (get_csky_live_regs, csky_can_eliminate,
csky_initial_elimination_offset, csky_expand_prologue,
csky_expand_epilogue): Add for HARD_FRAME_POINTER_REGNUM.
---
 gcc/config/csky/csky.c | 15 +--
 gcc/config/csky/csky.h |  7 +--
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c
index 18835f4..6382e89 100644
--- a/gcc/config/csky/csky.c
+++ b/gcc/config/csky/csky.c
@@ -1751,12 +1751,12 @@ get_csky_live_regs (int *count)
save = true;
 
   /* Frame pointer marked used.  */
-  else if (frame_pointer_needed && reg == FRAME_POINTER_REGNUM)
+  else if (frame_pointer_needed && reg == HARD_FRAME_POINTER_REGNUM)
save = true;
 
   /* This is required for CK801/802 where FP is a fixed reg, otherwise
 we end up with no FP value available to the DWARF-2 unwinder.  */
-  else if (crtl->calls_eh_return && reg == FRAME_POINTER_REGNUM)
+  else if (crtl->calls_eh_return && reg == HARD_FRAME_POINTER_REGNUM)
save = true;
 
   /* CK801/802 also need special handling for LR because it's clobbered
@@ -1832,6 +1832,8 @@ csky_layout_stack_frame (void)
 static bool
 csky_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
 {
+  if (to == FRAME_POINTER_REGNUM)
+return from != ARG_POINTER_REGNUM;
   if (to == STACK_POINTER_REGNUM)
 return !frame_pointer_needed;
   return true;
@@ -1852,6 +1854,7 @@ csky_initial_elimination_offset (int from, int to)
   switch (from)
 {
 case FRAME_POINTER_REGNUM:
+case HARD_FRAME_POINTER_REGNUM:
   offset = cfun->machine->reg_offset;
   break;
 
@@ -1866,7 +1869,7 @@ csky_initial_elimination_offset (int from, int to)
   /* If we are asked for the offset to the frame pointer instead,
  then subtract the difference between the frame pointer and stack
  pointer.  */
-  if (to == FRAME_POINTER_REGNUM)
+  if (to == FRAME_POINTER_REGNUM || to == HARD_FRAME_POINTER_REGNUM)
 offset -= cfun->machine->reg_offset;
   return offset;
 }
@@ -5789,7 +5792,7 @@ csky_expand_prologue (void)
  of the register save area.  */
   if (frame_pointer_needed)
 {
-  insn = emit_insn (gen_movsi (frame_pointer_rtx, stack_pointer_rtx));
+  insn = emit_insn (gen_movsi (hard_frame_pointer_rtx, stack_pointer_rtx));
   RTX_FRAME_RELATED_P (insn) = 1;
 }
 
@@ -5852,7 +5855,7 @@ csky_expand_epilogue (void)
   /* Restore the SP to the base of the register save area.  */
   if (frame_pointer_needed)
 {
-  insn = emit_move_insn (stack_pointer_rtx, frame_pointer_rtx);
+  insn = emit_move_insn (stack_pointer_rtx, hard_frame_pointer_rtx);
   RTX_FRAME_RELATED_P (insn) = 1;
 }
   else
@@ -6008,7 +6011,7 @@ csky_set_eh_return_address (rtx source, rtx scratch)
 
   if (frame_pointer_needed)
{
- basereg = frame_pointer_rtx;
+ basereg = hard_frame_pointer_rtx;
  delta = 0;
}
   else
diff --git a/gcc/config/csky/csky.h b/gcc/config/csky/csky.h
index 0246906..589c320 100644
--- a/gcc/config/csky/csky.h
+++ b/gcc/config/csky/csky.h
@@ -342,7 +342,8 @@ extern int csky_arch_isa_features[];
 #define STACK_POINTER_REGNUM  CSKY_SP_REGNUM
 
 /* Base register for access to local variables of the function.  */
-#define FRAME_POINTER_REGNUM  8
+#define FRAME_POINTER_REGNUM  36
+#define HARD_FRAME_POINTER_REGNUM  8
 
 /* Base register for access to arguments of the function.  This is a fake
register that is always eliminated.  */
@@ -370,7 +371,9 @@ extern int csky_arch_isa_features[];
 #define ELIMINABLE_REGS  \
 {{ ARG_POINTER_REGNUM,   STACK_POINTER_REGNUM},\
  { ARG_POINTER_REGNUM,   FRAME_POINTER_REGNUM},\
- { FRAME_POINTER_REGNUM,  STACK_POINTER_REGNUM   }}
+ { ARG_POINTER_REGNUM,   HARD_FRAME_POINTER_REGNUM   },\
+ { FRAME_POINTER_REGNUM,  STACK_POINTER_REGNUM   },\
+ { FRAME_POINTER_REGNUM,  HARD_FRAME_POINTER_REGNUM  }}
 
 /* Define the offset between two registers, one to be eliminated, and the
other its replacement, at the start of a routine.  */
-- 
2.7.4



Re: [PATCH] LTO: get_section: add new argument

2020-10-29 Thread Martin Liška

On 10/29/20 1:51 PM, Jan Hubicka wrote:



 From 33c58cab6bc0d779b11e7ffb36bfb485d73d6816 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 21 Oct 2020 11:11:03 +0200
Subject: [PATCH] LTO: get_section: add new argument

gcc/ChangeLog:

PR lto/97508
* langhooks.c (lhd_begin_section): Call get_section with
not_existing = true.
* output.h (get_section): Add new argument.
* varasm.c (get_section): Fail when NOT_EXISTING is true
and a section already exists.
* ipa-cp.c (ipcp_write_summary): Remove.
(ipcp_read_summary): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_read): Always read jump
functions summary.
(ipa_fn_summary_write): Always stream it.


OK with ...

diff --git a/gcc/varasm.c b/gcc/varasm.c
index ea0b59cf44a..207c9b077d1 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -277,10 +277,12 @@ get_noswitch_section (unsigned int flags, 
noswitch_section_callback callback)
  }
  
  /* Return the named section structure associated with NAME.  Create

-   a new section with the given fields if no such structure exists.  */
+   a new section with the given fields if no such structure exists.
+   When NOT_EXISTING, then fail if the section already exists.  */
  
  section *

-get_section (const char *name, unsigned int flags, tree decl)
+get_section (const char *name, unsigned int flags, tree decl,
+bool not_existing)
  {
section *sect, **slot;
  
@@ -297,6 +299,12 @@ get_section (const char *name, unsigned int flags, tree decl)

  }
else
  {
+  if (not_existing)
+   {
+ error ("Section already exists: %qs", name);
+ gcc_unreachable ();
+   }


internal_error here?


Yep! Thanks for review.


OK, I see that you do checking in the get_section that is not lto
streaming only.  I guess in that case you also want to do same checking
in the place we produce .o file directly (during WPA->ltrans streaming).


You are right, it's the following call stack:

#0  simple_object_write_create_section (sobj=0x305e930, name=0x305c200 
".gnu.lto_.ipa_modref.38127cbcd426", align=3, errmsg=0x7fffd928, 
err=0x7fffd924) at /home/marxin/Programming/gcc/libiberty/simple-object.c:462
#1  0x0096e044 in lto_obj_begin_section (name=0x305c200 
".gnu.lto_.ipa_modref.38127cbcd426") at 
/home/marxin/Programming/gcc/gcc/lto/lto-object.c:333
#2  0x00dbc70e in lto_begin_section (name=0x305c200 
".gnu.lto_.ipa_modref.38127cbcd426", compress=false) at 
/home/marxin/Programming/gcc/gcc/lto-section-out.c:68
#3  0x00db153d in produce_asm (ob=0x307a3d0, fn=0x0) at 
/home/marxin/Programming/gcc/gcc/lto-streamer-out.c:2206
#4  0x00c78275 in (anonymous namespace)::modref_write () at 
/home/marxin/Programming/gcc/gcc/ipa-modref.c:1372
#5  0x00ea4ae0 in ipa_write_optimization_summaries_1 (pass=0x30715d0, 
state=0x305ed40) at /home/marxin/Programming/gcc/gcc/passes.c:2785
#6  0x00ea4be8 in ipa_write_optimization_summaries (encoder=0x305bce0) 
at /home/marxin/Programming/gcc/gcc/passes.c:2818
#7  0x0096996f in stream_out (temp_filename=0x305b280 "./a.ltrans0.o", 
encoder=0x305bce0, part=0) at /home/marxin/Programming/gcc/gcc/lto/lto.c:172
#8  0x00969af0 in stream_out_partitions_1 (temp_filename=0x305b280 
"./a.ltrans0.o", blen=10, min=0, max=1) at 
/home/marxin/Programming/gcc/gcc/lto/lto.c:218
#9  0x00969b4d in stream_out_partitions (temp_filename=0x305b280 
"./a.ltrans0.o", blen=10, min=0, max=1, last=true) at 
/home/marxin/Programming/gcc/gcc/lto/lto.c:235
#10 0x0096a274 in lto_wpa_write_files () at 
/home/marxin/Programming/gcc/gcc/lto/lto.c:394
#11 0x0096a75d in do_whole_program_analysis () at 
/home/marxin/Programming/gcc/gcc/lto/lto.c:537
#12 0x0096a86b in lto_main () at 
/home/marxin/Programming/gcc/gcc/lto/lto.c:637
#13 0x0102bc6c in compile_file () at 
/home/marxin/Programming/gcc/gcc/toplev.c:460
#14 0x0102eff4 in do_compile () at 
/home/marxin/Programming/gcc/gcc/toplev.c:2321
#15 0x0102f2eb in toplev::main (this=0x7fffdd3e, argc=17, 
argv=0x30496d0) at /home/marxin/Programming/gcc/gcc/toplev.c:2460
#16 0x0216ff50 in main (argc=17, argv=0x7fffde48) at 
/home/marxin/Programming/gcc/gcc/main.c:39

I'm going to prepare one another patch for it and I'm going to install
this patch.

Martin



Honza

+
sect = *slot;
/* It is fine if one of the sections has SECTION_NOTYPE as long as
   the other has none of the contrary flags (see the logic at the end
--
2.28.0







[PATCH 1/7] C-SKY: Add fpuv3 instructions and CK860 arch

2020-10-29 Thread gengqi via Gcc-patches
gcc/ChangeLog:

* config/csky/constraints.md ("W"): New constriant for mem operand
with base reg, index register.
("Q"): Renamed and modified "csky_valid_fpuv2_mem_operand" to
"csky_valid_mem_constraint_operand" to deal with both "Q" and "W"
constraint.
("Dv"): New constraint for const double value that can be used at
fmovi instruction.
* config/csky/csky-modes.def (HFmode): New mode.
* config/csky/csky-protos.h (csky_valid_fpuv2_mem_operand): Rename
to "csky_valid_mem_constraint_operand" and new support for constraint
"W".
(get_output_csky_movedouble_length): New.
(fpuv3_output_move): New.
(fpuv3_const_double): New.
* config/csky/csky.c (csky_option_override): New arch CK860 with fpv3.
(decompose_csky_address): Robustness adjust.
(csky_print_operand): New "CONST_DOUBLE" operand.
(csky_output_move): New support for fpv3 instructions.
(get_output_csky_movedouble_length): New.
(fpuv3_output_move): New.
(fpuv3_const_double): New.
(csky_emit_compare): New cover for float comparsion.
(csky_emit_compare_float): Refine.
(csky_vaild_fpuv2_mem_operand): Rename to
"csky_valid_mem_constraint_operand" and new support for constraint "W".
(ck860_rtx_costs): New.
(csky_rtx_costs): New subcall for CK860.
(regno_reg_class): New vregs for fpuv3.
(csky_dbx_regno): Likewise.
(csky_cpu_cpp_builtins): New builtin macro for fpuv3.
(csky_conditional_register_usage): New suporrot for fpuv3.
(csky_dwarf_register_span): New suporrot for fpuv3.
(csky_init_builtins, csky_mangle_type): New support for "__fp16" type.
(ck810_legitimate_index_p): New support for fp16.
* gcc/config/csky/csky.h (TARGET_TLS): ADD CK860.
(CSKY_VREG_P, CSKY_VREG_LO_P, CSKY_VREG_HI_P): New support for fpuv3.
(TARGET_SINGLE_FPU): New support for fpuv3.
(TARGET_SUPPORT_FPV3): New macro.
(FIRST_PSEUDO_REGISTER): Value change, since the new fpuv3 regs.
(FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS, REGISTER_NAMES,
 REG_CLASS_CONTENTS): Support for fpuv3.
* gcc/config/csky/csky.md (movsf): Move to cksy_insn_fpu.md and adjust.
(csky_movsf_fpv2): Likewise.
(ck801_movsf): Likewise.
(csky_movsf): Likewise.
(movdf): Likewise.
(csky_movdf_fpv2): Likewise.
(ck801_movdf): Likewise.
(csky_movdf): Likewise.
(movsicc): Refine. Use "comparison_operatior" instead of
"ordered_comparison_operatior".
(addsicc): Likewise.
(CSKY_FIRST_VFP3_REGNUM, CSKY_LAST_VFP3_REGNUM): New constant.
(call_value_internal_vh): New insn.
* config/csky/csky_cores.def (CK860): New arch and cpu.
(fpv3): New 4 fpus: fpv3_hf, fpv3_hsf, fpv3_sdf and fpv3.
* config/csky/csky_insn_fpu.md (mov): Move the float mov
patterns from csky.md here.
(fpuv2 instructions): Refactor. Separate all float patterns into
emit-patterns and match-patterns, remain the emit-patterns here, and
move the match-patterns to csky_insn_fpuv2.md.
(fpuv3 instructions): Add patterns and fuse part of them with the
fpuv2's.
* config/csky/csky_insn_fpuv2.md: New file for fpuv2 instructions.
* config/csky/csky_insn_fpuv3.md: New flie and new patterns for fpuv3
isntructions.
* config/csky/csky_isa.def (fcr): New.
(fpv3): New 4 isa sets: fpv3_hi, fpv3_hf, fpv3_sf and fpv3_df.
(CK860): New definition for ck860.
* gcc/config/csky/csky_tables.opt (ck860): New processors ck860,
ck860f. And new arch ck860.
(fpv3): New 4 fpus: fpv3_hf, fpv3_hsf, fpv3_sdf and fpv3.
* config/csky/predicates.md (csky_float_comparsion_operator): Delete
"geu", "gtu", "leu", "ltu", which will never appear at float comparison.
* config/cksy/t-csky-elf, config/csky/t-csky-linux: New for ck860.
* doc/md.texi: Add "Q" and "W" constraints for C-SKY.
---
 gcc/config/csky/constraints.md |  13 +-
 gcc/config/csky/csky-modes.def |   2 +
 gcc/config/csky/csky-protos.h  |   7 +-
 gcc/config/csky/csky.c | 650 ++
 gcc/config/csky/csky.h | 162 ++--
 gcc/config/csky/csky.md| 127 ++
 gcc/config/csky/csky_cores.def |  13 +
 gcc/config/csky/csky_insn_fpu.md   | 798 +++--
 gcc/config/csky/csky_insn_fpuv2.md | 470 ++
 gcc/config/csky/csky_insn_fpuv3.md | 506 +++
 gcc/config/csky/csky_isa.def   |  15 +
 gcc/config/csky/csky_tables.opt|  21 +
 gcc/config/csky/predicates.md  |   3 +-
 gcc/config/csky/t-csky-elf |   9 +-
 gcc/config/csky/t-csky-linux   |  11 +-
 gcc/doc/md.texi|   8 +
 16 files changed

[patch] vxworks: Introduce support for vxworks7r2 on x86 and x86_64

2020-10-29 Thread Olivier Hainque
Hello,

This change extends the VxWorks support on intel CPUs to
VxWorks7r2, with a "mcmodel=large" additional multilib for
the 64bit configuration.

The support for fPIC is not functional yet for this model,
so we just don't add the corresponding multilib.

Following the trend already visible on powerpc, the base system
environment is getting extremely close to the standard Linux one,
so fPIC and usual kernel vs RTP distinction apart, the new port 
essentially just leverages the base ELF definitions.

We extend the range of CPU families handled by TARGET_OS_CPP_BUILTINS,
accounting for the fact that archs older than PENTIUM4 are
not supported (any more) by VxWorks 7. As we did for powerpc, we
leverage VX_CPU_PREFIX to emit different forms of definitions for
different families of VxWorks as the system headers's expectations
has evolved between Vx 5, 6 and 7.

Wiith libquadmath disabled (compile time confusion from an
imprecise definition of “howmany” in some variants of the system
headers), we get decent gcc/g++/libstdc++ testsuite results with
this change on a gcc-9 source base, for both 32 and 64bit environments,
for both kernel and RTP modes.

I was able to get good gcc and g++ results for RTPs with this change
applied on a gcc-10 source base and sanity checked that a build with
mainline sources proceeds to completion.

Olivier

2020-10-27  Olivier Hainque  

gcc/
* config.gcc: Adjust the ix86/x86_64-wrs-vxworks filters
to apply to VxWorks 7 as well.
* config/i386/t-vxworks (MULTILIB_OPTIONS, MULTILIB_DIRNAMES):
Remove the fPIC multilib and add one for the large code model
on x86_64.
* config/i386/vxworks.h: Separate sections for TARGET_VXWORKS7,
other variants and common bits.
(TARGET_OS_CPP_BUILTINS): Augment to support a range of CPU
families. Leverage VX_CPU_PREFIX.
(CC1_SPEC): Add definition.
(STACK_CHECK_PROTECT): Use conditional expression instead of
heavier to read conditioned macro definitions.

libgcc/
config.host: Adjust the ix86/x86_64-wrs-vxworks filters
to apply to VxWorks 7 as well.

Co-autored-by:  Douglas Rupp  
Co-autored-by:  Pat Bernardi  

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d14a1a3e8124..b169f2fc3aad 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2050,7 +2050,7 @@ i[34567]86-*-solaris2* | x86_64-*-solaris2*)
esac
fi
;;
-i[4567]86-wrs-vxworks|i[4567]86-wrs-vxworksae|i[4567]86-wrs-vxworks7|x86_64-wrs-vxworks7)
+i[4567]86-wrs-vxworks*|x86_64-wrs-vxworks7*)
tm_file="${tm_file} i386/unix.h i386/att.h elfos.h"
case ${target} in
  x86_64-*)
diff --git a/gcc/config/i386/t-vxworks b/gcc/config/i386/t-vxworks
index c440b1f90317..8f5e8c73b71e 100644
--- a/gcc/config/i386/t-vxworks
+++ b/gcc/config/i386/t-vxworks
@@ -1,8 +1,19 @@
 # Multilibs for VxWorks.
 
-# Build multilibs for normal, -mrtp, and -mrtp -fPIC.
-MULTILIB_OPTIONS = mrtp fPIC
-MULTILIB_DIRNAMES =
+# The common variant across the board is for -mrtp
+MULTILIB_OPTIONS = mrtp
+MULTILIB_DIRNAMES = mrtp
+
+# Then variants for the "large" code model on x86_64, or fPIC on x86,
+# RTP only. -fPIC -mrtp -mcmodel=large is not functional yet.
+ifneq (,$(findstring x86_64, $(target)))
+MULTILIB_OPTIONS += mcmodel=large
+MULTILIB_DIRNAMES += large
+else
+MULTILIB_OPTIONS += fPIC
+MULTILIB_DIRNAMES += fPIC
 MULTILIB_MATCHES = fPIC=fpic
-MULTILIB_EXCEPTIONS = fPIC
 
+# -fPIC is only supported in combination with -mrtp
+MULTILIB_EXCEPTIONS = fPIC
+endif
diff --git a/gcc/config/i386/vxworks.h b/gcc/config/i386/vxworks.h
index ad9404b40ccd..891b4ff04b5f 100644
--- a/gcc/config/i386/vxworks.h
+++ b/gcc/config/i386/vxworks.h
@@ -18,12 +18,21 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+/* VxWorks after 7 SR0600 use the ELF ABI and the system environment is llvm
+   based.  Earlier versions have GNU based environment components and use the
+   same ABI as Solaris 2.  */
+
+#if TARGET_VXWORKS7
+
+#undef VXWORKS_PERSONALITY
+#define VXWORKS_PERSONALITY "llvm"
+
+#else
+
 #undef ASM_OUTPUT_ALIGNED_BSS
 #define ASM_OUTPUT_ALIGNED_BSS(FILE, DECL, NAME, SIZE, ALIGN) \
   asm_output_aligned_bss (FILE, DECL, NAME, SIZE, ALIGN)
 
-/* VxWorks uses the same ABI as Solaris 2, so use i386/sol2.h version.  */
-
 #undef TARGET_SUBTARGET_DEFAULT
 #define TARGET_SUBTARGET_DEFAULT \
(MASK_80387 | MASK_IEEE_FP | MASK_FLOAT_RETURNS | MASK_VECT8_RETURNS)
@@ -41,43 +50,73 @@ along with GCC; see the file COPYING3.  If not see
 #undef SIZE_TYPE
 #define SIZE_TYPE (TARGET_LP64 ? "long unsigned int" : "unsigned int")
 
+/* We cannot use PC-relative accesses for VxWorks PIC because there is no
+   fixed gap between segments.  */
+#undef ASM_PREFERRED_EH_DATA_FORMAT
+
 #if TARGET_64BIT_DEFAULT
 #undef VXWORKS_SYSCALL_LIBS_RTP
 #define VXWORKS_SYS

Re: [PATCH] Selectively trap if ranger and vr-values disagree on range builtins.

2020-10-29 Thread Andrew MacLeod via Gcc-patches

On 10/27/20 11:29 AM, Aldy Hernandez wrote:

The UBSAN builtins degrade into PLUS/MINUS/MULT and call
extract_range_from_binary_expr, which as the PR shows, can special
case some symbolics which the ranger doesn't currently handle.

Looking at vr_values::extract_range_builtin(), I see that every single
place where we ask for a range, we bail on non-integers (symbolics,
etc).  That is, with the exception of the UBSAN builtins.

Since this seems to be particular to UBSAN, we could still go with the
original plan of removing the duplicity in ranger vs vr-values, but
leave in the UBSAN builtin handling.  This isn't ideal, as we'd like
to remove all the common code, but I'd be willing to put up with UBSAN
duplication for the time being.

This patch disables the assert on the UBSAN builtins, while still
trapping if any other differences are found between the vr_values and
the ranger versions of builtin range handling.

As a follow-up, once Fedora can test this approach, I'll remove all
the builtin code from extract_range_builtin, with the exception of the
UBSAN stuff (renaming it to extract_range_ubsan_builtin).

Since the builtin code has proven fickle across architectures, I've
tested this with {-m32,-m64,-fsanitize=signed-integer-overflow} on
x86, ppc64le, and aarch64.  I think this should be enough.  If it
isn't, we can revert the patch, and leave the duplicate code until
the next release cycle when hopefully vr_values, evrp, and friends
will all be overhauled.

Andrew, do you have any thoughts on this?



OK.

I think we want to remove as much duplication as possible, which will 
then give us confidence that that ranger versions are correct.
THe UBSAN versions will have to get tighter ranger integration with 
relationals when they are available in order to handle things like this.


I dont suppose you can create a testcase for this?   Otherwise we'll 
have to tag it somehow so we dont forget to come back to it when we 
start handling these ubsan builtins differently.


Andrew

PS. and might as well create and test the follow up patch so its ready 
to go.  I'd leave this as-is with the asserts for a week or so.







Re: [RS6000] float128-type-2.c unsupported

2020-10-29 Thread David Edelsohn via Gcc-patches
On Thu, Oct 29, 2020 at 12:53 AM Alan Modra  wrote:
>
> On Wed, Oct 28, 2020 at 11:35:07PM -0400, David Edelsohn wrote:
> > Alan,
> >
> > It is disrespectful for you to ignore the review of a maintainer and
> > your colleague.  You may not pick and choose amongst maintainers.  And
> > Segher should not be so disrespectful as to contradict his colleague
> > and co-maintainer.
>
> I'm sorry you see this as a matter of respect.  I didn't see it that
> way at all.  Segher disagreed with your review, and gave sufficient
> technical reason for me to commit the patch.

Alan,

This isn't how patch review works, and you know that.  Segher also
knows that.  Ignoring my decision while accepting Segher's clearly is
a measure of respect.  You don't get to decide whose justification or
reasons are appropriate -- that is a flimsy rationalization for your
unprofessional behavior.

If you or Segher disagree with my review, we can discuss it like
mature adults and reach a mutually-acceptable outcome.  I'm a
reasonable person.

Collaboratively developing software is not a game of playing
maintainers off one another like children gaming parents until they
receive the answer that they want.

Both you and Segher are adults and have leadership roles in the GNU
Toolchain. Everyone is under additional stress because of the global
health and economic situations, as well as technical projects, but
that is not an excuse. Both of you are capable of better behavior and
I wish to see it demonstrated.

Thank You,
David


[patch] vxworks: Predefine __ppc and __ppc__ for VxWorks 7

2020-10-29 Thread Olivier Hainque

Unfortunately, some VxWorks 7r2 system headers rely on a
couple more variations of the predefined macros expected
to characterize a “powerpc” target that we discussed recently.

setjmp.h, for example, relies on __ppc and the absence of
a definition results in “gcc” dejagnu test failures from all
the tests #including that header, which stumble on:

#error "_JBLEN not set!”

The other case is __ppc__ expected by yvals.h, key to libstdc++.

This change adjusts the VxWorks 7 section of our configuration
to honor those expectations.

Olivier

2020-10-29. Olivier Hainque  

gcc/
* config/rs6000/vxworks.h (TARGET_OS_CPP_BUILTINS): Also
builtin_define __ppc and __ppc__ for VxWorks 7.

diff --git a/gcc/config/rs6000/vxworks.h b/gcc/config/rs6000/vxworks.h
index 9dabdab323ab..51a3250f5dcc 100644
--- a/gcc/config/rs6000/vxworks.h
+++ b/gcc/config/rs6000/vxworks.h
@@ -70,6 +70,12 @@ along with GCC; see the file COPYING3.  If not see
  builtin_define ("__PPC"); \
  builtin_define ("__powerpc"); \
}   \
+   \
+ /* __ppc isn't emitted by the system compiler \
+any more but a few system headers still depend \
+on it, as well as on __ppc__.  */  \
+ builtin_define ("__ppc"); \
+ builtin_define ("__ppc__");   \
}   \
\
   /* Asserts for #cpu and #machine.  */\
-- 
2.17.1



C++ patch ping

2020-10-29 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping 2 patches:

- https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556370.html
  PR95808 - diagnose constexpr delete [] new int; and delete new int[N];

- https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556548.html
  PR97388 - fix up constexpr evaluation of arguments passed by invisible 
reference

Thanks

Jakub



[committed] libstdc++: Fix memory issue in ranges::lexicographical_compare testcase

2020-10-29 Thread Patrick Palka via Gcc-patches
libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/lexicographical_compare/constrained.cc:
(test03): Fix initializing the vector vy with the array y of size 4.
---
 .../25_algorithms/lexicographical_compare/constrained.cc| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc 
b/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc
index b82c872..2019bbc75e4 100644
--- 
a/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc
+++ 
b/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc
@@ -136,7 +136,7 @@ test03()
   VERIFY( !ranges::lexicographical_compare(cy.begin(), cy.end(),
   cz.begin(), cz.end()) );
 
-  std::vector vx(x, x+5), vy(y, y+5);
+  std::vector vx(x, x+5), vy(y, y+4);
   VERIFY( ranges::lexicographical_compare(vx, vy) );
   VERIFY( !ranges::lexicographical_compare(vx, vy, ranges::greater{}) );
   VERIFY( !ranges::lexicographical_compare(vy, vx) );
-- 
2.29.0.rc0



[PATCH] d: Add dragonflybsd support for D compiler and runtime

2020-10-29 Thread Iain Buclaw via Gcc-patches
Hi,

This patch adds the necessary version conditions and configure rules in
place to allow building the D compiler on DragonFlyBSD.

Running the testsuite, all core tests pass, with a couple failures
relating to CTFE math support which are not blocking the library from
being usable, and will be fixed in a follow-up.

OK for mainline?

Regards
Iain

---
gcc/ChangeLog:

* config.gcc (*-*-dragonfly*): Add dragonfly-d.o and t-dragonfly.
* config/dragonfly-d.c: New file.
* config/t-dragonfly: New file.

libphobos/ChangeLog:

* configure.tgt: Add *-*-dragonfly* as a supported target.
* configure: Regenerate.
* m4/druntime/os.m4 (DRUNTIME_OS_SOURCES): Add dragonfly* as a posix
target.
---
 gcc/config.gcc  |  3 +++
 gcc/config/dragonfly-d.c| 37 +
 gcc/config/t-dragonfly  | 21 +
 libphobos/configure |  2 +-
 libphobos/configure.tgt |  3 +++
 libphobos/m4/druntime/os.m4 |  2 +-
 6 files changed, 66 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/dragonfly-d.c
 create mode 100644 gcc/config/t-dragonfly

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d14a1a3e812..8fff8da1dd0 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -731,6 +731,9 @@ case ${target} in
   extra_options="$extra_options rpath.opt dragonfly.opt"
   default_use_cxa_atexit=yes
   use_gcc_stdint=wrap
+  d_target_objs="${d_target_objs} dragonfly-d.o"
+  tmake_file="${tmake_file} t-dragonfly"
+  target_has_targetdm=yes
   ;;
 *-*-freebsd*)
   # This is the generic ELF configuration of FreeBSD.  Later
diff --git a/gcc/config/dragonfly-d.c b/gcc/config/dragonfly-d.c
new file mode 100644
index 000..70ec820b75d
--- /dev/null
+++ b/gcc/config/dragonfly-d.c
@@ -0,0 +1,37 @@
+/* DragonFly support needed only by D front-end.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm_d.h"
+#include "d/d-target.h"
+#include "d/d-target-def.h"
+
+/* Implement TARGET_D_OS_VERSIONS for DragonFly targets.  */
+
+static void
+dragonfly_d_os_builtins (void)
+{
+  d_add_builtin_version ("DragonFlyBSD");
+  d_add_builtin_version ("Posix");
+}
+
+#undef TARGET_D_OS_VERSIONS
+#define TARGET_D_OS_VERSIONS dragonfly_d_os_builtins
+
+struct gcc_targetdm targetdm = TARGETDM_INITIALIZER;
diff --git a/gcc/config/t-dragonfly b/gcc/config/t-dragonfly
new file mode 100644
index 000..764ced9cd91
--- /dev/null
+++ b/gcc/config/t-dragonfly
@@ -0,0 +1,21 @@
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+dragonfly-d.o: $(srcdir)/config/dragonfly-d.c
+   $(COMPILE) $<
+   $(POSTCOMPILE)
diff --git a/libphobos/configure b/libphobos/configure
index 4c1116d6f80..455f338a9e8 100755
--- a/libphobos/configure
+++ b/libphobos/configure
@@ -14283,7 +14283,7 @@ fi
 
   druntime_target_posix="no"
   case "$druntime_cv_target_os" in
-aix*|*bsd*|cygwin*|darwin*|gnu*|linux*|skyos*|*solaris*|sysv*)
+aix*|*bsd*|cygwin*|darwin*|dragonfly*|gnu*|linux*|skyos*|*solaris*|sysv*)
   druntime_target_posix="yes"
   ;;
   esac
diff --git a/libphobos/configure.tgt b/libphobos/configure.tgt
index 94e42bf5509..1ea9e0c804c 100644
--- a/libphobos/configure.tgt
+++ b/libphobos/configure.tgt
@@ -24,6 +24,9 @@
 LIBPHOBOS_SUPPORTED=no
 LIBDRUNTIME_ONLY=auto
 case "${target}" in
+  *-*-dragonfly*)
+   LIBPHOBOS_SUPPORTED=yes
+   ;;
   aarch64*-*-linux*)
LIBPHOBOS_SUPPORTED=yes
;;
diff --git a/libphobos/m4/druntime/os.m4 b/libphobos/m4/druntime/os.m4
index 47d4c6a6c80..ed93e30f1e9 100644
--- a/libphobos/m4/druntime/os.m4
+++ b/libphobos/m4/druntime/os.m4
@@ -112,7 

Re: [PING] [PATCH] S/390: Do not turn maybe-uninitialized warnings into errors

2020-10-29 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches  writes:
> On 10/28/20 11:29 AM, Stefan Schulze Frielinghaus wrote:
>> On Wed, Oct 28, 2020 at 08:39:41AM -0600, Jeff Law wrote:
>>> On 10/28/20 3:38 AM, Stefan Schulze Frielinghaus via Gcc-patches wrote:
 On Mon, Oct 05, 2020 at 02:02:57PM +0200, Stefan Schulze Frielinghaus via 
 Gcc-patches wrote:
> On Tue, Sep 22, 2020 at 02:59:30PM +0200, Andreas Krebbel wrote:
>> On 15.09.20 17:02, Stefan Schulze Frielinghaus wrote:
>>> Over the last couple of months quite a few warnings about uninitialized
>>> variables were raised while building GCC.  A reason why these warnings
>>> show up on S/390 only is due to the aggressive inlining settings here.
>>> Some of these warnings (2c832ffedf0, b776bdca932, 2786c0221b6,
>>> 1657178f59b) could be fixed or in case of a false positive silenced by
>>> initializing the corresponding variable.  Since the latter reoccurs and
>>> while bootstrapping such warnings are turned into errors bootstrapping
>>> fails on S/390 consistently.  Therefore, for the moment do not turn
>>> those warnings into errors.
>>>
>>> config/ChangeLog:
>>>
>>> * warnings.m4: Do not turn maybe-uninitialized warnings into 
>>> errors
>>> on S/390.
>>>
>>> fixincludes/ChangeLog:
>>>
>>> * configure: Regenerate.
>>>
>>> gcc/ChangeLog:
>>>
>>> * configure: Regenerate.
>>>
>>> libcc1/ChangeLog:
>>>
>>> * configure: Regenerate.
>>>
>>> libcpp/ChangeLog:
>>>
>>> * configure: Regenerate.
>>>
>>> libdecnumber/ChangeLog:
>>>
>>> * configure: Regenerate.
>> That change looks good to me. Could a global reviewer please comment!
> Ping
 Ping
>>> I think this would be a huge mistake to install.
>> The root cause why those false positives show up on S/390 only seems to
>> be of more aggressive inlining w.r.t. other architectures.  Because of
>> bigger caches and a rather huge function call overhead we greatly
>> benefit from those inlining parameters. Thus:
>>
>> 1) Reverting those parameters would have a negative performance impact.
>>
>> 2) Fixing the maybe-uninitialized warnings analysis itself seems not to
>>happen in the near future (assuming that it is fixable at all).
>>
>> 3) Silencing the warning by initialising the variable itself also seems
>>to be undesired and feels like a fight against windmills ;-)
>>
>> 4) Not lifting maybe-uninitialized warnings to errors on S/390 only.
>>
>> Option (4) has the least intrusive effect to me.  At least then it is
>> not necessary to bootstrap with --disable-werror and we would still
>> treat all other warnings as errors.  All maybe-uninitialized warnings
>> which are triggered in common code with non-aggressive inlining are
>> still caught by other architectures.  Therefore, I'm wondering why this
>> should be a huge mistake?  What would you propose instead?
>
> I'm aware of all that.  What I think it all argues is that y'all need to
> address the issues because of how you've changed the tuning on the s390
> port.  Simply disabling things like you've suggested is, IMHO, horribly
> wrong.
>
>
> Improve the analysis, dummy initializers, pragmas all seem viable.  But
> again, it feels like it's something the s390 maintainers will have to
> take the lead on because of how you've retuned the port.
>
>
> And note that this isn't just an issue with uninitialized warnings, the
> changes in inlining heuristics can impact all the middle end warnings.

To play devil's advocate: it seems like a reasonable workaround to me.
(I didn't want to approve a potentially controversial patch for
“another port” so was staying silent. :-))

Isn't this just the known downside of using maybe-used-uninitialised
warnings?  AIUI, it's accepted that the option has false positives
that vary based on the amount of optimisation that previous passes
have or haven't done.  So I don't think it's an issue of “fixing”
the analysis: the current implementation doesn't seem like it is
going to be (and perhaps it isn't meant to be) predictable from a
user's perspective.  I got the impression this was a deliberate
trade-off we'd made in order to let fewer false negatives slip by.

We already disable maybe-used-uninitialized warnings when bootstrapping
with anything other than the default -O2 -g configuration.  (I remember
when looking at -Og a while back that there were a large number of
unsuppressed false positives when bootstrapping with that.)  ISTM that
s390 is effectively using non-standard bootstrap options, in a similar
way to --with-build-config=, and so turning these errors back into
warnings is reasonable here too.

Thanks,
Richard


Re: [PATCH] Selectively trap if ranger and vr-values disagree on range builtins.

2020-10-29 Thread Aldy Hernandez via Gcc-patches




On 10/29/20 2:53 PM, Andrew MacLeod wrote:

On 10/27/20 11:29 AM, Aldy Hernandez wrote:

The UBSAN builtins degrade into PLUS/MINUS/MULT and call
extract_range_from_binary_expr, which as the PR shows, can special
case some symbolics which the ranger doesn't currently handle.

Looking at vr_values::extract_range_builtin(), I see that every single
place where we ask for a range, we bail on non-integers (symbolics,
etc).  That is, with the exception of the UBSAN builtins.

Since this seems to be particular to UBSAN, we could still go with the
original plan of removing the duplicity in ranger vs vr-values, but
leave in the UBSAN builtin handling.  This isn't ideal, as we'd like
to remove all the common code, but I'd be willing to put up with UBSAN
duplication for the time being.

This patch disables the assert on the UBSAN builtins, while still
trapping if any other differences are found between the vr_values and
the ranger versions of builtin range handling.

As a follow-up, once Fedora can test this approach, I'll remove all
the builtin code from extract_range_builtin, with the exception of the
UBSAN stuff (renaming it to extract_range_ubsan_builtin).

Since the builtin code has proven fickle across architectures, I've
tested this with {-m32,-m64,-fsanitize=signed-integer-overflow} on
x86, ppc64le, and aarch64.  I think this should be enough.  If it
isn't, we can revert the patch, and leave the duplicate code until
the next release cycle when hopefully vr_values, evrp, and friends
will all be overhauled.

Andrew, do you have any thoughts on this?



OK.

I think we want to remove as much duplication as possible, which will 
then give us confidence that that ranger versions are correct.
THe UBSAN versions will have to get tighter ranger integration with 
relationals when they are available in order to handle things like this.


I dont suppose you can create a testcase for this?   Otherwise we'll 
have to tag it somehow so we dont forget to come back to it when we 
start handling these ubsan builtins differently.


Ughh...not easily.  It's deep in a Fortran testcase, and AFAICT the 
range that is determined (~[0,0]) does not affect the code generated.


I'll post as is, while I ponder this.  Perhaps I can hand craft a gimple 
FE test that will trigger different code out of evrp that we can somehow 
test.  If/when I do, I'll push the test.


Aldy


Andrew

PS. and might as well create and test the follow up patch so its ready 
to go.  I'd leave this as-is with the asserts for a week or so.









Re: [PATCH][middle-end][i386][version 5]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-gpr-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-29 Thread Qing Zhao via Gcc-patches



> On Oct 29, 2020, at 6:09 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao via Gcc-patches  writes:
>> +/* Handle a "zero_call_used_regs" attribute; arguments as in
>> +   struct attribute_spec.handler.  */
>> +
>> +static tree
>> +handle_zero_call_used_regs_attribute (tree *node, tree name, tree args,
>> +  int ARG_UNUSED (flags),
>> +  bool *no_add_attrs)
>> +{
>> +  tree decl = *node;
>> +  tree id = TREE_VALUE (args);
>> +
>> +  if (TREE_CODE (decl) != FUNCTION_DECL)
>> +{
>> +  error_at (DECL_SOURCE_LOCATION (decl),
>> +"%qE attribute applies only to functions", name);
>> +  *no_add_attrs = true;
>> +}
>> +
>> +  if (TREE_CODE (id) != STRING_CST)
>> +{
>> +  error_at (DECL_SOURCE_LOCATION (decl),
>> +"attribute %qE arguments not a string", name);
> 
> The existing message for this seems to be:
> 
>  "%qE argument not a string"
> 
> (which seems a bit terse, but hey)

Okay.
> 
>> +  *no_add_attrs = true;
>> +}
>> +
>> +  bool found = false;
>> +  for (unsigned int i = 0; zero_call_used_regs_opts[i].name != NULL; ++i)
>> +if (strcmp (TREE_STRING_POINTER (id),
>> +zero_call_used_regs_opts[i].name) == 0)
>> +  {
>> +found = true;
>> +break;
>> +  }
>> +
>> +  if (!found)
>> +{
>> +  error_at (DECL_SOURCE_LOCATION (decl),
>> +"unrecognized zero_call_used_regs attribute: %qs",
>> +TREE_STRING_POINTER (id));
> 
> The attribute name needs to be quoted, and it would be good if it
> wasn't hard-coded into the string:
> 
>  error_at (DECL_SOURCE_LOCATION (decl),
>   "unrecognized %qE attribute argument %qs", name,
>   TREE_STRING_POINTER (id));
Okay.
> 
>> @@ -228,6 +228,10 @@ unsigned int flag_sanitize_coverage
>> Variable
>> bool dump_base_name_prefixed = false
>> 
>> +; What subset of registers should be zeroed
> 
> Think it would be useful to add “ on function return.”.
Okay.
> 
>> +Variable
>> +unsigned int flag_zero_call_used_regs
>> +
>> ###
>> Driver
>> 
>> diff --git a/gcc/df.h b/gcc/df.h
>> index 8b6ca8c..0f098d7 100644
>> --- a/gcc/df.h
>> +++ b/gcc/df.h
>> @@ -1085,6 +1085,7 @@ extern void df_update_entry_exit_and_calls (void);
>> extern bool df_hard_reg_used_p (unsigned int);
>> extern unsigned int df_hard_reg_used_count (unsigned int);
>> extern bool df_regs_ever_live_p (unsigned int);
>> +extern bool df_epilogue_uses_p (unsigned int);
>> extern void df_set_regs_ever_live (unsigned int, bool);
>> extern void df_compute_regs_ever_live (bool);
>> extern void df_scan_verify (void);
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index c9f7299..b011c17 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -3992,6 +3992,96 @@ performing a link with relocatable output (i.e.@: 
>> @code{ld -r}) on them.
>> A declaration to which @code{weakref} is attached and that is associated
>> with a named @code{target} must be @code{static}.
>> 
>> +@item zero_call_used_regs ("@var{choice}")
>> +@cindex @code{zero_call_used_regs} function attribute
>> +
>> +The @code{zero_call_used_regs} attribute causes the compiler to zero
>> +a subset of all call-used registers at function return according to
>> +@var{choice}.
> 
> Suggest dropping “according to @var{choice}” here, since it's now
> disconnected with the part that talks about what @var{choice} is.
Okay
> 
>> +This is used to increase the program security by either mitigating
> 
> s/the program security/program security/
Okay
> 
>> +Return-Oriented Programming (ROP) or preventing information leak
> 
> leakage
> 
> (FWIW, I'm not sure “mitigating ROP” is really correct usage, but I don't
> have any better suggestions.)

Do you mean whether “mitigating ROP’ is one of the major purpose of this new 
feature?

The initial main motivation of the new feature is for mitigating ROP. And the 
reason for only zeroing
argument subset of the register is also for mitigating ROP.

> 
>> +through registers.
>> +
>> +A ``call-used'' register is a register whose contents can be changed by
>> +a function call; therefore, a caller cannot assume that the register has
>> +the same contents on return from the function as it had before calling
>> +the function.  Such registers are also called ``call-clobbered'',
>> +``caller-saved'', or ``volatile''.
> 
> Reading it back, perhaps it would be better to put this whole paragraph
> in a footnote immediately after the first use of “call-used registers”,
> i.e.
> 
> …call-used registers@footnote{A ``call-used'' register…}…
> 
> It obviously breaks the flow when reading the raw .texi, but I think it
> reads better in the final version.

Okay.
> 
>> +In order to satisfy users with different security needs and control the
>> +run-time overhead at the same time, GCC provides a flexible way to choose
>> +the subset of the call-used registers to be zeroed.
> 
> Maybe s/GCC/the @var{choice} parameter/.
Okay.
> 

[committed] libstdc++: Make std::function work better with -fno-rtti

2020-10-29 Thread Jonathan Wakely via Gcc-patches
This change allows std::function::target() to work even without RTTI,
using the same approach as std::any. Because we know what the manager
function would be for a given type, we can check if the stored pointer
has the expected address. If it does, we don't need to use RTTI. If it
isn't equal, we still need to do the RTTI check (when RTTI is enabled)
to handle the case where the same function has different addresses in
different shared objects.

This also changes the implementation of the manager function to return a
null pointer result when asked for the type_info of the target object.
This not only avoids a warning with -Wswitch -Wsystem-headers, but also
avoids prevents std::function::target_type() from dereferencing an
uninitialized pointer when the linker keeps an instantiation of the
manager function that was compiled without RTTI.

Finally, this fixes a bug in the non-const overload of function::target
where calling it with a function type F was ill-formed, due to
attempting to use const_cast(ptr). The standard only allows
const_cast when T is an object type.  The solution is to use
*const_cast(&ptr) instead, because F* is an object type even if F
isn't. I've also used _GLIBCXX17_CONSTEXPR in function::target so that
it doesn't bother instantiating anything for types that can never be a
valid target.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h (_Function_handler):
Define explicit specialization used for invalid target types.
(_Base_manager::_M_manager) [!__cpp_rtti]: Return null.
(function::target_type()): Check for null pointer.
(function::target()): Define unconditionall. Fix bug with
const_cast of function pointer type.
(function::target() const): Define unconditionally, but
only use RTTI if enabled.
* testsuite/20_util/function/target_no_rtti.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

2
1
0
commit 3c9b99ef7115fa88ef4d744cc2afc424bd5c3ef2
Author: Jonathan Wakely 
Date:   Thu Oct 29 14:47:17 2020

libstdc++: Make std::function work better with -fno-rtti

This change allows std::function::target() to work even without RTTI,
using the same approach as std::any. Because we know what the manager
function would be for a given type, we can check if the stored pointer
has the expected address. If it does, we don't need to use RTTI. If it
isn't equal, we still need to do the RTTI check (when RTTI is enabled)
to handle the case where the same function has different addresses in
different shared objects.

This also changes the implementation of the manager function to return a
null pointer result when asked for the type_info of the target object.
This not only avoids a warning with -Wswitch -Wsystem-headers, but also
avoids prevents std::function::target_type() from dereferencing an
uninitialized pointer when the linker keeps an instantiation of the
manager function that was compiled without RTTI.

Finally, this fixes a bug in the non-const overload of function::target
where calling it with a function type F was ill-formed, due to
attempting to use const_cast(ptr). The standard only allows
const_cast when T is an object type.  The solution is to use
*const_cast(&ptr) instead, because F* is an object type even if F
isn't. I've also used _GLIBCXX17_CONSTEXPR in function::target so that
it doesn't bother instantiating anything for types that can never be a
valid target.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h (_Function_handler):
Define explicit specialization used for invalid target types.
(_Base_manager::_M_manager) [!__cpp_rtti]: Return null.
(function::target_type()): Check for null pointer.
(function::target()): Define unconditionall. Fix bug with
const_cast of function pointer type.
(function::target() const): Define unconditionally, but
only use RTTI if enabled.
* testsuite/20_util/function/target_no_rtti.cc: New test.

diff --git a/libstdc++-v3/include/bits/std_function.h 
b/libstdc++-v3/include/bits/std_function.h
index fa65885d1de..054d9cbbf02 100644
--- a/libstdc++-v3/include/bits/std_function.h
+++ b/libstdc++-v3/include/bits/std_function.h
@@ -183,11 +183,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
  switch (__op)
{
-#if __cpp_rtti
case __get_type_info:
+#if __cpp_rtti
  __dest._M_access() = &typeid(_Functor);
- break;
+#else
+ __dest._M_access() = nullptr;
 #endif
+ break;
case __get_functor_ptr:
  __dest._M_access<_Functor*>() = _M_get_pointer(__source);
  break;
@@ -293,6 +295,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 };
 
+  // Specialization for invalid types
+  template<>
+class _Function_handler
+{
+public:
+  st

[committed] libstdc++: Make std::function work better with -fno-rtti

2020-10-29 Thread Jonathan Wakely via Gcc-patches
This change allows std::function::target() to work even without RTTI,
using the same approach as std::any. Because we know what the manager
function would be for a given type, we can check if the stored pointer
has the expected address. If it does, we don't need to use RTTI. If it
isn't equal, we still need to do the RTTI check (when RTTI is enabled)
to handle the case where the same function has different addresses in
different shared objects.

This also changes the implementation of the manager function to return a
null pointer result when asked for the type_info of the target object.
This not only avoids a warning with -Wswitch -Wsystem-headers, but also
avoids prevents std::function::target_type() from dereferencing an
uninitialized pointer when the linker keeps an instantiation of the
manager function that was compiled without RTTI.

Finally, this fixes a bug in the non-const overload of function::target
where calling it with a function type F was ill-formed, due to
attempting to use const_cast(ptr). The standard only allows
const_cast when T is an object type.  The solution is to use
*const_cast(&ptr) instead, because F* is an object type even if F
isn't. I've also used _GLIBCXX17_CONSTEXPR in function::target so that
it doesn't bother instantiating anything for types that can never be a
valid target.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h (_Function_handler):
Define explicit specialization used for invalid target types.
(_Base_manager::_M_manager) [!__cpp_rtti]: Return null.
(function::target_type()): Check for null pointer.
(function::target()): Define unconditionall. Fix bug with
const_cast of function pointer type.
(function::target() const): Define unconditionally, but
only use RTTI if enabled.
* testsuite/20_util/function/target_no_rtti.cc: New test.

Tested x86_64-linux. Committed to trunk.

commit 3c9b99ef7115fa88ef4d744cc2afc424bd5c3ef2
Author: Jonathan Wakely 
Date:   Thu Oct 29 14:47:17 2020

libstdc++: Make std::function work better with -fno-rtti

This change allows std::function::target() to work even without RTTI,
using the same approach as std::any. Because we know what the manager
function would be for a given type, we can check if the stored pointer
has the expected address. If it does, we don't need to use RTTI. If it
isn't equal, we still need to do the RTTI check (when RTTI is enabled)
to handle the case where the same function has different addresses in
different shared objects.

This also changes the implementation of the manager function to return a
null pointer result when asked for the type_info of the target object.
This not only avoids a warning with -Wswitch -Wsystem-headers, but also
avoids prevents std::function::target_type() from dereferencing an
uninitialized pointer when the linker keeps an instantiation of the
manager function that was compiled without RTTI.

Finally, this fixes a bug in the non-const overload of function::target
where calling it with a function type F was ill-formed, due to
attempting to use const_cast(ptr). The standard only allows
const_cast when T is an object type.  The solution is to use
*const_cast(&ptr) instead, because F* is an object type even if F
isn't. I've also used _GLIBCXX17_CONSTEXPR in function::target so that
it doesn't bother instantiating anything for types that can never be a
valid target.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h (_Function_handler):
Define explicit specialization used for invalid target types.
(_Base_manager::_M_manager) [!__cpp_rtti]: Return null.
(function::target_type()): Check for null pointer.
(function::target()): Define unconditionall. Fix bug with
const_cast of function pointer type.
(function::target() const): Define unconditionally, but
only use RTTI if enabled.
* testsuite/20_util/function/target_no_rtti.cc: New test.

diff --git a/libstdc++-v3/include/bits/std_function.h 
b/libstdc++-v3/include/bits/std_function.h
index fa65885d1de..054d9cbbf02 100644
--- a/libstdc++-v3/include/bits/std_function.h
+++ b/libstdc++-v3/include/bits/std_function.h
@@ -183,11 +183,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
  switch (__op)
{
-#if __cpp_rtti
case __get_type_info:
+#if __cpp_rtti
  __dest._M_access() = &typeid(_Functor);
- break;
+#else
+ __dest._M_access() = nullptr;
 #endif
+ break;
case __get_functor_ptr:
  __dest._M_access<_Functor*>() = _M_get_pointer(__source);
  break;
@@ -293,6 +295,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 };
 
+  // Specialization for invalid types
+  template<>
+class _Function_handler
+{
+public:
+  static bool
+

[committed] libstdc++: Do not use volatile for __gnu_cxx::rope reference counting

2020-10-29 Thread Jonathan Wakely via Gcc-patches
The rope extension uses a volatile variable for its reference count.
This is not only unnecessary for correctness (volatile provides neither
atomicity nor memory visibility, and the variable is only modified while
a lock is held) but it now causes deprecated warnings with
-Wsystem-headers due to the use of ++ and -- operators.

It would be possible to use __gnu_cxx::__exchange_and_add in _M_incr and
_M_decr when __atomic_is_lock_free(sizeof(_RC_t), &_M_ref_count) is
true, rather than locking a mutex. That would probably be a significant
improvement for multi-threaded and single-threaded code (because
__exchange_and_add will use non-atomic ops when possible, and even in MT
code it should be faster than the mutex lock/unlock pair). However,
mixing objects compiled with the old and new code would result in
inconsistent synchronization being used for the reference count.

libstdc++-v3/ChangeLog:

* include/ext/rope (_Refcount_Base::_M_ref_count): Remove
volatile qualifier.
(_Refcount_Base::_M_decr()): Likewise.

Tested x86_64-linux. Committed to trunk.

commit d067bd729367e947e919fc869143539ae023
Author: Jonathan Wakely 
Date:   Thu Oct 29 14:47:17 2020

libstdc++: Do not use volatile for __gnu_cxx::rope reference counting

The rope extension uses a volatile variable for its reference count.
This is not only unnecessary for correctness (volatile provides neither
atomicity nor memory visibility, and the variable is only modified while
a lock is held) but it now causes deprecated warnings with
-Wsystem-headers due to the use of ++ and -- operators.

It would be possible to use __gnu_cxx::__exchange_and_add in _M_incr and
_M_decr when __atomic_is_lock_free(sizeof(_RC_t), &_M_ref_count) is
true, rather than locking a mutex. That would probably be a significant
improvement for multi-threaded and single-threaded code (because
__exchange_and_add will use non-atomic ops when possible, and even in MT
code it should be faster than the mutex lock/unlock pair). However,
mixing objects compiled with the old and new code would result in
inconsistent synchronization being used for the reference count.

libstdc++-v3/ChangeLog:

* include/ext/rope (_Refcount_Base::_M_ref_count): Remove
volatile qualifier.
(_Refcount_Base::_M_decr()): Likewise.

diff --git a/libstdc++-v3/include/ext/rope b/libstdc++-v3/include/ext/rope
index fb7bdb0d6f4..08e510bb0dc 100644
--- a/libstdc++-v3/include/ext/rope
+++ b/libstdc++-v3/include/ext/rope
@@ -452,7 +452,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 typedef std::size_t _RC_t;
 
 // The data member _M_ref_count
-volatile _RC_t _M_ref_count;
+_RC_t _M_ref_count;
 
 // Constructor
 #ifdef __GTHREAD_MUTEX_INIT
@@ -489,7 +489,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _M_decr()
 {
   __gthread_mutex_lock(&_M_ref_count_lock);
-  volatile _RC_t __tmp = --_M_ref_count;
+  _RC_t __tmp = --_M_ref_count;
   __gthread_mutex_unlock(&_M_ref_count_lock);
   return __tmp;
 }


Re: [RS6000] float128-type-2.c unsupported

2020-10-29 Thread Segher Boessenkool
Hi David, all,

On Thu, Oct 29, 2020 at 10:12:31AM -0400, David Edelsohn wrote:
> > I'm sorry you see this as a matter of respect.  I didn't see it that
> > way at all.  Segher disagreed with your review, and gave sufficient
> > technical reason for me to commit the patch.

> If you or Segher disagree with my review, we can discuss it like
> mature adults and reach a mutually-acceptable outcome.  I'm a
> reasonable person.

Like I said, I do *not* disagree with what you said.  I just thought you
missed the point of the patch, which was to fix another (much more
trivial) problem, not the bigger problem that you want to see fixed (and
I agree with that, as I said).

But we cannot ask contributors to solve unrelated problems.  We can try
to interest them in solving things, but not much more.

I thought it would be obvious to you as well that Alan's patch just was
for the one simple problem (like some related patches I acked at about
the same time), which is why I didn't talk to you first.  I wasn't
overriding your review, your points are well taken.

Take care,


Segher


[committed] libstdc++: Allow Lemire's algorithm to be used in more cases

2020-10-29 Thread Jonathan Wakely via Gcc-patches
This extends the fast path to also work when the URBG's range of
possible values is not the entire range of its result_type. Previously,
the slow path would be used for engines with a uint_fast32_t result type
if that type is actually a typedef for uint64_t rather than uint32_t.
After this change, the generator's result_type is not important, only
the range of possible value that generator can produce. If the
generator's range is exactly UINT64_MAX then the calculation will be
done using 128-bit and 64-bit integers, and if the range is UINT32_MAX
it will be done using 64-bit and 32-bit integers.

In practice, this benefits most of the engines and engine adaptors
defined in [rand.predef] on x86_64-linux and other 64-bit targets. This
is because std::minstd_rand0 and std::mt19937 and others use
uint_fast32_t, which is a typedef for uint64_t.

The code now makes use of the recently-clarified requirement that the
generator's min() and max() functions are usable in constant
expressions (see LWG 2154).

libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (_Power_of_two): Add
constexpr.
(uniform_int_distribution::_S_nd): Add static_assert to ensure
the wider type is twice as wide as the result type.
(uniform_int_distribution::__generate_impl): Add static_assert
and declare variables as constexpr where appropriate.
(uniform_int_distribution:operator()): Likewise. Only consider
the uniform random bit generator's range of possible results
when deciding whether _S_nd can be used, not the __uctype type.

Tested x86_64-linux. Committed to trunk.

commit 822c1d21a3c710831af65a6e3bc83f558fb39044
Author: Jonathan Wakely 
Date:   Thu Oct 29 14:47:18 2020

libstdc++: Allow Lemire's algorithm to be used in more cases

This extends the fast path to also work when the URBG's range of
possible values is not the entire range of its result_type. Previously,
the slow path would be used for engines with a uint_fast32_t result type
if that type is actually a typedef for uint64_t rather than uint32_t.
After this change, the generator's result_type is not important, only
the range of possible value that generator can produce. If the
generator's range is exactly UINT64_MAX then the calculation will be
done using 128-bit and 64-bit integers, and if the range is UINT32_MAX
it will be done using 64-bit and 32-bit integers.

In practice, this benefits most of the engines and engine adaptors
defined in [rand.predef] on x86_64-linux and other 64-bit targets. This
is because std::minstd_rand0 and std::mt19937 and others use
uint_fast32_t, which is a typedef for uint64_t.

The code now makes use of the recently-clarified requirement that the
generator's min() and max() functions are usable in constant
expressions (see LWG 2154).

libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (_Power_of_two): Add
constexpr.
(uniform_int_distribution::_S_nd): Add static_assert to ensure
the wider type is twice as wide as the result type.
(uniform_int_distribution::__generate_impl): Add static_assert
and declare variables as constexpr where appropriate.
(uniform_int_distribution:operator()): Likewise. Only consider
the uniform random bit generator's range of possible results
when deciding whether _S_nd can be used, not the __uctype type.

diff --git a/libstdc++-v3/include/bits/uniform_int_dist.h 
b/libstdc++-v3/include/bits/uniform_int_dist.h
index cf6ba35c675..524593bb984 100644
--- a/libstdc++-v3/include/bits/uniform_int_dist.h
+++ b/libstdc++-v3/include/bits/uniform_int_dist.h
@@ -58,7 +58,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 /* Determine whether number is a power of 2.  */
 template
-  inline bool
+  constexpr bool
   _Power_of_2(_Tp __x)
   {
return ((__x - 1) & __x) == 0;
@@ -242,9 +242,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
static _Up
_S_nd(_Urbg& __g, _Up __range)
{
- using __gnu_cxx::__int_traits;
- static_assert(!__int_traits<_Up>::__is_signed, "U must be unsigned");
- static_assert(!__int_traits<_Wp>::__is_signed, "W must be unsigned");
+ using _Up_traits = __gnu_cxx::__int_traits<_Up>;
+ using _Wp_traits = __gnu_cxx::__int_traits<_Wp>;
+ static_assert(!_Up_traits::__is_signed, "U must be unsigned");
+ static_assert(!_Wp_traits::__is_signed, "W must be unsigned");
+ static_assert(_Wp_traits::__digits == (2 * _Up_traits::__digits),
+   "W must be twice as wide as U");
 
  // reference: Fast Random Integer Generation in an Interval
  // ACM Transactions on Modeling and Computer Simulation 29 (1), 2019
@@ -260,7 +263,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __low = _Up(__product);
 

[committed] libstdc++: Improve tests for constexpr algorithms

2020-10-29 Thread Jonathan Wakely via Gcc-patches
These tests just return true without checking that the results of the
algorithms. Although it should be safe to assume that the algorithms
behave the same at compile-time as at run-time, we can use these tests
to verify it.

This replaces each 'return true' statement with a condition that depends
on the basic functionality of the algorithm, such as returning an
iterator to the right position.

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/all_of/constexpr.cc: Check result of
the algorithm.
* testsuite/25_algorithms/any_of/constexpr.cc: Likewise.
* testsuite/25_algorithms/binary_search/constexpr.cc: Likewise.
* testsuite/25_algorithms/copy_backward/constexpr.cc: Likewise.
* testsuite/25_algorithms/count/constexpr.cc: Likewise.
* testsuite/25_algorithms/equal/constexpr.cc: Likewise.
* testsuite/25_algorithms/equal_range/constexpr.cc: Likewise.
* testsuite/25_algorithms/fill/constexpr.cc: Likewise.
* testsuite/25_algorithms/find_end/constexpr.cc: Likewise.
* testsuite/25_algorithms/find_if/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_partitioned/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_permutation/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_sorted_until/constexpr.cc:
Likewise.
* testsuite/25_algorithms/lexicographical_compare/constexpr.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/constexpr.cc: Likewise.
* testsuite/25_algorithms/merge/constexpr.cc: Likewise.
* testsuite/25_algorithms/mismatch/constexpr.cc: Likewise.
* testsuite/25_algorithms/none_of/constexpr.cc: Likewise.
* testsuite/25_algorithms/partition_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/remove_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/remove_copy_if/constexpr.cc: Likewise.
* testsuite/25_algorithms/remove_if/constexpr.cc: Likewise.
* testsuite/25_algorithms/replace_if/constexpr.cc: Likewise.
* testsuite/25_algorithms/reverse/constexpr.cc: Likewise.
* testsuite/25_algorithms/reverse_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/rotate_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/search/constexpr.cc: Likewise.
* testsuite/25_algorithms/set_difference/constexpr.cc: Likewise.
* testsuite/25_algorithms/set_intersection/constexpr.cc:
Likewise.
* testsuite/25_algorithms/set_symmetric_difference/constexpr.cc:
Likewise.
* testsuite/25_algorithms/set_union/constexpr.cc: Likewise.
* testsuite/25_algorithms/unique_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/upper_bound/constexpr.cc: Likewise.

Tested x86_64-linux. Committed to trunk.

commit 8c84486bba104399b7e544cb1ba343646d39ea0a
Author: Jonathan Wakely 
Date:   Thu Oct 29 14:47:18 2020

libstdc++: Improve tests for constexpr algorithms

These tests just return true without checking that the results of the
algorithms. Although it should be safe to assume that the algorithms
behave the same at compile-time as at run-time, we can use these tests
to verify it.

This replaces each 'return true' statement with a condition that depends
on the basic functionality of the algorithm, such as returning an
iterator to the right position.

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/all_of/constexpr.cc: Check result of
the algorithm.
* testsuite/25_algorithms/any_of/constexpr.cc: Likewise.
* testsuite/25_algorithms/binary_search/constexpr.cc: Likewise.
* testsuite/25_algorithms/copy_backward/constexpr.cc: Likewise.
* testsuite/25_algorithms/count/constexpr.cc: Likewise.
* testsuite/25_algorithms/equal/constexpr.cc: Likewise.
* testsuite/25_algorithms/equal_range/constexpr.cc: Likewise.
* testsuite/25_algorithms/fill/constexpr.cc: Likewise.
* testsuite/25_algorithms/find_end/constexpr.cc: Likewise.
* testsuite/25_algorithms/find_if/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_partitioned/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_permutation/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_sorted_until/constexpr.cc:
Likewise.
* testsuite/25_algorithms/lexicographical_compare/constexpr.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/constexpr.cc: Likewise.
* testsuite/25_algorithms/merge/constexpr.cc: Likewise.
* testsuite/25_algorithms/mismatch/constexpr.cc: Likewise.
* testsuite/25_algorithms/none_of/constexpr.cc: Likewise.
* testsuite/25_algorithms/partition_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/remove_copy/constexpr.cc: Likewise.
* testsuite

Re: [committed] libstdc++: Fix memory issue in ranges::lexicographical_compare testcase

2020-10-29 Thread Patrick Palka via Gcc-patches
On Thu, 29 Oct 2020, Patrick Palka wrote:

> libstdc++-v3/ChangeLog:
> 
>   * testsuite/25_algorithms/lexicographical_compare/constrained.cc:
>   (test03): Fix initializing the vector vy with the array y of size 4.

Now committed to the 10 branch as well.

> ---
>  .../25_algorithms/lexicographical_compare/constrained.cc| 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git 
> a/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc 
> b/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc
> index b82c872..2019bbc75e4 100644
> --- 
> a/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc
> +++ 
> b/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/constrained.cc
> @@ -136,7 +136,7 @@ test03()
>VERIFY( !ranges::lexicographical_compare(cy.begin(), cy.end(),
>  cz.begin(), cz.end()) );
>  
> -  std::vector vx(x, x+5), vy(y, y+5);
> +  std::vector vx(x, x+5), vy(y, y+4);
>VERIFY( ranges::lexicographical_compare(vx, vy) );
>VERIFY( !ranges::lexicographical_compare(vx, vy, ranges::greater{}) );
>VERIFY( !ranges::lexicographical_compare(vy, vx) );
> -- 
> 2.29.0.rc0
> 
> 



[patch] i386 tests: Add dg-require-profiling to i386 tests using -pg

2020-10-29 Thread Olivier Hainque
Hello,

This patch is a proposal to add 

  /* { dg-require-profiling "-pg" } */

to a few tests in gcc.target/i386 that use -pg explicitly.

This matches what other tests checking profiling related
options do and prevents these specific ones from failing
during runs for VxWorks targets.

Ok to commit ?

Thanks in advance!

Best Regards,

Olivier

2020-10-29  Olivier Hainque  

gcc/testsuite/
* gcc.target/i386/fentryname1.c: Add dg-require-profiling.
* gcc.target/i386/fentryname2.c: Likewise.
* gcc.target/i386/fentryname3.c: Likewise.
* gcc.target/i386/returninst1.c: Likewise.
* gcc.target/i386/returninst2.c: Likewise.
* gcc.target/i386/returninst3.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/i386/fentryname1.c 
b/gcc/testsuite/gcc.target/i386/fentryname1.c
index 1265342b954f..a9d1c727e86d 100644
--- a/gcc/testsuite/gcc.target/i386/fentryname1.c
+++ b/gcc/testsuite/gcc.target/i386/fentryname1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mfentry } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-pg -mfentry -mfentry-name=foo" } */
 /* { dg-final { scan-assembler "call.*foo" } } */
 /* { dg-final { scan-assembler "call.*bar" } } */
diff --git a/gcc/testsuite/gcc.target/i386/fentryname2.c 
b/gcc/testsuite/gcc.target/i386/fentryname2.c
index c51c5d1ff716..13a43ec27e5c 100644
--- a/gcc/testsuite/gcc.target/i386/fentryname2.c
+++ b/gcc/testsuite/gcc.target/i386/fentryname2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mfentry } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-pg -mfentry -mrecord-mcount -mfentry-section=foo" } */
 /* { dg-final { scan-assembler "section.*foo" } } */
 /* { dg-final { scan-assembler "section.*bar" } } */
diff --git a/gcc/testsuite/gcc.target/i386/fentryname3.c 
b/gcc/testsuite/gcc.target/i386/fentryname3.c
index 56881090a9c7..bd7c997c178f 100644
--- a/gcc/testsuite/gcc.target/i386/fentryname3.c
+++ b/gcc/testsuite/gcc.target/i386/fentryname3.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target mfentry } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-pg -mfentry"  } */
 /* { dg-final { scan-assembler "section.*__entry_loc" } } */
 /* { dg-final { scan-assembler "0x0f, 0x1f, 0x44, 0x00, 0x00" } } */
diff --git a/gcc/testsuite/gcc.target/i386/returninst1.c 
b/gcc/testsuite/gcc.target/i386/returninst1.c
index 133fdeef5aa1..74d10c925c3a 100644
--- a/gcc/testsuite/gcc.target/i386/returninst1.c
+++ b/gcc/testsuite/gcc.target/i386/returninst1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-require-effective-target mfentry } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-pg -mfentry -minstrument-return=call -mrecord-return" } */
 /* { dg-final { scan-assembler "call.*__return__" } } */
 /* { dg-final { scan-assembler "section.*return_loc" } } */
diff --git a/gcc/testsuite/gcc.target/i386/returninst2.c 
b/gcc/testsuite/gcc.target/i386/returninst2.c
index 3629310a59a7..e19f0d01f84c 100644
--- a/gcc/testsuite/gcc.target/i386/returninst2.c
+++ b/gcc/testsuite/gcc.target/i386/returninst2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-require-effective-target mfentry } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-pg -mfentry -minstrument-return=nop5 -mrecord-return" } */
 /* { dg-final { scan-assembler-times "0x0f, 0x1f, 0x44, 0x00, 0x00" 3 } } */
 /* { dg-final { scan-assembler "section.*return_loc" } } */
diff --git a/gcc/testsuite/gcc.target/i386/returninst3.c 
b/gcc/testsuite/gcc.target/i386/returninst3.c
index b84cc77e12bc..acb8984d38ff 100644
--- a/gcc/testsuite/gcc.target/i386/returninst3.c
+++ b/gcc/testsuite/gcc.target/i386/returninst3.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-require-effective-target mfentry } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-pg -mfentry -minstrument-return=call" } */
 /* { dg-final { scan-assembler-not "call.*__return__" } } */
 
-- 
2.17.1



Re: [PATCH] c++: Fix up constexpr evaluation of arguments passed by invisible reference [PR97388]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/20/20 3:33 AM, Jakub Jelinek wrote:

Hi!

For arguments passed by invisible reference, in the IL until genericization
we have the source types on the callee side and while on the caller side
we already pass references to the actual argument slot in the caller, we
undo that in cxx_bind_parameters_in_call's
   if (TREE_ADDRESSABLE (type))
 /* Undo convert_for_arg_passing work here.  */
 x = convert_from_reference (x);
This works fine most of the time, except when the type also has constexpr
destructor; in that case the destructor is invoked in the caller and thus
the unsharing we do to make sure that the callee doesn't modify caller's
values is in that case undesirable, it prevents the changes done in the
callee propagating to the caller which should see them for the constexpr
dtor evaluation.

The following patch fixes that.  While it could be perhaps done for all
TREE_ADDRESSABLE types, I don't see the need to change the behavior
if there is no constexpr non-trivial dtor.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


I think this isn't enough; if bar calls foo twice, the second call will 
find the value in the hash table and not change the temporary, so the 
destructor will throw.  I think we also need to set non_constant_args if 
the argument type has a non-trivial destructor, so we don't try to 
memoize the call.


Then setting arg back to orig_arg isn't needed, because we don't do the 
first unsharing for the hash table.


And then I think that the second unsharing is unnecessary for an 
argument in a non-memoized call, because we will have already unshared 
it when making the copy in the caller.


 

How does this look to you?
commit 70f568bf6a30bc57ae6bc04ed4b4ac6335f01bae
Author: Jakub Jelinek 
Date:   Tue Oct 20 09:33:20 2020 +0200

c++: Fix constexpr dtors vs invisible ref [PR97388]

For arguments passed by invisible reference, in the IL until genericization
we have the source types on the callee side and while on the caller side
we already pass references to the actual argument slot in the caller, we
undo that in cxx_bind_parameters_in_call's
  if (TREE_ADDRESSABLE (type))
/* Undo convert_for_arg_passing work here.  */
x = convert_from_reference (x);
This works fine most of the time, except when the type also has constexpr
destructor; in that case the destructor is invoked in the caller and thus
the unsharing we do to make sure that the callee doesn't modify caller's
values is in that case undesirable, it prevents the changes done in the
callee propagating to the caller which should see them for the constexpr
dtor evaluation.

The following patch fixes that.  While it could be perhaps done for all
TREE_ADDRESSABLE types, I don't see the need to change the behavior
if there is no constexpr non-trivial dtor.

Jason: And we need to avoid memoizing the call, because a later equivalent
call also needs to modify its argument.  And we don't need to unshare
constructors when we aren't memoizing the call, because we already unshared
them when evaluating the TARGET_EXPR representing the copy-initialization of
the argument.

2020-10-20  Jakub Jelinek  
Jason Merrill  

PR c++/97388
* constexpr.c (cxx_bind_parameters_in_call): Set non_constant_args
if the parameter type has a non-trivial destructor.
(cxx_eval_call_expression): Only unshare arguments if we're
memoizing this evaluation.

* g++.dg/cpp2a/constexpr-dtor5.C: New test.
* g++.dg/cpp2a/constexpr-dtor6.C: New test.
* g++.dg/cpp2a/constexpr-dtor7.C: New test.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 7ebdd308dcd..524ce9384cf 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1602,6 +1602,11 @@ cxx_bind_parameters_in_call (const constexpr_ctx *ctx, tree t,
 	arg = adjust_temp_type (type, arg);
 	  if (!TREE_CONSTANT (arg))
 	*non_constant_args = true;
+	  else if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type))
+	/* The destructor needs to see any modifications the callee makes
+	   to the argument.  */
+	*non_constant_args = true;
+
 	  /* For virtual calls, adjust the this argument, so that it is
 	 the object on which the method is called, rather than
 	 one of its bases.  */
@@ -2586,14 +2591,14 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 		 problems with verify_gimple.  */
 		  arg = unshare_expr_without_location (arg);
 		  TREE_VEC_ELT (bound, i) = arg;
+
+		  /* And then unshare again so the callee doesn't change the
+		 argument values in the hash table. XXX Could we unshare
+		 lazily in cxx_eval_store_expression?  */
+		  arg = unshare_constructor (arg);
+		  if (TREE_CODE (arg) == CONSTRUCTOR)
+		vec_safe_push (ctors, arg);
 		}
-	  /* Don't share a CONSTRUCTOR 

Re: New modref/ipa_modref optimization passes

2020-10-29 Thread Jan Hubicka
> Hi,
> this is patch I am using to fix the assumed_alias_type.f90 failure by
> simply arranging alias set 0 for the problematic array descriptor.
> 
> I am not sure this is the best option, but I suppose it is better than
> setting all array descritors to have same canonical type (as done by
> LTO)?
> 
Hi,
here is updated patch which used TYPELESS_STORAGE instead of alias set
0, so it is LTO safe.  Unforunately I also had to enable it for all
array descriptors otherwise I still get misopitmizations with modref
extended to handle bulitins, for example:

FAIL: gfortran.dg/class_array_20.f03   -Os  execution test
FAIL: gfortran.dg/coindexed_1.f90   -O2  execution test
FAIL: gfortran.dg/coindexed_1.f90   -O3 -fomit-frame-pointer
FAIL: gfortran.dg/coindexed_1.f90   -O3 -g  execution test

This is not a perfect solution (we really want to track array
descriptors), but it fixes wrong code and would let me to move forward.
Is it OK for mainline?

With extended modref I still get infinite loop on pdt_14 testcase.
ipa-modref only performs disambiguation on
__vtab_link_module_Pdtlink_8._deallocate this global variable is
readonly (and is detected as such with LTO) so it must be just
uncovering some latent problem there.  I am however not familiar enough
with Fortran to tell what is wrong there.

The testcase fail different way with -flto for me.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* trans-types.c: Include alias.h
(gfc_get_array_type_bounds): Set typeless storage.
diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
index b15ea667411..b7129dcbe6d 100644
--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "trans-array.h"
 #include "dwarf2out.h" /* For struct array_descr_info.  */
 #include "attribs.h"
+#include "alias.h"
 
 
 #if (GFC_MAX_DIMENSIONS < 10)
@@ -1903,6 +1904,10 @@ gfc_get_array_type_bounds (tree etype, int dimen, int 
codimen, tree * lbound,
   base_type = gfc_get_array_descriptor_base (dimen, codimen, false);
   TYPE_CANONICAL (fat_type) = base_type;
   TYPE_STUB_DECL (fat_type) = TYPE_STUB_DECL (base_type);
+  /* Arrays of unknown type must alias with all array descriptors.  */
+  TYPE_TYPELESS_STORAGE (base_type) = 1;
+  TYPE_TYPELESS_STORAGE (fat_type) = 1;
+  gcc_checking_assert (!get_alias_set (base_type) && !get_alias_set 
(fat_type));
 
   tmp = TYPE_NAME (etype);
   if (tmp && TREE_CODE (tmp) == TYPE_DECL)


Re: [PATCH] c++: Diagnose constexpr delete [] new int; and delete new int[N]; [PR95808]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/16/20 5:42 AM, Jakub Jelinek wrote:

Hi!

This patch diagnoses delete [] new int; and delete new int[1]; in constexpr
contexts by remembering
IDENTIFIER_OVL_OP_FLAGS (DECL_NAME (fun)) & OVL_OP_FLAG_VEC
from the operator new and checking it at operator delete time.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-10-16  Jakub Jelinek  

PR c++/95808
* cp-tree.h (enum cp_tree_index): Add CPTI_HEAP_VEC_UNINIT_IDENTIFIER
and CPTI_HEAP_VEC_IDENTIFIER.
(heap_vec_uninit_identifier, heap_vec_identifier): Define.
* decl.c (initialize_predefined_identifiers): Initialize those
identifiers.
* constexpr.c (cxx_eval_call_expression): Reject array allocations
deallocated with non-array deallocation or non-array allocations
deallocated with array deallocation.
(non_const_var_error): Handle heap_vec_uninit_identifier and
heap_vec_identifier too.
(cxx_eval_constant_expression): Handle also heap_vec_uninit_identifier
and in that case during initialization replace it with
heap_vec_identifier.
(find_heap_var_refs): Handle heap_vec_uninit_identifier and
heap_vec_identifier too.

* g++.dg/cpp2a/constexpr-new15.C: New test.

--- gcc/cp/cp-tree.h.jj 2020-10-14 22:05:19.274858485 +0200
+++ gcc/cp/cp-tree.h2020-10-15 16:29:12.136899207 +0200
@@ -178,6 +178,8 @@ enum cp_tree_index
  CPTI_HEAP_UNINIT_IDENTIFIER,
  CPTI_HEAP_IDENTIFIER,
  CPTI_HEAP_DELETED_IDENTIFIER,
+CPTI_HEAP_VEC_UNINIT_IDENTIFIER,
+CPTI_HEAP_VEC_IDENTIFIER,
  
  CPTI_LANG_NAME_C,

  CPTI_LANG_NAME_CPLUSPLUS,
@@ -322,6 +324,8 @@ extern GTY(()) tree cp_global_trees[CPTI
  #define heap_uninit_identifier
cp_global_trees[CPTI_HEAP_UNINIT_IDENTIFIER]
  #define heap_identifier   
cp_global_trees[CPTI_HEAP_IDENTIFIER]
  #define heap_deleted_identifier   
cp_global_trees[CPTI_HEAP_DELETED_IDENTIFIER]
+#define heap_vec_uninit_identifier 
cp_global_trees[CPTI_HEAP_VEC_UNINIT_IDENTIFIER]
+#define heap_vec_identifier
cp_global_trees[CPTI_HEAP_VEC_IDENTIFIER]
  #define lang_name_c   cp_global_trees[CPTI_LANG_NAME_C]
  #define lang_name_cplusplus   
cp_global_trees[CPTI_LANG_NAME_CPLUSPLUS]
  
--- gcc/cp/decl.c.jj	2020-10-14 22:05:19.293858210 +0200

+++ gcc/cp/decl.c   2020-10-15 16:30:05.690125490 +0200
@@ -4242,6 +4242,8 @@ initialize_predefined_identifiers (void)
  {"heap uninit", &heap_uninit_identifier, cik_normal},
  {"heap ", &heap_identifier, cik_normal},
  {"heap deleted", &heap_deleted_identifier, cik_normal},
+{"heap [] uninit", &heap_vec_uninit_identifier, cik_normal},
+{"heap []", &heap_vec_identifier, cik_normal},
  {NULL, NULL, cik_normal}
};
  
--- gcc/cp/constexpr.c.jj	2020-10-01 11:16:36.390959542 +0200

+++ gcc/cp/constexpr.c  2020-10-15 17:02:31.036021476 +0200
@@ -2288,7 +2288,11 @@ cxx_eval_call_expression (const constexp
{
  tree type = build_array_type_nelts (char_type_node,
  tree_to_uhwi (arg0));
- tree var = build_decl (loc, VAR_DECL, heap_uninit_identifier,
+ tree var = build_decl (loc, VAR_DECL,
+(IDENTIFIER_OVL_OP_FLAGS (DECL_NAME (fun))
+ & OVL_OP_FLAG_VEC)
+? heap_vec_uninit_identifier
+: heap_uninit_identifier,
 type);
  DECL_ARTIFICIAL (var) = 1;
  TREE_STATIC (var) = 1;
@@ -2306,6 +2310,42 @@ cxx_eval_call_expression (const constexp
  if (DECL_NAME (var) == heap_uninit_identifier
  || DECL_NAME (var) == heap_identifier)
{
+ if (IDENTIFIER_OVL_OP_FLAGS (DECL_NAME (fun))
+ & OVL_OP_FLAG_VEC)
+   {
+ if (!ctx->quiet)
+   {
+ error_at (loc, "array deallocation of object "
+"allocated with non-array "
+"allocation");
+ inform (DECL_SOURCE_LOCATION (var),
+ "allocation performed here");
+   }
+ *non_constant_p = true;
+ return t;
+   }
+ DECL_NAME (var) = heap_deleted_identifier;
+ ctx->global->values.remove (var);
+ ctx->global->heap_dealloc_count++;
+ return void_node;
+   }
+ else if (DECL_NAME (var) == heap_vec_uninit_identifier
+  || DECL_NAME (var) == heap_vec_identifie

Re: [PATCH v2] c++: Prevent warnings for value-dependent exprs [PR96742]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/28/20 10:45 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 05:48:08PM -0400, Jason Merrill wrote:

On 10/28/20 5:29 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 02:46:36PM -0400, Jason Merrill wrote:

On 10/28/20 2:00 PM, Marek Polacek wrote:

On Tue, Oct 27, 2020 at 01:36:30PM -0400, Jason Merrill wrote:

On 10/24/20 6:52 PM, Marek Polacek wrote:

Here, in r11-155, I changed the call to uses_template_parms to
type_dependent_expression_p_push to avoid a crash in C++98 in
value_dependent_expression_p on a non-constant expression.  But that
prompted a host of complaints that we now warn for value-dependent
expressions in templates.  Those warnings are technically valid, but
people still don't want them because they're awkward to avoid.  So let's
partially revert my earlier fix and make sure that we don't ICE in
value_dependent_expression_p by checking potential_constant_expression
first.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

gcc/cp/ChangeLog:

PR c++/96675
PR c++/96742
* pt.c (tsubst_copy_and_build): Call uses_template_parms instead of
type_dependent_expression_p_push.  Only call uses_template_parms
for expressions that are potential_constant_expression.

gcc/testsuite/ChangeLog:

PR c++/96675
PR c++/96742
* g++.dg/warn/Wdiv-by-zero-3.C: Turn dg-warning into dg-bogus.
* g++.dg/warn/Wtautological-compare3.C: New test.
* g++.dg/warn/Wtype-limits5.C: New test.
* g++.old-deja/g++.pt/crash10.C: Remove dg-warning.
---
 gcc/cp/pt.c|  6 --
 gcc/testsuite/g++.dg/warn/Wdiv-by-zero-3.C |  6 --
 gcc/testsuite/g++.dg/warn/Wtautological-compare3.C | 11 +++
 gcc/testsuite/g++.dg/warn/Wtype-limits5.C  | 11 +++
 gcc/testsuite/g++.old-deja/g++.pt/crash10.C|  1 -
 5 files changed, 30 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wtautological-compare3.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wtype-limits5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index dc664ec3798..8aa0bc2c0d8 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19618,8 +19618,10 @@ tsubst_copy_and_build (tree t,
   {
/* If T was type-dependent, suppress warnings that depend on the range
   of the types involved.  */
-   bool was_dep = type_dependent_expression_p_push (t);
-
+   ++processing_template_decl;
+   const bool was_dep = (!potential_constant_expression (t)
+ || uses_template_parms (t));


We don't want to suppress warnings for a non-constant expression that uses
no template parms.  So maybe


Fair enough.


potential_c_e ? value_d : type_d


That works for all the cases I have.


?  Or perhaps instantiation_dependent_expression_p.


i_d_e_p would still crash in C++98 :(.


Perhaps we should protect the value_d call in i_d_e_p with potential_c_e?


Yeah, probably.  But then we should also guard the call to value_d in
uses_template_parms.  I can apply such a patch if it tests fine, if you
want.


Or change uses_template_parms to use i_d.


Experimenting with this revealed a curious issue: when we have
__PRETTY_FUNCTION__ in a template function, we set its DECL_VALUE_EXPR
to error_mark_node (cp_make_fname_decl), so potential_c_e returns false
when it gets it, but value_dependent_expression_p handles it specially
and says true.  So this patch

--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -27277,7 +27277,8 @@ bool
  instantiation_dependent_expression_p (tree expression)
  {
return (instantiation_dependent_uneval_expression_p (expression)
- || value_dependent_expression_p (expression));
+ || (potential_constant_expression (expression)
+ && value_dependent_expression_p (expression)));
  }
  
  /* Like type_dependent_expression_p, but it also works while not processing


breaks lambda-generic-pretty1.C.  ISTM that potential_c_e should return
true for a DECL_PRETTY_FUNCTION_P with error DECL_VALUE_EXPR.


Agreed.

Jason



Re: [PATCH v2] c++: Implement -Wvexing-parse [PR25814]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/28/20 7:40 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 03:09:08PM -0400, Jason Merrill wrote:

On 10/28/20 1:58 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 01:26:53AM -0400, Jason Merrill via Gcc-patches wrote:

On 10/24/20 7:40 PM, Marek Polacek wrote:

On Fri, Oct 23, 2020 at 09:33:38PM -0400, Jason Merrill via Gcc-patches wrote:

On 10/23/20 3:01 PM, Marek Polacek wrote:

This patch implements the -Wvexing-parse warning to warn about the
sneaky most vexing parse rule in C++: the cases when a declaration
looks like a variable definition, but the C++ language requires it
to be interpreted as a function declaration.  This warning is on by
default (like clang++).  From the docs:

  void f(double a) {
int i();// extern int i (void);
int n(int(a));  // extern int n (int);
  }

  Another example:

  struct S { S(int); };
  void f(double a) {
S x(int(a));   // extern struct S x (int);
S y(int());// extern struct S y (int (*) (void));
S z(); // extern struct S z (void);
  }

You can find more on this in [dcl.ambig.res].

I spent a fair amount of time on fix-it hints so that GCC can recommend
various ways to resolve such an ambiguity.  Sometimes that's tricky.
E.g., suggesting default-initialization when the class doesn't have
a default constructor would not be optimal.  Suggesting {}-init is also
not trivial because it can use an initializer-list constructor if no
default constructor is available (which ()-init wouldn't do).  And of
course, pre-C++11, we shouldn't be recommending {}-init at all.


What do you think of, instead of passing the type down into the declarator
parse, adding the paren locations to cp_declarator::function and giving the
diagnostic from cp_parser_init_declarator instead?


Oops, now I see there's already cp_declarator::parenthesized; might as well
reuse that.  And maybe change it to a range, while we're at it.


I'm afraid I can't reuse it because grokdeclarator uses it to warn about
"unnecessary parentheses in declaration".  So when we have:

int (x());

declarator->parenthesized points to the outer parens (if any), whereas
declarator->u.function.parens_loc should point to the inner ones.  We also
have declarator->id_loc but I think we should only use it for declarator-ids.


Makes sense.


(We should still adjust ->parenthesized to be a range to generate a better
diagnostic; I shall send a patch soon.)


Hmm, I wonder why we have the parenthesized_p parameter to some of these
functions, since we can look at the declarator to find that information...


That would be a nice cleanup.


Interesting idea.  I suppose it's better, and makes the implementation
more localized.  The approach here is that if the .function.parens_loc
is UNKNOWN_LOCATION, we've not seen a vexing parse.


I'd rather always set the parens location, and then analyze the
cp_declarator in warn_about_ambiguous_parse to see if it's a vexing parse;
we should have all the information we need.


I could always set .parens_loc, but then I'd still need another flag telling
me whether we had an ambiguity.  Otherwise I don't know how I would tell
apart e.g. "int f()" (warn) v. "int f(void)" (don't warn), etc.


Ah, I was thinking that we still had the parameter declarators, but now I
see that cp_parser_parameter_declaration_list groks them and returns a
TREE_LIST.  We could set a TREE_LANG_FLAG on each TREE_LIST if its parameter
declarator was parenthesized?


I think so, looks like we have a bunch of free TREE_LANG_FLAG slots on
a TREE_LIST.  But cp_parser_parameter_declaration_clause can return
a void_list_node, so I assume I'd have to copy_node it before setting
some new flag in it.  Do you think that'd be fine?


There's no declarator in a void_list_node, so we shouldn't need to set a 
"declarator is parenthesized" flag on it.


Jason



Re: [PATCH] c++: Deprecate arithmetic convs on different enums [PR97573]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/28/20 10:46 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 02:43:30PM -0400, Jason Merrill wrote:

On 10/28/20 2:01 PM, Marek Polacek wrote:

I noticed that C++20 P1120R0 deprecated certain arithmetic conversions
as outlined in [depr.arith.conv.enum], but we don't warn about them.  In
particular, "If one operand is of enumeration type and the other operand
is of a different enumeration type or a floating-point type, this
behavior is deprecated."  These will likely become ill-formed in C++23,
so we should warn by default in C++20.  To this effect, this patch adds
two new warnings (like clang++): -Wdeprecated-enum-enum-conversion and
-Wdeprecated-enum-float-conversion.  They are enabled by default in
C++20.  In older dialects, to enable these warnings you can now use
-Wenum-conversion which I made available in C++ too.  Note that unlike
C, in C++ it is not enabled by -Wextra, because that breaks bootstrap.

We already warn about comparisons of two different enumeration types via
-Wenum-compare, the rest is handled in this patch: we're performing the
usual arithmetic conversions in these contexts:
- an arithmetic operation,
- a bitwise operation,
- a comparison,
- a conditional operator,
- a compound assign operator.

Using the spaceship operator as enum <=> real_type is ill-formed but we
don't reject it yet.


Hmm, oops.  Will you fix that as well?  It should be simple to fix in the
SPACESHIP_EXPR block that starts just at the end of this patch.


Sure.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK, thanks.


 From 8ae2e45f2dd35510aed3be1ab249b8612e33f00d Mon Sep 17 00:00:00 2001
From: Marek Polacek 
Date: Wed, 28 Oct 2020 19:02:29 -0400
Subject: [PATCH] c++: Reject float <=> enum.

As [depr.arith.conv.enum] says, these are ill-formed.

gcc/cp/ChangeLog:

* typeck.c (do_warn_enum_conversions): Don't warn for SPACESHIP_EXPR.
(cp_build_binary_op): Reject float <=> enum or enum <=> float.  Use
CP_INTEGRAL_TYPE_P instead of INTEGRAL_OR_ENUMERATION_TYPE_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/enum-conv1.C: Remove unused code.
* g++.dg/cpp2a/spaceship-err5.C: New test.
---
  gcc/cp/typeck.c | 13 ++--
  gcc/testsuite/g++.dg/cpp2a/enum-conv1.C |  3 ---
  gcc/testsuite/g++.dg/cpp2a/spaceship-err5.C | 23 +
  3 files changed, 34 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-err5.C

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 7305310ecbe..d3b701610cf 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -4512,6 +4512,9 @@ do_warn_enum_conversions (location_t loc, enum tree_code 
code, tree type0,
"with enumeration type %qT is deprecated",
type0, type1);
  return;
+   case SPACESHIP_EXPR:
+ /* This is invalid, don't warn.  */
+ return;
default:
  if (enum_first_p)
warning_at (loc, opt, "arithmetic between enumeration type %qT "
@@ -5584,6 +5587,12 @@ cp_build_binary_op (const op_location_t &location,
   arithmetic conversions are applied to the operands."  So we don't do
   arithmetic conversions if the operands both have enumeral type.  */
result_type = NULL_TREE;
+  else if ((orig_code0 == ENUMERAL_TYPE && orig_code1 == REAL_TYPE)
+  || (orig_code0 == REAL_TYPE && orig_code1 == ENUMERAL_TYPE))
+   /* [depr.arith.conv.enum]: Three-way comparisons between such operands
+  [where one is of enumeration type and the other is of a different
+  enumeration type or a floating-point type] are ill-formed.  */
+   result_type = NULL_TREE;
  
if (result_type)

{
@@ -5598,12 +5607,12 @@ cp_build_binary_op (const op_location_t &location,
 type to a floating point type, the program is ill-formed.  */
  bool ok = true;
  if (TREE_CODE (result_type) == REAL_TYPE
- && INTEGRAL_OR_ENUMERATION_TYPE_P (TREE_TYPE (orig_op0)))
+ && CP_INTEGRAL_TYPE_P (orig_type0))
/* OK */;
  else if (!check_narrowing (result_type, orig_op0, complain))
ok = false;
  if (TREE_CODE (result_type) == REAL_TYPE
- && INTEGRAL_OR_ENUMERATION_TYPE_P (TREE_TYPE (orig_op1)))
+ && CP_INTEGRAL_TYPE_P (orig_type1))
/* OK */;
  else if (!check_narrowing (result_type, orig_op1, complain))
ok = false;
diff --git a/gcc/testsuite/g++.dg/cpp2a/enum-conv1.C 
b/gcc/testsuite/g++.dg/cpp2a/enum-conv1.C
index d4960f334dd..4571b5e8968 100644
--- a/gcc/testsuite/g++.dg/cpp2a/enum-conv1.C
+++ b/gcc/testsuite/g++.dg/cpp2a/enum-conv1.C
@@ -110,9 +110,6 @@ enum_float (bool b)
r += b ? d : u1; // { dg-warning "conditional expression between" "" { 
target c++20 } }
r += b ? u1 : d; // { dg-warning "conditional expression between" "" { 
target c++20 } }

Re: [PATCH 1/2] c++: Tolerate empty initial targs during normalization [PR97412]

2020-10-29 Thread Patrick Palka via Gcc-patches
On Mon, 19 Oct 2020, Patrick Palka wrote:

> When normalizing the constraint-expression of a nested-requirement, we
> pass NULL_TREE as the initial template arguments for normalization, but
> tsubst_argument_pack is not prepared to handle a NULL_TREE targ vector.
> This causes us to ICE when normalizing a variadic concept as part of a
> nested-requirement.
> 
> This patch fixes the ICE by guarding the call to tsubst_template_args in
> normalize_concept_check appropriately.  This will also enables us to
> simplify many of the normalization routines to pass NULL_TREE instead of
> a set of generic template arguments as the initial template arguments,
> which will be done in a subsequent patch.

Ping.  For some reason I confusingly referred to 'targs' in the commit
message when the variable in question is actually 'args'.  So I've
adjusted the commit message below:

-- >8 --

Subject: [PATCH] c++: Tolerate empty initial args during normalization
 [PR97412]

When normalizing the constraint-expression of a nested-requirement, we
pass NULL_TREE as the initial template arguments for normalization, but
tsubst_argument_pack is not prepared to handle a NULL_TREE args vector.
This causes us to ICE when normalizing a variadic concept as part of a
nested-requirement.

This patch fixes the ICE by guarding the call to tsubst_template_args in
normalize_concept_check appropriately.  This will also enable us to
simplify many of the normalization routines to just pass NULL_TREE
(instead of a set of generic template arguments) as the initial template
arguments.

gcc/cp/ChangeLog:

PR c++/97412
* constraint.cc (normalize_concept_check): Don't call
tsubst_template_args when 'args' is NULL.

gcc/testsuite/ChangeLog:

PR c++/97412
* g++.dg/cpp2a/concepts-variadic2.C: New test.
---
 gcc/cp/constraint.cc|  3 ++-
 gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C | 12 
 2 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index f4f5174eff3..75457a2dd60 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -686,7 +686,8 @@ normalize_concept_check (tree check, tree args, norm_info 
info)
 }
 
   /* Substitute through the arguments of the concept check. */
-  targs = tsubst_template_args (targs, args, info.complain, info.in_decl);
+  if (args)
+targs = tsubst_template_args (targs, args, info.complain, info.in_decl);
   if (targs == error_mark_node)
 return error_mark_node;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C
new file mode 100644
index 000..ce61aef5481
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C
@@ -0,0 +1,12 @@
+// PR c++/97412
+// { dg-do compile { target c++20 } }
+
+template 
+concept call_bar_with = requires(T t, TArgs... args) {
+  t.bar(args...);
+};
+
+template 
+concept foo = requires {
+  requires call_bar_with;
+};
-- 
2.29.0.rc0



Re: [PATCH v2] c++: Implement CWG 625: Use of auto as template-arg [PR97479]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/28/20 10:56 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 02:34:15PM -0400, Jason Merrill via Gcc-patches wrote:

On 10/28/20 2:02 PM, Marek Polacek wrote:

This patch implements CWG 625 which prohibits using auto in a template
argument.  A few tests used this construction.  We could perhaps only
give an error in C++20, but not in C++17 with -fconcepts.


We should not give an error with -fconcepts-ts, this was allowed by the
Concepts TS.


Ah, I see.  Presumably we should only get the errors on { target c++20 }.


...which won't happen in c++17_only tests, so no need to change auto[134].C.


Does just changing !flag_concepts to !flag_concepts_ts work?


Almost: one issue is that it would regress the error message for
something like

   using T = auto;

for which we have dedicated code in grokdeclarator, which is nicer than just
the terse "invalid use."  So I think if flag_concepts is on, we should still
check in_template_argument_list_p.  Since the logic has gotten a bit tricky,
I've introduced a new variable, rather than to play hard-to-read games with
?: in the if.

Thanks,

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements CWG 625 which prohibits using auto in a template
argument.  A few tests used this construction.  Since this usage was
allowed by the Concepts TS, we only give an error in C++20.

gcc/cp/ChangeLog:

DR 625
PR c++/97479
* parser.c (cp_parser_type_id_1): Reject using auto as
a template-argument in C++20.

gcc/testsuite/ChangeLog:

DR 625
PR c++/97479
* g++.dg/concepts/auto1.C: Add dg-error.
* g++.dg/concepts/auto3.C: Likewise.
* g++.dg/concepts/auto4.C: Likewise.
* g++.dg/cpp0x/auto3.C: Update dg-error.
* g++.dg/cpp0x/auto9.C: Likewise.
* g++.dg/cpp2a/concepts-pr84979-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr84979-3.C: Likewise.
* g++.dg/cpp2a/concepts-pr84979.C: Likewise.
* g++.dg/DRs/dr625.C: New test.
---
  gcc/cp/parser.c | 15 +--
  gcc/testsuite/g++.dg/DRs/dr625.C| 15 +++
  gcc/testsuite/g++.dg/concepts/auto1.C   |  4 ++--
  gcc/testsuite/g++.dg/concepts/auto3.C   |  6 +++---
  gcc/testsuite/g++.dg/concepts/auto4.C   |  2 +-
  gcc/testsuite/g++.dg/cpp0x/auto3.C  |  2 +-
  gcc/testsuite/g++.dg/cpp0x/auto9.C  |  2 +-
  gcc/testsuite/g++.dg/cpp2a/concepts-pr84979-2.C | 12 ++--
  gcc/testsuite/g++.dg/cpp2a/concepts-pr84979-3.C | 12 ++--
  gcc/testsuite/g++.dg/cpp2a/concepts-pr84979.C   |  2 +-
  10 files changed, 49 insertions(+), 23 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/DRs/dr625.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 234079559b9..6570b0af889 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -22417,9 +22417,17 @@ cp_parser_type_id_1 (cp_parser *parser, 
cp_parser_flags flags,
if (!cp_parser_parse_definitely (parser))
  abstract_declarator = NULL;
  
+  bool auto_typeid_ok = false;

+  /* The concepts TS allows 'auto' as a type-id.  */
+  if (flag_concepts_ts)
+auto_typeid_ok = !parser->in_type_id_in_expr_p;
+  /* DR 625 prohibits use of auto as a template-argument.  */


In this comment, please mention that we're only allowing it here for the 
better diagnostic.  OK with this and dropping the unnecessary testsuite 
changes.



+  else if (flag_concepts)
+auto_typeid_ok = (!parser->in_type_id_in_expr_p
+ && !parser->in_template_argument_list_p);
+
if (type_specifier_seq.type
-  /* The concepts TS allows 'auto' as a type-id.  */
-  && (!flag_concepts || parser->in_type_id_in_expr_p)
+  && !auto_typeid_ok
/* None of the valid uses of 'auto' in C++14 involve the type-id
 nonterminal, but it is valid in a trailing-return-type.  */
&& !(cxx_dialect >= cxx14 && is_trailing_return))
@@ -22446,6 +22454,9 @@ cp_parser_type_id_1 (cp_parser *parser, cp_parser_flags 
flags,
inform (DECL_SOURCE_LOCATION (tmpl), "%qD declared here",
tmpl);
  }
+   else if (parser->in_template_argument_list_p)
+ error_at (loc, "%qT not permitted in template argument",
+   auto_node);
else
  error_at (loc, "invalid use of %qT", auto_node);
return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/DRs/dr625.C b/gcc/testsuite/g++.dg/DRs/dr625.C
new file mode 100644
index 000..ce30a9258e6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/DRs/dr625.C
@@ -0,0 +1,15 @@
+// DR 625 - Use of auto as a template-argument
+// PR c++/97479
+// { dg-do compile { target c++14 } }
+
+template
+struct A { };
+
+void f(int);
+
+int main()
+{
+  A x = A(); // { dg-error "not permitted|invalid|cannot 
convert" }
+  A a = A(); // { dg-error "not permitted|invalid|cannot convert" }
+  void (*p)(auto

Re: [PATCH] c++: Fix up constexpr evaluation of arguments passed by invisible reference [PR97388]

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 29, 2020 at 11:09:05AM -0400, Jason Merrill wrote:
> I think this isn't enough; if bar calls foo twice, the second call will find
> the value in the hash table and not change the temporary, so the destructor
> will throw.  I think we also need to set non_constant_args if the argument
> type has a non-trivial destructor, so we don't try to memoize the call.

For the testcases with constexpr new in there it wouldn't be memoized
already, but you are right that for other cases making it non_constant_args
is desirable.

> Then setting arg back to orig_arg isn't needed, because we don't do the
> first unsharing for the hash table.

Yes.

> And then I think that the second unsharing is unnecessary for an argument in
> a non-memoized call, because we will have already unshared it when making
> the copy in the caller.

I'm not sure about this one, but if it works, I'm not against that, the less
unsharing the better for compile time memory unless it breaks stuff.

I'll bootstrap/regtest your patchset (or do you want to do that)?

Jakub



Avoid char[] array in tree_def

2020-10-29 Thread Jan Hubicka
Hi,
this patch removes second char array from tree_def union and makes it
!TYPELESS_STORAGE.  Now all accesses to anything in tree no longer have alias
set 0, but they all have alias set 1 :)
This is because the way we handle unions. However it still increases TBAA
effectivity by about 12%. From:

Alias oracle query stats:
  refs_may_alias_p: 65066258 disambiguations, 74846942 queries
  ref_maybe_used_by_call_p: 152444 disambiguations, 65966862 queries
  call_may_clobber_ref_p: 22546 disambiguations, 28559 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 36816 queries
  nonoverlapping_refs_since_match_p: 27230 disambiguations, 58300 must 
overlaps, 86300 queries
  aliasing_component_refs_p: 66090 disambiguations, 2048800 queries
  TBAA oracle: 25578632 disambiguations 59483650 queries
   12219919 are in alias set 0
   10534575 queries asked about the same object
   125 queries asked about the same alias set
   0 access volatile
   9491563 are dependent in the DAG
   1658836 are aritificially in conflict with void *

Modref stats:
  modref use: 14421 disambiguations, 48129 queries
  modref clobber: 1528229 disambiguations, 1926907 queries
  3881547 tbaa queries (2.014392 per modref query)
  565057 base compares (0.293246 per modref query)

PTA query stats:
  pt_solution_includes: 947491 disambiguations, 13119151 queries
  pt_solutions_intersect: 1043695 disambiguations, 13221495 queries

To:

Alias oracle query stats:
  refs_may_alias_p: 66455561 disambiguations, 75202803 queries
  ref_maybe_used_by_call_p: 155301 disambiguations, 67370278 queries
  call_may_clobber_ref_p: 22550 disambiguations, 28587 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 37058 queries
  nonoverlapping_refs_since_match_p: 28126 disambiguations, 59906 must 
overlaps, 88990 queries
  aliasing_component_refs_p: 66375 disambiguations, 2440039 queries
  TBAA oracle: 28800751 disambiguations 64328055 queries
   8053661 are in alias set 0
   11181983 queries asked about the same object
   125 queries asked about the same alias set
   0 access volatile
   13905691 are dependent in the DAG
   2385844 are aritificially in conflict with void *

Modref stats:
  modref use: 16781 disambiguations, 52031 queries
  modref clobber: 1745589 disambiguations, 2149518 queries
  4192266 tbaa queries (1.950328 per modref query)
  559148 base compares (0.260127 per modref query)

PTA query stats:
  pt_solution_includes: 906487 disambiguations, 13105994 queries
  pt_solutions_intersect: 1041144 disambiguations, 13659726 queries

Bootstrapped/regtested x86_64-linux, OK?

* tree.c (build_string): Update.
* tree-core.h (tree_fixed_cst): Avoid typeless storage.

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index c9280a8d3b1..63dbb5b8eab 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1401,7 +1401,8 @@ struct GTY(()) tree_fixed_cst {
 struct GTY(()) tree_string {
   struct tree_typed typed;
   int length;
-  char str[1];
+  /* Avoid char array that would make whole type to be typeless storage.  */
+  struct {char c;} str[1];
 };
 
 struct GTY(()) tree_complex {
diff --git a/gcc/tree.c b/gcc/tree.c
index 81f867ddded..84115630184 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -2273,7 +2273,7 @@ build_string (unsigned len, const char *str /*= NULL */)
 memcpy (s->string.str, str, len);
   else
 memset (s->string.str, 0, len);
-  s->string.str[len] = '\0';
+  s->string.str[len].c = '\0';
 
   return s;
 }


Re: Avoid char[] array in tree_def

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 29, 2020 at 04:50:54PM +0100, Jan Hubicka wrote:
>   * tree.c (build_string): Update.
>   * tree-core.h (tree_fixed_cst): Avoid typeless storage.

Is it valid then to
#define TREE_STRING_POINTER(NODE) \
  ((const char *)(STRING_CST_CHECK (NODE)->string.str))
and strcpy etc. it around though?
Maybe yes, because stores through char can alias anything.

> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index c9280a8d3b1..63dbb5b8eab 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1401,7 +1401,8 @@ struct GTY(()) tree_fixed_cst {
>  struct GTY(()) tree_string {
>struct tree_typed typed;
>int length;
> -  char str[1];
> +  /* Avoid char array that would make whole type to be typeless storage.  */
> +  struct {char c;} str[1];
>  };
>  
>  struct GTY(()) tree_complex {
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 81f867ddded..84115630184 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -2273,7 +2273,7 @@ build_string (unsigned len, const char *str /*= NULL */)
>  memcpy (s->string.str, str, len);
>else
>  memset (s->string.str, 0, len);
> -  s->string.str[len] = '\0';
> +  s->string.str[len].c = '\0';
>  
>return s;
>  }

Jakub



Re: [committed] libstdc++: Make std::function work better with -fno-rtti

2020-10-29 Thread Jonathan Wakely via Gcc-patches

On 29/10/20 14:49 +, Jonathan Wakely wrote:

This change allows std::function::target() to work even without RTTI,
using the same approach as std::any. Because we know what the manager
function would be for a given type, we can check if the stored pointer
has the expected address. If it does, we don't need to use RTTI. If it
isn't equal, we still need to do the RTTI check (when RTTI is enabled)
to handle the case where the same function has different addresses in
different shared objects.

This also changes the implementation of the manager function to return a
null pointer result when asked for the type_info of the target object.
This not only avoids a warning with -Wswitch -Wsystem-headers, but also
avoids prevents std::function::target_type() from dereferencing an
uninitialized pointer when the linker keeps an instantiation of the
manager function that was compiled without RTTI.

Finally, this fixes a bug in the non-const overload of function::target
where calling it with a function type F was ill-formed, due to
attempting to use const_cast(ptr). The standard only allows
const_cast when T is an object type.  The solution is to use
*const_cast(&ptr) instead, because F* is an object type even if F
isn't. I've also used _GLIBCXX17_CONSTEXPR in function::target so that
it doesn't bother instantiating anything for types that can never be a
valid target.

libstdc++-v3/ChangeLog:

* include/bits/std_function.h (_Function_handler):
Define explicit specialization used for invalid target types.
(_Base_manager::_M_manager) [!__cpp_rtti]: Return null.
(function::target_type()): Check for null pointer.
(function::target()): Define unconditionall. Fix bug with
const_cast of function pointer type.
(function::target() const): Define unconditionally, but
only use RTTI if enabled.
* testsuite/20_util/function/target_no_rtti.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

2
1
0


Oops, sorry for the duplicate mail about this. My mailer ate my
attempt to script something.



Re: Avoid char[] array in tree_def

2020-10-29 Thread Jan Hubicka
> On Thu, Oct 29, 2020 at 04:50:54PM +0100, Jan Hubicka wrote:
> > * tree.c (build_string): Update.
> > * tree-core.h (tree_fixed_cst): Avoid typeless storage.
> 
> Is it valid then to
> #define TREE_STRING_POINTER(NODE) \
>   ((const char *)(STRING_CST_CHECK (NODE)->string.str))
> and strcpy etc. it around though?
> Maybe yes, because stores through char can alias anything.

Yep, I think it should be valid for that reason.  The whole thing is not
terribly pretty (the wide-int change was better), but I do not know of
better solution and it affects our core datastructure.  Typeless storage
is really complicated concept.  Forutnately it seems that there are no
more hacks like this needed: both tree and gimple now gets non-zero
alias set.

Honza
> 
> > diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> > index c9280a8d3b1..63dbb5b8eab 100644
> > --- a/gcc/tree-core.h
> > +++ b/gcc/tree-core.h
> > @@ -1401,7 +1401,8 @@ struct GTY(()) tree_fixed_cst {
> >  struct GTY(()) tree_string {
> >struct tree_typed typed;
> >int length;
> > -  char str[1];
> > +  /* Avoid char array that would make whole type to be typeless storage.  
> > */
> > +  struct {char c;} str[1];
> >  };
> >  
> >  struct GTY(()) tree_complex {
> > diff --git a/gcc/tree.c b/gcc/tree.c
> > index 81f867ddded..84115630184 100644
> > --- a/gcc/tree.c
> > +++ b/gcc/tree.c
> > @@ -2273,7 +2273,7 @@ build_string (unsigned len, const char *str /*= NULL 
> > */)
> >  memcpy (s->string.str, str, len);
> >else
> >  memset (s->string.str, 0, len);
> > -  s->string.str[len] = '\0';
> > +  s->string.str[len].c = '\0';
> >  
> >return s;
> >  }
> 
>   Jakub
> 


Re: Avoid char[] array in tree_def

2020-10-29 Thread Richard Biener
On Thu, 29 Oct 2020, Jan Hubicka wrote:

> Hi,
> this patch removes second char array from tree_def union and makes it
> !TYPELESS_STORAGE.  Now all accesses to anything in tree no longer have alias
> set 0, but they all have alias set 1 :)
> This is because the way we handle unions. However it still increases TBAA
> effectivity by about 12%. From:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 65066258 disambiguations, 74846942 queries
>   ref_maybe_used_by_call_p: 152444 disambiguations, 65966862 queries
>   call_may_clobber_ref_p: 22546 disambiguations, 28559 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 36816 queries
>   nonoverlapping_refs_since_match_p: 27230 disambiguations, 58300 must 
> overlaps, 86300 queries
>   aliasing_component_refs_p: 66090 disambiguations, 2048800 queries
>   TBAA oracle: 25578632 disambiguations 59483650 queries
>12219919 are in alias set 0
>10534575 queries asked about the same object
>125 queries asked about the same alias set
>0 access volatile
>9491563 are dependent in the DAG
>1658836 are aritificially in conflict with void *
> 
> Modref stats:
>   modref use: 14421 disambiguations, 48129 queries
>   modref clobber: 1528229 disambiguations, 1926907 queries
>   3881547 tbaa queries (2.014392 per modref query)
>   565057 base compares (0.293246 per modref query)
> 
> PTA query stats:
>   pt_solution_includes: 947491 disambiguations, 13119151 queries
>   pt_solutions_intersect: 1043695 disambiguations, 13221495 queries
> 
> To:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 66455561 disambiguations, 75202803 queries
>   ref_maybe_used_by_call_p: 155301 disambiguations, 67370278 queries
>   call_may_clobber_ref_p: 22550 disambiguations, 28587 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 37058 queries
>   nonoverlapping_refs_since_match_p: 28126 disambiguations, 59906 must 
> overlaps, 88990 queries
>   aliasing_component_refs_p: 66375 disambiguations, 2440039 queries
>   TBAA oracle: 28800751 disambiguations 64328055 queries
>8053661 are in alias set 0
>11181983 queries asked about the same object
>125 queries asked about the same alias set
>0 access volatile
>13905691 are dependent in the DAG
>2385844 are aritificially in conflict with void *
> 
> Modref stats:
>   modref use: 16781 disambiguations, 52031 queries
>   modref clobber: 1745589 disambiguations, 2149518 queries
>   4192266 tbaa queries (1.950328 per modref query)
>   559148 base compares (0.260127 per modref query)
> 
> PTA query stats:
>   pt_solution_includes: 906487 disambiguations, 13105994 queries
>   pt_solutions_intersect: 1041144 disambiguations, 13659726 queries
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
>   * tree.c (build_string): Update.
>   * tree-core.h (tree_fixed_cst): Avoid typeless storage.
> 
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index c9280a8d3b1..63dbb5b8eab 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1401,7 +1401,8 @@ struct GTY(()) tree_fixed_cst {
>  struct GTY(()) tree_string {
>struct tree_typed typed;
>int length;
> -  char str[1];
> +  /* Avoid char array that would make whole type to be typeless storage.  */
> +  struct {char c;} str[1];

That's ugly and will for sure defeat warning / access code
when we access this as char[], no?  I mean, we could
as well use 'int str[1];' here?

Maybe we can invent some C++ attribute for this?

[[gnu::string]]

or so that marks it as actual char and not typeless storage?

Richard.

>  };
>  
>  struct GTY(()) tree_complex {
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 81f867ddded..84115630184 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -2273,7 +2273,7 @@ build_string (unsigned len, const char *str /*= NULL */)
>  memcpy (s->string.str, str, len);
>else
>  memset (s->string.str, 0, len);
> -  s->string.str[len] = '\0';
> +  s->string.str[len].c = '\0';
>  
>return s;
>  }
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


[Patch, fortran] PR83118 - [8/9/10/11 Regression] Bad intrinsic assignment of class(*) array component of derived type

2020-10-29 Thread Paul Richard Thomas via Gcc-patches
Hi Everyone,

I am afraid that this is a rather long sad story, mainly due to my efforts
with gfortran being interrupted by daytime work. I posted the first version
of the patch nearly a year ago but this was derailed by Tobias's question
at: https://gcc.gnu.org/legacy-ml/fortran/2019-11/msg00098.html

(i) The attached fixes the original problem and is tested by
gfortran.dg/unlimited_polymorphic_32.f03.
(ii) In fixing the original problem, a fair amount of effort was required
to get the element length correct for class temporaries produced by
dependencies in class assignment (see footnote). This is reflected in the
changes to trans_array.c(gfc_alloc_allocatable_for_assignment).
(iii) Tobias's testcase in the above posting to the list didn't address
itself to class arrays of the original problem. However, it revealed that
reallocation was not occuring at all for scalar assignments.  This is fixed
by the large chunk in trans-expr.c(trans_class_assignment). The array case
is 'fixed' by testing for unequal element sizes between lhs and rhs before
reallocation in gfc_alloc_allocatable_for_assignment. This is difficult to
test for since, in most cases, the system returns that same address after
reallocation.
(iv) dependency_57.f90 segfaulted at runtime. The other work in
trans_class_assignment was required to fix this.
(v) A number of minor tidy ups were done including the new function
gfc_resize_class_size_with_len to eliminate some repeated code.

This all bootstraps and regtests on FC31/x86_64 - OK for master?

Cheers

Paul

This patch fixes PR83118 and fixes one or two other niggles in handling
class objects - most importantly class array temporaries required, where
dependences occur in class assignment, and a correct implementation of
reallocation on assignment.

2020-10-29  Paul Thomas  

gcc/fortran
PR fortran/83118
* resolve.c (resolve_ordinary_assign): Generate a vtable if
necessary for scalar non-polymorphic rhs's to unlimited lhs's.
* trans-array.c (gfc_trans_allocate_array_storage): Defer
obtaining class element type until all sources of class exprs.
are tried. Use class API rather than TREE_OPERAND. Look for
class expressions in ss->info. After this, obtain the element
size for class payloads. Cast the data as character(len=size)
to overcome unlimited polymorphic problems.
(structure_alloc_comps): Replace code that replicates the new
function gfc_resize_class_size_with_len.
(gfc_alloc_allocatable_for_assignment): Obtain element size
for lhs in cases of deferred characters and class enitities.
Move code for the element size of rhs to start of block. Clean
up extraction of class parmateres throughout this function.
After the shape check test whether or not the lhs and rhs
element sizes are the same. Use earlier evaluation of
'cond_null'. Reallocation of lhs only to happen if siz changes
or element size changes.
* trans-expr.c (gfc_resize_class_size_with_len): New function.
(gfc_conv_procedure_call): Ensure the vtable is present for
passing a non-class actual to an unlimited formal.
(trans_class_vptr_len_assignment): For expressions of type
BT_CLASS, extract the class expression if necessary. Use a
statement block outside the loop body. Ensure that 'rhs' is
of the correct type. Obtain rhs vptr in all circumstances.
(gfc_trans_assignment_1): Simplify some of the logic with
'realloc_flag'. Set 'vptr_copy' for all array assignments to
unlimited polymorphic lhs.
* trans-c (gfc_build_array_ref): Call gfc_resize_class_size_
with_len to correct span for unlimited polymorphic decls.
* trans.h : Add prototype for gfc_resize_class_size_with_len.

gcc/testsuite/
PR fortran/83118
* gfortran.dg/dependency_57.f90: Change to dg-run and test
for correct result.
* gfortran.dg/unlimited_polymorphic_32.f03: New test.

Footnote: I have come to the conclusion that
gfc_trans_allocate_array_storage is the last place that we should be
dealing with class array temporaries, or directly at least. I will give
some thought as to how to do it better. Also, chunks of code are coming
within scalarization loops that should be outside:
  x->_vptr = (struct __vtype__STAR * {ref-all})
&__vtab_INTEGER_4_;
  x->_len = 0;
  D.3977 = x->_vptr->_size;
  D.3978 = x->_len;
  D.3979 = D.3978 > 0 ? D.3977 * D.3978 : D.3977;


Change2.Logs
Description: Binary data


unlimited_polymorphic_32.f03
Description: Binary data


Re: Avoid char[] array in tree_def

2020-10-29 Thread Jan Hubicka
> 
> That's ugly and will for sure defeat warning / access code
> when we access this as char[], no?  I mean, we could
> as well use 'int str[1];' here?

Well, we always get char pointer via macro that is IMO OK, but I am also
not very much in love with this.
> 
> Maybe we can invent some C++ attribute for this?
> 
> [[gnu::string]]
> 
> or so that marks it as actual char and not typeless storage?

Attribute would probably make sense.  Not sure if gnu::string is best
name given that it can be also meaningful for array of small integers
(such as in wide_int).

Honza
> 
> Richard.
> 
> >  };
> >  
> >  struct GTY(()) tree_complex {
> > diff --git a/gcc/tree.c b/gcc/tree.c
> > index 81f867ddded..84115630184 100644
> > --- a/gcc/tree.c
> > +++ b/gcc/tree.c
> > @@ -2273,7 +2273,7 @@ build_string (unsigned len, const char *str /*= NULL 
> > */)
> >  memcpy (s->string.str, str, len);
> >else
> >  memset (s->string.str, 0, len);
> > -  s->string.str[len] = '\0';
> > +  s->string.str[len].c = '\0';
> >  
> >return s;
> >  }
> > 
> 
> -- 
> Richard Biener 
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imend


[PATCH] Fix some memleaks

2020-10-29 Thread Richard Biener
This fixes some memleaks, one older, one recently introduced.

Bootstrap / regtest in progress on x86_64-unknown-linux-gnu.

2020-10-29  Richard Biener  

* tree-ssa-pre.c (compute_avail): Free operands consistently.
* tree-vect-loop.c (vectorizable_phi): Make sure all operand
defs vectors are released.
---
 gcc/tree-ssa-pre.c   | 5 -
 gcc/tree-vect-loop.c | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 63f3a81e94c..bcef9720095 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -3953,7 +3953,10 @@ compute_avail (void)
 adding the reference to EXP_GEN.  */
  if (BB_MAY_NOTRETURN (block)
  && vn_reference_may_trap (ref))
-   continue;
+   {
+ operands.release ();
+ continue;
+   }
 
  /* If the value of the reference is not invalidated in
 this block until it is computed, add the expression
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 75b731407ba..5ab125d15c6 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -7570,7 +7570,6 @@ vectorizable_phi (vec_info *,
   tree scalar_dest = gimple_phi_result (stmt_info->stmt);
   basic_block bb = gimple_bb (stmt_info->stmt);
   tree vec_dest = vect_create_destination_var (scalar_dest, vectype);
-  auto_vec vec_oprnds;
   auto_vec new_phis;
   for (unsigned i = 0; i < gimple_phi_num_args (stmt_info->stmt); ++i)
 {
@@ -7581,6 +7580,7 @@ vectorizable_phi (vec_info *,
  && SLP_TREE_VEC_STMTS (child).is_empty ())
continue;
 
+  auto_vec vec_oprnds;
   vect_get_slp_defs (SLP_TREE_CHILDREN (slp_node)[i], &vec_oprnds);
   if (!new_phis.exists ())
{
-- 
2.26.2


Re: Avoid char[] array in tree_def

2020-10-29 Thread Richard Biener
On Thu, 29 Oct 2020, Jan Hubicka wrote:

> > 
> > That's ugly and will for sure defeat warning / access code
> > when we access this as char[], no?  I mean, we could
> > as well use 'int str[1];' here?
> 
> Well, we always get char pointer via macro that is IMO OK, but I am also
> not very much in love with this.
> > 
> > Maybe we can invent some C++ attribute for this?
> > 
> > [[gnu::string]]
> > 
> > or so that marks it as actual char and not typeless storage?
> 
> Attribute would probably make sense.  Not sure if gnu::string is best
> name given that it can be also meaningful for array of small integers
> (such as in wide_int).

OK, maybe [[gnu::strictly_typed]] then?

> Honza
> > 
> > Richard.
> > 
> > >  };
> > >  
> > >  struct GTY(()) tree_complex {
> > > diff --git a/gcc/tree.c b/gcc/tree.c
> > > index 81f867ddded..84115630184 100644
> > > --- a/gcc/tree.c
> > > +++ b/gcc/tree.c
> > > @@ -2273,7 +2273,7 @@ build_string (unsigned len, const char *str /*= 
> > > NULL */)
> > >  memcpy (s->string.str, str, len);
> > >else
> > >  memset (s->string.str, 0, len);
> > > -  s->string.str[len] = '\0';
> > > +  s->string.str[len].c = '\0';
> > >  
> > >return s;
> > >  }
> > > 
> > 
> > -- 
> > Richard Biener 
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imend
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: Avoid char[] array in tree_def

2020-10-29 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 29, 2020 at 05:00:40PM +0100, Jan Hubicka wrote:
> > 
> > That's ugly and will for sure defeat warning / access code
> > when we access this as char[], no?  I mean, we could
> > as well use 'int str[1];' here?
> 
> Well, we always get char pointer via macro that is IMO OK, but I am also
> not very much in love with this.

Do we treat signed char [...]; as typeless storage too, or just
what the C++ standard requires (i.e. char, unsigned char and std::byte
where the last one is enum type with unsigned char underlying type)?

Jakub



Re: [PATCH] c++: Fix up constexpr evaluation of arguments passed by invisible reference [PR97388]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/29/20 11:40 AM, Jakub Jelinek wrote:

On Thu, Oct 29, 2020 at 11:09:05AM -0400, Jason Merrill wrote:

I think this isn't enough; if bar calls foo twice, the second call will find
the value in the hash table and not change the temporary, so the destructor
will throw.  I think we also need to set non_constant_args if the argument
type has a non-trivial destructor, so we don't try to memoize the call.


For the testcases with constexpr new in there it wouldn't be memoized
already, but you are right that for other cases making it non_constant_args
is desirable.


Then setting arg back to orig_arg isn't needed, because we don't do the
first unsharing for the hash table.


Yes.


And then I think that the second unsharing is unnecessary for an argument in
a non-memoized call, because we will have already unshared it when making
the copy in the caller.


I'm not sure about this one, but if it works, I'm not against that, the less
unsharing the better for compile time memory unless it breaks stuff.

I'll bootstrap/regtest your patchset (or do you want to do that)?


I already did, thanks.

Jason



Re: [PATCH 1/2] c++: Tolerate empty initial targs during normalization [PR97412]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/29/20 11:21 AM, Patrick Palka wrote:

On Mon, 19 Oct 2020, Patrick Palka wrote:


When normalizing the constraint-expression of a nested-requirement, we
pass NULL_TREE as the initial template arguments for normalization, but
tsubst_argument_pack is not prepared to handle a NULL_TREE targ vector.
This causes us to ICE when normalizing a variadic concept as part of a
nested-requirement.

This patch fixes the ICE by guarding the call to tsubst_template_args in
normalize_concept_check appropriately.  This will also enables us to
simplify many of the normalization routines to pass NULL_TREE instead of
a set of generic template arguments as the initial template arguments,
which will be done in a subsequent patch.


Ping.  For some reason I confusingly referred to 'targs' in the commit
message when the variable in question is actually 'args'.  So I've
adjusted the commit message below:


OK.


-- >8 --

Subject: [PATCH] c++: Tolerate empty initial args during normalization
  [PR97412]

When normalizing the constraint-expression of a nested-requirement, we
pass NULL_TREE as the initial template arguments for normalization, but
tsubst_argument_pack is not prepared to handle a NULL_TREE args vector.
This causes us to ICE when normalizing a variadic concept as part of a
nested-requirement.

This patch fixes the ICE by guarding the call to tsubst_template_args in
normalize_concept_check appropriately.  This will also enable us to
simplify many of the normalization routines to just pass NULL_TREE
(instead of a set of generic template arguments) as the initial template
arguments.

gcc/cp/ChangeLog:

PR c++/97412
* constraint.cc (normalize_concept_check): Don't call
tsubst_template_args when 'args' is NULL.

gcc/testsuite/ChangeLog:

PR c++/97412
* g++.dg/cpp2a/concepts-variadic2.C: New test.
---
  gcc/cp/constraint.cc|  3 ++-
  gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C | 12 
  2 files changed, 14 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index f4f5174eff3..75457a2dd60 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -686,7 +686,8 @@ normalize_concept_check (tree check, tree args, norm_info 
info)
  }
  
/* Substitute through the arguments of the concept check. */

-  targs = tsubst_template_args (targs, args, info.complain, info.in_decl);
+  if (args)
+targs = tsubst_template_args (targs, args, info.complain, info.in_decl);
if (targs == error_mark_node)
  return error_mark_node;
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C b/gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C

new file mode 100644
index 000..ce61aef5481
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-variadic2.C
@@ -0,0 +1,12 @@
+// PR c++/97412
+// { dg-do compile { target c++20 } }
+
+template 
+concept call_bar_with = requires(T t, TArgs... args) {
+  t.bar(args...);
+};
+
+template 
+concept foo = requires {
+  requires call_bar_with;
+};





Re: [PATCH 2/2] c++: Clean up constraint normalization routines

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/19/20 6:08 PM, Patrick Palka wrote:

Many of the high-level constraint normalization routines allow the
caller to supply the initial template arguments for normalization, but
in practice all of the callers ultimately supply either NULL_TREE or a
set of generic template arguments (*).  Since the previous patch made
NULL_TREE act like a set of generic template arguments during
normalization, we can just make get_normalized_constraints pass
NULL_TREE to normalize_expression and remove the 'args' parameter from
the routines that wrap it.

(*): Except one of the overloads of normalize_constraint_expression uses
the template arguments of a concept check for normalizing the concept
check, which doesn't seem right, since if a substitution failure happens
during normalization then it will become a hard error instead of a
SFINAE error.  This patch does away with this overload.

Bootstrapped and regtested on x86_64-pc-linux-gnu and tested on the
cmcstl2 testsuite.  Also verified that the concepts diagnostics remain
unchanged across our testsuite.  Doest this look OK to commit?


OK.


gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints): Remove 'args'
parameter.  Pass NULL_TREE as the initial template arguments to
normalize_expression.
(get_normalized_constraints_from_info): Remove 'args' parameter
and adjust the call to get_normalized_constraints.
(get_normalized_constraints_from_decl): Remove 'args' local
variable and adjust call to get_normalized_constraints_from_info.
(normalize_concept_definition): Remove 'args' local variable
and adjust call to get_normalized_constraints.
(normalize_constraint_expression): Remove the two-argument
overload.  Remove 'args' parameter from the three-argument
overload and update function comment accordingly.  Remove
default argument from 'diag' parameter. Adjust call to
get_normalized_constraints accordingly.
(finish_nested_requirement): Adjust call to
normalize_constraint_expression accordingly.
(strictly_subsumes): Remove 'args' parameter.  Adjust call to
get_normalized_constraints_from_info accordingly.
(weakly_subsumes): Likewise.
* cp-tree.h (strictly_subsumes): Remove 'args' parameter.
(weakly_subsumes): Likewise.
* pt.c (process_partial_specialization): Adjust call to
strictly_subsumes accordingly.
(is_compatible_template_arg): Adjust call to weakly_subsumes
accordingly.
---
  gcc/cp/constraint.cc | 69 +---
  gcc/cp/cp-tree.h |  4 +--
  gcc/cp/pt.c  |  5 ++--
  3 files changed, 24 insertions(+), 54 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 75457a2dd60..d6354edbe6f 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -759,20 +759,18 @@ normalize_expression (tree t, tree args, norm_info info)
  static GTY((deletable)) hash_map *normalized_map;
  
  static tree

-get_normalized_constraints (tree t, tree args, norm_info info)
+get_normalized_constraints (tree t, norm_info info)
  {
auto_timevar time (TV_CONSTRAINT_NORM);
-  return normalize_expression (t, args, info);
+  return normalize_expression (t, NULL_TREE, info);
  }
  
  /* Returns the normalized constraints from a constraint-info object

-   or NULL_TREE if the constraints are null. ARGS provide the initial
-   arguments for normalization and IN_DECL provides the declaration
-   to which the constraints belong.  */
+   or NULL_TREE if the constraints are null. IN_DECL provides the
+   declaration to which the constraints belong.  */
  
  static tree

-get_normalized_constraints_from_info (tree ci, tree args, tree in_decl,
- bool diag = false)
+get_normalized_constraints_from_info (tree ci, tree in_decl, bool diag = false)
  {
if (ci == NULL_TREE)
  return NULL_TREE;
@@ -780,8 +778,7 @@ get_normalized_constraints_from_info (tree ci, tree args, 
tree in_decl,
/* Substitution errors during normalization are fatal.  */
++processing_template_decl;
norm_info info (in_decl, diag ? tf_norm : tf_none);
-  tree t = get_normalized_constraints (CI_ASSOCIATED_CONSTRAINTS (ci),
-  args, info);
+  tree t = get_normalized_constraints (CI_ASSOCIATED_CONSTRAINTS (ci), info);
--processing_template_decl;
  
return t;

@@ -843,9 +840,8 @@ get_normalized_constraints_from_decl (tree d, bool diag = 
false)
  
push_nested_class_guard pncs (DECL_CONTEXT (d));
  
-  tree args = generic_targs_for (tmpl);

tree ci = get_constraints (decl);
-  tree norm = get_normalized_constraints_from_info (ci, args, tmpl, diag);
+  tree norm = get_normalized_constraints_from_info (ci, tmpl, diag);
  
if (!diag)

  hash_map_safe_put (normalized_map, tmpl, norm);
@@ -866,11 +862,10 @@ normalize_concept_definition (tree tmpl, bool diag = 

Re: [PATCH] libstdc++: Add c++2a

2020-10-29 Thread Jonathan Wakely via Gcc-patches

On 21/10/20 09:53 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

libstdc++/Changelog:
libstdc++-v3/doc/doxygen/user.cfg.in (INPUT): Add new header.
libstdc++-v3/include/Makefile.am (std_headers): Add new header.
libstdc++-v3/include/Makefile.in: Regenerate.
libstdc++-v3/include/precompiled/stdc++.h: Include new header.
libstdc++-v3/include/std/streambuf
   (__detail::__streambuf_core_access): Define.
   (basic_streambuf): Befriend __detail::__streambuf_core_access.
libstdc++-v3/include/std/syncstream: New header.
libstdc++-v3/include/std/version: Add __cpp_lib_syncbuf:
libstdc++-v3/testsuite/27_io/basic_syncbuf/1.cc: New test.
libstdc++-v3/testsuite/27_io/basic_syncbuf/2.cc: Likewise.
libstdc++-v3/testsuite/27_io/basic_syncbuf/basic_ops/1.cc:
   Likewise.
libstdc++-v3/testsuite/27_io/basic_syncbuf/requirements/types.cc:
   Likewise.
libstdc++-v3/testsuite/27_io/basic_syncbuf/sync_ops/1.cc:
   Likewise.
libstdc++-v3/testsuite/27_io/basic_syncstream/1.cc: Likewise.
libstdc++-v3/testsuite/27_io/basic_syncstream/2.cc: Likewise.
libstdc++-v3/testsuite/27_io/basic_syncstream/basic_ops/1.cc:
   Likewise.
libstdc++-v3/testsuite/27_io/basic_syncstream/requirements/types.cc:
   Likewise.

---
libstdc++-v3/doc/doxygen/user.cfg.in  |   1 +
libstdc++-v3/include/Makefile.am  |   1 +
libstdc++-v3/include/Makefile.in  |   1 +
libstdc++-v3/include/precompiled/stdc++.h |   2 +-
libstdc++-v3/include/std/syncstream   | 279 ++
libstdc++-v3/include/std/version  |   4 +
.../testsuite/27_io/basic_syncbuf/1.cc|  28 ++
.../testsuite/27_io/basic_syncbuf/2.cc|  27 ++
.../27_io/basic_syncbuf/basic_ops/1.cc| 138 +
.../27_io/basic_syncbuf/requirements/types.cc |  42 +++
.../27_io/basic_syncbuf/sync_ops/1.cc | 130 
.../testsuite/27_io/basic_syncstream/1.cc |  28 ++
.../testsuite/27_io/basic_syncstream/2.cc |  27 ++
.../27_io/basic_syncstream/basic_ops/1.cc | 135 +
.../basic_syncstream/requirements/types.cc|  43 +++
15 files changed, 885 insertions(+), 1 deletion(-)
create mode 100644 libstdc++-v3/include/std/syncstream
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncbuf/1.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncbuf/2.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncbuf/basic_ops/1.cc
create mode 100644 
libstdc++-v3/testsuite/27_io/basic_syncbuf/requirements/types.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncbuf/sync_ops/1.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncstream/1.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncstream/2.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_syncstream/basic_ops/1.cc
create mode 100644 
libstdc++-v3/testsuite/27_io/basic_syncstream/requirements/types.cc

diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in 
b/libstdc++-v3/doc/doxygen/user.cfg.in
index 9b49a15d31b..320f6dea688 100644
--- a/libstdc++-v3/doc/doxygen/user.cfg.in
+++ b/libstdc++-v3/doc/doxygen/user.cfg.in
@@ -897,6 +897,7 @@ INPUT  = @srcdir@/doc/doxygen/doxygroups.cc 
\
 include/streambuf \
 include/string \
 include/string_view \
+ include/syncstream \
 include/system_error \
 include/thread \
 include/tuple \
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 28d273924ee..61aaff7a2f4 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -73,6 +73,7 @@ std_headers = \
${std_srcdir}/shared_mutex \
${std_srcdir}/span \
${std_srcdir}/sstream \
+   ${std_srcdir}/syncstream \
${std_srcdir}/stack \
${std_srcdir}/stdexcept \
${std_srcdir}/stop_token \
diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index 7518a98c25a..8899c323a28 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -141,6 +141,6 @@
#include 
#include 
#include 
-// #include 
+#include 
#include 
#endif
diff --git a/libstdc++-v3/include/std/syncstream 
b/libstdc++-v3/include/std/syncstream
new file mode 100644
index 000..3f78cef1d8d
--- /dev/null
+++ b/libstdc++-v3/include/std/syncstream
@@ -0,0 +1,279 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// T

Re: [PATCH][middle-end][i386][version 5]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-gpr-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-29 Thread Qing Zhao via Gcc-patches
Hi, Richard,


Now, the documentation (gcc.info) is like following, let me know any issue 
there:

thanks.

Qing
==

'zero_call_used_regs ("CHOICE")'

 The 'zero_call_used_regs' attribute causes the compiler to zero a
 subset of all call-used registers(1) at function return.  This is
 used to increase program security by either mitigating
 Return-Oriented Programming (ROP) or preventing information leakage
 through registers.

 In order to satisfy users with different security needs and control
 the run-time overhead at the same time, CHOICE parameter provides a
 flexible way to choose the subset of the call-used registers to be
 zeroed.

 The three basic values of CHOICE are:

* 'skip' doesn't zero any call-used registers.

* 'used' only zeros call-used registers that are used in the
  function.  A "used" register is one whose content has been set
  or referenced in the function.

* 'all' zeros all call-used registers.

 In addition to these three basic choices, it is possible to modify
 'used' or 'all' as follows:

* Adding '-gpr' restricts the zeroing to general-purpose
  registers.

* Adding '-arg' restricts the zeroing to registers that can
  sometimes be used to pass function arguments.  This includes
  all argument registers defined by the platform's calling
  conversion, regardless of whether the function uses those
  registers for function arguments or not.

 The modifiers can be used individually or together.  If they are
 used together, they must appear in the order above.

 The full list of CHOICEs is therefore:

 'skip'
  doesn't zero any call-used register.

 'used'
  only zeros call-used registers that are used in the function.

 'used-gpr'
  only zeros call-used general purpose registers that are used
  in the function.

 'used-arg'
  only zeros call-used registers that are used in the function
  and pass arguments.

 'used-gpr-arg'
  only zeros call-used general purpose registers that are used
  in the function and pass arguments.

 'all'
  zeros all call-used registers.

 'all-gpr'
  zeros all call-used general purpose registers.

 'all-arg'
  zeros all call-used registers that pass arguments.

 'all-gpr-arg'
  zeros all call-used general purpose registers that pass
  arguments.

 Of this list, 'used-arg', 'used-gpr-arg', 'all-arg', and
 'all-gpr-arg' are mainly used for ROP mitigation.

 The default for the attribute is controlled by
 '-fzero-call-used-regs'.

   -- Footnotes --

   (1) A "call-used" register is a register whose contents can be
changed by a function call; therefore, a caller cannot assume that the
register has the same contents on return from the function as it had
before calling the function.  Such registers are also called
"call-clobbered", "caller-saved", or "volatile”.


'-fzero-call-used-regs=CHOICE'
 Zero call-used registers at function return to increase program
 security by either mitigating Return-Oriented Programming (ROP) or
 preventing information leakage through registers.

 The possible values of CHOICE are the same as for the
 'zero_call_used_regs' attribute (*note Function Attributes::).  The
 default is 'skip'.

 You can control this behavior for a specific function by using the
 function attribute 'zero_call_used_regs' (*note Function
 Attributes::).



Re: PowerPC: Add __float128 conversions to/from Decimal

2020-10-29 Thread Michael Meissner via Gcc-patches
On Wed, Oct 28, 2020 at 07:04:31PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 22, 2020 at 06:06:03PM -0400, Michael Meissner wrote:
> > This patch adds the various decimal to/from IEEE 128-bit conversions.  I
> > had to make some changes to the infrastructure, since that infrastructure
> > assumed that there is a sprintf/scanf format modifier to convert floating
> > point.  Instead, I used to str* conversion functions.
> 
> > --- /dev/null
> > +++ b/libgcc/config/rs6000/_dd_to_kf.c
> 
> > +/* Decimal64 -> _Float128 conversion.  */
> > +#define FINE_GRAINED_LIBRARIES 1
> 
> This isn't defined in any other source file (instead, it is put in the
> Makefile).  Why should it be different here?

I'll check it out.

> > +# Force the TF mode to/from decimal functions to be compiled with IBM long
> > +# double.  Add building the KF mode to/from decimal conversions with 
> > explict
> 
> (typo, "explicit")
> 
> > +#if HAVE_KF_MODE
> > +  strfromf128 (buf, BUFMAX, BFP_FMT, (BFP_VIA_TYPE) x);
> > +#else
> >sprintf (buf, BFP_FMT, (BFP_VIA_TYPE) x);
> > +#endif
> 
> Does strfromf128 exist everywhere we build this?  It isn't a standard
> function.

Yes, it is in ISO/IEC TS 18661-3, which is the document that describes most of
the *f128 functions.

We have to use str* instead of sprintf or scanf, because I don't believe their
is a float128 format specifier.

> > +/* Support PowerPC KF mode, which is __float128 when long double is
> > +   IBM extended double.  */
> > +#if defined (L_sd_to_kf) || defined (L_dd_to_kf) || defined (L_td_to_kf) \
> > + || defined (L_kf_to_sd) || defined (L_kf_to_dd) || defined (L_kf_to_td)
> > +#define HAVE_KF_MODE 1
> > +#endif
> 
> This might want a better name, other targets can have a KFmode as well,
> for some completely different purpose, since it is not a standard mode.

Given everything else uses *F, including XF on the x86, I figured it was easier
than creating a new name.

> (Some libgcc maintainer needs to approve the generic parts, not all of
> it can obviously only trigger for us.)
> 
> 
> Segher

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

2020-10-29 Thread Michael Meissner via Gcc-patches
On Tue, Oct 27, 2020 at 09:38:20AM -0500, will schmidt wrote:
> > @@ -2420,6 +2423,8 @@ BU_P9V_64BIT_VSX_2 (VSIEDPF,  "scalar_insert_exp_dp", 
> > CONST,  xsiexpdpf)
> > 
> >  BU_FLOAT128_HW_VSX_2 (VSIEQP,  "scalar_insert_exp_q",  CONST,  
> > xsiexpqp_kf)
> >  BU_FLOAT128_HW_VSX_2 (VSIEQPF, "scalar_insert_exp_qp", CONST,  
> > xsiexpqpf_kf)
> > +BU_FLOAT128_HW_VSX_2 (VSIETF,  "scalar_insert_exp_tf", CONST,  
> > xsiexpqp_tf)
> > +BU_FLOAT128_HW_VSX_2 (VSIETFF, "scalar_insert_exp_tfp", CONST, 
> > xsiexpqpf_tf)
> 
> Ok if its ok, but the pattern catches my eye.  Should that be VSIETFP ?
> (or named "scalar_insert_exp_tff")?

That is the existing function in the library.  All I'm doing is adding TF
versions of the existing functions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

2020-10-29 Thread Michael Meissner via Gcc-patches
On Wed, Oct 28, 2020 at 06:27:42PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 22, 2020 at 06:09:38PM -0400, Michael Meissner wrote:
> > This patch adds long double variants of the power10 __float128 built-in
> > functions.  This is needed when long double uses IEEE 128-bit because
> > __float128 uses TFmode in this case instead of KFmode.  If this patch is not
> > applied, these built-in functions can't be used when long double is IEEE
> > 128-bit.
> 
> But now they still cannot, you need new builtins, instead.
> 
> TFmode is an implementation detail at this level (functions use types,
> not modes), so you do not need new builtins at all afaics?  Just define
> the existing ones with TFmode as well (if that is the same as KFmode)?

In order to add new overloaded built-ins, you have to add a new built-in with a
new name.  Hence I have to add TF variants for these functions when __float128
is the same as long double.

Maybe when Bill finally reorganizes the built-in functions, we can do anyway
with having to create new named functions.  But for now, in order to add them,
you need a name.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.

2020-10-29 Thread Michael Meissner via Gcc-patches
On Wed, Oct 28, 2020 at 04:58:46PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Oct 22, 2020 at 06:10:37PM -0400, Michael Meissner wrote:
> > PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.
> 
> This title makes no sense, and thankfully is not what the patch does :-)

Thanks, every so often I accidently type __ieee128 instead of __ibm128.

> > This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
> > instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
> > we need to use the __ibm128 type.  The code will run without this patch,
> > but this patch slightly optimizes it better.
> 
> It uses __builtin_pack_ibm128, instead?

Yes.

> > libgcc/
> > 2020-10-22  Michael Meissner  
> > 
> > * config/rs6000/ibm-ldouble.c (pack_ldouble): Use
> > __builtin_pack_ieee128 if long double is IEEE 128-bit.
> 
> Here, too.
> 
> > ---
> >  libgcc/config/rs6000/ibm-ldouble.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/libgcc/config/rs6000/ibm-ldouble.c 
> > b/libgcc/config/rs6000/ibm-ldouble.c
> > index dd2a02373f2..767fdd72683 100644
> > --- a/libgcc/config/rs6000/ibm-ldouble.c
> > +++ b/libgcc/config/rs6000/ibm-ldouble.c
> > @@ -102,9 +102,17 @@ __asm__ (".symver __gcc_qadd,_xlqadd@GCC_3.4\n\t"
> >  static inline IBM128_TYPE
> >  pack_ldouble (double dh, double dl)
> >  {
> > +  /* If we are building on a non-VSX system, the __ibm128 type is not 
> > defined.
> 
> "Building on" does not matter in the least.  The compiler should
> generate the same code, no matter what it runs on.  Target matters, not
> host (and build not at all).

Yes.

> > + This means we can't always use __builtin_pack_ibm128.  Instead, we use
> > + __builtin_pack_longdouble if long double uses the IBM extended double
> > + 128-bit format, and use the explicit __builtin_pack_ibm128 if long 
> > double
> > + is IEEE 128-bit.  */
> 
> And this comment is about the *next* case?
> 
> >  #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)  
> > \
> >  && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> >return __builtin_pack_longdouble (dh, dl);
> > +#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) \
> > +&& !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> > +  return __builtin_pack_ibm128 (dh, dl);
> 
> Given the above, _SOFT_FLOAT etc. are wrong.
> 
> Just use some more portable thing to repack?  Is __builtin_pack_ibm128
> not defined always here anyway?

That is the problem.  If you build a big endian PowerPC compiler where VSX is
not default, the __ibm128 stuff is not defined.  It is only defined when
__float128 is a possibility.  Hence __builtin_pack_ibm128 and
__builtin_unpack_ibm128 are not defined.

> /* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
>__ibm128 is available).  */
> #define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)\
>   RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
> "__builtin_" NAME,  /* NAME */  \
> (RS6000_BTM_HARD_FLOAT  /* MASK */  \
>  | RS6000_BTM_FLOAT128),\
> (RS6000_BTC_ ## ATTR/* ATTR */  \
>  | RS6000_BTC_BINARY),  \
> CODE_FOR_ ## ICODE) /* ICODE */
> 
> (so just HARD_FLOAT and FLOAT128 are needed)
> 
> What am I missing?

As I said, the __ibm128 keyword is not enabled on non-VSX systems.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.

2020-10-29 Thread Michael Meissner via Gcc-patches
On Tue, Oct 27, 2020 at 09:30:03AM -0500, will schmidt wrote:
> On Thu, 2020-10-22 at 18:10 -0400, Michael Meissner via Gcc-patches wrote:
> > PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.
> > 
> > I have split all of these patches into separate patches to hopefully get 
> > them
> > into the tree.
> > 
> > This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
> > instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
> > we need to use the __ibm128 type.  The code will run without this patch,
> > but this patch slightly optimizes it better.
> > 
> > I have tested this patch with bootstrap builds on a little endian power9 
> > system
> > running Linux.  With the other patches, I have built two full bootstrap 
> > builds
> > using this patch and the patches after this patch.  One build used the 
> > current
> > default for long double (IBM extended double) and the other build switched 
> > the
> > default to IEEE 128-bit.  I used the Advance Toolchain AT 14.0 compiler as 
> > the
> > library used by this compiler.  There are no regressions between the tests.
> > There are 3 fortran benchmarks (ieee/large_2.f90, default_format_2.f90, and
> > default_format_denormal_2.f90) that now pass.
> 
> good. :-)A quick search of gcc bugzilla shows there is an existing
> PR 67531 that includes ieee rounding support for powerpc long double. 
> Does this (partially?) address that? 
>   
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67531

In theory, once the full system uses IEEE 128-bit floating point for long
double, all of the various rounding issues will be fixed.

However, we have to get to that step, and this is just one of a long line of
intermediate steps to get to that goal.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: Avoid char[] array in tree_def

2020-10-29 Thread Jan Hubicka
> On Thu, Oct 29, 2020 at 05:00:40PM +0100, Jan Hubicka wrote:
> > > 
> > > That's ugly and will for sure defeat warning / access code
> > > when we access this as char[], no?  I mean, we could
> > > as well use 'int str[1];' here?
> > 
> > Well, we always get char pointer via macro that is IMO OK, but I am also
> > not very much in love with this.
> 
> Do we treat signed char [...]; as typeless storage too, or just
> what the C++ standard requires (i.e. char, unsigned char and std::byte
> where the last one is enum type with unsigned char underlying type)?
struct a {signed char b[10];int d;} c;
void
test ()
{
  c.d=1;
}

still leads to alias set 0 access, so perhaps this can be improved.
Where the standard specifies this? (also coincidentally I have no idea
where C++ sets typeless storage to 1 :)

Honza
> 
>   Jakub
> 


Re: [PATCH][PR target/97540] Don't extract memory from operand for normal memory constraint.

2020-10-29 Thread Richard Sandiford via Gcc-patches
Hongtao Liu via Gcc-patches  writes:
> On Thu, Oct 29, 2020 at 2:46 AM Richard Sandiford
>  wrote:
>>
>> Hongtao Liu  writes:
>> > On Tue, Oct 27, 2020 at 7:13 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> Hongtao Liu via Gcc-patches  writes:
>> >> > Hi:
>> >> >   For inline asm, there could be an operand like (not (mem:)), it's
>> >> > not a valid operand for normal memory constraint.
>> >> >   Bootstrap is ok, regression test is ok for make check
>> >> > RUNTESTFLAGS="--target_board='unix{-m32,}'"
>> >> >
>> >> > gcc/ChangeLog
>> >> > PR target/97540
>> >> > * ira.c: (ira_setup_alts): Extract memory from operand only
>> >> > for special memory constraint.
>> >> > * recog.c (asm_operand_ok): Ditto.
>> >> > * lra-constraints.c (process_alt_operands): MEM_P is
>> >> > required for normal memory constraint.
>> >> >
>> >> > gcc/testsuite/ChangeLog
>> >> > * gcc.target/i386/pr97540.c: New test.
>> >>
>> >> Sorry to stick my oar in, but I think we should reconsider the
>> >> bcst_mem_operand approach.  It seems like these patches (and the
>> >> previous one) are fighting against the principle that operands
>> >> cannot be arbitrary expressions.
>> >>
>> >> This kind of thing was attempted long ago (even before my time!)
>> >> for SIGN_EXTEND on MIPS.  It ended up causing more problems than
>> >> it solved and in the end it had to be taken out.  I'm worried that
>> >> we might end up going through the same cycle again.
>> >>
>> >> Also, this LRA code is extremely performance-sensitive in terms
>> >> of compile time: it's often at the top or near the top of the profile.
>> >> So adding calls to new functions like extract_mem_from_operand for
>> >> a fairly niche case probably isn't a good trade-off.
>> >>
>> >> I think we should instead find a nice(?) syntax for generating separate
>> >> patterns for the two bcst_vector_operand alternatives from a single
>> >> .md pattern.  That would fit the existing model much more closely.
>> >>
>> >
>> > We have define_subst for RTL template transformations, but it's not
>> > suitable for this case(too many define_subst templates need to
>> > be added, and it doesn't show any advantage compared to adding
>> > separate bcst patterns.). I don't find other workable existing syntax for 
>> > it.
>>
>> Yeah, I think it would need to be new syntax.  I was wondering if it
>> would help if we had somethine like (strawman suggestion):
>>
>>   (one_of 0
>> [(match_operand:VI_AVX2 1 "vector_operand" "...")
>>  (vec_duplicate:VI_AVX2
>>(match_operand:<...> 1 "..." "..."))]
>>
>> where all instances of (one_of N ...) for a given N are required
>> to have the same number of alternatives.
>>
>> This could be handled in a similar way to define_subst, with the
>> one_of being expanded before the main generator routines see it.
>>
>> But maybe it wouldn't help that much.  E.g. for:
>>
>> (define_insn "*3"
>>   [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v")
>> (plusminus:VI_AVX2
>>   (match_operand:VI_AVX2 1 "bcst_vector_operand" "0,v")
>>   (match_operand:VI_AVX2 2 "bcst_vector_operand" "xBm,vmBr")))]
>>
>> the vec_duplicate version should only really have one alternative.
> It would be guaranteed by it's attribute (set_attr "isa" "noavx,avx"),
> since bcst_mem_operand implies avx512f, which of course implies avx,
> therefore the first alternative would never be enabled under avx.

Ah, OK.

>> I guess we could handle that by using a:
>>
>>   (one_of 0
>> [(set_attr "enabled" "*,*")
>>  (set_attr "enabled" "0,*")])
>>
>> or some variant of that that uses a derived attribute.  But it feels
>> a bit clunky…
>>
>> Without that, I guess the only pattern that would benefit directly is:
>>
>> (define_insn "avx512dq_mul3"
>>   [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=v")
>> (mult:VI8_AVX512VL
>>   (match_operand:VI8_AVX512VL 1 "bcst_vector_operand" "%v")
>>   (match_operand:VI8_AVX512VL 2 "bcst_vector_operand" "vmBr")))]
>>
>> > So suppose I should revert my former 2 patches and add separate bcst 
>> > patterns.
>>
>> Are there going to more patterns that need bcst_vector_operand,
>
> Almost all AVX512 instructions need corresponding bcst patterns except
> for those with 8-bit/16-bit data elements.

OK, that changes things.  Sorry, not knowing the architecture,
I wasn't sure how far this was from being complete.

>> or is the current set complete?
>>
>> I definitely think we should have a better way of handling this in the
>> .md files, and I'd be happy to hack something up on the generator side
>> (given that I'm being the awkward one here).  But I guess the answer to
>> the question above will decide whether it make things better or not.
>>
>> FWIW, I think having separate patterns (whether they're produced from
>> one .md construct or from several) might better optimisation results.
>
> With proper extending for sp

Re: Avoid char[] array in tree_def

2020-10-29 Thread Jan Hubicka
> On Thu, 29 Oct 2020, Jan Hubicka wrote:
> 
> > > 
> > > That's ugly and will for sure defeat warning / access code
> > > when we access this as char[], no?  I mean, we could
> > > as well use 'int str[1];' here?
> > 
> > Well, we always get char pointer via macro that is IMO OK, but I am also
> > not very much in love with this.
> > > 
> > > Maybe we can invent some C++ attribute for this?
> > > 
> > > [[gnu::string]]
> > > 
> > > or so that marks it as actual char and not typeless storage?
> > 
> > Attribute would probably make sense.  Not sure if gnu::string is best
> > name given that it can be also meaningful for array of small integers
> > (such as in wide_int).
> 
> OK, maybe [[gnu::strictly_typed]] then?

This looks like good idea to me (and probably also making difference
with signed char as Jakub suggests).

I am adding Jason to CC, since he may know better.
Honza


Re: [PATCH][PR target/97540] Don't extract memory from operand for normal memory constraint.

2020-10-29 Thread Richard Sandiford via Gcc-patches
Hongtao Liu via Gcc-patches  writes:
> On Tue, Oct 27, 2020 at 7:13 PM Richard Sandiford
>  wrote:
>>
>> Hongtao Liu via Gcc-patches  writes:
>> > Hi:
>> >   For inline asm, there could be an operand like (not (mem:)), it's
>> > not a valid operand for normal memory constraint.
>> >   Bootstrap is ok, regression test is ok for make check
>> > RUNTESTFLAGS="--target_board='unix{-m32,}'"
>> >
>> > gcc/ChangeLog
>> > PR target/97540
>> > * ira.c: (ira_setup_alts): Extract memory from operand only
>> > for special memory constraint.
>> > * recog.c (asm_operand_ok): Ditto.
>> > * lra-constraints.c (process_alt_operands): MEM_P is
>> > required for normal memory constraint.
>> >
>> > gcc/testsuite/ChangeLog
>> > * gcc.target/i386/pr97540.c: New test.
>>
>> Sorry to stick my oar in, but I think we should reconsider the
>> bcst_mem_operand approach.  It seems like these patches (and the
>> previous one) are fighting against the principle that operands
>> cannot be arbitrary expressions.
>>
>> This kind of thing was attempted long ago (even before my time!)
>> for SIGN_EXTEND on MIPS.  It ended up causing more problems than
>> it solved and in the end it had to be taken out.  I'm worried that
>> we might end up going through the same cycle again.
>>
>
> Could you provide the thread link for the issue of SIGN_EXTEND on
> MIPS, then I can take a look to see if it's exactly the same issue as
> mine.

I couldn't find anything, sorry.  The patch that finally removed
the MIPS handling was:

  https://gcc.gnu.org/pipermail/gcc-patches/2002-October/088178.html

I know there was some discussion about the problems around then,
but some of it might have been private rather than on-list.
I can't remember the details now.

Thanks,
Richard


Re: PowerPC: Update __float128 and __ibm128 error messages.

2020-10-29 Thread Michael Meissner via Gcc-patches
On Tue, Oct 27, 2020 at 06:27:22PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Oct 22, 2020 at 06:11:35PM -0400, Michael Meissner wrote:
> > This patch attempts to make the error messages for intermixing IEEE 128-bit
> > floating point with IBM 128-bit extended double types to be clearer if the 
> > long
> > double type uses the IEEE 128-bit format.
> 
> > We have gotten some requests to back port these changes to GCC 10.x.  At the
> > moment, I am not planning to do the back port, but I may need to in the 
> > future.
> 
> Ping the patches if/when that happens?

Certainly.

> > +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> 
> Use *-*-linux* instead?  (In all relevant tests.)

Ok.

> Is there any reason these tests should only run on Linux?  If not, it
> should not restrict itself like this; and if so, you may want another
> selsector (something ieee128 perhaps), or at the very least add a
> comment why you do this.

Right now the float128 emulation is only built on Linux, because it needs the
support in GLIBC.  If/when other systems add support for float128 in there
C/C++ libraries, we can widen the tests.

> >  /* { dg-do compile { target { powerpc*-*-linux* } } } */
> > -/* { dg-require-effective-target powerpc_vsx_ok } */
> > -/* { dg-options "-O2 -mvsx" } */
> > +/* { dg-require-effective-target ppc_float128_sw } */
> 
> Removing powerpc_vsx_ok is wrong, you still use -mvsx.  That the only
> current soft float QP stuff requires VSX is irrelevant.
> 
> Please fix those everywhere.  Okay for trunk with that.  Thanks!

IIRC, these tests were added very early in the float128 cycle, before we had
the target supports for float128.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Improve vec::copy mem stat annotations

2020-10-29 Thread Jan Hubicka
Hi,
this patch annotates vec::copy so it shows better in stats.  I still do
not see how auto vecs gets miscounted.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* vec.h (vec::copy): Pass mem stat info.
diff --git a/gcc/vec.h b/gcc/vec.h
index 3ad26972a62..14d77e87342 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -1731,7 +1731,7 @@ vec::copy (ALONE_MEM_STAT_DECL) const
 {
   vec new_vec = vNULL;
   if (length ())
-new_vec.m_vec = m_vec->copy ();
+new_vec.m_vec = m_vec->copy (ALONE_PASS_MEM_STAT);
   return new_vec;
 }
 


Re: PowerPC: Allow C/C++ to change long double type on GLIBC 2.32.

2020-10-29 Thread Michael Meissner via Gcc-patches
On Mon, Oct 26, 2020 at 05:48:48PM -0500, will schmidt wrote:
> On Thu, 2020-10-22 at 18:15 -0400, Michael Meissner via Gcc-patches wrote:
> > PowerPC: Allow C/C++ to change long double type on GLIBC 2.32.
> > 
> > This is a new patch.  It turns off the warning about switching the long 
> > double
> > type via compile line if the GLIBC is 2.32 or newer.  It only does this if 
> > the
> > languages are C or C++, since those language libraries support switching the
> > long double type.  Other languages like Fortran don't have any current 
> > support
> > to provide both sets of interfaces to the library.
> > 
> > 2020-10-21  Michael Meissner  
> > 
> > * config/rs6000/rs6000.c (rs6000_option_override_internal): Allow
> > long double type to be changed for C/C++ if glibc 2.32 or newer.
> > ---
> >  gcc/config/rs6000/rs6000.c | 10 --
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index 50039c0a53d..940c15f3265 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -4158,10 +4158,16 @@ rs6000_option_override_internal (bool global_init_p)
> > 
> >if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && 
> > TARGET_LONG_DOUBLE_128)
> > {
> > + /* Determine if the user can change the default long double type at
> > +compilation time.  Only C and C++ support this, and you need GLIBC
> > +2.32 or newer.  Only issue one warning.  */
> 
> >   static bool warned_change_long_double;
> > - if (!warned_change_long_double)
> > +
> > + if (!warned_change_long_double
> > + && (!OPTION_GLIBC
> > + || (!lang_GNU_C () && !lang_GNU_CXX ())
> > + || ((TARGET_GLIBC_MAJOR * 1000) + TARGET_GLIBC_MINOR) < 2032))
> > {
> > - warned_change_long_double = true;
> 
> Does this need to be added back elsewhere? 

At the present time, we are not contemplating adding the full support to enable
configuring GCC to use IEEE 128-bit long double in GCC 10 or earlier.  This may
change depending on customer demands.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[Patch] Fortran: Update omp atomic for OpenMP 5

2020-10-29 Thread Tobias Burnus

The parser partially anticipates the upcoming OpenMP 5.1 changes, which
adds some more clauses - but otherwise does not update it for OpenMP 5.1,
yet. In particular, the "omp end atomic" for capture is still required and
the memory-order-clause restrictions still apply.

I am a bit unsure about how to handle 'capture' (= update capture) and
the internal 'swap' in the internal representation; the current one is
not ideal, but others did not seem to be ideal, either.

OK?

Tobias

PS:
* On the C/C++ side, 'capture' (or update capture') restrictions are
  not checked (are the same as 'update' – and both are gone with OpenMP 5.1,
  which also permits ACQ_REL for read/write)
* On the C/C++ side, OpenACC's atomic piggybacks on OpenMP's which accepts
  too much.
* Fortran as C/C++: hint(hint-expr) is parsed but not actually used.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Update omp atomic for OpenMP 5

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Handle atomic clauses.
	(show_omp_node): Call it for atomic.
	* gfortran.h (enum gfc_omp_atomic_op): Add GFC_OMP_ATOMIC_UNSET,
	remove GFC_OMP_ATOMIC_SEQ_CST and GFC_OMP_ATOMIC_ACQ_REL.
	(enum gfc_omp_memorder): Replace OMP_MEMORDER_LAST by
	OMP_MEMORDER_UNSET, add OMP_MEMORDER_SEQ_CST/OMP_MEMORDER_RELAXED.
	(gfc_omp_clauses): Add capture and atomic_op.
	(gfc_code): remove omp_atomic.
	* openmp.c (enum omp_mask1): Add atomic, capture, memorder clauses.
	(gfc_match_omp_clauses): Match them.
	(OMP_ATOMIC_CLAUSES): Add.
	(gfc_match_omp_flush): Update for 'last' to 'unset' change.
	(gfc_match_omp_oacc_atomic): Removed and placed content ..
	(gfc_match_omp_atomic): ... here. Update for OpenMP 5 clauses.
	(gfc_match_oacc_atomic): Match directly here.
	(resolve_omp_atomic, gfc_resolve_omp_directive): Update.
	* parse.c (parse_omp_oacc_atomic): Update for struct gfc_code changes.
	* resolve.c (gfc_resolve_blocks): Update assert.
	* st.c (gfc_free_statement): Also call for EXEC_O{ACC,MP}_ATOMIC.
	* trans-openmp.c (gfc_trans_omp_atomic): Update.
	(gfc_trans_omp_flush): Update for 'last' to 'unset' change.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/atomic-2.f90: New test.
	* gfortran.dg/gomp/atomic.f90: New test.

 gcc/fortran/dump-parse-tree.c   |  34 
 gcc/fortran/gfortran.h  |  30 ++--
 gcc/fortran/openmp.c| 250 +---
 gcc/fortran/parse.c |   9 +-
 gcc/fortran/resolve.c   |   7 +-
 gcc/fortran/st.c|   4 +-
 gcc/fortran/trans-openmp.c  |  41 ++---
 gcc/testsuite/gfortran.dg/gomp/atomic-2.f90 |  33 
 gcc/testsuite/gfortran.dg/gomp/atomic.f90   | 111 
 9 files changed, 409 insertions(+), 110 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 6e265f4520d..43b97ba26ff 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1715,6 +1715,36 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
 }
   if (omp_clauses->depend_source)
 fputs (" DEPEND(source)", dumpfile);
+  if (omp_clauses->capture)
+fputs (" CAPTURE", dumpfile);
+  if (omp_clauses->atomic_op != GFC_OMP_ATOMIC_UNSET)
+{
+  const char *atomic_op;
+  switch (omp_clauses->atomic_op)
+	{
+	case GFC_OMP_ATOMIC_READ: atomic_op = "READ"; break;
+	case GFC_OMP_ATOMIC_WRITE: atomic_op = "WRITE"; break;
+	case GFC_OMP_ATOMIC_UPDATE: atomic_op = "UPDATE"; break;
+	default: gcc_unreachable ();
+	}
+  fputc (' ', dumpfile);
+  fputs (atomic_op, dumpfile);
+}
+  if (omp_clauses->memorder != OMP_MEMORDER_UNSET)
+{
+  const char *memorder;
+  switch (omp_clauses->memorder)
+	{
+	case OMP_MEMORDER_ACQ_REL: memorder = "ACQ_REL"; break;
+	case OMP_MEMORDER_ACQUIRE: memorder = "AQUIRE"; break;
+	case OMP_MEMORDER_RELAXED: memorder = "RELAXED"; break;
+	case OMP_MEMORDER_RELEASE: memorder = "RELEASE"; break;
+	case OMP_MEMORDER_SEQ_CST: memorder = "SEQ_CST"; break;
+	default: gcc_unreachable ();
+	}
+  fputc (' ', dumpfile);
+  fputs (memorder, dumpfile);
+}
 }
 
 /* Show a single OpenMP or OpenACC directive node and everything underneath it
@@ -1880,6 +1910,10 @@ show_omp_node (int level, gfc_code *c)
 case EXEC_OMP_TASKWAIT:
 case EXEC_OMP_TASKYIELD:
   return;
+case EXEC_OACC_ATOMIC:
+case EXEC_OMP_ATOMIC:
+  omp_clauses = c->block ? c->block->ext.omp_clauses : NULL;
+  break;
 default:
   break;
 }
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 73b6ffd870c..9500032f0e3 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1343,6 +1343,16 @@ enum gfc_omp_if_kind
   OMP_IF_LAST
 };
 
+enum gfc_omp_atomic_op
+{
+  GFC_OMP_ATOMIC_UNSET = 0,
+  GFC_OMP_ATOMIC_UPDATE = 1,
+  GFC_OMP_ATOMIC

Re: [PATCH][PR target/97540] Don't extract memory from operand for normal memory constraint.

2020-10-29 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Tue, Oct 27, 2020 at 11:13:21AM +, Richard Sandiford via Gcc-patches 
> wrote:
>> Sorry to stick my oar in, but I think we should reconsider the
>> bcst_mem_operand approach.  It seems like these patches (and the
>> previous one) are fighting against the principle that operands
>> cannot be arbitrary expressions.
>
> Many operands already are fairly complex expressions, so it is unclear how
> this changes that.

But the things subject to constraint matching currently have to be
SCRATCHes, SUBREGs, REGs, MEMs or constants.  The address inside
a MEM can be complex, but even that has certain limits (so that LRA
knows what to do with addresses that need reloading).

Matching something like VEC_DUPLICATE in a constraint is new in
that thing being constrained isn't conceptually an object
(only the operand of the VEC_DUPLICATE is).

> And LRA etc. already handles SUBREGs of MEM which is kind of similar to
> this.

Yeah, but SUBREGs of MEMs are a bit of a legacy feature :-)
It would be great to remove them at some point…

>> This kind of thing was attempted long ago (even before my time!)
>> for SIGN_EXTEND on MIPS.  It ended up causing more problems than
>> it solved and in the end it had to be taken out.  I'm worried that
>> we might end up going through the same cycle again.
>> 
>> Also, this LRA code is extremely performance-sensitive in terms
>> of compile time: it's often at the top or near the top of the profile.
>> So adding calls to new functions like extract_mem_from_operand for
>> a fairly niche case probably isn't a good trade-off.
>
> It can be just an inline function that looks through just the target
> selected rtxes rather than arbitrary ones (derived from *.md properties or
> something).

Having something in the .md file sounds good.  The more information the
generators have, the more chance they have to do something efficient.

>> I think we should instead find a nice(?) syntax for generating separate
>> patterns for the two bcst_vector_operand alternatives from a single
>> .md pattern.  That would fit the existing model much more closely.
>
> That would result in thousands of new patterns, I'm not sure it is a good
> idea.  Pretty much all AVX512* instructions allow those.

Yeah, I hadn't realised that.

Thanks,
Richard


[PATCH] aarch64: Add backend support for expanding __builtin_memset

2020-10-29 Thread Sudakshina Das via Gcc-patches
Hi

This patch implements aarch64 backend expansion for __builtin_memset. Most of 
the
implementation is based on the expansion of __builtin_memcpy. We change the 
values of
SET_RATIO and MOVE_RATIO for cases where we do not have to strictly align and 
where
we can benefit from NEON instructions in the backend.

So for a test case like:

void foo (void* p) { __builtin_memset (p, 1, 7); }

instead of generating:
mov w3, 16843009
mov w2, 257
mov w1, 1
str w3, [x0]
strhw2, [x0, 4]
strbw1, [x0, 6]
ret
we now generate
moviv0.16b, 0x1
str s0, [x0]
str s0, [x0, 3]
ret

Bootstrapped and regression tested on aarch64-none-linux-gnu.
With this patch I have seen an overall improvement of 0.27% in Spec2017 Int
and 0.19% in Spec2017 FP benchmarks on Neoverse N1.

Is this ok for trunk?

gcc/ChangeLog:

2020-xx-xx  Sudakshina Das  

* config/aarch64/aarch64-protos.h (aarch64_expand_setmem): New
declaration.
* config/aarch64/aarch64.c (aarch64_gen_store_pair): Add case for
E_V16QImode.
(aarch64_set_one_block_and_progress_pointer): New helper for
aarch64_expand_setmem.
(aarch64_expand_setmem): Define the expansion for memset.
* config/aarch64/aarch64.h (CLEAR_RATIO): Tweak to favor
aarch64_expand_setmem when allowed and profitable.
(SET_RATIO): Likewise.
* config/aarch64/aarch64.md: Define pattern for setmemdi.

gcc/testsuite/ChangeLog:

2020-xx-xx  Sudakshina Das  

* g++.dg/tree-ssa/pr90883.C: Remove xfail for aarch64.
* gcc.dg/tree-prof/stringop-2.c: Add xfail for aarch64.
* gcc.target/aarch64/memset-corner-cases.c: New test.
* gcc.target/aarch64/memset-q-reg.c: New test.

Thanks
Sudi

### Attachment also inlined for ease of reply###

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
7a34c841355bad88365381912b163c61c5a35811..2aa3f1fddaafae58f0bfb26e5b33fe6a94e85e06
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -510,6 +510,7 @@ bool aarch64_emit_approx_div (rtx, rtx, rtx);
 bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
 void aarch64_expand_call (rtx, rtx, rtx, bool);
 bool aarch64_expand_cpymem (rtx *);
+bool aarch64_expand_setmem (rtx *);
 bool aarch64_float_const_zero_rtx_p (rtx);
 bool aarch64_float_const_rtx_p (rtx);
 bool aarch64_function_arg_regno_p (unsigned);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
00b5f8438863bb52c348cfafd5d4db478fe248a7..bcb654809c9662db0f51fc1368e37e42969efd29
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1024,16 +1024,18 @@ typedef struct
 #define MOVE_RATIO(speed) \
   (!STRICT_ALIGNMENT ? 2 : (((speed) ? 15 : AARCH64_CALL_RATIO) / 2))
 
-/* For CLEAR_RATIO, when optimizing for size, give a better estimate
-   of the length of a memset call, but use the default otherwise.  */
+/* Like MOVE_RATIO, without -mstrict-align, make decisions in "setmem" when
+   we would use more than 3 scalar instructions.
+   Otherwise follow a sensible default: when optimizing for size, give a better
+   estimate of the length of a memset call, but use the default otherwise.  */
 #define CLEAR_RATIO(speed) \
-  ((speed) ? 15 : AARCH64_CALL_RATIO)
+  (!STRICT_ALIGNMENT ? 4 : (speed) ? 15 : AARCH64_CALL_RATIO)
 
 /* SET_RATIO is similar to CLEAR_RATIO, but for a non-zero constant, so when
optimizing for size adjust the ratio to account for the overhead of loading
the constant.  */
 #define SET_RATIO(speed) \
-  ((speed) ? 15 : AARCH64_CALL_RATIO - 2)
+  (!STRICT_ALIGNMENT ? 0 : (speed) ? 15 : AARCH64_CALL_RATIO - 2)
 
 /* Disable auto-increment in move_by_pieces et al.  Use of auto-increment is
rarely a good idea in straight-line code since it adds an extra address
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
a8cc545c37044345c3f1d3bf09151c8a9578a032..16ac0c076adcc82627af43473a938e78d3a7ecdc
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7058,6 +7058,9 @@ aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx 
reg1, rtx mem2,
 case E_V4SImode:
   return gen_vec_store_pairv4siv4si (mem1, reg1, mem2, reg2);
 
+case E_V16QImode:
+  return gen_vec_store_pairv16qiv16qi (mem1, reg1, mem2, reg2);
+
 default:
   gcc_unreachable ();
 }
@@ -21373,6 +21376,134 @@ aarch64_expand_cpymem (rtx *operands)
   return true;
 }
 
+/* Like aarch64_copy_one_block_and_progress_pointers, except for memset where
+   *src is a register we have created with the duplicated value to be set.  */
+static void
+aarch64_set_one_block_and_progress_pointer (rtx *src, rtx *dst,
+   machine_mode mode)
+{
+  /* If we are copying 128bits or 256bits, we can do that s

Re: [PATCH] libstdc++: remove an ignored qualifier on function return type

2020-10-29 Thread Jonathan Wakely via Gcc-patches
On Fri, 28 Aug 2020 at 07:56, Krystian Kuźniarek via Libstdc++
 wrote:
>
> > So then you need to produce a changelog entry by hand.
> I had this problem on some old Ubuntu 18.04. Anyway, here's new ChangeLog:
>
> libstdc++-v3/ChangeLog:
>
> * include/std/variant: Fix -Wignored-qualifiers
> in system headers.
>
>
> >That doesn't test this header at all.
> It does but indirectly. What I meant by manual test was:
> ${GCC_GIT} -E contains_only_stdcpp_include.cpp > preprocessed.cpp
> ${GCC_GIT} -Wall -Wextra -pedantic -fsyntax-only preprocessed.cpp
> By manipulating GCC_GIT variable to trunk GCC and patched GCC, I checked if
> the warning is gone.
>
> >What about the libstdc++ testsuite?
> I hope you mean calling make bootstrap and make check. If that's ok, I
> confirm it works on Manjaro and Ubuntu 18.04 with gcc10 and gcc8
> respectively.
>
> >I don't remember exactly why I put it there, but I seem to recall it
> >was necessary.
> I don't know your reasons but I can only tell that this patch seems to
> compile and work just fine.

I see new test failures with that change:

include/variant:1039: error: invalid conversion from
'std::enable_if_t (*)(test02()::Visitor&&,
std::variant&)' {aka 'void (*)(test02()::Visitor&&,
std::variant&)'} to
'std::__detail::__variant::_Multi_array&)>::__untag_result&)>::element_type' {aka 'const void
(*)(test02()::Visitor&&, std::variant&)'} [-fpermissive]
UNRESOLVED: 20_util/variant/visit_r.cc compilation failed to produce executable


So I still think it's there for a reason.


Re: Avoid char[] array in tree_def

2020-10-29 Thread Richard Biener
On Thu, 29 Oct 2020, Jakub Jelinek wrote:

> On Thu, Oct 29, 2020 at 05:00:40PM +0100, Jan Hubicka wrote:
> > > 
> > > That's ugly and will for sure defeat warning / access code
> > > when we access this as char[], no?  I mean, we could
> > > as well use 'int str[1];' here?
> > 
> > Well, we always get char pointer via macro that is IMO OK, but I am also
> > not very much in love with this.
> 
> Do we treat signed char [...]; as typeless storage too, or just
> what the C++ standard requires (i.e. char, unsigned char and std::byte
> where the last one is enum type with unsigned char underlying type)?

All that is covered by is_byte_access_type which includes all
character types (including char16_t and wchar it seems) and std::byte.

Richard.

>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH][AArch64] ACLE intrinsics: convert from BFloat16 to Float32

2020-10-29 Thread Richard Sandiford via Gcc-patches
Dennis Zhang  writes:
> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> b/gcc/config/aarch64/aarch64-simd-builtins.def
> index 5bc596dbffc..b68c3ca7f4b 100644
> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> @@ -732,3 +732,8 @@
>VAR1 (UNOP, bfcvtn_q, 0, ALL, v8bf)
>VAR1 (BINOP, bfcvtn2, 0, ALL, v8bf)
>VAR1 (UNOP, bfcvt, 0, ALL, bf)
> +
> +  /* Implemented by aarch64_{v}bfcvt{_high}.  */
> +  VAR2 (UNOP, vbfcvt, 0, ALL, v4bf, v8bf)
> +  VAR1 (UNOP, vbfcvt_high, 0, ALL, v8bf)
> +  VAR1 (UNOP, bfcvt, 0, ALL, sf)

New intrinsics should use something more specific than “ALL”.
Since these functions are pure non-trapping integer operations,
I think they should use “AUTO_FP” instead.  (On reflection,
we should probably change the name.)

> +(define_insn "aarch64_bfcvtsf"
> +  [(set (match_operand:SF 0 "register_operand" "=w")
> + (unspec:SF [(match_operand:BF 1 "register_operand" "w")]
> + UNSPEC_BFCVT))]
> +  "TARGET_BF16_FP"
> +  "shl\\t%d0, %d1, #16"
> +  [(set_attr "type" "neon_shift_reg")]

I think this should be neon_shift_imm instead.

OK with those changes, thanks.

Richard


Re: [PATCH][middle-end][i386][version 5]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-gpr-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-29 Thread Richard Sandiford via Gcc-patches
Qing Zhao via Gcc-patches  writes:
>>> +Return-Oriented Programming (ROP) or preventing information leak
>> 
>> leakage
>> 
>> (FWIW, I'm not sure “mitigating ROP” is really correct usage, but I don't
>> have any better suggestions.)
>
> Do you mean whether “mitigating ROP’ is one of the major purpose of this new 
> feature?

No, I meant just the English usage.  E.g., I think you mitigate the
damage caused by earthquakes rather than mitigate earthquakes themselves.
But I could be wrong.  It's not a word I use very often ;-)

>>> +In order to satisfy users with different security needs and control the
>>> +run-time overhead at the same time, GCC provides a flexible way to choose
>>> +the subset of the call-used registers to be zeroed.
>> 
>> Maybe s/GCC/the @var{choice} parameter/.
> Okay.
>> 
>>> +
>>> +The three basic values of @var{choice} are:
>> 
>> After which, I think this should be part of the previous paragraph.
>
> Don’t understand here, could you explain a little bit more?

I meant:

In order to satisfy users with different security needs and control the
run-time overhead at the same time, @var{choice} provides a flexible way
to choose the subset of the call-used registers to be zeroed.  The three
basic values of @var{choice} are:

>>> +  /* If gpr_only is true, only zero call-used registers that are
>>> + general-purpose registers; if used_only is true, only zero
>>> + call-used registers that are used in the current function;
>>> + if arg_only is true, only zero call-used registers that pass
>>> + parameters defined by the flatform's calling conversion.  */
>>> +
>>> +  gpr_only = crtl->zero_call_used_regs & ONLY_GPR;
>>> +  used_only = crtl->zero_call_used_regs & ONLY_USED;
>>> +  arg_only = crtl->zero_call_used_regs & ONLY_ARG;
>> 
>> Guess it would be nice to be consistent about which side the “only”
>> goes on.  FWIW, I don't mind which way: GPR_ONLY etc. would be
>> OK with me if you prefer that.
> The current names are okay for me.

OK.  But I think one of them should change to match the other.
E.g. either the local variable should be “only_gpr” or the
flag should be “GPR_ONLY”.

Thanks,
Richard


Re: [PATCH][middle-end][i386][version 5]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-gpr-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-29 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
> Now, the documentation (gcc.info) is like following, let me know any issue 
> there:

Yeah, looks good apart from merging

>  In order to satisfy users with different security needs and control
>  the run-time overhead at the same time, CHOICE parameter provides a
>  flexible way to choose the subset of the call-used registers to be
>  zeroed.
>
>  The three basic values of CHOICE are:

this into a single paragraph.

Thanks,
Richard


Re: PowerPC: Add __float128 conversions to/from Decimal

2020-10-29 Thread Segher Boessenkool
Hi!

On Thu, Oct 29, 2020 at 12:45:15PM -0400, Michael Meissner wrote:
> On Wed, Oct 28, 2020 at 07:04:31PM -0500, Segher Boessenkool wrote:
> > > +#if HAVE_KF_MODE
> > > +  strfromf128 (buf, BUFMAX, BFP_FMT, (BFP_VIA_TYPE) x);
> > > +#else
> > >sprintf (buf, BFP_FMT, (BFP_VIA_TYPE) x);
> > > +#endif
> > 
> > Does strfromf128 exist everywhere we build this?  It isn't a standard
> > function.
> 
> Yes, it is in ISO/IEC TS 18661-3, which is the document that describes most of
> the *f128 functions.

But this means it does *not* exist most places we build this?  Not the
whole world is Linux (and even then, it is probably a too recent
addition).

Does it need something in libibiberty maybe?  At least _doprnt handles
long double (whatever type it uses for that, but that can be fixed :-) )
(_doprint is used by all the libiberty versions of the printf family,
and it handles %lf etc. for long double.)

> We have to use str* instead of sprintf or scanf, because I don't believe their
> is a float128 format specifier.

No standard one at least, yes.

> > > +/* Support PowerPC KF mode, which is __float128 when long double is
> > > +   IBM extended double.  */
> > > +#if defined (L_sd_to_kf) || defined (L_dd_to_kf) || defined (L_td_to_kf) 
> > > \
> > > + || defined (L_kf_to_sd) || defined (L_kf_to_dd) || defined (L_kf_to_td)
> > > +#define HAVE_KF_MODE 1
> > > +#endif
> > 
> > This might want a better name, other targets can have a KFmode as well,
> > for some completely different purpose, since it is not a standard mode.
> 
> Given everything else uses *F, including XF on the x86, I figured it was 
> easier
> than creating a new name.

I mean the name for the macro.  "HAVE_KF_MODE" is not great.

Anyway, some libgcc maintainer needs to review this, you may be lucky
with this ;-)


Segher


Re: [PATCH v2] c++: Implement -Wvexing-parse [PR25814]

2020-10-29 Thread Marek Polacek via Gcc-patches
On Thu, Oct 29, 2020 at 11:17:37AM -0400, Jason Merrill via Gcc-patches wrote:
> On 10/28/20 7:40 PM, Marek Polacek wrote:
> > On Wed, Oct 28, 2020 at 03:09:08PM -0400, Jason Merrill wrote:
> > > On 10/28/20 1:58 PM, Marek Polacek wrote:
> > > > On Wed, Oct 28, 2020 at 01:26:53AM -0400, Jason Merrill via Gcc-patches 
> > > > wrote:
> > > > > On 10/24/20 7:40 PM, Marek Polacek wrote:
> > > > > > On Fri, Oct 23, 2020 at 09:33:38PM -0400, Jason Merrill via 
> > > > > > Gcc-patches wrote:
> > > > > > > On 10/23/20 3:01 PM, Marek Polacek wrote:
> > > > > > > > This patch implements the -Wvexing-parse warning to warn about 
> > > > > > > > the
> > > > > > > > sneaky most vexing parse rule in C++: the cases when a 
> > > > > > > > declaration
> > > > > > > > looks like a variable definition, but the C++ language requires 
> > > > > > > > it
> > > > > > > > to be interpreted as a function declaration.  This warning is 
> > > > > > > > on by
> > > > > > > > default (like clang++).  From the docs:
> > > > > > > > 
> > > > > > > >   void f(double a) {
> > > > > > > > int i();// extern int i (void);
> > > > > > > > int n(int(a));  // extern int n (int);
> > > > > > > >   }
> > > > > > > > 
> > > > > > > >   Another example:
> > > > > > > > 
> > > > > > > >   struct S { S(int); };
> > > > > > > >   void f(double a) {
> > > > > > > > S x(int(a));   // extern struct S x (int);
> > > > > > > > S y(int());// extern struct S y (int (*) (void));
> > > > > > > > S z(); // extern struct S z (void);
> > > > > > > >   }
> > > > > > > > 
> > > > > > > > You can find more on this in [dcl.ambig.res].
> > > > > > > > 
> > > > > > > > I spent a fair amount of time on fix-it hints so that GCC can 
> > > > > > > > recommend
> > > > > > > > various ways to resolve such an ambiguity.  Sometimes that's 
> > > > > > > > tricky.
> > > > > > > > E.g., suggesting default-initialization when the class doesn't 
> > > > > > > > have
> > > > > > > > a default constructor would not be optimal.  Suggesting {}-init 
> > > > > > > > is also
> > > > > > > > not trivial because it can use an initializer-list constructor 
> > > > > > > > if no
> > > > > > > > default constructor is available (which ()-init wouldn't do).  
> > > > > > > > And of
> > > > > > > > course, pre-C++11, we shouldn't be recommending {}-init at all.
> > > > > > > 
> > > > > > > What do you think of, instead of passing the type down into the 
> > > > > > > declarator
> > > > > > > parse, adding the paren locations to cp_declarator::function and 
> > > > > > > giving the
> > > > > > > diagnostic from cp_parser_init_declarator instead?
> > > > > 
> > > > > Oops, now I see there's already cp_declarator::parenthesized; might 
> > > > > as well
> > > > > reuse that.  And maybe change it to a range, while we're at it.
> > > > 
> > > > I'm afraid I can't reuse it because grokdeclarator uses it to warn about
> > > > "unnecessary parentheses in declaration".  So when we have:
> > > > 
> > > > int (x());
> > > > 
> > > > declarator->parenthesized points to the outer parens (if any), whereas
> > > > declarator->u.function.parens_loc should point to the inner ones.  We 
> > > > also
> > > > have declarator->id_loc but I think we should only use it for 
> > > > declarator-ids.
> > > 
> > > Makes sense.
> > > 
> > > > (We should still adjust ->parenthesized to be a range to generate a 
> > > > better
> > > > diagnostic; I shall send a patch soon.)
> > > > 
> > > > > Hmm, I wonder why we have the parenthesized_p parameter to some of 
> > > > > these
> > > > > functions, since we can look at the declarator to find that 
> > > > > information...
> > > > 
> > > > That would be a nice cleanup.
> > > > 
> > > > > > Interesting idea.  I suppose it's better, and makes the 
> > > > > > implementation
> > > > > > more localized.  The approach here is that if the 
> > > > > > .function.parens_loc
> > > > > > is UNKNOWN_LOCATION, we've not seen a vexing parse.
> > > > > 
> > > > > I'd rather always set the parens location, and then analyze the
> > > > > cp_declarator in warn_about_ambiguous_parse to see if it's a vexing 
> > > > > parse;
> > > > > we should have all the information we need.
> > > > 
> > > > I could always set .parens_loc, but then I'd still need another flag 
> > > > telling
> > > > me whether we had an ambiguity.  Otherwise I don't know how I would tell
> > > > apart e.g. "int f()" (warn) v. "int f(void)" (don't warn), etc.
> > > 
> > > Ah, I was thinking that we still had the parameter declarators, but now I
> > > see that cp_parser_parameter_declaration_list groks them and returns a
> > > TREE_LIST.  We could set a TREE_LANG_FLAG on each TREE_LIST if its 
> > > parameter
> > > declarator was parenthesized?
> > 
> > I think so, looks like we have a bunch of free TREE_LANG_FLAG slots on
> > a TREE_LIST.  But cp_parser_parameter_declaration_clause can return
> > a void_list_node, so I assume I'd have to

[PATCH] LTO: get_section: add new argument

2020-10-29 Thread Martin Liška

One more backport I've just tested:

gcc/ChangeLog:

PR lto/97508
* langhooks.c (lhd_begin_section): Call get_section with
not_existing = true.
* output.h (get_section): Add new argument.
* varasm.c (get_section): Fail when NOT_EXISTING is true
and a section already exists.
* ipa-cp.c (ipcp_write_summary): Remove.
(ipcp_read_summary): Likewise.
* ipa-fnsummary.c (ipa_fn_summary_read): Always read jump
functions summary.
(ipa_fn_summary_write): Always stream it.

(cherry picked from commit 568de14d2e74cfdd600b8995ff6ac08c98ddef48)
---
 gcc/ipa-cp.c| 20 ++--
 gcc/ipa-fnsummary.c |  6 ++
 gcc/langhooks.c |  2 +-
 gcc/output.h|  3 ++-
 gcc/varasm.c|  9 +++--
 5 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index c7867dbed9b..b1f0881bd70 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -5946,22 +5946,6 @@ ipcp_generate_summary (void)
 ipa_analyze_node (node);
 }
 
-/* Write ipcp summary for nodes in SET.  */

-
-static void
-ipcp_write_summary (void)
-{
-  ipa_prop_write_jump_functions ();
-}
-
-/* Read ipcp summary.  */
-
-static void
-ipcp_read_summary (void)
-{
-  ipa_prop_read_jump_functions ();
-}
-
 namespace {
 
 const pass_data pass_data_ipa_cp =

@@ -5983,8 +5967,8 @@ public:
   pass_ipa_cp (gcc::context *ctxt)
 : ipa_opt_pass_d (pass_data_ipa_cp, ctxt,
  ipcp_generate_summary, /* generate_summary */
- ipcp_write_summary, /* write_summary */
- ipcp_read_summary, /* read_summary */
+ NULL, /* write_summary */
+ NULL, /* read_summary */
  ipcp_write_transformation_summaries, /*
  write_optimization_summary */
  ipcp_read_transformation_summaries, /*
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 55a0b272a96..e07c9b3bba0 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -4346,6 +4346,7 @@ ipa_fn_summary_read (void)
   struct lto_file_decl_data *file_data;
   unsigned int j = 0;
 
+  ipa_prop_read_jump_functions ();

   ipa_fn_summary_alloc ();
 
   while ((file_data = file_data_vec[j++]))

@@ -4364,8 +4365,6 @@ ipa_fn_summary_read (void)
 "ipa inline summary is missing in input file");
 }
   ipa_register_cgraph_hooks ();
-  if (!flag_ipa_cp)
-ipa_prop_read_jump_functions ();
 
   gcc_assert (ipa_fn_summaries);

   ipa_fn_summaries->enable_insertion_hook ();
@@ -4500,8 +4499,7 @@ ipa_fn_summary_write (void)
   produce_asm (ob, NULL);
   destroy_output_block (ob);
 
-  if (!flag_ipa_cp)

-ipa_prop_write_jump_functions ();
+  ipa_prop_write_jump_functions ();
 }
 
 
diff --git a/gcc/langhooks.c b/gcc/langhooks.c

index 5e3216da631..70a554c4447 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -777,7 +777,7 @@ lhd_begin_section (const char *name)
 saved_section = text_section;
 
   /* Create a new section and switch to it.  */

-  section = get_section (name, SECTION_DEBUG | SECTION_EXCLUDE, NULL);
+  section = get_section (name, SECTION_DEBUG | SECTION_EXCLUDE, NULL, true);
   switch_to_section (section);
 }
 
diff --git a/gcc/output.h b/gcc/output.h

index eb253c50329..2f2f1697fd8 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -523,7 +523,8 @@ extern GTY(()) bool in_cold_section_p;
 
 extern section *get_unnamed_section (unsigned int, void (*) (const void *),

 const void *);
-extern section *get_section (const char *, unsigned int, tree);
+extern section *get_section (const char *, unsigned int, tree,
+bool not_existing = false);
 extern section *get_named_section (tree, const char *, int);
 extern section *get_variable_section (tree, bool);
 extern void place_block_symbol (rtx);
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 5bf4e96a773..0e7531926f8 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -276,10 +276,12 @@ get_noswitch_section (unsigned int flags, 
noswitch_section_callback callback)
 }
 
 /* Return the named section structure associated with NAME.  Create

-   a new section with the given fields if no such structure exists.  */
+   a new section with the given fields if no such structure exists.
+   When NOT_EXISTING, then fail if the section already exists.  */
 
 section *

-get_section (const char *name, unsigned int flags, tree decl)
+get_section (const char *name, unsigned int flags, tree decl,
+bool not_existing)
 {
   section *sect, **slot;
 
@@ -296,6 +298,9 @@ get_section (const char *name, unsigned int flags, tree decl)

 }
   else
 {
+  if (not_existing)
+   internal_error ("Section already exists: %qs", name);
+
   sect = *slot;
   /* It is fine if one of the sections has SECTION_NOTYPE as long as
  the other has none of the contrary flags (see the logic at the end

Re: [PATCH v2] c++: Implement -Wvexing-parse [PR25814]

2020-10-29 Thread Jason Merrill via Gcc-patches

On 10/29/20 2:11 PM, Marek Polacek wrote:

On Thu, Oct 29, 2020 at 11:17:37AM -0400, Jason Merrill via Gcc-patches wrote:

On 10/28/20 7:40 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 03:09:08PM -0400, Jason Merrill wrote:

On 10/28/20 1:58 PM, Marek Polacek wrote:

On Wed, Oct 28, 2020 at 01:26:53AM -0400, Jason Merrill via Gcc-patches wrote:

On 10/24/20 7:40 PM, Marek Polacek wrote:

On Fri, Oct 23, 2020 at 09:33:38PM -0400, Jason Merrill via Gcc-patches wrote:

On 10/23/20 3:01 PM, Marek Polacek wrote:

This patch implements the -Wvexing-parse warning to warn about the
sneaky most vexing parse rule in C++: the cases when a declaration
looks like a variable definition, but the C++ language requires it
to be interpreted as a function declaration.  This warning is on by
default (like clang++).  From the docs:

   void f(double a) {
 int i();// extern int i (void);
 int n(int(a));  // extern int n (int);
   }

   Another example:

   struct S { S(int); };
   void f(double a) {
 S x(int(a));   // extern struct S x (int);
 S y(int());// extern struct S y (int (*) (void));
 S z(); // extern struct S z (void);
   }

You can find more on this in [dcl.ambig.res].

I spent a fair amount of time on fix-it hints so that GCC can recommend
various ways to resolve such an ambiguity.  Sometimes that's tricky.
E.g., suggesting default-initialization when the class doesn't have
a default constructor would not be optimal.  Suggesting {}-init is also
not trivial because it can use an initializer-list constructor if no
default constructor is available (which ()-init wouldn't do).  And of
course, pre-C++11, we shouldn't be recommending {}-init at all.


What do you think of, instead of passing the type down into the declarator
parse, adding the paren locations to cp_declarator::function and giving the
diagnostic from cp_parser_init_declarator instead?


Oops, now I see there's already cp_declarator::parenthesized; might as well
reuse that.  And maybe change it to a range, while we're at it.


I'm afraid I can't reuse it because grokdeclarator uses it to warn about
"unnecessary parentheses in declaration".  So when we have:

 int (x());

declarator->parenthesized points to the outer parens (if any), whereas
declarator->u.function.parens_loc should point to the inner ones.  We also
have declarator->id_loc but I think we should only use it for declarator-ids.


Makes sense.


(We should still adjust ->parenthesized to be a range to generate a better
diagnostic; I shall send a patch soon.)


Hmm, I wonder why we have the parenthesized_p parameter to some of these
functions, since we can look at the declarator to find that information...


That would be a nice cleanup.


Interesting idea.  I suppose it's better, and makes the implementation
more localized.  The approach here is that if the .function.parens_loc
is UNKNOWN_LOCATION, we've not seen a vexing parse.


I'd rather always set the parens location, and then analyze the
cp_declarator in warn_about_ambiguous_parse to see if it's a vexing parse;
we should have all the information we need.


I could always set .parens_loc, but then I'd still need another flag telling
me whether we had an ambiguity.  Otherwise I don't know how I would tell
apart e.g. "int f()" (warn) v. "int f(void)" (don't warn), etc.


Ah, I was thinking that we still had the parameter declarators, but now I
see that cp_parser_parameter_declaration_list groks them and returns a
TREE_LIST.  We could set a TREE_LANG_FLAG on each TREE_LIST if its parameter
declarator was parenthesized?


I think so, looks like we have a bunch of free TREE_LANG_FLAG slots on
a TREE_LIST.  But cp_parser_parameter_declaration_clause can return
a void_list_node, so I assume I'd have to copy_node it before setting
some new flag in it.  Do you think that'd be fine?


There's no declarator in a void_list_node, so we shouldn't need to set a
"declarator is parenthesized" flag on it.


I guess I'm still not clear on how I would distinguish between
int f() and int f(void).  When I look at the cdk_function declarator,
all I can see is the .parameters TREE_LIST, which for both cases will
be the same void_list_node, but we should only warn for the former.

What am I missing?


I'm just being dense.  You're right that we would need to distinguish 
those two.  Perhaps an explicit_void_parms_node or something like that 
for during parsing; it looks like grokparms will turn it into 
void_list_node as other code expects.


Jason



Re: PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

2020-10-29 Thread Segher Boessenkool
On Thu, Oct 29, 2020 at 12:47:20PM -0400, Michael Meissner wrote:
> On Tue, Oct 27, 2020 at 09:38:20AM -0500, will schmidt wrote:
> > > @@ -2420,6 +2423,8 @@ BU_P9V_64BIT_VSX_2 (VSIEDPF,
> > > "scalar_insert_exp_dp", CONST,  xsiexpdpf)
> > > 
> > >  BU_FLOAT128_HW_VSX_2 (VSIEQP,"scalar_insert_exp_q",  CONST,  
> > > xsiexpqp_kf)
> > >  BU_FLOAT128_HW_VSX_2 (VSIEQPF,   "scalar_insert_exp_qp", CONST,  
> > > xsiexpqpf_kf)
> > > +BU_FLOAT128_HW_VSX_2 (VSIETF,"scalar_insert_exp_tf", CONST,  
> > > xsiexpqp_tf)
> > > +BU_FLOAT128_HW_VSX_2 (VSIETFF,   "scalar_insert_exp_tfp", CONST, 
> > > xsiexpqpf_tf)
> > 
> > Ok if its ok, but the pattern catches my eye.  Should that be VSIETFP ?
> > (or named "scalar_insert_exp_tff")?
> 
> That is the existing function in the library.  All I'm doing is adding TF
> versions of the existing functions.

Sure, but logically the macro for scalar_insert_exp_tfp would be VSIETFP
(instead of VSIETF) (and that is a new macro name fwiw).  So please fix
that?


Segher


Re: PowerPC: Add __float128 conversions to/from Decimal

2020-10-29 Thread Joseph Myers
On Thu, 29 Oct 2020, Segher Boessenkool wrote:

> Hi!
> 
> On Thu, Oct 29, 2020 at 12:45:15PM -0400, Michael Meissner wrote:
> > On Wed, Oct 28, 2020 at 07:04:31PM -0500, Segher Boessenkool wrote:
> > > > +#if HAVE_KF_MODE
> > > > +  strfromf128 (buf, BUFMAX, BFP_FMT, (BFP_VIA_TYPE) x);
> > > > +#else
> > > >sprintf (buf, BFP_FMT, (BFP_VIA_TYPE) x);
> > > > +#endif
> > > 
> > > Does strfromf128 exist everywhere we build this?  It isn't a standard
> > > function.
> > 
> > Yes, it is in ISO/IEC TS 18661-3, which is the document that describes most 
> > of
> > the *f128 functions.
> 
> But this means it does *not* exist most places we build this?  Not the
> whole world is Linux (and even then, it is probably a too recent
> addition).

strfromf128 and strtof128 were added for powerpc64le-linux-gnu in glibc 
2.26.  (The variants that are namespace-clean in the absence of 18661-3, 
which may be relevant when being used for long double, __strfromieee128 
and __strtoieee128, were added in 2.32.)

Doing these conversions accurately is nontrivial.  Converting via strings 
is the simple approach (i.e. the one that moves the complexity somewhere 
else).  There are more complicated but more efficient approaches that can 
achieve correct conversions with smaller bounds on resource usage (and 
there are various papers published in this area), but those involve a lot 
more code (and precomputed data, with a speed/space trade-off in how much 
you precompute; the BID code in libgcc has several MB of precomputed data 
for that purpose).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

2020-10-29 Thread Segher Boessenkool
On Thu, Oct 29, 2020 at 12:50:10PM -0400, Michael Meissner wrote:
> On Wed, Oct 28, 2020 at 06:27:42PM -0500, Segher Boessenkool wrote:
> > On Thu, Oct 22, 2020 at 06:09:38PM -0400, Michael Meissner wrote:
> > > This patch adds long double variants of the power10 __float128 built-in
> > > functions.  This is needed when long double uses IEEE 128-bit because
> > > __float128 uses TFmode in this case instead of KFmode.  If this patch is 
> > > not
> > > applied, these built-in functions can't be used when long double is IEEE
> > > 128-bit.
> > 
> > But now they still cannot, you need new builtins, instead.
> > 
> > TFmode is an implementation detail at this level (functions use types,
> > not modes), so you do not need new builtins at all afaics?  Just define
> > the existing ones with TFmode as well (if that is the same as KFmode)?
> 
> In order to add new overloaded built-ins, you have to add a new built-in with 
> a
> new name.

I do not follow?  Just delete the old non-overloaded one and add the
overloaded one with that same old name at the same time.

TF is a nasty name, it means a different thing externally (in the libgcc
function names, say: always IFmode) and internally (it varies what it
means).

> Maybe when Bill finally reorganizes the built-in functions, we can do anyway
> with having to create new named functions.  But for now, in order to add them,
> you need a name.

Of course.  And there already is a name :-)


Segher


Re: PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.

2020-10-29 Thread Segher Boessenkool
On Thu, Oct 29, 2020 at 12:56:03PM -0400, Michael Meissner wrote:
> On Wed, Oct 28, 2020 at 04:58:46PM -0500, Segher Boessenkool wrote:
> > >  #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)
> > > \
> > >  && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> > >return __builtin_pack_longdouble (dh, dl);
> > > +#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) 
> > > \
> > > +&& !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> > > +  return __builtin_pack_ibm128 (dh, dl);
> > 
> > Given the above, _SOFT_FLOAT etc. are wrong.
> > 
> > Just use some more portable thing to repack?  Is __builtin_pack_ibm128
> > not defined always here anyway?
> 
> That is the problem.  If you build a big endian PowerPC compiler where VSX is
> not default, the __ibm128 stuff is not defined.  It is only defined when
> __float128 is a possibility.  Hence __builtin_pack_ibm128 and
> __builtin_unpack_ibm128 are not defined.

So fix that?  When ibm128 is the only thing supported there is no reason
why __builtin_{un,}pack_ibm128 should not be supported (the ieee128
functions of course not, but there is no reason to not define the normal
names for the one supported thing).

> > /* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
> >__ibm128 is available).  */
> > #define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)\
> >   RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
> > "__builtin_" NAME,  /* NAME */  \
> > (RS6000_BTM_HARD_FLOAT  /* MASK */  \
> >  | RS6000_BTM_FLOAT128),\
> > (RS6000_BTC_ ## ATTR/* ATTR */  \
> >  | RS6000_BTC_BINARY),  \
> > CODE_FOR_ ## ICODE) /* ICODE */
> > 
> > (so just HARD_FLOAT and FLOAT128 are needed)
> > 
> > What am I missing?
> 
> As I said, the __ibm128 keyword is not enabled on non-VSX systems.

So fix that?  It can easily be supported everywhere, after all.


Segher


[PATCH 2/3] Binutils: Pass --plugin to AR and RANLIB

2020-10-29 Thread H.J. Lu via Gcc-patches
Detect GCC LTO plugin.  Pass --plugin to AR and RANLIB to support LTO
build.

bfd/

* configure: Regenerated.

binutils/

* configure: Regenerated.

gas/

* configure: Regenerated.

gprof/

* configure: Regenerated.

ld/

* configure: Regenerated.

libctf/

* configure: Regenerated.

opcodes/

* configure: Regenerated.
---
 bfd/configure  | 27 +--
 binutils/configure | 27 +--
 gas/configure  | 27 +--
 gprof/configure| 27 +--
 ld/configure   | 27 +--
 libctf/configure   | 27 +--
 opcodes/configure  | 27 +--
 7 files changed, 175 insertions(+), 14 deletions(-)

diff --git a/bfd/configure b/bfd/configure
index 864e78851c..c518d9e5be 100755
--- a/bfd/configure
+++ b/bfd/configure
@@ -6824,6 +6824,19 @@ test -z "$deplibs_check_method" && 
deplibs_check_method=unknown
 
 
 
+plugin_option=
+plugin_names="liblto_plugin.so liblto_plugin-0.dll cyglto_plugin-0.dll"
+for plugin in $plugin_names; do
+  plugin_so=`${CC} ${CFLAGS} --print-prog-name $plugin`
+  if test x$plugin_so = x$plugin; then
+plugin_so=`${CC} ${CFLAGS} --print-file-name $plugin`
+  fi
+  if test x$plugin_so != x$plugin; then
+plugin_option="--plugin $plugin_so"
+break
+  fi
+done
+
 if test -n "$ac_tool_prefix"; then
   # Extract the first word of "${ac_tool_prefix}ar", so it can be a program 
name with args.
 set dummy ${ac_tool_prefix}ar; ac_word=$2
@@ -6917,6 +6930,11 @@ else
 fi
 
 test -z "$AR" && AR=ar
+if test -n "$plugin_option"; then
+  if $AR --help 2>&1 | grep -q "\--plugin"; then
+AR="$AR $plugin_option"
+  fi
+fi
 test -z "$AR_FLAGS" && AR_FLAGS=cru
 
 
@@ -7121,6 +7139,11 @@ else
 fi
 
 test -z "$RANLIB" && RANLIB=:
+if test -n "$plugin_option" && test "$RANLIB" != ":"; then
+  if $RANLIB --help 2>&1 | grep -q "\--plugin"; then
+RANLIB="$RANLIB $plugin_option"
+  fi
+fi
 
 
 
@@ -11729,7 +11752,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11732 "configure"
+#line 11755 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11835,7 +11858,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11838 "configure"
+#line 11861 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/binutils/configure b/binutils/configure
index 7c3113c6af..c4d19e406e 100755
--- a/binutils/configure
+++ b/binutils/configure
@@ -6616,6 +6616,19 @@ test -z "$deplibs_check_method" && 
deplibs_check_method=unknown
 
 
 
+plugin_option=
+plugin_names="liblto_plugin.so liblto_plugin-0.dll cyglto_plugin-0.dll"
+for plugin in $plugin_names; do
+  plugin_so=`${CC} ${CFLAGS} --print-prog-name $plugin`
+  if test x$plugin_so = x$plugin; then
+plugin_so=`${CC} ${CFLAGS} --print-file-name $plugin`
+  fi
+  if test x$plugin_so != x$plugin; then
+plugin_option="--plugin $plugin_so"
+break
+  fi
+done
+
 if test -n "$ac_tool_prefix"; then
   # Extract the first word of "${ac_tool_prefix}ar", so it can be a program 
name with args.
 set dummy ${ac_tool_prefix}ar; ac_word=$2
@@ -6709,6 +6722,11 @@ else
 fi
 
 test -z "$AR" && AR=ar
+if test -n "$plugin_option"; then
+  if $AR --help 2>&1 | grep -q "\--plugin"; then
+AR="$AR $plugin_option"
+  fi
+fi
 test -z "$AR_FLAGS" && AR_FLAGS=cru
 
 
@@ -6913,6 +6931,11 @@ else
 fi
 
 test -z "$RANLIB" && RANLIB=:
+if test -n "$plugin_option" && test "$RANLIB" != ":"; then
+  if $RANLIB --help 2>&1 | grep -q "\--plugin"; then
+RANLIB="$RANLIB $plugin_option"
+  fi
+fi
 
 
 
@@ -11552,7 +11575,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11555 "configure"
+#line 11578 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11658,7 +11681,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11661 "configure"
+#line 11684 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gas/configure b/gas/configure
index c1fff579c6..6b87cc2401 100755
--- a/gas/configure
+++ b/gas/configure
@@ -6408,6 +6408,19 @@ test -z "$deplibs_check_method" && 
deplibs_check_method=unknown
 
 
 
+plugin_option=
+plugin_names="liblto_plugin.so liblto_plugin-0.dll cyglto_plugin-0.dll"
+for plugin in $plugin_names; do
+  plugin_so=`${CC} ${CFLAGS} --print-prog-name $plugin`
+  if test x$plugin_so = x$plugin; then
+plugin_so=`${CC} ${CFLAGS} --print-file-name $plugin`
+  fi
+  if test x$plugin_so != x$plugin; then
+plugin_option="--plugin $plugin_so"
+break
+  fi
+done
+
 if test -n "$ac_tool_prefix"; then
   # Extract the first word of "${ac_tool_prefix}ar", so it can be a program 
name with args.
 set dummy 

  1   2   >