Re: [PATCH] [PING] Asan changes for RISC-V.

2020-11-06 Thread Kito Cheng via Gcc-patches
LGTM.

Verified with Fedora rawhide image running on qemu, kernel version is 5.5.0,
I got slightly different gcc testsuite result, but after reviewing all
failed cases,
it should not be a blocker for this patch.

It  seems environment issue, some minor issues like stack
unwinding or library search path.
e.g
- ld complain cannot find -latomic
- Stack unwinding result not show main in asan report.

=== gcc Summary ===

# of expected passes2672
# of unexpected failures25
# of unresolved testcases   14
# of unsupported tests  224
=== g++ Summary ===

# of expected passes1967
# of unexpected failures20
# of unresolved testcases   15
# of unsupported tests  175




On Thu, Nov 5, 2020 at 4:11 AM Jim Wilson  wrote:
>
> On Wed, Oct 28, 2020 at 4:59 PM Jim Wilson  wrote:
>
> > We have only riscv64 asan support, there is no riscv32 support as yet.  So
> > I
> > need to be able to conditionally enable asan support for the riscv
> > target.  I
> > implemented this by returning zero from the asan_shadow_offset function.
> > This
> > requires a change to toplev.c and docs in target.def.
> >
> > The asan support works on a 5.5 kernel, but does not work on a 4.15 kernel.
> > The problem is that the asan high memory region is a small wedge below
> > 0x40.  The new kernel puts shared libraries at 0x3f and
> > going
> > down which works.  But the old kernel puts shared libraries at 0x20
> > and going up which does not work, as it isn't in any recognized memory
> > region.  This might be fixable with more asan work, but we don't really
> > need
> > support for old kernel versions.
> >
> > The asan port is curious in that it uses 1<<29 for the shadow offset, but
> > all
> > other 64-bit targets use a number larger than 1<<32.  But what we have is
> > working OK for now.
> >
> > I did a make check RUNTESTFLAGS="asan.exp" on Fedora rawhide image running
> > on
> > qemu and the results look reasonable.
> >
> > === gcc Summary ===
> >
> > # of expected passes1905
> > # of unexpected failures11
> > # of unsupported tests  224
> >
> > === g++ Summary ===
> >
> > # of expected passes2002
> > # of unexpected failures6
> > # of unresolved testcases   1
> > # of unsupported tests  175
> >
> > OK?
> >
> > Jim
> >
> > 2020-10-28  Jim Wilson  
> >
> > gcc/
> > * config/riscv/riscv.c (riscv_asan_shadow_offset): New.
> > (TARGET_ASAN_SHADOW_OFFSET): New.
> > * doc/tm.texi: Regenerated.
> > * target.def (asan_shadow_offset); Mention that it can return zero.
> > * toplev.c (process_options): Check for and handle zero return from
> > targetm.asan_shadow_offset call.
> >
> > Co-Authored-By: cooper.joshua 
> > ---
> >  gcc/config/riscv/riscv.c | 16 
> >  gcc/doc/tm.texi  |  3 ++-
> >  gcc/target.def   |  3 ++-
> >  gcc/toplev.c |  3 ++-
> >  4 files changed, 22 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> > index 989a9f15250..6909e200de1 100644
> > --- a/gcc/config/riscv/riscv.c
> > +++ b/gcc/config/riscv/riscv.c
> > @@ -5299,6 +5299,19 @@ riscv_gpr_save_operation_p (rtx op)
> >return true;
> >  }
> >
> > +/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
> > +
> > +static unsigned HOST_WIDE_INT
> > +riscv_asan_shadow_offset (void)
> > +{
> > +  /* We only have libsanitizer support for RV64 at present.
> > +
> > + This number must match kRiscv*_ShadowOffset* in the file
> > + libsanitizer/asan/asan_mapping.h which is currently 1<<29 for rv64,
> > + even though 1<<36 makes more sense.  */
> > +  return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0;
> > +}
> > +
> >  /* Initialize the GCC target structure.  */
> >  #undef TARGET_ASM_ALIGNED_HI_OP
> >  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> > @@ -5482,6 +5495,9 @@ riscv_gpr_save_operation_p (rtx op)
> >  #undef TARGET_NEW_ADDRESS_PROFITABLE_P
> >  #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p
> >
> > +#undef TARGET_ASAN_SHADOW_OFFSET
> > +#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
> > +
> >  struct gcc_target targetm = TARGET_INITIALIZER;
> >
> >  #include "gt-riscv.h"
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index 24c37f655c8..39c596b647a 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -12078,7 +12078,8 @@ is zero, which disables this optimization.
> >  @deftypefn {Target Hook} {unsigned HOST_WIDE_INT}
> > TARGET_ASAN_SHADOW_OFFSET (void)
> >  Return the offset bitwise ored into shifted address to get corresponding
> >  Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
> > -supported by the target.
> > +supported by the target.  May return 0 if Address Sanitizer is not
> > supported
> > +by a subtarget.
> >  @end deftypefn

Re: Fwd: libstdc++: Attempt to resolve PR83562

2020-11-06 Thread Liu Hao via Gcc-patches
Ping?






在 2020/10/29 下午3:56, Liu Hao 写道:
> I forward it here for comments.
> 
> Basing on the behavior of both GCC and Clang, `__cxa_thread_atexit` is used 
> to register the
> destructor of thread_local objects directly, suggesting the first parameter 
> should have `__thiscall`
> convention.
> 
> libstdc++ used the default `__cdecl` convention and caused crashes on 
> 1686-w64-mingw32 (see
> PR83562). But to my surprise, libcxxabi uses `__cdecl` too [1], but I haven't 
> heard any of relevant
> reports so far.
> 
> Original patch is attached in case you can't find it in gcc-patches.
> 
> 
> [1]
> https://github.com/llvm/llvm-project/blob/97b351a827677ebbedc10bfbce8ef8844c246553/libcxxabi/src/cxa_thread_atexit.cpp#L22
> 
> 



-- 
Best regards,
LH_Mouse



signature.asc
Description: OpenPGP digital signature


Re: Fwd: libstdc++: Attempt to resolve PR83562

2020-11-06 Thread Martin Storsjö

On Fri, 6 Nov 2020, Liu Hao via Gcc-patches wrote:


在 2020/10/29 下午3:56, Liu Hao 写道:

I forward it here for comments.

Basing on the behavior of both GCC and Clang, `__cxa_thread_atexit` is used to 
register the
destructor of thread_local objects directly, suggesting the first parameter 
should have `__thiscall`
convention.

libstdc++ used the default `__cdecl` convention and caused crashes on 
1686-w64-mingw32 (see
PR83562). But to my surprise, libcxxabi uses `__cdecl` too [1], but I haven't 
heard any of relevant
reports so far.

Original patch is attached in case you can't find it in gcc-patches.



FWIW, this patch looks good and correct to me, from a mingw perspective.

// Martin


Re: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}

2020-11-06 Thread Thomas Schwinge
Hi!

On 2018-09-25T16:00:14-0400, David Malcolm  wrote:
> As noted at Cauldron, dumpfile.c currently emits "note: " for all kinds
> of dump message, so that (after filtering) there's no distinction between
> MSG_OPTIMIZED_LOCATIONS vs MSG_NOTE vs MSG_MISSED_OPTIMIZATION in the
> textual output.
>
> This patch changes dumpfile.c so that the "note: " varies to show
> which MSG_* was used, with the string prefix matching that used for
> filtering in -fopt-info, hence e.g.
>   directive_unroll_3.f90:24:0: optimized: loop unrolled 7 times
> and:
>   pr19210-1.c:24:3: missed: missed loop optimization: niters analysis ends up 
> with assumptions.
>
> The patch adds "dg-optimized" and "dg-missed" directives for use
> in the testsuite for matching these (with -fopt-info on stderr; they
> don't help for dumpfile output).

Thanks, this is very useful.


I just ran into a problem regarding these two:

> --- a/gcc/testsuite/lib/gcc-dg.exp
> +++ b/gcc/testsuite/lib/gcc-dg.exp

> +# Handle output from -fopt-info for MSG_OPTIMIZED_LOCATIONS:
> +# a successful optimization.
> +
> +proc dg-optimized { args } {
> +# Make this variable available here and to the saved proc.
> +upvar dg-messages dg-messages
> +
> +process-message saved-dg-error "optimized: " "$args"
> +}
> +
> +# Handle output from -fopt-info for MSG_MISSED_OPTIMIZATION:
> +# a missed optimization.
> +
> +proc dg-missed { args } {
> +# Make this variable available here and to the saved proc.
> +upvar dg-messages dg-messages
> +
> +process-message saved-dg-error "missed: " "$args"
> +}

If, in addition to the usual line location checking, you'd like to do
column location checking ("[column]: " prefix before the actual
diagnostic), and the actual diagnostic doesn't begin with whitespace,
then this currently fails.  To address this, OK to push the attached
patch "[testsuite] Enable column location checking for 'dg-optimized',
'dg-missed'" -- with or without the demonstrator
'gcc.dg/vect/nodump-vect-opt-info-1.c',
'gcc.dg/vect/nodump-vect-opt-info-2.c' changes, your call?  (I still have
to run this through regression testing.)


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From f3046b8bea6a2a6489dd10d72cb038b92aa4fc38 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 6 Nov 2020 09:18:06 +0100
Subject: [PATCH] [testsuite] Enable column location checking for
 'dg-optimized', 'dg-missed'

'process-message' would like the 'msgprefix' argument without trailing space.

This is a bug-fix for commit ed2d9d3720adef3a260b8a55e17e744352a901fc
"dumpfile.c: use prefixes other than 'note: ' for
MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}", which added 'dg-optimized',
'dg-missed'.

	gcc/testsuite/
	* lib/gcc-dg.exp (dg-optimized, dg-missed): Fix 'process-message'
	call.
	* gcc.dg/vect/nodump-vect-opt-info-1.c: Demonstrate.
	* gcc.dg/vect/nodump-vect-opt-info-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c | 4 ++--
 gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c | 4 ++--
 gcc/testsuite/lib/gcc-dg.exp   | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c b/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c
index 3bfe498ef0a2..6834b9a9d0b9 100644
--- a/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c
+++ b/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-1.c
@@ -5,8 +5,8 @@ void
 vadd (int *dst, int *op1, int *op2, int count)
 {
 /* { dg-prune-output " version\[^\n\r]* alignment" } */
-/* { dg-optimized "loop vectorized" "" { target *-*-* } .+2 } */
-/* { dg-optimized "loop versioned for vectorization because of possible aliasing" "" { target *-*-* } .+1 } */
+/* { dg-optimized "21: loop vectorized" "" { target *-*-* } .+2 } */
+/* { dg-optimized "21: loop versioned for vectorization because of possible aliasing" "" { target *-*-* } .+1 } */
   for (int i = 0; i < count; ++i)
 dst[i] = op1[i] + op2[i];
 }
diff --git a/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c b/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c
index 94c55a92bb4f..23a3b39fbb32 100644
--- a/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c
+++ b/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c
@@ -6,7 +6,7 @@ extern void accumulate (int x, int *a);
 int test_missing_function_defn (int *arr, int n) /* { dg-message "vectorized 0 loops in function" } */
 {
   int sum = 0;
-  for (int i = 0; i < n; ++i) /* { dg-missed "couldn't vectorize loop" } */
-accumulate (arr[i], &sum); /* { dg-missed "statement clobbers memory: accumulate \\(.*\\);" } */
+  for (int i = 0; i < n; ++i) /* { dg-missed "21: couldn't vectorize loop" } */
+accumulate (arr[i], &sum); /* { dg-missed "5: statement clobbers memory: accumulate \\(.*\\);" } */
   return sum;
 }
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsu

[PATCH] tree-optimization/97732 - fix init of SLP induction vectorization

2020-11-06 Thread Richard Biener
This PR exposes two issues - one that the vector builder treats
&x as eligible for VECTOR_CST elements and one that SLP induction
vectorization forgets to convert init elements to the vector
component type which makes a difference for pointer vs. integer.

Bootstrap & regtest pending on x86_64-unknown-linux-gnu.

2020-11-06  Richard Biener  

PR tree-optimization/97732
* tree-vect-loop.c (vectorizable_induction): Convert the
init elements to the vector component type.
* gimple-fold.c (gimple_build_vector): Use CONSTANT_CLASS_P
rather than TREE_CONSTANT to determine if elements are
eligible for VECTOR_CSTs.

* gcc.dg/vect/bb-slp-pr97732.c: New testcase.
---
 gcc/gimple-fold.c  |  2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-pr97732.c | 11 +++
 gcc/tree-vect-loop.c   |  4 
 3 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr97732.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index c3fa4cb7cc1..ca38a31c3c2 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -7855,7 +7855,7 @@ gimple_build_vector (gimple_seq *seq, location_t loc,
   gcc_assert (builder->nelts_per_pattern () <= 2);
   unsigned int encoded_nelts = builder->encoded_nelts ();
   for (unsigned int i = 0; i < encoded_nelts; ++i)
-if (!TREE_CONSTANT ((*builder)[i]))
+if (!CONSTANT_CLASS_P ((*builder)[i]))
   {
tree type = builder->type ();
unsigned int nelts = TYPE_VECTOR_SUBPARTS (type).to_constant ();
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr97732.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97732.c
new file mode 100644
index 000..5187090797d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97732.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+
+struct S { int a, b; } *e;
+int d;
+
+void
+foo (struct S *x)
+{
+  for (e = x; d; d++, e++)
+e->a = e->b = (int) (__UINTPTR_TYPE__) e;
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index ef2ea3d0fb0..0ba37540d5d 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -8001,6 +8001,10 @@ vectorizable_induction (loop_vec_info loop_vinfo,
{
  /* The scalar inits of the IVs if not vectorized.  */
  elt = inits[(ivn*const_nunits + eltn) % group_size];
+ if (!useless_type_conversion_p (TREE_TYPE (vectype),
+ TREE_TYPE (elt)))
+   elt = gimple_build (&init_stmts, VIEW_CONVERT_EXPR,
+   TREE_TYPE (vectype), elt);
  init_elts.quick_push (elt);
}
  /* The number of steps to add to the initial values.  */
-- 
2.26.2


Re: [PATCH] SLP: Move load/store-lanes check till late

2020-11-06 Thread Christophe Lyon via Gcc-patches
On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches
 wrote:
>
> > -Original Message-
> > From: rguent...@c653.arch.suse.de  On
> > Behalf Of Richard Biener
> > Sent: Thursday, November 5, 2020 10:17 AM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> >
> > On Wed, 4 Nov 2020, Tamar Christina wrote:
> >
> > > Hi Richi,
> > >
> > > > -Original Message-
> > > > From: rguent...@c653.arch.suse.de  On
> > > > Behalf Of Richard Biener
> > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > >
> > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > >
> > > > > Hi Richi,
> > > > >
> > > > > We decided to take the regression in any code-gen this could give
> > > > > and fix it properly next stage-1.  As such here's a new patch
> > > > > based on your previous feedback.
> > > > >
> > > > > Ok for master?
> > > >
> > > > Looks good sofar but be aware that you elide the
> > > >
> > > > - && vect_store_lanes_supported
> > > > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size,
> > > > false))
> > > >
> > > > part of the check - that is, you don't verify the store part of the
> > > > instance can use store-lanes.  Btw, this means the original code
> > > > cancelled an instance only when the SLP graph entry is a store-lane
> > > > capable store but your variant would also cancel in case there's a load-
> > lane capable reduction.
> > > >
> > >
> > > I do still have it,
> > >
> > >   if (loads_permuted
> > >   && vect_store_lanes_supported (vectype, group_size, false))
> > >
> > > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which
> > > should be the store if one exists.
> > >
> > > > I think that you eventually want to re-instantiate the store-lane
> > > > check but treat it the same as any of the load checks (thus not
> > > > require all instances to be stores for the cancellation).
> > > > But at least when a store cannot use store-lanes we probably
> > > > shouldn't cancel the SLP.
> > >
> > > I did however elide the kind check, that was added as part of the
> > > rebase, it looked like kind wasn't Being stored inside the SLP instance 
> > > and
> > I'd have to redo the analysis to find it.
> > >
> > > Does it does reasonable to include kind as a field in the SLP instance?
> > >
> > > >
> > > > Anyway, the patch is OK for master.  The store-lane check part can
> > > > be re- added as followup.
> > > >
> > >
> > > Thanks! Will do.
> >
> > Btw, the patch regressed
> >
> > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect
> > "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing stmts
> > using SLP" 1
> > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump vect
> > "Built SLP cancelled: can use load/store-lanes"
> > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump vect
> > "LOAD_LANES"
> > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump vect
> > "STORE_LANES"
> > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled:
> > can use load/store-lanes"
> > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> >
> > on x86_64.  The slp-11b.c testcase is interesting since there extract_muldiv
> > folding makes the group of four stores not matching so we split into a size 
> > of
> > 3 and one remaining store.
> > This causes us to arrive at
> >
> > note:   node 0x441a940 (max_nunits=4, refcnt=2)
> > note:   stmt 0 _2 = in[_1];
> > note:   stmt 1 _6 = in[_5];
> > note:   stmt 2 _10 = in[_8];
> > note:   load permutation { 0 2 1 }
> >
> > which on x86_64 we in the end cannot handle (without SSE4 I think) so it 
> > fails
> > to SLP there.  Guess arm can do the permute but not the load-lane here.
> >
> > For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done
> > for !vect_load_lanes targets.  Not sure if that's possible easily, like 
> > with a
> > { target vect_load_lanes } { xfail vect_load_lanes } combo ...?  I suggest 
> > to
> > make it xfail everywhere instead and add a comment as to we're expecting
> > those only for vect_load_lanes targets.
>
> Yes just fixed these, the change in gcc.dg/vect/slp-11b.c shouldn't be there
> and I updated the target selector properly :)
>

Hi Tamar,

This patch (r11-4728) introduced a regression on arm:
FAIL:gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects
scan-tree-dump vect "vectorized 1 loops"
FAIL:gcc.dg/vect/slp-reduc-9.c -flto -ffat-lto-objects
scan-tree-dump vect "vectorizing stmts using SLP"
FAIL:gcc.dg/vect/slp-reduc-9.c scan-tree-dump vect "vectorized 1 loops"
FAIL:gcc.dg/vect/slp-r

Re: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}

2020-11-06 Thread Thomas Schwinge
Hi!

On 2018-09-25T16:00:14-0400, David Malcolm  wrote:
> The patch adds "dg-optimized" and "dg-missed" directives

Another small thing I just noticed:

> --- a/gcc/testsuite/lib/gcc-dg.exp
> +++ b/gcc/testsuite/lib/gcc-dg.exp

> +# Handle output from -fopt-info for MSG_OPTIMIZED_LOCATIONS:
> +# a successful optimization.
> +
> +proc dg-optimized { args } {
> +# Make this variable available here and to the saved proc.
> +upvar dg-messages dg-messages
> +
> +process-message saved-dg-error "optimized: " "$args"
> +}
> +
> +# Handle output from -fopt-info for MSG_MISSED_OPTIMIZATION:
> +# a missed optimization.
> +
> +proc dg-missed { args } {
> +# Make this variable available here and to the saved proc.
> +upvar dg-messages dg-messages
> +
> +process-message saved-dg-error "missed: " "$args"
> +}

These currently print "(test for *errors*, line [...])".  However, these
diagnostics are not actually error diagnostics (fatal, meaning: causes
compilation to fail) but rather warning diagnostics (non-fatal, doesn't
cause compilation to fail).  Thus, same as 'dg-message', these should use
'saved-dg-warning' instead of 'saved-dg-error', which will print: "(test
for *warnings*, line [...])".  OK to change that after regression
testing?


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [r11-4733 Regression] FAIL: gcc.dg/guality/pr54519-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects -DPREVENT_OPTIMIZATION line 17 y == 25 on Linux/x86_64

2020-11-06 Thread Richard Biener
On Thu, 5 Nov 2020, sunil.k.pandey wrote:

> On Linux/x86_64,
> 
> 1436ef2a57e79b6b8ce5b03e32a38dd64f46c97c is the first bad commit
> commit 1436ef2a57e79b6b8ce5b03e32a38dd64f46c97c
> Author: Richard Biener 
> Date:   Thu Nov 5 09:27:28 2020 +0100
> 
> debug/97718 - fix abstract origin references after last change
> 
> caused
> 
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  -DPREVENT_OPTIMIZATION line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  -DPREVENT_OPTIMIZATION line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  -DPREVENT_OPTIMIZATION line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  -DPREVENT_OPTIMIZATION line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 20 y == 25
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 20 z == 6
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 23 y == 117
> FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 23 z == 8
> FAIL: gcc.dg/guality/pr54519-4.c   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none  -DPREVENT_OPTIMIZATION line 17 y == 25
> FAIL: gcc.dg/guality/pr54519-4.c   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 17 y == 25

Note this returns us to the FAIL state before 
104ca9cfa60aa1d5ddd3574bed012d394e8c

the sequence of changes might help to debug the FAILs though which may
also be consumer issues.  The interesting thing is that after
104ca9cfa60aa1d5ddd3574bed012d394e8c, despite "broken" abstract origin
references avoided some of the FAILs.  But debug of partial inlining
is an odd beast with issues.

Richard.


[committed] common: Remove DEBUG_FUNCTION from verify_sequence_points

2020-11-06 Thread Jakub Jelinek via Gcc-patches
Hi!

While perhaps the function name might suggest that it is a 
verification/debugging
only routine, it is actually implementation of the -Wsequence-point warning
and so doesn't need the DEBUG_FUNCTION macro on it.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as
obvious.

2020-11-06  Jakub Jelinek  

* c-common.c (verify_sequence_points): Remove DEBUG_FUNCTION.

--- gcc/c-family/c-common.c.jj  2020-11-03 11:15:07.169681012 +0100
+++ gcc/c-family/c-common.c 2020-11-04 19:26:59.974555602 +0100
@@ -2045,7 +2045,7 @@ verify_tree (tree x, struct tlist **pbef
 /* Try to warn for undefined behavior in EXPR due to missing sequence
points.  */
 
-DEBUG_FUNCTION void
+void
 verify_sequence_points (tree expr)
 {
   struct tlist *before_sp = 0, *after_sp = 0;

Jakub



[PATCH] c++: Propagate attributes to clones in duplicate_decls [PR67453]

2020-11-06 Thread Jakub Jelinek via Gcc-patches
Hi!

On the following testcase where the cdtor attributes aren't on the
in-class declaration but on an out-of-class definition, the cdtors
have their clones created from the in-class declaration, and later on
duplicate_decls updates attributes on the abstract cdtors, but nothing
propagates them to the clones.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2020-11-06  Jakub Jelinek  

PR c++/67453
* decl.c (duplicate_decls): Propagate DECL_ATTRIBUTES and
DECL_PRESERVE_P from olddecl to its clones if any.

* g++.dg/ext/attr-used-2.C: New test.

--- gcc/cp/decl.c.jj2020-11-03 21:42:00.536043737 +0100
+++ gcc/cp/decl.c   2020-11-05 17:33:40.064072970 +0100
@@ -2921,6 +2921,16 @@ duplicate_decls (tree newdecl, tree oldd
snode->remove ();
 }
 
+  if (TREE_CODE (olddecl) == FUNCTION_DECL)
+{
+  tree clone;
+  FOR_EACH_CLONE (clone, olddecl)
+   {
+ DECL_ATTRIBUTES (clone) = DECL_ATTRIBUTES (olddecl);
+ DECL_PRESERVE_P (clone) |= DECL_PRESERVE_P (olddecl);
+   }
+}
+
   /* Remove the associated constraints for newdecl, if any, before
  reclaiming memory. */
   if (flag_concepts)
--- gcc/testsuite/g++.dg/ext/attr-used-2.C.jj   2020-11-05 17:42:49.895949119 
+0100
+++ gcc/testsuite/g++.dg/ext/attr-used-2.C  2020-11-05 17:42:07.934416482 
+0100
@@ -0,0 +1,15 @@
+// PR c++/67453
+// { dg-do compile }
+// { dg-final { scan-assembler "_ZN1SC\[12]Ev" } }
+// { dg-final { scan-assembler "_ZN1SD\[12]Ev" } }
+// { dg-final { scan-assembler "_ZN1SC\[12]ERKS_" } }
+
+struct S {
+S();
+~S();
+S(const S&);
+};
+
+__attribute__((used)) inline S::S()  { }
+__attribute__((used)) inline S::~S() { }
+__attribute__((used)) inline S::S(const S&) { }

Jakub



Add 'dg-note' next to 'dg-optimized', 'dg-missed' (was: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION})

2020-11-06 Thread Thomas Schwinge
Hi, again!

On 2018-09-25T16:00:14-0400, David Malcolm  wrote:
> As noted at Cauldron, dumpfile.c currently emits "note: " for all kinds
> of dump message, so that (after filtering) there's no distinction between
> MSG_OPTIMIZED_LOCATIONS vs MSG_NOTE vs MSG_MISSED_OPTIMIZATION in the
> textual output.
>
> This patch changes dumpfile.c so that the "note: " varies to show
> which MSG_* was used, with the string prefix matching that used for
> filtering in -fopt-info, hence e.g.
>   directive_unroll_3.f90:24:0: optimized: loop unrolled 7 times
> and:
>   pr19210-1.c:24:3: missed: missed loop optimization: niters analysis ends up 
> with assumptions.

(However, 'MSG_NOTE'/'note: ' also still remains used for "general
optimization info".)

> The patch adds "dg-optimized" and "dg-missed" directives

> --- a/gcc/testsuite/lib/gcc-dg.exp
> +++ b/gcc/testsuite/lib/gcc-dg.exp

> +# Handle output from -fopt-info for MSG_OPTIMIZED_LOCATIONS:
> +# a successful optimization.
> +
> +proc dg-optimized { args } {
> +# Make this variable available here and to the saved proc.
> +upvar dg-messages dg-messages
> +
> +process-message saved-dg-error "optimized: " "$args"
> +}
> +
> +# Handle output from -fopt-info for MSG_MISSED_OPTIMIZATION:
> +# a missed optimization.
> +
> +proc dg-missed { args } {
> +# Make this variable available here and to the saved proc.
> +upvar dg-messages dg-messages
> +
> +process-message saved-dg-error "missed: " "$args"
> +}
> +

Next to these, I'm proposing to add 'dg-note', see attached "[WIP] Add
'dg-note' next to 'dg-optimized'", which may be used instead of generic
'dg-message' (which in current uses in testcases often doesn't scan for
the 'note: ' prefix, by the way).

The proposed 'dg-note' has the additional property that "if dg-note is
used once, [notes] must *all* be handled explicitly".  The rationale is
that either you're not interested in notes at all (default behavior of
pruning all notes), but often, when you're interested in one note, you're
in fact interested in all notes, and especially interested if
*additional* notes appear over time, as GCC evolves.  It seemed somewhat
useful, but I'm not insisting on coupling the disabling of notes pruning
on 'dg-note' usage, so if anyone feels strongly about that, please speak
up.

TODO document (also 'dg-optimized', 'dg-missed')

TODO 'gcc/testsuite/lib/lto.exp' change necessary/desirable?

The latter got added in commit 824721f0905478ebc39e6a295cc8e95c22fa9d17
"lto, testsuite: Fix ICE in -Wodr (PR lto/83121)".  David, do you happen
to have an opinion on that one?


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From bb293fff7580025a3b78fc1619d8bf0d8f8b8a1a Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 6 Nov 2020 09:01:26 +0100
Subject: [PATCH] [WIP] Add 'dg-note' next to 'dg-optimized', 'dg-missed'

TODO document (also 'dg-optimized', 'dg-missed')

TODO 'gcc/testsuite/lib/lto.exp' change necessary/desirable?
---
 .../gcc.dg/vect/nodump-vect-opt-info-2.c  |  4 ++-
 gcc/testsuite/lib/gcc-dg.exp  | 26 +++
 gcc/testsuite/lib/lto.exp |  7 +++--
 gcc/testsuite/lib/prune.exp   |  7 +++--
 4 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c b/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c
index 23a3b39fbb32..bcdf7f076715 100644
--- a/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c
+++ b/gcc/testsuite/gcc.dg/vect/nodump-vect-opt-info-2.c
@@ -3,7 +3,9 @@
 
 extern void accumulate (int x, int *a);
 
-int test_missing_function_defn (int *arr, int n) /* { dg-message "vectorized 0 loops in function" } */
+int test_missing_function_defn (int *arr, int n) /* { dg-note "5: vectorized 0 loops in function" } */
+/* { dg-prune-output "note: " } as we're not interested in matching any further
+   notes.  */
 {
   int sum = 0;
   for (int i = 0; i < n; ++i) /* { dg-missed "21: couldn't vectorize loop" } */
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 700529afbe25..c6ff07ab1376 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -1012,6 +1012,8 @@ if { [info procs saved-dg-test] == [list] } {
 	}
 	unset save_linenr_varnames
 	}
+
+	initialize_prune_notes
 }
 
 proc dg-test { args } {
@@ -1245,6 +1247,30 @@ proc dg-missed { args } {
 process-message saved-dg-warning "missed:" "$args"
 }
 
+# Handle output from -fopt-info for MSG_NOTE:
+# a general optimization info.
+# By default, such notes are pruned, but if dg-note is used once, they must all
+# be handled explicitly.
+
+variable prune_notes
+
+proc initialize_prune_notes { } {
+global prune_notes
+set prune_notes 1
+}
+
+initialize_prune_notes
+
+proc dg-note { args } {
+# Make this variabl

[PATCH v3] Include checking of 0 cost dependency due to bypass in rank_for_schedule

2020-11-06 Thread Jojo R
Insn seqs before sched:

.L1:
a5 = insn-1 (a0)
a6 = insn-2 (a1)
a7 = insn-3 (a7, a5)
a8 = insn-4 (a8, a6)
Jmp .L1

Insn-3 & insn-4 is REG_DEP_TRUE of insn-1 & insn-2,
so insn-3 & insn-4 will be as the last of ready list.
And this patch will put 0 cost dependency due to a bypass
as highest numbered class also if some target have forward
feature between DEP_PRO and DEP_CON.

if the insns are in the same cost class on -fsched-last-insn-heuristic,
And then, go to "prefer the insn which has more later insns that depend on it",
return from dep_list_size() is not satisfied, it includes all dependence of 
insn.
We need to ignore the ones that have a 0 cost dependency due to a bypass.

With this patch and pipeline description as below:

(define_bypass 0 "insn-1, insn-2" "insn-3, insn-4")

We can get better insn seqs after sched:

.L1:
a5 = insn-1 (a0)
a7 = insn-3 (a7, a5)
a6 = insn-2 (a1)
a8 = insn-4 (a8, a6)
Jmp .L1

I have tested on ck860 of C-SKY arch and C960 of T-Head based on RISCV arch

gcc/
* haifa-sched.c (dep_list_costs): New.
(rank_for_schedule): Replace dep_list_size with dep_list_costs.
Add 0 cost dependency due to bypass on -fsched-last-insn-heuristic.

---
 gcc/haifa-sched.c | 49 +++
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 350178c82b8..51c6d23d3a5 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -1584,6 +1584,44 @@ dep_list_size (rtx_insn *insn, sd_list_types_def list)
   return nodbgcount;
 }
 
+/* Get the bypass cost of dependence DEP.  */
+
+HAIFA_INLINE static int
+dep_cost_bypass(dep_t dep)
+{
+  if (dep == NULL)
+return -1;
+
+  if (INSN_CODE (DEP_PRO (dep)) >= 0
+  && bypass_p (DEP_PRO (dep))
+  && recog_memoized (DEP_CON (dep)) >= 0)
+return dep_cost (dep);
+
+  return -1;
+}
+
+/* Compute the costs of nondebug deps in list LIST for INSN.  */
+
+static int
+dep_list_costs (rtx_insn *insn, sd_list_types_def list)
+{
+  sd_iterator_def sd_it;
+  dep_t dep;
+  int costs = 0;
+
+  FOR_EACH_DEP (insn, list, sd_it, dep)
+{
+  if (!DEBUG_INSN_P (DEP_CON (dep))
+ && !DEBUG_INSN_P (DEP_PRO (dep)))
+   {
+ if (dep_cost_bypass (dep) != 0)
+   costs++;
+   }
+}
+
+  return costs;
+}
+
 bool sched_fusion;
 
 /* Compute the priority number for INSN.  */
@@ -2758,10 +2796,12 @@ rank_for_schedule (const void *x, const void *y)
  1) Data dependent on last schedule insn.
  2) Anti/Output dependent on last scheduled insn.
  3) Independent of last scheduled insn, or has latency of one.
+ 4) bypass of last scheduled insn, and has latency of zero.
  Choose the insn from the highest numbered class if different.  */
   dep1 = sd_find_dep_between (last, tmp, true);
 
-  if (dep1 == NULL || dep_cost (dep1) == 1)
+  if (dep1 == NULL || dep_cost (dep1) == 1
+ || (dep_cost_bypass (dep1) == 0))
tmp_class = 3;
   else if (/* Data dependence.  */
   DEP_TYPE (dep1) == REG_DEP_TRUE)
@@ -2771,7 +2811,8 @@ rank_for_schedule (const void *x, const void *y)
 
   dep2 = sd_find_dep_between (last, tmp2, true);
 
-  if (dep2 == NULL || dep_cost (dep2)  == 1)
+  if (dep2 == NULL || dep_cost (dep2)  == 1
+ || (dep_cost_bypass (dep2) == 0))
tmp2_class = 3;
   else if (/* Data dependence.  */
   DEP_TYPE (dep2) == REG_DEP_TRUE)
@@ -2795,8 +2836,8 @@ rank_for_schedule (const void *x, const void *y)
  This gives the scheduler more freedom when scheduling later
  instructions at the expense of added register pressure.  */
 
-  val = (dep_list_size (tmp2, SD_LIST_FORW)
-- dep_list_size (tmp, SD_LIST_FORW));
+  val = (dep_list_costs (tmp2, SD_LIST_FORW)
+- dep_list_costs (tmp, SD_LIST_FORW));
 
   if (flag_sched_dep_count_heuristic && val != 0)
 return rfs_result (RFS_DEP_COUNT, val, tmp, tmp2);
-- 
2.24.3 (Apple Git-128)



Re: [PATCH v2] Replace dep_list_size with dep_list_costs for better scheduling

2020-11-06 Thread Jojo R


Jojo
在 2020年11月6日 +0800 AM11:18,Jeff Law ,写道:

On 11/5/20 7:50 PM, Jim Wilson wrote:
On Thu, Nov 5, 2020 at 6:03 PM Jojo R  wrote:
> >         gcc/
> >         * haifa-sched.c (dep_list_costs): New.
> >         (rank_for_schedule): Use dep_list_costs.
>
> When you post a patch, you should explain what the patch is doing and why 
> this is better than the code that was there before.  It is helpful if you can 
> show results that demonstrate that it is better, e.g. give a small example 
> and show some scheduler or assembly output to show what it does.
>
> You should also consider that when you modify target independent code then 
> you are affecting every target.  This change may work well for your target, 
> but does it also work for x86, arm, ppc, etc?  This probably requires some 
> testing to see if it works for other targets.  If not, then maybe it needs to 
> be conditional on a target hook.
>
> The patch does seem to make some sense though.  When choosing the instruction 
> that has the most dependent instructions to schedule next, you want to ignore 
> the ones that have a 0 cost dependency due to a bypass.
Agreed. It looks pretty reasonable, but a bit more background would be helpful.

Ok & Thanks,

It’s fixed in patch v3.

jeff



Re: [PATCH v2] Add bypass_p cost check in flag_sched_last_insn_heuristic

2020-11-06 Thread Jojo R


Jojo
在 2020年11月6日 +0800 AM11:18,Jeff Law ,写道:

On 11/5/20 7:52 PM, Jim Wilson wrote:
On Thu, Nov 5, 2020 at 6:10 PM Jojo R  wrote:
> >         gcc/
> >         * haifa-sched.c (rank_for_schedule): Add bypass_p
> >         cost check in flag_sched_last_insn_heuristic.
> >
> > +         || (INSN_CODE (DEP_PRO (dep1)) >= 0 && bypass_p (DEP_PRO (dep1))
> > +             && recog_memoized (DEP_CON (dep1)) >= 0
> > +             && !dep_cost (dep1)))
>
> This is using the same idiom at the previous patch.  Do the two patches 
> depend on each other?  It isn't clear.  Since this idiom is used 3 times 
> across the 2 patches, maybe it should be a macro or an inline function.
FWIW, I'd just let the inliner make the decision.

>
> As with the other patch, some explanation would be nice, and some testing on 
> multiple targets too.
Agreed.
Ok & Thanks,

It’s fixed in patch v3.
jeff



Re: [Patch] x86: Enable GCC support for Intel AVX-VNNI extension

2020-11-06 Thread Uros Bizjak via Gcc-patches
> This patch is about to support Intel AVX-VNNI instructions.
>
> AVX-VNNI is an equivalent to AVX512-VNNI with VEX encoding. The instructions
> are same, but with extra {vex} prefix to distinguish from AVX512-VNNI 
> instructions
> in assembler.
>
> For more details, please refer to 
> https://software.intel.com/content/dam/develop
> /external/us/en/documents/architecture-instruction-set-extensions-programming-
> reference.pdf
>
> Bootstrap ok, regression test on i386/x86 backend is ok.
>
> OK for master?
>
> 2020-10-13  Hongtao Liu  
> Hongyu Wang  
>
> gcc/
> * common/config/i386/cpuinfo.h (get_available_features):
> Detect AVXVNNI.
> * common/config/i386/i386-common.c
> (OPTION_MASK_ISA2_AVXVNNI_SET,
> OPTION_MASK_ISA2_AVXVNNI_UNSET, OPTION_MASK_ISA2_AVX2_UNSET):
> New.
> (ix86_hanlde_option): Handle -mavxvnni, unset avxvnni when
> avx2 is disabled.
> * common/config/i386/i386-cpuinfo.h (enum processor_features):
> Add FEATURE_AVXVNNI.
> * common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
> for avxvnni.
> * config.gcc: Add avxvnniintrin.h.
> * config/i386/avx512vnniintrin.h: Remove 128/256 bit non-mask
> intrinsics.
> * config/i386/avxvnniintrin.h: New header file.
> * config/i386/cpuid.h (bit_AVXVNNI): New.
> * config/i386/i386-builtins.c (def_builtin): Handle AVXVNNI mask
> for unified builtin.
> * config/i386/i386-builtin.def (BDESC): Adjust AVX512VNNI
> builtins for AVXVNNI.
> * config/i386/i386-c.c (ix86_target_macros_internal): Define
> __AVXVNNI__.
> * config/i386/i386-expand.c (ix86_expand_builtin): Handle bisa
> for AVXVNNI to support unified intrinsic name, since there is no
> dependency between AVX512VNNI and AVXVNNI.
> * config/i386/i386-options.c (isa2_opts): Add -mavxvnni.
> (ix86_valid_target_attribute_inner_p): Handle avxnnni.
> (ix86_valid_target_attribute_inner_p): Ditto.
> * config/i386/i386.h (TARGET_AVXVNNI, TARGET_AVXVNNI_P,
> TARGET_AVXVNNI_P, PTA_AVXVNNI): New.
> (PTA_SAPPHIRERAPIDS): Add AVX_VNNI.
> (PTA_ALDERLAKE): Likewise.
> * config/i386/i386.md ("isa"): Add avxvnni, avx512vnnivl.
> ("enabled"): Adjust for avxvnni and avx512vnnivl.
> * config/i386/i386.opt: Add option -mavxvnni.
> * config/i386/immintrin.h: Include avxvnniintrin.h.
> * config/i386/sse.md (vpdpbusd_): Adjust for AVXVNNI.
> (vpdpbusds_): Likewise.
> (vpdpwssd_): Likewise.
> (vpdpwssds_): Likewise.
> (vpdpbusd_v16si): New.
> (vpdpbusds_v16si): Likewise.
> (vpdpwssd_v16si): Likewise.
> (vpdpwssds_v16si): Likewise.
> * doc/invoke.texi: Document -mavxvnni.
> * doc/extend.texi: Document avxvnni.
> * doc/sourcebuild.texi: Document target avxvnni.
>
> gcc/testsuite/
>
> * gcc.target/i386/avx512vl-vnni-1.c: Rename..
> * gcc.target/i386/avx512vl-vnni-1a.c: To This.
> * gcc.target/i386/avx512vl-vnni-1b.c: New test.
> * gcc.target/i386/avx512vl-vnni-2.c: Ditto.
> * gcc.target/i386/avx512vl-vnni-3.c: Ditto.
> * gcc.target/i386/avx-vnni-1.c: Ditto.
> * gcc.target/i386/avx-vnni-2.c: Ditto.
> * gcc.target/i386/avx-vnni-3.c: Ditto.
> * gcc.target/i386/avx-vnni-4.c: Ditto.
> * gcc.target/i386/avx-vnni-5.c: Ditto.
> * gcc.target/i386/avx-vnni-6.c: Ditto.
> * gcc.target/i386/avx-vpdpbusd-2.c: Ditto.
> * gcc.target/i386/avx-vpdpbusds-2.c: Ditto.
> * gcc.target/i386/avx-vpdpwssd-2.c: Ditto.
> * gcc.target/i386/avx-vpdpwssds-2.c: Ditto.
> * gcc.target/i386/vnni_inline_error.c: Ditto.
> * gcc.target/i386/avx512vnnivl-builtin.c: Ditto.
> * gcc.target/i386/avxvnni-builtin.c: Ditto.
> * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> * gcc.target/i386/pr83488-3.c: Adjust.
> * gcc.target/i386/sse-12.c: Add -mavxvnni.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-14.c: Ditto.
> * gcc.target/i386/sse-22.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * g++.dg/other/i386-2.C: Ditto.
> * g++.dg/other/i386-3.C: Ditto.
> * lib/target-supports.exp (check_effective_target_avxvnni):
> New proc.

+  /* Support unified builtin.  */
+  || (mask2 == OPTION_MASK_ISA2_AVXVNNI)

I don't think we gain anything with unified builtins. Better, just
introduce separate builtins, e.g for

-BDESC (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL, 0,
CODE_FOR_vpdpbusd_v8si, "__builtin_ia32_vpdpbusd_v8si",
IX86_BUILTIN_VPDPBUSDV8SI, UNKNOWN, (int) V8SI_FTYPE_V8SI_V8SI_V8SI)
+BDESC (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL,
OPTION_MASK_ISA2_AVXVNNI, CODE_FOR_vpdpbusd_v8si,
"__builtin_ia32_vpdpbusd_v8si", IX86_BUILTIN_VPDPBUSDV8SI, UNKNOWN,
(int) V8SI_FTYPE_V8SI_V8SI_V8SI)

add __builtin_ia32_vpdbusd_avx_v8si with the same CODE_FOR.

This will remove the need for:

+  if bisa & (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL))
+== (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_

Re: Use EAF_RETURN_ARG in tree-ssa-ccp.c

2020-11-06 Thread Richard Biener
On Thu, 5 Nov 2020, Jan Hubicka wrote:

> > 
> > On 10/27/20 3:01 AM, Richard Biener wrote:
> > > On Tue, 27 Oct 2020, Jan Hubicka wrote:
> > >
> > >>> On Mon, 26 Oct 2020, Jan Hubicka wrote:
> > >>>
> >  Hi,
> >  while looking for special cases of buitins I noticed that tree-ssa-ccp
> >  can use EAF_RETURNS_ARG.  I wonder if same should be done by value
> >  numbering and other propagators
> > >>> The issue is that changing
> > >>>
> > >>>   q = memcpy (p, r);
> > >>>   .. use q ...
> > >>>
> > >>> to
> > >>>
> > >>>   memcpy (p, r);
> > >>>   .. use p ..
> > >>>
> > >>> is bad for RA so we generally do not want to copy-propagate
> > >>> EAF_RETURNS_ARG.  We eventually do want to optimize a following
> > >>>
> > >>>
> > >>>   if (q == p)
> > >>>
> > >>> of course.  And we eventually want to do the _reverse_ transform,
> > >>> replacing
> > >>>
> > >>>   memcpy (p, r)
> > >>>   .. use p ..
> > >>>
> > >>> with
> > >>>
> > >>>   tem = memcpy (p, r)
> > >>>   .. use tem ..
> > >>>
> > >>> ISTR playing with patches doing all of the above, would need to dig
> > >>> them out again.  There's also a PR about this I think.
> > >>>
> > >>> Bernd added some code to RTL call expansion, not sure exactly
> > >>> what it does...
> > >> It adds copy intstruction to call fusage, so RTL backend now about the
> > >> equivalence.
> > >> void *
> > >> test(void *a, void *b, int l)
> > >> {
> > >>   __builtin_memcpy (a,b,l);
> > >>   return a;
> > >> }
> > >> eliminates the extra copy. So I would say that we should not be affraid
> > >> to propagate in gimple world. It is a minor thing I guess though.
> > >> (my interest is mostly to get rid of unnecesary special casing of
> > >> builtins, as these special cases are clearly not well maintained
> > >> because almost no one knows about them:)
> > > The complication is when this appears in a loop like
> > >
> > >  for (; n; --n)
> > >{
> > >  p = memcpy (p, s, k);
> > >  p += j;
> > >}
> > >
> > > then I assume IVOPTs can do a better job knowing the equivalence
> > > (guess we'd still need to teach SCEV about this then ...) and
> > > when it's not present explicitely in the SSA chain any SSA based
> > > analysis has difficulties seeing it.
> > >
> > > ISTR I saw regressions when doing a patch propagating those
> > > equivalences.
> > 
> > SImilarly.? I don't remember the details, but definitely remember being
> > surprised that the propagation caused regressions and then chasing it
> > down to a bad interaction with the register allocator.
> 
> I wonder if it was before or after the code in calls.c adding
> CALL_FUSAGE was added.  It is probably not that important, but given
> that we have all infrastructure on place it seems pity to not use it.

The CALL_FUSAGE was a band-aid I think - RA still doesn't know how
to re-materialize regs from the return value in general.

Btw, below is some WIP patch teaching FRE about this but instead
of propagating out the return value (which would be simpler in the
patch) it tries to do the reverse and eliminate to the most
downstream call.  It has some issues with valueization at -O1
though so not all simplification opportunities are realized.

As said, propagating out the return value is easy but that loses
the LHS of the calls in the end which is probably why the
CALL_FUSAGE thing doesn't help.  If there's no return value
there's noting to use.

Richard.

>From 7e96e97c33a533b74dd9d16634bb7a8acce6ca19 Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Fri, 6 Nov 2020 10:47:00 +0100
Subject: [PATCH] teach FRE about ERF_RETURNS_ARG
To: gcc-patches@gcc.gnu.org

This teaches value-numbering about calls returning one of their
arguments.  Elimination makes sure to use the return value
downstream of calls.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-89.c | 16 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-90.c | 15 +
 gcc/tree-ssa-sccvn.c   | 37 +-
 3 files changed, 60 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-89.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-90.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-89.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-89.c
new file mode 100644
index 000..6d62510ed9d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-89.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-fre1" } */
+
+void *x, *y;
+int foo (void *q, void *p)
+{
+  void *r = __builtin_memcpy (q, p, 17);
+  x = r;
+  y = q;
+  return r != q;
+}
+
+/* { dg-final { scan-tree-dump "x = r" "fre1" } } */
+/* { dg-final { scan-tree-dump "y = r" "fre1" } } */
+/* We fail simplifying r != q for the lattice.  */
+/* { dg-final { scan-tree-dump "return 0;" "fre1" { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-90.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-90.c
new file mode 100644
index 000..1913de9c8ba
--- /dev/null
+++ b/gcc/testsuite/

Re: [PATCH, 1/3, OpenMP] Target mapping changes for OpenMP 5.0, front-end parts

2020-11-06 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 04, 2020 at 02:02:25AM +0800, Chung-Lin Tang wrote:
>   gcc/c-family/
>   * c-common.h (c_omp_adjust_map_clauses): New declaration.
>   * c-omp.c (c_omp_adjust_map_clauses): New function.
> 
>   gcc/c/
>   * c-parser.c (c_parser_omp_target_data): Add use of
>   new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as
>   handled map clause kind.
>   (c_parser_omp_target_enter_data): Likewise.
>   (c_parser_omp_target_exit_data): Likewise.
>   (c_parser_omp_target): Likewise.
>   * c-typeck.c (handle_omp_array_sections): Adjust COMPONENT_REF case to
>   use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type.
>   (c_finish_omp_clauses): Adjust bitmap checks to allow struct decl and
>   same struct field access to co-exist on OpenMP construct.
> 
>   gcc/cp/
>   * parser.c (cp_parser_omp_target_data): Add use of
>   new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as
>   handled map clause kind.
>   (cp_parser_omp_target_enter_data): Likewise.
>   (cp_parser_omp_target_exit_data): Likewise.
>   (cp_parser_omp_target): Likewise.
>   * semantics.c (handle_omp_array_sections): Adjust COMPONENT_REF case to
>   use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type. Fix
>   interaction between reference case and attach/detach.
>   (finish_omp_clauses): Adjust bitmap checks to allow struct decl and
>   same struct field access to co-exist on OpenMP construct.

Ok, thanks.

Jakub



Re: [PATCH, 2/3, OpenMP] Target mapping changes for OpenMP 5.0, middle-end parts and compiler testcases

2020-11-06 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 04, 2020 at 02:02:56AM +0800, Chung-Lin Tang wrote:
>   gcc/
>   * gimplify.c (is_or_contains_p): New static helper function.
>   (omp_target_reorder_clauses): New function.
>   (gimplify_scan_omp_clauses): Add use of omp_target_reorder_clauses to
>   reorder clause list according to OpenMP 5.0 rules. Add handling of
>   GOMP_MAP_ATTACH_DETACH for OpenMP cases.
>   * omp-low.c (is_omp_target): New static helper function.
>   (scan_sharing_clauses): Add scan phase handling of 
> GOMP_MAP_ATTACH/DETACH
>   for OpenMP cases.
>   (lower_omp_target): Add lowering handling of GOMP_MAP_ATTACH/DETACH for
>   OpenMP cases.
> 
>   gcc/testsuite/
>   * c-c++-common/gomp/clauses-2.c: Remove dg-error cases now valid.
>   * gfortran.dg/gomp/map-2.f90: Likewise.
>   * c-c++-common/gomp/map-5.c: New testcase.

Ok, thanks.

Jakub



Re: [PATCH] Clean up loop-closed PHIs at loopdone pass

2020-11-06 Thread Richard Biener
On Fri, 6 Nov 2020, Jiufu Guo wrote:

> On 2020-11-05 21:43, Richard Biener wrote:
> 
> Hi Richard,
> 
> Thanks for your comments and suggestions!
> 
> > On Thu, Nov 5, 2020 at 2:19 PM guojiufu via Gcc-patches
> >  wrote:
> >> 
> >> In PR87473, there are discussions about loop-closed PHIs which
> >> are generated for loop optimization passes.  It would be helpful
> >> to clean them up after loop optimization is done, then this may
> >> simplify some jobs of following passes.
> >> This patch introduces a cheaper way to propagate them out in
> >> pass_tree_loop_done.
> >> 
> >> This patch passes bootstrap and regtest on ppc64le.  Is this ok for trunk?
> > 
> > Huh, I think this is somewhat useless work, the PHIs won't survive for long
> > and you certainly cannot expect degenerate PHIs to not occur anyway.
> 
> After `loopdone` pass, those loop-closed-PHIs will still live ~10 passes
> (veclower, switchlower, slsr...) till the next `copyprop` pass.
> It would be helpful to those passes if we can eliminate those degenerated PHIs
> in a cheaper way.  As you mentioned in
> https://gcc.gnu.org/legacy-ml/gcc-patches/2018-10/msg00834.html
> 
> We know vrp/dom may generate some degenerated PHIS, and then we have
> `copyprop`
> was added after each vrp/dom pair to propagate out those PHIs.  Likely, I
> think for loop-closed PHIs, we may also eliminate them once they are not
> needed.
> 
> 
> > You probably can replace propagate_rhs_into_lhs by the
> > existing replace_uses_by function.  You're walking loop exits
> 
> Yes, replace_uses_by + remove_phi_node would be a good implementation
> propagate_rhs_into_lhs.
> 
> 
> Thanks!
> 
> > after loop_optimizer_finalize () - that's wasting work.  If you want to
> > avoid inconsistent state and we really want to go with this I suggest
> > to instead add a flag to loop_optimizer_finalize () as to whether to
> > propagate out LC PHI nodes or not and do this from within there.
> 
> Thank you for the suggestion!
> You mean adding a flag and in loop_optimizer_finalize, and add code like:
> ```
> if (flag_propagate_loop_closed_phi_when_loop_done)
> {
>   loops_state_clear (fn, LOOP_CLOSED_SSA)
>   clean_up_loop_closed_phis(fn);
> }
> ```
> 
> Is this align with your suggestions?

Yeah.

> One concern: function loop_optimizer_finalize is called a lot of places,
> while we just need to clean up loop-closed PHIs at GIMPLE loopdone pass.

There are quite some other passes rewriting into LC SSA outside of
the loop pipeline.  [E]VRP for example but also invariant motion.

To avoid touching too many places you can default the new argument
to false for example.

Richard.

> Thanks again,
> 
> Jiufu Guo.
> 
> > 
> > Thanks,
> > Richard.
> > 
> >> gcc/ChangeLog
> >> 2020-10-05  Jiufu Guo   
> >> 
> >> * tree-ssa-loop.h (clean_up_loop_closed_phi): New declaration.
> >> * tree-ssa-loop.c (tree_ssa_loop_done): Call 
> >> clean_up_loop_closed_phi.
> >> * tree-ssa-propagate.c (propagate_rhs_into_lhs): New function.
> >> 
> >> gcc/testsuite/ChangeLog
> >> 2020-10-05  Jiufu Guo   
> >> 
> >> * gcc.dg/tree-ssa/loopclosedphi.c: New test.
> >> ---
> >>  gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c |  21 +++
> >>  gcc/tree-ssa-loop.c   |   1 +
> >>  gcc/tree-ssa-loop.h   |   1 +
> >>  gcc/tree-ssa-propagate.c  | 120 
> >> ++
> >>  4 files changed, 143 insertions(+)
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
> >> 
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
> >> b/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
> >> new file mode 100644
> >> index 000..d71b757fbca
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
> >> @@ -0,0 +1,21 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O3 -fno-tree-ch -w -fdump-tree-loopdone-details" } */
> >> +
> >> +void
> >> +t6 (int qz, int wh)
> >> +{
> >> +  int jl = wh;
> >> +
> >> +  while (1.0 * qz / wh < 1)
> >> +{
> >> +  qz = wh * (wh + 2);
> >> +
> >> +  while (wh < 1)
> >> +jl = 0;
> >> +}
> >> +
> >> +  while (qz < 1)
> >> +qz = jl * wh;
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-times "Replacing" 2 "loopdone"} } */
> >> diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
> >> index 5e8365d4e83..7d680b2f5d2 100644
> >> --- a/gcc/tree-ssa-loop.c
> >> +++ b/gcc/tree-ssa-loop.c
> >> @@ -530,6 +530,7 @@ tree_ssa_loop_done (void)
> >>free_numbers_of_iterations_estimates (cfun);
> >>scev_finalize ();
> >>loop_optimizer_finalize ();
> >> +  clean_up_loop_closed_phi (cfun);
> >>return 0;
> >>  }
> >> 
> >> diff --git a/gcc/tree-ssa-loop.h b/gcc/tree-ssa-loop.h
> >> index 9e35125e6e8..baa940b9d1e 100644
> >> --- a/gcc/tree-ssa-loop.h
> >> +++ b/gcc/tree-ssa-loop.h
> >> @@ -67,6 +67,7 @@ public:
> >> extern bool for_each_index (tree *, bool (*) (tree, tree *, void *), void
> >> *);
> >> extern char *get_lsm_tmp_na

Re: [PATCH, 3/3, OpenMP] Target mapping changes for OpenMP 5.0, libgomp parts [resend]

2020-11-06 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 04, 2020 at 02:03:27AM +0800, Chung-Lin Tang wrote:
> >      libgomp/
> >      * libgomp.h (enum gomp_map_vars_kind): Adjust enum values to be 
> > bit-flag
> >      usable.
> >      * oacc-mem.c (acc_map_data): Adjust gomp_map_vars argument flags to
> >      'GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_ENTER_DATA'.
> >      (goacc_enter_datum): Likewise for call to gomp_map_vars_async.
> >      (goacc_enter_data_internal): Likewise.
> > 
> >      * target.c (gomp_map_vars_internal): Change checks of 
> > GOMP_MAP_VARS_ENTER_DATA
> >      to use bit-and (&). Adjust use of gomp_attach_pointer for OpenMP 
> > cases.
> >      (gomp_exit_data): Add handling of GOMP_MAP_DETACH.
> >      (GOMP_target_enter_exit_data): Add handling of GOMP_MAP_ATTACH.
> >      * testsuite/libgomp.c-c++-common/ptr-attach-1.c: New testcase.

Ok, with two nits fixed.

> @@ -572,7 +573,8 @@ goacc_enter_datum (void **hostaddrs, size_t *sizes, void 
> *kinds, int async)
>  
>struct target_mem_desc *tgt
>   = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, NULL, sizes,
> -kinds, true, GOMP_MAP_VARS_ENTER_DATA);
> +kinds, true,
> +GOMP_MAP_VARS_OPENACC | 
> GOMP_MAP_VARS_ENTER_DATA);

This line is too long.

>assert (tgt);
>assert (tgt->list_count == 1);
>n = tgt->list[0].key;
> @@ -1202,7 +1204,7 @@ goacc_enter_data_internal (struct gomp_device_descr 
> *acc_dev, size_t mapnum,
> struct target_mem_desc *tgt
>   = gomp_map_vars_async (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
>  &sizes[i], &kinds[i], true,
> -GOMP_MAP_VARS_ENTER_DATA);
> +GOMP_MAP_VARS_OPENACC | 
> GOMP_MAP_VARS_ENTER_DATA);

And this one too.
Please wrap them.

Jakub



Re: [PATCH v9] genemit.c (main): split insn-emit.c for compiling parallelly

2020-11-06 Thread Richard Sandiford via Gcc-patches
Sorry, going it over it again before committing, I spotted a couple
of things…

Jojo R  writes:
> gcc/ChangeLog:
>
>   * genemit.c (main): Print 'split line'.
>   * Makefile.in (insn-emit.c): Define split count and file
>
> ---
>  gcc/Makefile.in |  35 +++-
>  gcc/genemit.c   | 104 +---
>  2 files changed, 90 insertions(+), 49 deletions(-)
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 978a08f7b04..de846c0fcd4 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1154,6 +1154,15 @@ export STRIP_FOR_TARGET
>  export RANLIB_FOR_TARGET
>  export libsubdir
>  
> +number_series0:=1 2 3 4 5 6 7 8 9
> +number_series1:=0 $(number_series0)
> +number_series2:=$(foreach i,$(number_series0),$(addprefix 
> $(i),$(number_series1)))
> +number_series3:=$(addprefix 0,$(number_series1)) $(number_series2)
> +number_series4:=$(foreach i,$(number_series0),$(addprefix 
> $(i),$(number_series3)))
> +number_series5:=$(addprefix 0,$(number_series3)) $(number_series4)
> +number_series6:=$(foreach i,$(number_series0),$(addprefix 
> $(i),$(number_series5)))
> +number_series:=$(number_series0) $(number_series2) $(number_series4) 
> $(number_series6)
> +
>  FLAGS_TO_PASS = \
>   "ADA_CFLAGS=$(ADA_CFLAGS)" \
>   "BISON=$(BISON)" \
> @@ -1259,6 +1268,18 @@ ANALYZER_OBJS = \
>  # We put the *-match.o and insn-*.o files first so that a parallel make
>  # will build them sooner, because they are large and otherwise tend to be
>  # the last objects to finish building.
> +
> +# target overrides
> +-include $(tmake_file)
> +
> +INSN-GENERATED-SPLIT-NUM ?= 0
> +
> +insn-generated-split-num = $(wordlist 1,$(shell expr 
> $(INSN-GENERATED-SPLIT-NUM) + 1),$(number_series))

I think it would be better to make SPLIT-NUM 1-based rather than
0-based and not have the shell expr.

> +
> +insn-emit-split-c := $(foreach o, $(insn-generated-split-num), 
> insn-emit$(o).c)
> +insn-emit-split-obj = $(patsubst %.c,%.o, $(insn-emit-split-c))
> +$(insn-emit-split-c): insn-emit.c
> +
>  OBJS = \
>   gimple-match.o \
>   generic-match.o \
> @@ -1266,6 +1287,7 @@ OBJS = \
>   insn-automata.o \
>   insn-dfatab.o \
>   insn-emit.o \
> + $(insn-emit-split-obj) \
>   insn-extract.o \
>   insn-latencytab.o \
>   insn-modes.o \
> @@ -2376,6 +2398,9 @@ $(simple_generated_c:insn-%.c=s-%): s-%: 
> build/gen%$(build_exeext)
>   $(RUN_GEN) build/gen$*$(build_exeext) $(md_file) \
> $(filter insn-conditions.md,$^) > tmp-$*.c
>   $(SHELL) $(srcdir)/../move-if-change tmp-$*.c insn-$*.c
> + $*v=$$(echo $$(csplit insn-$*.c /parallel\ compilation/ -k -s 
> {$(INSN-GENERATED-SPLIT-NUM)} -f insn-$* -b "%d.c" 2>&1));\
> + [ ! "$$$*v" ] || grep "match not found" <<< $$$*v
> + [ -s insn-$*0.c ] || (for i in $(insn-generated-split-num); do touch 
> insn-$*$$i.c; done && echo "" > insn-$*.c)

This still uses bashisms.  Also, I don't think we can require csplit.

genemit itself is quite fast, so it would probably be simpler to run
it multiple times, passing a new command-line parameter to say which
output file we want.

In other words, rather than generate a monolithic insn-emit.c and
then split it later, we should just generate the separate insn-emit.cs
directly.

Thanks,
Richard


RE: [PATCH] SLP: Move load/store-lanes check till late

2020-11-06 Thread Tamar Christina via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: Friday, November 6, 2020 8:42 AM
> To: Tamar Christina 
> Cc: Richard Biener ; nd ; gcc-
> patc...@gcc.gnu.org; o...@ucw.cz
> Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> 
> On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches  patc...@gcc.gnu.org> wrote:
> >
> > > -Original Message-
> > > From: rguent...@c653.arch.suse.de  On
> > > Behalf Of Richard Biener
> > > Sent: Thursday, November 5, 2020 10:17 AM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > >
> > > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > >
> > > > Hi Richi,
> > > >
> > > > > -Original Message-
> > > > > From: rguent...@c653.arch.suse.de 
> > > > > On Behalf Of Richard Biener
> > > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > > To: Tamar Christina 
> > > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > > >
> > > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > > >
> > > > > > Hi Richi,
> > > > > >
> > > > > > We decided to take the regression in any code-gen this could
> > > > > > give and fix it properly next stage-1.  As such here's a new
> > > > > > patch based on your previous feedback.
> > > > > >
> > > > > > Ok for master?
> > > > >
> > > > > Looks good sofar but be aware that you elide the
> > > > >
> > > > > - && vect_store_lanes_supported
> > > > > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size,
> > > > > false))
> > > > >
> > > > > part of the check - that is, you don't verify the store part of
> > > > > the instance can use store-lanes.  Btw, this means the original
> > > > > code cancelled an instance only when the SLP graph entry is a
> > > > > store-lane capable store but your variant would also cancel in
> > > > > case there's a load-
> > > lane capable reduction.
> > > > >
> > > >
> > > > I do still have it,
> > > >
> > > >   if (loads_permuted
> > > >   && vect_store_lanes_supported (vectype, group_size,
> > > > false))
> > > >
> > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which
> > > > should be the store if one exists.
> > > >
> > > > > I think that you eventually want to re-instantiate the
> > > > > store-lane check but treat it the same as any of the load checks
> > > > > (thus not require all instances to be stores for the cancellation).
> > > > > But at least when a store cannot use store-lanes we probably
> > > > > shouldn't cancel the SLP.
> > > >
> > > > I did however elide the kind check, that was added as part of the
> > > > rebase, it looked like kind wasn't Being stored inside the SLP
> > > > instance and
> > > I'd have to redo the analysis to find it.
> > > >
> > > > Does it does reasonable to include kind as a field in the SLP instance?
> > > >
> > > > >
> > > > > Anyway, the patch is OK for master.  The store-lane check part
> > > > > can be re- added as followup.
> > > > >
> > > >
> > > > Thanks! Will do.
> > >
> > > Btw, the patch regressed
> > >
> > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing
> > > stmts using SLP" 1
> > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes"
> > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "LOAD_LANES"
> > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "STORE_LANES"
> > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled:
> > > can use load/store-lanes"
> > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> > >
> > > on x86_64.  The slp-11b.c testcase is interesting since there
> > > extract_muldiv folding makes the group of four stores not matching
> > > so we split into a size of
> > > 3 and one remaining store.
> > > This causes us to arrive at
> > >
> > > note:   node 0x441a940 (max_nunits=4, refcnt=2)
> > > note:   stmt 0 _2 = in[_1];
> > > note:   stmt 1 _6 = in[_5];
> > > note:   stmt 2 _10 = in[_8];
> > > note:   load permutation { 0 2 1 }
> > >
> > > which on x86_64 we in the end cannot handle (without SSE4 I think)
> > > so it fails to SLP there.  Guess arm can do the permute but not the load-
> lane here.
> > >
> > > For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done for
> > > !vect_load_lanes targets.  Not sure if that's possible easily, like
> > > with a { target vect_load_lanes } { xfail vect_load_lanes } combo
> > > ...?  I suggest to make it xfail everywhere instead and add a
> > > comment as to we're expecting those only for vect_load_lanes targets.
> >
> > Yes jus

Re: [PATCH] SLP: Move load/store-lanes check till late

2020-11-06 Thread Christophe Lyon via Gcc-patches
On Fri, 6 Nov 2020 at 11:18, Tamar Christina  wrote:
>
>
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: Friday, November 6, 2020 8:42 AM
> > To: Tamar Christina 
> > Cc: Richard Biener ; nd ; gcc-
> > patc...@gcc.gnu.org; o...@ucw.cz
> > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> >
> > On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches  > patc...@gcc.gnu.org> wrote:
> > >
> > > > -Original Message-
> > > > From: rguent...@c653.arch.suse.de  On
> > > > Behalf Of Richard Biener
> > > > Sent: Thursday, November 5, 2020 10:17 AM
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > >
> > > > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > > >
> > > > > Hi Richi,
> > > > >
> > > > > > -Original Message-
> > > > > > From: rguent...@c653.arch.suse.de 
> > > > > > On Behalf Of Richard Biener
> > > > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > > > To: Tamar Christina 
> > > > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > > > >
> > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > > > >
> > > > > > > Hi Richi,
> > > > > > >
> > > > > > > We decided to take the regression in any code-gen this could
> > > > > > > give and fix it properly next stage-1.  As such here's a new
> > > > > > > patch based on your previous feedback.
> > > > > > >
> > > > > > > Ok for master?
> > > > > >
> > > > > > Looks good sofar but be aware that you elide the
> > > > > >
> > > > > > - && vect_store_lanes_supported
> > > > > > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), 
> > > > > > group_size,
> > > > > > false))
> > > > > >
> > > > > > part of the check - that is, you don't verify the store part of
> > > > > > the instance can use store-lanes.  Btw, this means the original
> > > > > > code cancelled an instance only when the SLP graph entry is a
> > > > > > store-lane capable store but your variant would also cancel in
> > > > > > case there's a load-
> > > > lane capable reduction.
> > > > > >
> > > > >
> > > > > I do still have it,
> > > > >
> > > > >   if (loads_permuted
> > > > >   && vect_store_lanes_supported (vectype, group_size,
> > > > > false))
> > > > >
> > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which
> > > > > should be the store if one exists.
> > > > >
> > > > > > I think that you eventually want to re-instantiate the
> > > > > > store-lane check but treat it the same as any of the load checks
> > > > > > (thus not require all instances to be stores for the cancellation).
> > > > > > But at least when a store cannot use store-lanes we probably
> > > > > > shouldn't cancel the SLP.
> > > > >
> > > > > I did however elide the kind check, that was added as part of the
> > > > > rebase, it looked like kind wasn't Being stored inside the SLP
> > > > > instance and
> > > > I'd have to redo the analysis to find it.
> > > > >
> > > > > Does it does reasonable to include kind as a field in the SLP 
> > > > > instance?
> > > > >
> > > > > >
> > > > > > Anyway, the patch is OK for master.  The store-lane check part
> > > > > > can be re- added as followup.
> > > > > >
> > > > >
> > > > > Thanks! Will do.
> > > >
> > > > Btw, the patch regressed
> > > >
> > > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects
> > > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing
> > > > stmts using SLP" 1
> > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > scan-tree-dump vect "LOAD_LANES"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > scan-tree-dump vect "STORE_LANES"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled:
> > > > can use load/store-lanes"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> > > >
> > > > on x86_64.  The slp-11b.c testcase is interesting since there
> > > > extract_muldiv folding makes the group of four stores not matching
> > > > so we split into a size of
> > > > 3 and one remaining store.
> > > > This causes us to arrive at
> > > >
> > > > note:   node 0x441a940 (max_nunits=4, refcnt=2)
> > > > note:   stmt 0 _2 = in[_1];
> > > > note:   stmt 1 _6 = in[_5];
> > > > note:   stmt 2 _10 = in[_8];
> > > > note:   load permutation { 0 2 1 }
> > > >
> > > > which on x86_64 we in the end cannot handle (without SSE4 I think)
> > > > so it fails to SLP there.  Guess arm can do the permute but not the 
> > > > load-
> > lane here.
> > > >
> > > > For gcc.dg/vect/slp-perm-6.c the XFAILs sh

Re: [Patch] OpenACC (C/C++): Fix 'acc atomic' parsing

2020-11-06 Thread Tobias Burnus

On 05.11.20 13:13, Jakub Jelinek wrote:


Wouldn't it be much simpler and more readable to do:
else if (!strcmp (p, "capture"))
  new_code = OMP_ATOMIC_CAPTURE_NEW;
+   else if (openacc)
+ {
+   p = NULL;
+   error_at (cloc, "expected %, %, %, "
+   "or % clause");
+ }


Thanks for looking through the patch – and the suggestion.
It is simpler – and also avoids issues when the OpenMP adds more clauses.


Otherwise LGTM, but I have no idea what OpenACC actually says...


I have now installed it as commit 
r11-4774-ga2c11935b010ee55f7ccd14d27f62c6fbed3745e.

Regarding OpenACC, permitted are (since 2.5):
"|#pragma acc atomic [atomic-clause] new-line ||#pragma acc atomic update capture 
new-line||"Where atomic-clause is one of read, write, update, or capture." I did
note that I misread the spec – 'update capture' is not permitted for
Fortran, only for C/C++. (This is now OpenACC spec Issue #333, which
also asks to better specify what 'update capture' means.)|

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
commit a2c11935b010ee55f7ccd14d27f62c6fbed3745e
Author: Tobias Burnus 
Date:   Fri Nov 6 11:13:47 2020 +0100

OpenACC (C/C++): Fix 'acc atomic' parsing

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_atomic): Add openacc parameter and update
OpenACC matching.
(c_parser_omp_construct): Update call.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_atomic): Add openacc parameter and update
OpenACC matching.
(cp_parser_omp_construct): Update call.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc-gomp/atomic.c: New test.
* c-c++-common/goacc/atomic.c: New test.
---
 gcc/c/c-parser.c   | 24 +++---
 gcc/cp/parser.c| 23 +++---
 gcc/testsuite/c-c++-common/goacc-gomp/atomic.c | 43 ++
 gcc/testsuite/c-c++-common/goacc/atomic.c  | 30 ++
 4 files changed, 110 insertions(+), 10 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index fc97aa3f95f..dedfb8472d0 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -17304,7 +17304,7 @@ c_parser_oacc_wait (location_t loc, c_parser *parser, char *p_name)
   LOC is the location of the #pragma token.  */
 
 static void
-c_parser_omp_atomic (location_t loc, c_parser *parser)
+c_parser_omp_atomic (location_t loc, c_parser *parser, bool openacc)
 {
   tree lhs = NULL_TREE, rhs = NULL_TREE, v = NULL_TREE;
   tree lhs1 = NULL_TREE, rhs1 = NULL_TREE;
@@ -17343,6 +17343,12 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 	new_code = OMP_ATOMIC;
 	  else if (!strcmp (p, "capture"))
 	new_code = OMP_ATOMIC_CAPTURE_NEW;
+	  else if (openacc)
+	{
+	  p = NULL;
+	  error_at (cloc, "expected %, %, %, "
+			  "or % clause");
+	}
 	  else if (!strcmp (p, "seq_cst"))
 	new_memory_order = OMP_MEMORY_ORDER_SEQ_CST;
 	  else if (!strcmp (p, "acq_rel"))
@@ -17370,7 +17376,12 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 	{
 	  if (new_code != ERROR_MARK)
 		{
-		  if (code != ERROR_MARK)
+		  /* OpenACC permits 'update capture'.  */
+		  if (openacc
+		  && code == OMP_ATOMIC
+		  && new_code == OMP_ATOMIC_CAPTURE_NEW)
+		code = new_code;
+		  else if (code != ERROR_MARK)
 		error_at (cloc, "too many atomic clauses");
 		  else
 		code = new_code;
@@ -17392,7 +17403,9 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 
   if (code == ERROR_MARK)
 code = OMP_ATOMIC;
-  if (memory_order == OMP_MEMORY_ORDER_UNSPECIFIED)
+  if (openacc)
+memory_order = OMP_MEMORY_ORDER_RELAXED;
+  else if (memory_order == OMP_MEMORY_ORDER_UNSPECIFIED)
 {
   omp_requires_mask
 	= (enum omp_requires) (omp_requires_mask
@@ -17448,6 +17461,7 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 	  }
 	break;
   case OMP_ATOMIC:
+ /* case OMP_ATOMIC_CAPTURE_NEW: - or update to OpenMP 5.1 */
 	if (memory_order == OMP_MEMORY_ORDER_ACQ_REL
 	|| memory_order == OMP_MEMORY_ORDER_ACQUIRE)
 	  {
@@ -21489,7 +21503,7 @@ c_parser_omp_construct (c_parser *parser, bool *if_p)
   switch (p_kind)
 {
 case PRAGMA_OACC_ATOMIC:
-  c_parser_omp_atomic (loc, parser);
+  c_parser_omp_atomic (loc, parser, true);
   return;
 case PRAGMA_OACC_CACHE:
   strcpy (p_name, "#pragma acc");
@@ -21516,7 +21530,7 @@ c_parser_omp_construct (c_parser *parser, bool *if_p)
   stmt = c_parser_oacc_wait (loc, parser, p_name);
   break;
 case PRAGMA_OMP_ATOMIC:
-  c_parser_omp_atomic (loc, parser);
+  c_parser_omp_atomic (loc, parser, false);
   return;
 case PRAGMA_OMP_CRITICAL:
   stmt = c_parser_omp_crit

Re: [committed 1/2] libstdc++: Export basic_stringbuf constructor [PR 97729]

2020-11-06 Thread Rainer Orth
Hi Jonathan,

> libstdc++-v3/ChangeLog:
>
>   PR libstdc++/97729
>   * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Add exports.
>   * src/c++20/sstream-inst.cc (basic_stringbuf): Instantiate
>   private constructor taking __xfer_bufptrs.
>
> Tested powerpc64le-linux. Committed to trunk.

unfortunately, this broke Solaris bootstrap again:

ld: fatal: libstdc++-symbols.ver-sun: 7314: symbol 
'_ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEEC1EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 7315: symbol 
'_ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEEC2EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 7316: symbol 
'_ZNSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEEC1EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 7317: symbol 
'_ZNSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEEC2EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict

Those are matched by both


##_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]EOS4_RKS3_ONS4_14__xfer_bufptrsE
 (glob)

but also by the previous

##_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]*__xfer_bufptrs* 
(glob)

I do have a hacky patch to avoid this, but I guess I best leave it to
you how to best tighten the previous pattern.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Move pass_oacc_device_lower after pass_graphite

2020-11-06 Thread Frederik Harwath

Hi Richard,

Richard Biener  writes:

> On Tue, Nov 3, 2020 at 4:31 PM Frederik Harwath

> What's on my TODO list (or on the list of things to explore) is to make
> the dump file names/suffixes explicit in passes.def like via
>
>   NEXT_PASS (pass_ccp, true /* nonzero_p */, "oacc")
>
> and we'd get a dump named .ccp_oacc or so.

That would be very helpful for avoiding the drudgery of adapting those
pass numbers!

> Now, what does oacc_device_lower actually do that you need to
> re-run complex lowering?  What does cunrolli do at this point that
> the complete_unroll pass later does not do?
>

Good spot, "cunrolli" seems to be unnecessary.  The complex lowering is
necessary to handle the code that gets created by the OpenACC reduction
lowering during oaccdevlow.  I have attached a test case (a reduced
version of
libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c) which
shows that the complex instructions are created by
pass_oacc_device_lower and which leads to an ICE if compiled without the
new complex lowering instance ("-foffload=-fdisable-tree-cplxlower2").
The problem is an unlowered addition. This is from a diff of the dump of
the pass following oaccdevlow1 (ccp4) with disabled and with enabled
tree-cplxlower2:

<   _91 = VIEW_CONVERT_EXPR(_1);
<   _92 = reduction_var_2 + _91;
---
>   _104 = REALPART_EXPR (_1)>;
>   _105 = IMAGPART_EXPR (_1)>;
>   _91 = COMPLEX_EXPR <_104, _105>;
>   _106 = reduction_var$real_100 + _104;
>   _107 = reduction_var$imag_101 + _105;
>   _92 = COMPLEX_EXPR <_106, _107>;

> What's special about oacc_device lower that doesn't also apply
> to omp_device_lower?

The passes do different things. The goal is to optimize OpenACC
loops using Graphite. The relevant lowering of the internal OpenACC
function calls happens in pass_oacc_device_lower.

> Is all this targeted at code compiled exclusively for the offload
> target?  Thus we're in lto1 here?

The OpenACC outlined functions also get compiled for the host.

> Does it make eventually more sense to have a completely custom pass
> pipeline for the  offload compilation?  Maybe even per offload target?
> See how we have a custom pipeline for -Og (pass_all_optimizations_g).

What would be the main benefits of a separate pipeline? Avoiding
(re-)running passes unneccessarily, less unwanted interactions
in the test suite (but your suggestion above regarding the fixed
pass names would also solve this)?

>> Ok to include the patch in master?

Best regards,
Frederik

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-lowering.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-lowering.c
new file mode 100644
index 000..6879e5aaf25
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-lowering.c
@@ -0,0 +1,50 @@
+/* { dg-additional-options "-foffload=-fdump-tree-cplxlower2" } */
+/* { dg-additional-options "-foffload=-fdump-tree-oaccdevlow1" } */
+/* { dg-do link } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } {""} } */
+
+#include 
+#if !defined(__hppa__) || !defined(__hpux__)
+#include 
+#endif
+
+#define N 100
+
+static float _Complex __attribute__ ((noinline))
+sum (float _Complex ary[N])
+{
+  float _Complex reduction_var = 0;
+#pragma acc parallel loop gang reduction(+:reduction_var)
+  for (int ix = 0; ix < N; ix++)
+reduction_var += ary[ix];
+
+ return reduction_var;
+}
+
+int main (void)
+{
+  float _Complex ary[N];
+  float _Complex result;
+
+  for (int ix = 0; ix < N;  ix++)
+{
+  float frac = ix * (1.0f / 1024) + 1.0f;
+  ary[ix] = frac + frac * 2.0j - 1.0j;
+}
+
+  result = sum (ary);
+  printf("%.1f%+.1fi\n", creal(result), cimag(result));
+  return 0;
+}
+
+/* { dg-final { scan-offload-tree-dump-times "COMPLEX_EXPR" 1 "oaccdevlow1" } }
+
+ There is just one COMPLEX_EXPR right before oaccdevlow1 ...*/
+
+/* { dg-final { scan-offload-tree-dump-times "GOACC_REDUCTION .*?reduction_var.*?;" 4 "oaccdevlow1" } }
+
+  ... but several IFN_GOACC_REDUCTION calls for the reduction variable which are subsequently lowered ... */
+
+/* { dg-final { scan-offload-tree-dump-times "COMPLEX_EXPR " 4  "cplxlower2" } }
+
+ ... which introduces new COMPLEX_EXPRs. */


[Patch, committed] OpenACC/Fortran: Reject '!$acc atomic update capture'

2020-11-06 Thread Tobias Burnus

This patch adds some OpenACC atomic testcases, based on my just
committed C/C++ patch.
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558288.html

And it fixes an issue I introduced when updating the Fortran atomic
handling (for OpenMP and a bit OpenACC) at
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557734.html

Namely: For C/C++, OpenACC 2.5 and later has:

#pragma acc atomic [atomic-clause] new-line
#pragma acc atomic update capture new-line
  Where atomic-clause is one of read, write, update, or capture.

But for Fortran, it only shows for all four clauses
"!$acc atomic" followed by the respective clause. Hence,
   !$acc atomic update capture
is not permitted.

This patch now removes the support for the latter and adds
the testcases.

I have also opened the OpenACC spec Issue #333 regarding the
C/C++ vs. Fortran difference and the the under-specification
of 'update capture'.

Committed as r11-4775-gc2e9f586fde57e64dc20e5528870d06cde894785

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
commit c2e9f586fde57e64dc20e5528870d06cde894785
Author: Tobias Burnus 
Date:   Fri Nov 6 12:30:20 2020 +0100

OpenACC/Fortran: Reject '!$acc atomic update capture'

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_oacc_atomic): No longer accept 'update capture'.

gcc/testsuite/ChangeLog:

* gfortran.dg/goacc-gomp/goacc-gomp.exp: New.
* gfortran.dg/goacc-gomp/atomic.f90: New test.
* gfortran.dg/goacc/atomic.f90: New test.
---
 gcc/fortran/openmp.c   |  7 +---
 gcc/testsuite/gfortran.dg/goacc-gomp/atomic.f90| 48 ++
 .../gfortran.dg/goacc-gomp/goacc-gomp.exp  | 37 +
 gcc/testsuite/gfortran.dg/goacc/atomic.f90 | 35 
 4 files changed, 122 insertions(+), 5 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 6cb4f2862ab..1891ac5591b 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -4181,8 +4181,7 @@ gfc_match_omp_atomic (void)
 }
 
 
-/* acc atomic [ read | write | update | capture]
-   acc atomic update capture.  */
+/* acc atomic [ read | write | update | capture]  */
 
 match
 gfc_match_oacc_atomic (void)
@@ -4191,9 +4190,7 @@ gfc_match_oacc_atomic (void)
   c->atomic_op = GFC_OMP_ATOMIC_UPDATE;
   c->memorder = OMP_MEMORDER_RELAXED;
   gfc_gobble_whitespace ();
-  if (gfc_match ("update capture") == MATCH_YES)
-c->capture = true;
-  else if (gfc_match ("update") == MATCH_YES)
+  if (gfc_match ("update") == MATCH_YES)
 ;
   else if (gfc_match ("read") == MATCH_YES)
 c->atomic_op = GFC_OMP_ATOMIC_READ;
diff --git a/gcc/testsuite/gfortran.dg/goacc-gomp/atomic.f90 b/gcc/testsuite/gfortran.dg/goacc-gomp/atomic.f90
new file mode 100644
index 000..59186a20982
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc-gomp/atomic.f90
@@ -0,0 +1,48 @@
+! { dg-do compile } */
+! { dg-additional-options "-fdump-tree-original" } */
+
+subroutine foo
+  !$omp requires atomic_default_mem_order(acq_rel)
+  integer :: i, v
+
+  !$omp atomic read
+  i = v
+
+  !$acc atomic read
+  i = v
+
+  !$omp atomic write
+  i = v
+
+  !$acc atomic write
+  i = v
+
+  !$omp atomic update
+  i = i + 1
+
+  !$acc atomic update
+  i = i + 1
+
+  !$omp atomic capture
+i = i + 1
+v = i
+  !$omp end atomic
+
+  !$acc atomic capture
+i = i + 1
+v = i
+  !$acc end atomic
+
+  ! Valid in C/C++ since OpenACC 2.5 but not in Fortran:
+  ! !$acc atomic update capture
+  !   i = i + 1
+  !   v = i
+  ! !$acc end atomic
+end
+
+! { dg-final { scan-tree-dump-times "i = #pragma omp atomic read acquire" 1 "original" } }
+! { dg-final { scan-tree-dump-times "i = #pragma omp atomic read relaxed" 1 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp atomic release" 2 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp atomic relaxed" 2 "original" } }
+! { dg-final { scan-tree-dump-times "v = #pragma omp atomic capture acq_rel" 1  "original" } }
+! { dg-final { scan-tree-dump-times "v = #pragma omp atomic capture relaxed" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc-gomp/goacc-gomp.exp b/gcc/testsuite/gfortran.dg/goacc-gomp/goacc-gomp.exp
new file mode 100644
index 000..6073fb3978a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc-gomp/goacc-gomp.exp
@@ -0,0 +1,37 @@
+# Copyright (C) 2005-2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERC

RE: [PATCH] SLP: Move load/store-lanes check till late

2020-11-06 Thread Tamar Christina via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: Friday, November 6, 2020 10:27 AM
> To: Tamar Christina 
> Cc: Richard Biener ; nd ; gcc-
> patc...@gcc.gnu.org; o...@ucw.cz
> Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> 
> On Fri, 6 Nov 2020 at 11:18, Tamar Christina 
> wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Christophe Lyon 
> > > Sent: Friday, November 6, 2020 8:42 AM
> > > To: Tamar Christina 
> > > Cc: Richard Biener ; nd ; gcc-
> > > patc...@gcc.gnu.org; o...@ucw.cz
> > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> > >
> > > On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches  > > patc...@gcc.gnu.org> wrote:
> > > >
> > > > > -Original Message-
> > > > > From: rguent...@c653.arch.suse.de 
> > > > > On Behalf Of Richard Biener
> > > > > Sent: Thursday, November 5, 2020 10:17 AM
> > > > > To: Tamar Christina 
> > > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > > >
> > > > > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > > > >
> > > > > > Hi Richi,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: rguent...@c653.arch.suse.de
> > > > > > >  On Behalf Of Richard Biener
> > > > > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > > > > To: Tamar Christina 
> > > > > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till
> > > > > > > late
> > > > > > >
> > > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > > > > >
> > > > > > > > Hi Richi,
> > > > > > > >
> > > > > > > > We decided to take the regression in any code-gen this
> > > > > > > > could give and fix it properly next stage-1.  As such
> > > > > > > > here's a new patch based on your previous feedback.
> > > > > > > >
> > > > > > > > Ok for master?
> > > > > > >
> > > > > > > Looks good sofar but be aware that you elide the
> > > > > > >
> > > > > > > - && vect_store_lanes_supported
> > > > > > > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), 
> > > > > > > group_size,
> > > > > > > false))
> > > > > > >
> > > > > > > part of the check - that is, you don't verify the store part
> > > > > > > of the instance can use store-lanes.  Btw, this means the
> > > > > > > original code cancelled an instance only when the SLP graph
> > > > > > > entry is a store-lane capable store but your variant would
> > > > > > > also cancel in case there's a load-
> > > > > lane capable reduction.
> > > > > > >
> > > > > >
> > > > > > I do still have it,
> > > > > >
> > > > > >   if (loads_permuted
> > > > > >   && vect_store_lanes_supported (vectype, group_size,
> > > > > > false))
> > > > > >
> > > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root);
> > > > > > which should be the store if one exists.
> > > > > >
> > > > > > > I think that you eventually want to re-instantiate the
> > > > > > > store-lane check but treat it the same as any of the load
> > > > > > > checks (thus not require all instances to be stores for the
> cancellation).
> > > > > > > But at least when a store cannot use store-lanes we probably
> > > > > > > shouldn't cancel the SLP.
> > > > > >
> > > > > > I did however elide the kind check, that was added as part of
> > > > > > the rebase, it looked like kind wasn't Being stored inside the
> > > > > > SLP instance and
> > > > > I'd have to redo the analysis to find it.
> > > > > >
> > > > > > Does it does reasonable to include kind as a field in the SLP 
> > > > > > instance?
> > > > > >
> > > > > > >
> > > > > > > Anyway, the patch is OK for master.  The store-lane check
> > > > > > > part can be re- added as followup.
> > > > > > >
> > > > > >
> > > > > > Thanks! Will do.
> > > > >
> > > > > Btw, the patch regressed
> > > > >
> > > > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects
> > > > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > > > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect
> > > > > "vectorizing stmts using SLP" 1
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > scan-tree-dump vect "LOAD_LANES"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > scan-tree-dump vect "STORE_LANES"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP
> cancelled:
> > > > > can use load/store-lanes"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> > > > >
> > > > > on x86_64.  The slp-11b.c testcase is interesting since there
> > > > > extract_muldiv folding makes the group of four stores not
> > > > > matching so we split into a size of
> > > > > 3 and one remaining store.
> > > > > This 

Re: [PATCH] SLP: Move load/store-lanes check till late

2020-11-06 Thread Christophe Lyon via Gcc-patches
On Fri, 6 Nov 2020 at 12:33, Tamar Christina  wrote:
>
>
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: Friday, November 6, 2020 10:27 AM
> > To: Tamar Christina 
> > Cc: Richard Biener ; nd ; gcc-
> > patc...@gcc.gnu.org; o...@ucw.cz
> > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> >
> > On Fri, 6 Nov 2020 at 11:18, Tamar Christina 
> > wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Christophe Lyon 
> > > > Sent: Friday, November 6, 2020 8:42 AM
> > > > To: Tamar Christina 
> > > > Cc: Richard Biener ; nd ; gcc-
> > > > patc...@gcc.gnu.org; o...@ucw.cz
> > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> > > >
> > > > On Thu, 5 Nov 2020 at 11:21, Tamar Christina via Gcc-patches  > > > patc...@gcc.gnu.org> wrote:
> > > > >
> > > > > > -Original Message-
> > > > > > From: rguent...@c653.arch.suse.de 
> > > > > > On Behalf Of Richard Biener
> > > > > > Sent: Thursday, November 5, 2020 10:17 AM
> > > > > > To: Tamar Christina 
> > > > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > > > > >
> > > > > > On Wed, 4 Nov 2020, Tamar Christina wrote:
> > > > > >
> > > > > > > Hi Richi,
> > > > > > >
> > > > > > > > -Original Message-
> > > > > > > > From: rguent...@c653.arch.suse.de
> > > > > > > >  On Behalf Of Richard Biener
> > > > > > > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > > > > > > To: Tamar Christina 
> > > > > > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > > > > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till
> > > > > > > > late
> > > > > > > >
> > > > > > > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > > > > > > >
> > > > > > > > > Hi Richi,
> > > > > > > > >
> > > > > > > > > We decided to take the regression in any code-gen this
> > > > > > > > > could give and fix it properly next stage-1.  As such
> > > > > > > > > here's a new patch based on your previous feedback.
> > > > > > > > >
> > > > > > > > > Ok for master?
> > > > > > > >
> > > > > > > > Looks good sofar but be aware that you elide the
> > > > > > > >
> > > > > > > > - && vect_store_lanes_supported
> > > > > > > > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), 
> > > > > > > > group_size,
> > > > > > > > false))
> > > > > > > >
> > > > > > > > part of the check - that is, you don't verify the store part
> > > > > > > > of the instance can use store-lanes.  Btw, this means the
> > > > > > > > original code cancelled an instance only when the SLP graph
> > > > > > > > entry is a store-lane capable store but your variant would
> > > > > > > > also cancel in case there's a load-
> > > > > > lane capable reduction.
> > > > > > > >
> > > > > > >
> > > > > > > I do still have it,
> > > > > > >
> > > > > > >   if (loads_permuted
> > > > > > >   && vect_store_lanes_supported (vectype, group_size,
> > > > > > > false))
> > > > > > >
> > > > > > > I just grab the type from the SLP_TREE_VECTYPE (slp_root);
> > > > > > > which should be the store if one exists.
> > > > > > >
> > > > > > > > I think that you eventually want to re-instantiate the
> > > > > > > > store-lane check but treat it the same as any of the load
> > > > > > > > checks (thus not require all instances to be stores for the
> > cancellation).
> > > > > > > > But at least when a store cannot use store-lanes we probably
> > > > > > > > shouldn't cancel the SLP.
> > > > > > >
> > > > > > > I did however elide the kind check, that was added as part of
> > > > > > > the rebase, it looked like kind wasn't Being stored inside the
> > > > > > > SLP instance and
> > > > > > I'd have to redo the analysis to find it.
> > > > > > >
> > > > > > > Does it does reasonable to include kind as a field in the SLP 
> > > > > > > instance?
> > > > > > >
> > > > > > > >
> > > > > > > > Anyway, the patch is OK for master.  The store-lane check
> > > > > > > > part can be re- added as followup.
> > > > > > > >
> > > > > > >
> > > > > > > Thanks! Will do.
> > > > > >
> > > > > > Btw, the patch regressed
> > > > > >
> > > > > > FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects
> > > > > > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > > > > > FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect
> > > > > > "vectorizing stmts using SLP" 1
> > > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > > scan-tree-dump vect "Built SLP cancelled: can use load/store-lanes"
> > > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > > scan-tree-dump vect "LOAD_LANES"
> > > > > > FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects
> > > > > > scan-tree-dump vect "STORE_LANES"
> > > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP
> > cancelled:
> > > > > > can use load/store-lanes"
> > > > > > FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> > > > > > FAIL: gcc.dg/vect/slp

[PATCH] tree-optimization/97706 - part one, refactor vect_determine_mask_precision

2020-11-06 Thread Richard Biener
This computes vect_determine_mask_precision in a RPO forward walk
rather than in a backward walk and using a worklist.  It will make
fixing PR97706 easier but for bisecting I wanted it to be separate.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-06  Richard Biener  

PR tree-optimization/97706
* tree-vect-patterns.c (vect_determine_mask_precision):
Remove worklist operation.
(vect_determine_stmt_precisions): Do not call
vect_determine_mask_precision here.
(vect_determine_precisions): Compute mask precision
in a forward walk.
---
 gcc/tree-vect-patterns.c | 161 ---
 1 file changed, 81 insertions(+), 80 deletions(-)

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index ac56acebe01..47d9fce594f 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -5017,104 +5017,88 @@ possible_vector_mask_operation_p (stmt_vec_info 
stmt_info)
 static void
 vect_determine_mask_precision (vec_info *vinfo, stmt_vec_info stmt_info)
 {
-  if (!possible_vector_mask_operation_p (stmt_info)
-  || stmt_info->mask_precision)
+  if (!possible_vector_mask_operation_p (stmt_info))
 return;
 
-  auto_vec worklist;
-  worklist.quick_push (stmt_info);
-  while (!worklist.is_empty ())
-{
-  stmt_info = worklist.last ();
-  unsigned int orig_length = worklist.length ();
-
-  /* If at least one boolean input uses a vector mask type,
-pick the mask type with the narrowest elements.
-
-??? This is the traditional behavior.  It should always produce
-the smallest number of operations, but isn't necessarily the
-optimal choice.  For example, if we have:
+  /* If at least one boolean input uses a vector mask type,
+ pick the mask type with the narrowest elements.
 
-  a = b & c
+ ??? This is the traditional behavior.  It should always produce
+ the smallest number of operations, but isn't necessarily the
+ optimal choice.  For example, if we have:
 
-where:
+   a = b & c
 
-- the user of a wants it to have a mask type for 16-bit elements (M16)
-- b also uses M16
-- c uses a mask type for 8-bit elements (M8)
+ where:
 
-then picking M8 gives:
+   - the user of a wants it to have a mask type for 16-bit elements (M16)
+   - b also uses M16
+   - c uses a mask type for 8-bit elements (M8)
 
-- 1 M16->M8 pack for b
-- 1 M8 AND for a
-- 2 M8->M16 unpacks for the user of a
+ then picking M8 gives:
 
-whereas picking M16 would have given:
+   - 1 M16->M8 pack for b
+   - 1 M8 AND for a
+   - 2 M8->M16 unpacks for the user of a
 
-- 2 M8->M16 unpacks for c
-- 2 M16 ANDs for a
-
-The number of operations are equal, but M16 would have given
-a shorter dependency chain and allowed more ILP.  */
-  unsigned int precision = ~0U;
-  gassign *assign = as_a  (stmt_info->stmt);
-  unsigned int nops = gimple_num_ops (assign);
-  for (unsigned int i = 1; i < nops; ++i)
-   {
- tree rhs = gimple_op (assign, i);
- if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs)))
-   continue;
+ whereas picking M16 would have given:
 
- stmt_vec_info def_stmt_info = vinfo->lookup_def (rhs);
- if (!def_stmt_info)
-   /* Don't let external or constant operands influence the choice.
-  We can convert them to whichever vector type we pick.  */
-   continue;
+   - 2 M8->M16 unpacks for c
+   - 2 M16 ANDs for a
 
- if (def_stmt_info->mask_precision)
-   {
- if (precision > def_stmt_info->mask_precision)
-   precision = def_stmt_info->mask_precision;
-   }
- else if (possible_vector_mask_operation_p (def_stmt_info))
-   worklist.safe_push (def_stmt_info);
-   }
+ The number of operations are equal, but M16 would have given
+ a shorter dependency chain and allowed more ILP.  */
+  unsigned int precision = ~0U;
+  gassign *assign = as_a  (stmt_info->stmt);
+  unsigned int nops = gimple_num_ops (assign);
+  for (unsigned int i = 1; i < nops; ++i)
+{
+  tree rhs = gimple_op (assign, i);
+  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs)))
+   continue;
 
-  /* Defer the choice if we need to visit operands first.  */
-  if (orig_length != worklist.length ())
+  stmt_vec_info def_stmt_info = vinfo->lookup_def (rhs);
+  if (!def_stmt_info)
+   /* Don't let external or constant operands influence the choice.
+  We can convert them to whichever vector type we pick.  */
continue;
 
-  /* If the statement compares two values that shouldn't use vector masks,
-try comparing the values as normal scalars instead.  */
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (precision == ~0

Re: [RFC PATCH] phiopt: Optimize x ? 1024 : 0 to (int) x << 10 [PR97690]

2020-11-06 Thread Christophe Lyon via Gcc-patches
On Wed, 4 Nov 2020 at 10:59, Richard Biener  wrote:
>
> On Wed, 4 Nov 2020, Jakub Jelinek wrote:
>
> > Hi!
> >
> > The following patch generalizes the x ? 1 : 0 -> (int) x optimization
> > to handle also left shifts by constant.
> >
> > During x86_64-linux and i686-linux bootstraps + regtests it triggered
> > in 1514 unique non-LTO -m64 cases (sort -u on log mentioning
> > filename, function name and shift count) and 1866 -m32 cases.
> >
> > Unfortunately, the patch regresses:
> > +FAIL: gcc.dg/tree-ssa/ssa-ccp-11.c scan-tree-dump-times optimized "if " 0
> > +FAIL: gcc.dg/vect/bb-slp-pattern-2.c -flto -ffat-lto-objects  
> > scan-tree-dump-times slp1 "optimized: basic block" 1
> > +FAIL: gcc.dg/vect/bb-slp-pattern-2.c scan-tree-dump-times slp1 "optimized: 
> > basic block" 1
> > and in both cases it actually results in worse code.
> >
> > In ssa-ccp-11.c since phiopt2 it results in smaller IL due to the
> > optimization, e.g.
> > -  if (_1 != 0)
> > -goto ; [21.72%]
> > -  else
> > -goto ; [78.28%]
> > -
> > -   [local count: 233216728]:
> > -
> > -   [local count: 1073741824]:
> > -  # _4 = PHI <2(5), 0(4)>
> > -  return _4;
> > +  _7 = (int) _1;
> > +  _8 = _7 << 1;
> > +  return _8;
> > but dom2 actually manages to optimize it only without this optimization:
> > -  # a_7 = PHI <0(3), 1(2)>
> > -  # b_8 = PHI <1(3), 0(2)>
> > -  _9 = a_7 & b_8;
> > -  return 0;
> > +  # a_2 = PHI <1(2), 0(3)>
> > +  # b_3 = PHI <0(2), 1(3)>
> > +  _1 = a_2 & b_3;
> > +  _7 = (int) _1;
> > +  _8 = _7 << 1;
> > +  return _8;
> > We'd need some optimization that would go through all PHI edges and
> > compute if some use of the phi results don't actually compute a constant
> > across all the PHI edges - 1 & 0 and 0 & 1 is always 0.
>
> PRE should do this, IMHO only optimizing it at -O2 is fine.  Can you
> check?
>
> >  Similarly in the
> > other function
> > +  # a_1 = PHI <3(2), 2(3)>
> > +  # b_2 = PHI <2(2), 3(3)>
> > +  c_5 = a_1 + b_2;
> > is always c_5 = 5;
> > Similarly, in the slp vectorization test there is:
> >  a[0] = b[0] ? 1 : 7;
>
> note this, carefully avoiding the already "optimized" b[0] ? 1 : 0 ...
>
> >  a[1] = b[1] ? 2 : 0;
> >  a[2] = b[2] ? 3 : 0;
> >  a[3] = b[3] ? 4 : 0;
> >  a[4] = b[4] ? 5 : 0;
> >  a[5] = b[5] ? 6 : 0;
> >  a[6] = b[6] ? 7 : 0;
> >  a[7] = b[7] ? 8 : 0;
> > and obviously if the ? 2 : 0 and ? 4 : 0 and ? 8 : 0 are optimized
> > into shifts, it doesn't match anymore.
>
> So the option is to put : 7 in the 2, 4 an 8 case as well.  The testcase
> wasn't added for any real-world case but is artificial I guess for
> COND_EXPR handling of invariants.
>
> > So, I wonder if we e.g. shouldn't perform this optimization only in the last
> > phiopt pass (i.e. change the bool early argument to int late where it would
> > be 0 (early), 1 (late) and 2 (very late) and perform this only if very late.
>
> Well, we always have the issue that a more "complex" expression might
> be more easily canonical.  But removing control flow is important
> and if we decide that we want to preserve it it more "canonical"
> (general) form then we should consider replacing
>
>   if (_1 != 0)
>
>   # _2 = PHI <0, 1>
>
> with
>
>   _2 = _1 ? 0 : 1;
>
> in general and doing fancy expansion late.  But we're already doing
> the other thing so ...
>
> But yeah, for things like SLP it means we eventually have to
> implement reverse transforms for all of this to make the lanes
> matching.  But that's true anyway for things like x + 1 vs. x + 0
> or x / 3 vs. x / 2 or other simplifications we do.
>
> > Thoughts on this?
>
> OK with the FAILing testcases adjusted (use -O2 / different constants).
>

The patch introduced a regression on aarch64:
FAIL:gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve
scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #-32768\\n 3
FAIL:gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve
scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #256\\n 3
FAIL:gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve
scan-assembler-times \\tmov\\tz[0-9]+\\.[hsd], p[0-7]/z, #2\\n 3
FAIL:gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve
scan-assembler-times \\tmov\\tz[0-9]+\\.b, p[0-7]/z, #-128\\n 1
FAIL:gcc.target/aarch64/sve/vcond_3.c -march=armv8.2-a+sve
scan-assembler-times \\tmov\\tz[0-9]+\\.b, p[0-7]/z, #2\\n 1

Christophe

> Thanks,
> Richard.
>
> > 2020-11-03  Jakub Jelinek  
> >
> >   PR tree-optimization/97690
> >   * tree-ssa-phiopt.c (conditional_replacement): Also optimize
> >   cond ? pow2p_cst : 0 as ((type) cond) << cst.
> >
> >   * gcc.dg/tree-ssa/phi-opt-22.c: New test.
> >
> > --- gcc/tree-ssa-phiopt.c.jj  2020-10-22 09:36:25.602484491 +0200
> > +++ gcc/tree-ssa-phiopt.c 2020-11-03 17:59:18.133662581 +0100
> > @@ -752,7 +752,9 @@ conditional_replacement (basic_block con
> >gimple_stmt_iterator gsi;
> >edge true_edge, false_edge;
> >tree new_var, new_var2;
> > -  bool neg;
> > +  bo

Move ipa-refs from GC to heap

2020-11-06 Thread Jan Hubicka
Hi,
this patch moves ipa-refs from ggc to heap.  They are half in heap
anyway (so they can not go to PCH) and thus finishing it easy and makes
code also bit cleaner.  While refs points to statements all those
statements are also reachable from function bodies.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

2020-11-06  Jan Hubicka  

* ipa-ref.h (enum ipa_ref_use): Remove GTY marker.
(struct ipa_ref): Remove GTY marker; reorder for better packing.
(struct ipa_ref_list): Remove GTY marker; turn references
nad referring to va_heap, vl_ptr vectors; update accesors.
* cgraph.h (symtab_node::iterate_reference): Update.
* ipa-ref.c (ipa_ref::remove_reference): Update.
* symtab.c (symtab_node::create_reference): Update.
(symtab_node::remove_all_references): Update.
(symtab_node::resolve_alias): Update.

gcc/cp/ChangeLog:

2020-11-06  Jan Hubicka  

* tree.c (cp_fix_function_decl_p): Do not access ipa_ref_list dirrectly.

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index c87180f1e96..73c37d8807d 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -221,7 +221,7 @@ public:
   /* Get number of references for this node.  */
   inline unsigned num_references (void)
   {
-return ref_list.references ? ref_list.references->length () : 0;
+return ref_list.references.length ();
   }
 
   /* Iterates I-th reference in the list, REF is also set.  */
@@ -604,7 +604,7 @@ public:
   symtab_node *same_comdat_group;
 
   /* Vectors of referring and referenced entities.  */
-  ipa_ref_list ref_list;
+  ipa_ref_list GTY((skip)) ref_list;
 
   /* Alias target. May be either DECL pointer or ASSEMBLER_NAME pointer
  depending to what was known to frontend on the creation time.
@@ -2676,7 +2676,7 @@ symtab_node::next_defined_symbol (void)
 inline ipa_ref *
 symtab_node::iterate_reference (unsigned i, ipa_ref *&ref)
 {
-  vec_safe_iterate (ref_list.references, i, &ref);
+  ref_list.references.iterate (i, &ref);
 
   return ref;
 }
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 63ce9acd7a6..28e591086b3 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -5719,8 +5719,7 @@ cp_fix_function_decl_p (tree decl)
 
   /* Don't fix same_body aliases.  Although they don't have their own
 CFG, they share it with what they alias to.  */
-  if (!node || !node->alias
- || !vec_safe_length (node->ref_list.references))
+  if (!node || !node->alias || !node->num_references ())
return true;
 }
 
diff --git a/gcc/ipa-ref.c b/gcc/ipa-ref.c
index 241828ee973..b7217c427f2 100644
--- a/gcc/ipa-ref.c
+++ b/gcc/ipa-ref.c
@@ -32,7 +32,6 @@ ipa_ref::remove_reference ()
 {
   struct ipa_ref_list *list = referred_ref_list ();
   struct ipa_ref_list *list2 = referring_ref_list ();
-  vec *old_references = list2->references;
   struct ipa_ref *last;
 
   gcc_assert (list->referring[referred_index] == this);
@@ -66,7 +65,7 @@ ipa_ref::remove_reference ()
 }
   list->referring.pop ();
 
-  last = &list2->references->last ();
+  last = &list2->references.last ();
 
   struct ipa_ref *ref = this;
 
@@ -75,8 +74,7 @@ ipa_ref::remove_reference ()
   *ref = *last;
   ref->referred_ref_list ()->referring[referred_index] = ref;
 }
-  list2->references->pop ();
-  gcc_assert (list2->references == old_references);
+  list2->references.pop ();
 }
 
 /* Return true when execution of reference can lead to return from
diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
index 1de5bd34b82..3ea3f665c3b 100644
--- a/gcc/ipa-ref.h
+++ b/gcc/ipa-ref.h
@@ -27,7 +27,7 @@ struct symtab_node;
 
 
 /* How the reference is done.  */
-enum GTY(()) ipa_ref_use
+enum ipa_ref_use
 {
   IPA_REF_LOAD,
   IPA_REF_STORE,
@@ -36,7 +36,7 @@ enum GTY(()) ipa_ref_use
 };
 
 /* Record of reference in callgraph or varpool.  */
-struct GTY(()) ipa_ref
+struct ipa_ref
 {
 public:
   /* Remove reference.  */
@@ -59,28 +59,27 @@ public:
   symtab_node *referred;
   gimple *stmt;
   unsigned int lto_stmt_uid;
+  unsigned int referred_index;
   /* speculative id is used to link direct calls with their corresponding
  IPA_REF_ADDR references when representing speculative calls.  */
   unsigned int speculative_id : 16;
-  unsigned int referred_index;
   ENUM_BITFIELD (ipa_ref_use) use:3;
   unsigned int speculative:1;
 };
 
 typedef struct ipa_ref ipa_ref_t;
-typedef struct ipa_ref *ipa_ref_ptr;
 
 
 /* List of references.  This is stored in both callgraph and varpool nodes.  */
-struct GTY(()) ipa_ref_list
+struct ipa_ref_list
 {
 public:
   /* Return first reference in list or NULL if empty.  */
   struct ipa_ref *first_reference (void)
   {
-if (!vec_safe_length (references))
+if (!references.length ())
   return NULL;
-return &(*references)[0];
+return &references[0];
   }
 
   /* Return first referring ref in list or NULL if empty.  */
@@ -121,20 +120,20 @@ public:
   void clear (void)
   {
 referring.create (0);
-references = NULL;
+refer

Re: [PING] [PATCH] S/390: Do not turn maybe-uninitialized warnings into errors

2020-11-06 Thread Andreas Krebbel via Gcc-patches
On 06.11.20 04:52, Jeff Law via Gcc-patches wrote:
> 
> On 10/30/20 7:01 AM, Richard Biener wrote:
>>
>> It's not that more / different inlining inherently exposes _more_
>> false positives in the middle-end warnings.  They simply expose
>> others and the GCC codebase is cleansed (by those who change
>> inliner heuristics / tunings) from those by either fixing the analysis
>> or modifying the code (like putting in initializers).
> 
> Right.  The change in heuristics inherently perturb the middle end
> warnings.  It has been and continues to be a source of significant
> headaches in Fedora.

Stefan did some measurements and in fact we see only a few benchmarks improving 
with our aggressive
settings. However, in these cases the performance benefits are significant. We 
will continue looking
into these cases. Perhaps more selective ways can be found to achieve the same.

I've just committed a patch to switch back to the default values. With that 
patch bootstrapping on Z
works fine again even without --disable-werror.

Andreas

gcc/ChangeLog:

* config/s390/s390.c (s390_option_override_internal): Remove
override of inline params.
---
 gcc/config/s390/s390.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b8961a315aa..847cedde674 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -15469,13 +15469,6 @@ s390_option_override_internal (struct gcc_options 
*opts,
   SET_OPTION_IF_UNSET (opts, opts_set, param_sched_pressure_algorithm, 2);
   SET_OPTION_IF_UNSET (opts, opts_set, param_min_vect_loop_bound, 2);

-  /* Use aggressive inlining parameters.  */
-  if (opts->x_s390_tune >= PROCESSOR_2964_Z13)
-{
-  SET_OPTION_IF_UNSET (opts, opts_set, param_inline_min_speedup, 2);
-  SET_OPTION_IF_UNSET (opts, opts_set, param_max_inline_insns_auto, 80);
-}
-
   /* Set the default alignment.  */
   s390_default_align (opts);


[committed] ipa-modref: Fix comment typos

2020-11-06 Thread Jakub Jelinek via Gcc-patches
Hi!

I've done this mainly to test whether the git hooks logging of sent mails
hack I've added to help debugging the missing mails problem works.

2020-11-06  Jakub Jelinek  

* ipa-modref-tree.h: Fix comment typos.
* ipa-modref.c: Likewise.

--- gcc/ipa-modref-tree.h
+++ gcc/ipa-modref-tree.h
@@ -23,17 +23,17 @@ along with GCC; see the file COPYING3.  If not see
call.  For every function we collect two trees, one for loads and other
for stores.  Tree consist of following levels:
 
-   1) Base: this level represent base alias set of the acecess and refers
+   1) Base: this level represent base alias set of the access and refers
   to sons (ref nodes). Flag all_refs means that all possible references
   are aliasing.
 
-  Because for LTO streaming we need to stream types rahter than alias sets
+  Because for LTO streaming we need to stream types rather than alias sets
   modref_base_node is implemented as a template.
-   2) Ref: this level represent ref alias set and links to acesses unless
-  all_refs flag is et.
+   2) Ref: this level represent ref alias set and links to accesses unless
+  all_refs flag is set.
   Again ref is an template to allow LTO streaming.
3) Access: this level represent info about individual accesses.  Presently
-  we record whether access is trhough a dereference of a function parameter
+  we record whether access is through a dereference of a function parameter
 */
 
 #ifndef GCC_MODREF_TREE_H
@@ -50,7 +50,7 @@ struct GTY(()) modref_access_node
   poly_int64 size;
   poly_int64 max_size;
 
-  /* Offset from parmeter pointer to the base of the access (in bytes).  */
+  /* Offset from parameter pointer to the base of the access (in bytes).  */
   poly_int64 parm_offset;
 
   /* Index of parameter which specifies the base of access. -1 if base is not
@@ -240,7 +240,7 @@ struct modref_parm_map
 {
   /* Index of parameter we translate to.
  -1 indicates that parameter is unknown
- -2 indicates that parmaeter points to local memory and access can be
+ -2 indicates that parameter points to local memory and access can be
discarded.  */
   int parm_index;
   bool parm_offset_known;
@@ -333,7 +333,7 @@ struct GTY((user)) modref_tree
 /* If we failed to insert ref, just see if there is a cleanup possible.  */
 if (!ref_node)
   {
-   /* No useful ref information and no useful base; collapse everyting.  */
+   /* No useful ref information and no useful base; collapse everything.  
*/
if (!base && base_node->every_ref)
  {
collapse ();
@@ -367,7 +367,7 @@ struct GTY((user)) modref_tree
 return changed;
   }
 
- /* Remove tree branches that are not useful (i.e. they will allways pass).  */
+ /* Remove tree branches that are not useful (i.e. they will always pass).  */
 
  void cleanup ()
  {
--- gcc/ipa-modref.c
+++ gcc/ipa-modref.c
@@ -24,7 +24,7 @@ along with GCC; see the file COPYING3.  If not see
described in ipa-modref-tree.h.
 
This file contains a tree pass and an IPA pass.  Both performs the same
-   analys however tree pass is executed during early and late optimization
+   analysis however tree pass is executed during early and late optimization
passes to propagate info downwards in the compilation order.  IPA pass
propagates across the callgraph and is able to handle recursion and works on
whole program during link-time analysis.
@@ -152,7 +152,7 @@ public:
 static GTY(()) fast_function_summary 
 *summaries;
 
-/* Global variable holding all modref optimizaiton summaries
+/* Global variable holding all modref optimization summaries
(from IPA propagation time or used by local optimization pass).  */
 
 static GTY(()) fast_function_summary 
@@ -924,7 +924,7 @@ analyze_call (modref_summary *cur_summary, 
modref_summary_lto *cur_summary_lto,
   return true;
 }
 
-/* Support analyzis in non-lto and lto mode in parallel.  */
+/* Support analysis in non-lto and lto mode in parallel.  */
 
 struct summary_ptrs
 {
@@ -995,10 +995,10 @@ static bool
 analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
  gimple *stmt, bool ipa, vec  *recursive_calls)
 {
-  /* In general we can not ignore clobbers because they are barries for code
- motion, however after inlining it is safe to do becuase local optimization
+  /* In general we can not ignore clobbers because they are barriers for code
+ motion, however after inlining it is safe to do because local optimization
  passes do not consider clobbers from other functions.
- Similar logic is in ipa-pure-consts.  */
+ Similar logic is in ipa-pure-const.c.  */
   if ((ipa || cfun->after_inlining) && gimple_clobber_p (stmt))
 return true;
 
@@ -1121,7 +1121,7 @@ analyze_function (function *f, bool ipa)
   summary = optimization_summaries->get_create (cgraph_node::get 
(f->decl));
   gcc_checking_assert (nolto && !lto);
 }

Re: [PATCH v2] Add if-chain to switch conversion pass.

2020-11-06 Thread Richard Biener via Gcc-patches
On Fri, Oct 16, 2020 at 4:04 PM Martin Liška  wrote:
>
> Hello.
>
> There's another version of the patch that should be based on what
> I discussed with Richi and Jakub:
>
> - the first patch introduces a new option -fbit-tests that analogue to 
> -fjump-tables
>and will control the new if-to-switch conversion pass
>
> - the second patch adds the pass
> - I share code with tree-ssa-reassoc.c (range_entry and init_range_entry)
> - a local discovery phase is run first
> - later than these local BBs are chained into a candidate list for the 
> conversion
>
> I'm also sending transformed chains for 'make all-host' (620 transformations).
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

-static bool
+bool
 no_side_effect_bb (basic_block bb)
 {

exporting this with this name is dangerous I think because the function
seems to allow side-effects in the last stmt - not sure exactly what
it tries to allow - there's no comment to that :/

+  free (rpo);
+  free_dominance_info (CDI_DOMINATORS);
+
+  if (!all_candidates.is_empty ())
+mark_virtual_operands_for_renaming (fun);

please avoid freeing dominance info when there was no change done
(move it to the !all_candidates.is_empty () block).

+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fun)
+find_conditions (bb, &conditions_in_bbs);
+

if we didn't find any conditions (or found just one?) we can elide the
rest of the function, no?

+ if_chain *chain = new if_chain ();
+ chain->m_entries.safe_push (info);
+ /* Try to find a chain starting in this BB.  */
+ while (true)
+   {
+ if (!single_pred_p (gimple_bb (info->m_cond)))
+   break;
+ edge e = single_pred_edge (gimple_bb (info->m_cond));
+ condition_info *info2 = conditions_in_bbs.get (e->src);
+ if (!info2 || info->m_ranges[0].exp != info2->m_ranges[0].exp)
+   break;
+
+ chain->m_entries.safe_push (info2);
+ bitmap_set_bit (seen_bbs, e->src->index);
+ info = info2;
+   }

so while we now record conditions per BB the above doesn't really
allow matching a binary tree.  What I was thinking of is to record
if_chain * per BB as well and look at successors, thus (pseudo-code)

   if (block ends in cond)
 if (if_chain on true edge && if_chain on false edge)
  try merge
else if (if_chain on true edge && this-cond tests same var)
  try merge
else if (if_chan on false edge && ...)
  try merge
record if_chain for block

where merging would eventually detach the if_chains from the successors.
For now we'd just handle the true (and maybe false) edge combos to handle
linear chains.  Walking reverse RPO (I'm not 100% sure reverse RPO is what
we want here, but guess it will work fine for now) will gather chains
accordingly.
When merging from a successor to a BB fails we push the successor chain
to the candidate list.

+/* Algorithm of the pass runs in the following steps:
+   a) We walk basic blocks in DOMINATOR order so that we first reach
+  a first condition of a future switch.
+   b) We follow false edges of a if-else-chain and we record chain
+  of GIMPLE conditions.  These blocks are only used for comparison
+  of a common SSA_NAME and we do not allow any side effect.
+   c) We remove all basic blocks (except first) of such chain and
+  GIMPLE switch replaces the condition in the first basic block.
+   d) We move all GIMPLE statements in the removed blocks into the
+  first one.  */

the overall comment is now a bit out-of-date?

Please remove the PHI mapping as I outlined in earlier review.

The 0001-Add-fbit-tests-option.patch is OK for trunk.

Thanks,
Richard.


> Thoughts?
> Thanks,
> Martin


[PATCH] refactor SLP analysis

2020-11-06 Thread Richard Biener
This passes down the graph entry kind down to vect_analyze_slp_instance
which simplifies it and makes it a shallow wrapper around
vect_build_slp_instance.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-11-06  Richard Biener  

* tree-vect-slp.c (vect_analyze_slp): Pass down the
SLP graph entry kind.
(vect_analyze_slp_instance): Simplify.
(vect_build_slp_instance): Adjust.
(vect_slp_check_for_constructors): Perform more
eligibility checks here.
---
 gcc/tree-vect-slp.c | 105 +++-
 1 file changed, 45 insertions(+), 60 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 4d1f17bd3fa..88e637e30dc 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2174,7 +2174,8 @@ calculate_unrolling_factor (poly_uint64 nunits, unsigned 
int group_size)
 static bool
 vect_analyze_slp_instance (vec_info *vinfo,
   scalar_stmts_to_slp_tree_map_t *bst_map,
-  stmt_vec_info stmt_info, unsigned max_tree_size);
+  stmt_vec_info stmt_info, slp_instance_kind kind,
+  unsigned max_tree_size);
 
 /* Analyze an SLP instance starting from SCALAR_STMTS which are a group
of KIND.  Return true if successful.  */
@@ -2375,7 +2376,7 @@ vect_build_slp_instance (vec_info *vinfo,
  stmt_vec_info rest = vect_split_slp_store_group (stmt_info,
   group1_size);
  bool res = vect_analyze_slp_instance (vinfo, bst_map, stmt_info,
-   max_tree_size);
+   kind, max_tree_size);
  /* Split the rest at the failure point and possibly
 re-analyze the remaining matching part if it has
 at least two lanes.  */
@@ -2386,14 +2387,14 @@ vect_build_slp_instance (vec_info *vinfo,
  stmt_vec_info rest2 = rest;
  rest = vect_split_slp_store_group (rest, i - group1_size);
  if (i - group1_size > 1)
-   res |= vect_analyze_slp_instance (vinfo, bst_map,
- rest2, max_tree_size);
+   res |= vect_analyze_slp_instance (vinfo, bst_map, rest2,
+ kind, max_tree_size);
}
  /* Re-analyze the non-matching tail if it has at least
 two lanes.  */
  if (i + 1 < group_size)
res |= vect_analyze_slp_instance (vinfo, bst_map,
- rest, max_tree_size);
+ rest, kind, max_tree_size);
  return res;
}
}
@@ -2418,10 +2419,10 @@ vect_build_slp_instance (vec_info *vinfo,
  DR_GROUP_GAP (stmt_info) = 0;
 
  bool res = vect_analyze_slp_instance (vinfo, bst_map, stmt_info,
-   max_tree_size);
+   kind, max_tree_size);
  if (i + 1 < group_size)
res |= vect_analyze_slp_instance (vinfo, bst_map,
- rest, max_tree_size);
+ rest, kind, max_tree_size);
 
  return res;
}
@@ -2444,59 +2445,34 @@ vect_build_slp_instance (vec_info *vinfo,
 static bool
 vect_analyze_slp_instance (vec_info *vinfo,
   scalar_stmts_to_slp_tree_map_t *bst_map,
-  stmt_vec_info stmt_info, unsigned max_tree_size)
+  stmt_vec_info stmt_info,
+  slp_instance_kind kind,
+  unsigned max_tree_size)
 {
-  unsigned int group_size;
   unsigned int i;
-  struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   vec scalar_stmts;
-  slp_instance_kind kind;
 
   if (is_a  (vinfo))
 vect_location = stmt_info->stmt;
-  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
-{
-  kind = slp_inst_kind_store;
-  group_size = DR_GROUP_SIZE (stmt_info);
-}
-  else if (!dr && REDUC_GROUP_FIRST_ELEMENT (stmt_info))
-{
-  kind = slp_inst_kind_reduc_chain;
-  gcc_assert (is_a  (vinfo));
-  group_size = REDUC_GROUP_SIZE (stmt_info);
-}
-  else if (is_gimple_assign (stmt_info->stmt)
-   && gimple_assign_rhs_code (stmt_info->stmt) == CONSTRUCTOR)
-{
-  kind = slp_inst_kind_ctor;
-  group_size = CONSTRUCTOR_NELTS (gimple_assign_rhs1 (stmt_info->stmt));
-}
-  else
-{
-  kind = slp_inst_kind_reduc_group;
-  gcc_assert (is_a  (vinfo));
-  group_size = as_a  (vinfo)->reductions.length ();
-}
 
-  /* Create a node (a root of the SLP tree) for the packed grouped stores.  */
-  scalar_stmts.create (group_size);
   stm

Re: Move pass_oacc_device_lower after pass_graphite

2020-11-06 Thread Richard Biener via Gcc-patches
On Fri, Nov 6, 2020 at 12:18 PM Frederik Harwath
 wrote:
>
>
> Hi Richard,
>
> Richard Biener  writes:
>
> > On Tue, Nov 3, 2020 at 4:31 PM Frederik Harwath
>
> > What's on my TODO list (or on the list of things to explore) is to make
> > the dump file names/suffixes explicit in passes.def like via
> >
> >   NEXT_PASS (pass_ccp, true /* nonzero_p */, "oacc")
> >
> > and we'd get a dump named .ccp_oacc or so.
>
> That would be very helpful for avoiding the drudgery of adapting those
> pass numbers!
>
> > Now, what does oacc_device_lower actually do that you need to
> > re-run complex lowering?  What does cunrolli do at this point that
> > the complete_unroll pass later does not do?
> >
>
> Good spot, "cunrolli" seems to be unnecessary.  The complex lowering is
> necessary to handle the code that gets created by the OpenACC reduction
> lowering during oaccdevlow.  I have attached a test case (a reduced
> version of
> libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c) which
> shows that the complex instructions are created by
> pass_oacc_device_lower and which leads to an ICE if compiled without the
> new complex lowering instance ("-foffload=-fdisable-tree-cplxlower2").
> The problem is an unlowered addition. This is from a diff of the dump of
> the pass following oaccdevlow1 (ccp4) with disabled and with enabled
> tree-cplxlower2:
>
> <   _91 = VIEW_CONVERT_EXPR(_1);
> <   _92 = reduction_var_2 + _91;
> ---
> >   _104 = REALPART_EXPR (_1)>;
> >   _105 = IMAGPART_EXPR (_1)>;
> >   _91 = COMPLEX_EXPR <_104, _105>;
> >   _106 = reduction_var$real_100 + _104;
> >   _107 = reduction_var$imag_101 + _105;
> >   _92 = COMPLEX_EXPR <_106, _107>;

I wonder if oacc device lowering could handle this itself rather than
requiring another cplxlower pass for presumably just complex add?

> > What's special about oacc_device lower that doesn't also apply
> > to omp_device_lower?
>
> The passes do different things. The goal is to optimize OpenACC
> loops using Graphite. The relevant lowering of the internal OpenACC
> function calls happens in pass_oacc_device_lower.
>
> > Is all this targeted at code compiled exclusively for the offload
> > target?  Thus we're in lto1 here?
>
> The OpenACC outlined functions also get compiled for the host.
>
> > Does it make eventually more sense to have a completely custom pass
> > pipeline for the  offload compilation?  Maybe even per offload target?
> > See how we have a custom pipeline for -Og (pass_all_optimizations_g).
>
> What would be the main benefits of a separate pipeline? Avoiding
> (re-)running passes unneccessarily, less unwanted interactions
> in the test suite (but your suggestion above regarding the fixed
> pass names would also solve this)?

Mainly to avoid (re-)running passes unneccessarily and more
easily tuning towards offload targets without affecting non-offload
code too much.

Can I somehow make you work on that dump-file idea? ;)

Richard.

> >> Ok to include the patch in master?
>
> Best regards,
> Frederik
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> Alexander Walter


Re: [patch] Add dg-require-effective-target fpic to an aarch64 specific test in gcc.dg

2020-11-06 Thread Olivier Hainque



> On 4 Nov 2020, at 20:16, Richard Sandiford  wrote:
> 
> Olivier Hainque  writes:
>> Hello,
>> 
>> This patch adds dg-require-effective-target fpic
>> to an aarch64 specific gcc.dg test using -fPIC,
>> which helps circumvent a failure we observed while
>> testing the aarch64 port for VxWorks.
>> 
>> ok to commit ?
> 
> OK, thanks.  Also OK for any other current or future aarch64 test that
> has -fpic or -fPIC in the options and forgets to do this.

Great, thanks Richard!

This echoes what you had actually already told me some
months ago at

  https://gcc.gnu.org/pipermail/gcc-patches/2019-December/536909.html

which I didn't remember before sending this patch.

For the avoidance of doubt, that would hold for arm
tests as well, right ?

(I'll be careful not to introduce obvious redundancy with e.g.
os=linux, as pointed out by Jakub earlier this week for another
set on i386).

Cheers,

Olivier



Re: [patch] i386 tests: Add dg-require-profiling to i386 tests using -pg

2020-11-06 Thread Olivier Hainque



> On 6 Nov 2020, at 04:56, Jeff Law  wrote:
> 
> OK

Thanks for your prompt feedback Jeff!



Re: Split up "gfortran.dg/goacc/loop-2.f95"

2020-11-06 Thread Thomas Schwinge
Hi!

On 2018-12-09T13:56:40+0100, I wrote:
> Committed to trunk in r266921:

> Split up "gfortran.dg/goacc/loop-2.f95"
>
> gcc/testsuite/
> * gfortran.dg/goacc/loop-2.f95: Split into...
> * gfortran.dg/goacc/loop-2-kernels-nested.f95: ... this new
> file...
> * gfortran.dg/goacc/loop-2-kernels-tile.f95: ..., and this new
> file...
> * gfortran.dg/goacc/loop-2-kernels.f95: ..., and this new file...
> * gfortran.dg/goacc/loop-2-parallel-3.f95: ..., and this new
> file...
> * gfortran.dg/goacc/loop-2-parallel-nested.f95: ..., and this new
> file...
> * gfortran.dg/goacc/loop-2-parallel-tile.f95: ..., and this new
> file...
> * gfortran.dg/goacc/loop-2-parallel.f95: ..., and this new file.

I recently noticed that we've got some duplication of testing here.  As
the above is in a more well-structured form, I've pushed "Remove
'gfortran.dg/goacc/loop-5.f95'" and "Remove
'gfortran.dg/goacc/loop-6.f95'" to master branch in commit
4dfa1789ab6560a69de22afe7982f372f598c5b8 and commit
52b74462176e4741ce1248c055e6bb1cb902c025, and backported to
releases/gcc-10 branch in commit 1288da82c0f239e81cc8474d320edb517a5754d1
and commit 594672c89dd4279fcf3b5a824d69b206ebf4b700, see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 4dfa1789ab6560a69de22afe7982f372f598c5b8 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 27 Oct 2020 10:16:29 +0100
Subject: [PATCH 1/2] Remove 'gfortran.dg/goacc/loop-5.f95'

What it's testing is adequately covered in other
'gfortran.dg/goacc/loop-2-*-tile.f95' testcases.

	gcc/testsuite/
	* gfortran.dg/goacc/loop-5.f95: Remove.
---
 gcc/testsuite/gfortran.dg/goacc/loop-5.f95 | 357 -
 1 file changed, 357 deletions(-)
 delete mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-5.f95

diff --git a/gcc/testsuite/gfortran.dg/goacc/loop-5.f95 b/gcc/testsuite/gfortran.dg/goacc/loop-5.f95
deleted file mode 100644
index d059cf7f377..000
--- a/gcc/testsuite/gfortran.dg/goacc/loop-5.f95
+++ /dev/null
@@ -1,357 +0,0 @@
-program test
-  implicit none
-  integer :: i, j
-
-  !$acc kernels
-!$acc loop auto
-DO i = 1,10
-ENDDO
-!$acc loop gang
-DO i = 1,10
-ENDDO
-!$acc loop gang(5)
-DO i = 1,10
-ENDDO
-!$acc loop gang(num:5)
-DO i = 1,10
-ENDDO
-!$acc loop gang(static:5)
-DO i = 1,10
-ENDDO
-!$acc loop gang(static:*)
-DO i = 1,10
-ENDDO
-!$acc loop gang
-DO i = 1,10
-  !$acc loop vector
-  DO j = 1,10
-  ENDDO
-  !$acc loop worker
-  DO j = 1,10
-  ENDDO
-ENDDO
-
-!$acc loop worker
-DO i = 1,10
-ENDDO
-!$acc loop worker(5)
-DO i = 1,10
-ENDDO
-!$acc loop worker(num:5)
-DO i = 1,10
-ENDDO
-!$acc loop worker
-DO i = 1,10
-  !$acc loop vector
-  DO j = 1,10
-  ENDDO
-ENDDO
-!$acc loop gang worker
-DO i = 1,10
-ENDDO
-
-!$acc loop vector
-DO i = 1,10
-ENDDO
-!$acc loop vector(5)
-DO i = 1,10
-ENDDO
-!$acc loop vector(length:5)
-DO i = 1,10
-ENDDO
-!$acc loop vector
-DO i = 1,10
-ENDDO
-!$acc loop gang vector
-DO i = 1,10
-ENDDO
-!$acc loop worker vector
-DO i = 1,10
-ENDDO
-
-!$acc loop auto
-DO i = 1,10
-ENDDO
-
-!$acc loop tile(1)
-DO i = 1,10
-ENDDO
-!$acc loop tile(2)
-DO i = 1,10
-ENDDO
-!$acc loop tile(6-2)
-DO i = 1,10
-ENDDO
-!$acc loop tile(6+2)
-DO i = 1,10
-ENDDO
-!$acc loop tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop tile(*, 1)
-DO i = 1,10
-  DO j = 1,10
-  ENDDO
-ENDDO
-!$acc loop tile(-1) ! { dg-warning "must be positive" }
-do i = 1,10
-enddo
-!$acc loop vector tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop worker tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop gang tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop vector gang tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop vector worker tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop gang worker tile(*)
-DO i = 1,10
-ENDDO
-  !$acc end kernels
-
-
-  !$acc parallel
-!$acc loop tile(1)
-DO i = 1,10
-ENDDO
-!$acc loop tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop tile(2)
-DO i = 1,10
-  DO j = 1,10
-  ENDDO
-ENDDO
-!$acc loop tile(-1) ! { dg-warning "must be positive" }
-do i = 1,10
-enddo
-!$acc loop vector tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop worker tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop gang tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop vector gang tile(*)
-DO i = 1,10
-ENDDO
-!$acc loop vector worker tile(*)

Re: Use existing middle end checking for Fortran OpenACC loop clauses

2020-11-06 Thread Thomas Schwinge
Hi!

On 2018-12-09T13:58:51+0100, I wrote:
> Committed to trunk in r266922:

> Use existing middle end checking for Fortran OpenACC loop clauses
>
> Don't duplicate in the Fortran front end what's generically being checked 
> in
> the middle end.
>
> gcc/fortran/
> * openmp.c (resolve_oacc_loop_blocks): Remove checking of OpenACC
> loop clauses.
> gcc/testsuite/
> * gfortran.dg/goacc/loop-2-kernels.f95: Update.
> * gfortran.dg/goacc/loop-2-parallel.f95: Likewise.
> * gfortran.dg/goacc/nested-parallelism.f90: Likewise.

Similar to that, I've noticed inconsistent diagnostics in C/C++ vs.
Fortran for OpenACC 'loop' clauses with arguments only allowed inside
OpenACC 'kernels' regions, so I pushed "[Fortran] Remove OpenACC 'loop'
inside 'parallel' special-case code" to master branch in commit
4c27f900950ed0ecb2897a8931c5cc348b1980be, and backported to
releases/gcc-10 in commit f41ca73aa11f28ad7d847ac5bf7e07f8bc763721, see
attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 4c27f900950ed0ecb2897a8931c5cc348b1980be Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 27 Oct 2020 10:43:27 +0100
Subject: [PATCH] [Fortran] Remove OpenACC 'loop' inside 'parallel'
 special-case code

Instead, use the generic middle-end code, like already used for Fortran OpenACC
'loop' inside other compute constructs, orphaned 'loop' constructs, and C, C++
generally.

	gcc/fortran/
	* openmp.c (oacc_is_parallel, resolve_oacc_params_in_parallel):
	Remove.
	(resolve_oacc_loop_blocks): Don't call the former.
	gcc/testsuite/
	* gfortran.dg/goacc/loop-2-parallel-3.f95: Adjust.
---
 gcc/fortran/openmp.c  | 37 ---
 .../gfortran.dg/goacc/loop-2-parallel-3.f95   | 24 ++--
 2 files changed, 12 insertions(+), 49 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 1891ac5591b..2270c858f39 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -6403,11 +6403,6 @@ resolve_omp_do (gfc_code *code)
 }
 }
 
-static bool
-oacc_is_parallel (gfc_code *code)
-{
-  return code->op == EXEC_OACC_PARALLEL || code->op == EXEC_OACC_PARALLEL_LOOP;
-}
 
 static gfc_statement
 omp_code_to_statement (gfc_code *code)
@@ -,26 +6661,6 @@ resolve_oacc_nested_loops (gfc_code *code, gfc_code* do_code, int collapse,
 }
 
 
-static void
-resolve_oacc_params_in_parallel (gfc_code *code, const char *clause,
- const char *arg)
-{
-  fortran_omp_context *c;
-
-  if (oacc_is_parallel (code))
-gfc_error ("!$ACC LOOP %s in PARALLEL region doesn't allow "
-	   "%s arguments at %L", clause, arg, &code->loc);
-  for (c = omp_current_ctx; c; c = c->previous)
-{
-  if (oacc_is_loop (c->code))
-	break;
-  if (oacc_is_parallel (c->code))
-	gfc_error ("!$ACC LOOP %s in PARALLEL region doesn't allow "
-		   "%s arguments at %L", clause, arg, &code->loc);
-}
-}
-
-
 static void
 resolve_oacc_loop_blocks (gfc_code *code)
 {
@@ -6697,18 +6672,6 @@ resolve_oacc_loop_blocks (gfc_code *code)
 gfc_error ("Tiled loop cannot be parallelized across gangs, workers and "
 	   "vectors at the same time at %L", &code->loc);
 
-  if (code->ext.omp_clauses->gang
-  && code->ext.omp_clauses->gang_num_expr)
-resolve_oacc_params_in_parallel (code, "GANG", "num");
-
-  if (code->ext.omp_clauses->worker
-  && code->ext.omp_clauses->worker_expr)
-resolve_oacc_params_in_parallel (code, "WORKER", "num");
-
-  if (code->ext.omp_clauses->vector
-  && code->ext.omp_clauses->vector_expr)
-resolve_oacc_params_in_parallel (code, "VECTOR", "length");
-
   if (code->ext.omp_clauses->tile_list)
 {
   gfc_expr_list *el;
diff --git a/gcc/testsuite/gfortran.dg/goacc/loop-2-parallel-3.f95 b/gcc/testsuite/gfortran.dg/goacc/loop-2-parallel-3.f95
index 03cae74c022..5379fba16ed 100644
--- a/gcc/testsuite/gfortran.dg/goacc/loop-2-parallel-3.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/loop-2-parallel-3.f95
@@ -5,52 +5,52 @@ program test
   integer :: i
 
   !$acc parallel
-!$acc loop gang(5) ! { dg-error "num arguments" }
+!$acc loop gang(5) ! { dg-error "argument not permitted" }
 DO i = 1,10
 ENDDO
 
-!$acc loop gang(num:5) ! { dg-error "num arguments" }
+!$acc loop gang(num:5) ! { dg-error "argument not permitted" }
 DO i = 1,10
 ENDDO
 
-!$acc loop worker(5) ! { dg-error "num arguments" }
+!$acc loop worker(5) ! { dg-error "argument not permitted" }
 DO i = 1,10
 ENDDO
 
-!$acc loop worker(num:5) ! { dg-error "num arguments" }
+!$acc loop worker(num:5) ! { dg-error "argument not permitted" }
 DO i = 1,10
 ENDDO
 
-!$acc loop vector(5) ! { dg-error "length arguments" }
+!$acc loop vector(5) ! { dg-error "argument not permitted" }
 DO i = 1

[PATCH][pushed] testsuite: fix malloc alignment in test

2020-11-06 Thread Martin Liška

Hi.

The patch fixes the testcase on ppc64. I'm going to push the commit.

Martin

gcc/testsuite/ChangeLog:

PR gcov-profile/97461
* gcc.dg/tree-prof/pr97461.c: Return aligned memory.
---
 gcc/testsuite/gcc.dg/tree-prof/pr97461.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr97461.c 
b/gcc/testsuite/gcc.dg/tree-prof/pr97461.c
index 8d21a3ef421..213fac9af04 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/pr97461.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/pr97461.c
@@ -20,7 +20,13 @@ static const fun_t funs[2] = { f1, f2, };
 
 static void * malloc_impl(size_t size) {

 void * r = &memory[memory_p];
-memory_p += size;
+/* The malloc() and calloc() functions return a pointer to the allocated
+ * memory, which is suitably aligned for any built-in type.  Use 16
+ * bytes here as the basic alignment requirement for user-defined malloc
+ * and calloc.  See PR97594 for the details.  */
+#define ROUND_UP_FOR_16B_ALIGNMENT(x) ((x + 15) & (-16))
+
+memory_p += ROUND_UP_FOR_16B_ALIGNMENT(size);
 
 // force TOPN profile

 funs[size % 2]();
--
2.29.2



c++: Parser tweaks

2020-11-06 Thread Nathan Sidwell


We need to adjust the wording for 'export'.  Between c++11 and c++20
it is deprecated.  Outside those ranges it is unsupported (at the
moment).  While here, there's also an unneeded setting of a bool --
it's inside an if block that just checked it was true.

gcc/cp/
* parser.c (cp_parser_template_declaration): Adjust 'export' 
warning.

(cp_parser_explicit_specialization): Remove unneeded bool setting.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/cp/parser.c w/gcc/cp/parser.c
index e7bfbf649a5..c948dc9d050 100644
--- i/gcc/cp/parser.c
+++ w/gcc/cp/parser.c
@@ -16031,8 +16031,13 @@ cp_parser_template_declaration (cp_parser* parser, bool member_p)
 {
   /* Consume the `export' token.  */
   cp_lexer_consume_token (parser->lexer);
-  /* Warn that we do not support `export'.  */
-  warning (0, "keyword % not implemented, and will be ignored");
+  /* Warn that this use of export is deprecated.  */
+  if (cxx_dialect < cxx11)
+	warning (0, "keyword % not implemented, and will be ignored");
+  else if (cxx_dialect < cxx20)
+	warning (0, "keyword % is deprecated, and is ignored");
+  else
+	warning (0, "keyword % not implemented, and will be ignored");
 }
 
   cp_parser_template_declaration_after_export (parser, member_p);
@@ -17753,7 +17758,6 @@ cp_parser_explicit_specialization (cp_parser* parser)
   /* Give it C++ linkage to avoid confusing other parts of the
 	 front end.  */
   push_lang_context (lang_name_cplusplus);
-  need_lang_pop = true;
 }
 
   /* Let the front end know that we are beginning a specialization.  */


[PATCH] tree-optimization/97706 - handle PHIs in pattern recog mask precison

2020-11-06 Thread Richard Biener
This adds handling of PHIs to mask precision compute which is
eventually needed to detect a bool pattern when the def chain
contains such a PHI node.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-11-06  Richard Biener  

PR tree-optimization/97706
* tree-vect-patterns.c (possible_vector_mask_operation_p):
PHIs are possible mask operations.
(vect_determine_mask_precision): Handle PHIs.
(vect_determine_precisions): Walk PHIs in BB analysis.

* gcc.dg/vect/bb-slp-pr97706.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr97706.c |  61 
 gcc/tree-vect-patterns.c   | 109 ++---
 2 files changed, 135 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr97706.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr97706.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97706.c
new file mode 100644
index 000..228ae700e8c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr97706.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+
+_Bool arr[16];
+void bar();
+void foo(int n, char *p)
+{
+  _Bool b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12, b13, b14, b15;
+  do
+{
+  b0 = p[0] != 0;
+  b1 = p[1] != 0;
+  b2 = p[2] != 0;
+  b3 = p[3] != 0;
+  b4 = p[4] != 0;
+  b5 = p[5] != 0;
+  b6 = p[6] != 0;
+  b7 = p[7] != 0;
+  b8 = p[8] != 0;
+  b9 = p[9] != 0;
+  b10 = p[10] != 0;
+  b11 = p[11] != 0;
+  b12 = p[12] != 0;
+  b13 = p[13] != 0;
+  b14 = p[14] != 0;
+  b15 = p[15] != 0;
+  arr[0] = b0;
+  arr[1] = b1;
+  arr[2] = b2;
+  arr[3] = b3;
+  arr[4] = b4;
+  arr[5] = b5;
+  arr[6] = b6;
+  arr[7] = b7;
+  arr[8] = b8;
+  arr[9] = b9;
+  arr[10] = b10;
+  arr[11] = b11;
+  arr[12] = b12;
+  arr[13] = b13;
+  arr[14] = b14;
+  arr[15] = b15;
+  bar ();
+}
+  while (--n);
+  arr[0] = b0;
+  arr[1] = b1;
+  arr[2] = b2;
+  arr[3] = b3;
+  arr[4] = b4;
+  arr[5] = b5;
+  arr[6] = b6;
+  arr[7] = b7;
+  arr[8] = b8;
+  arr[9] = b9;
+  arr[10] = b10;
+  arr[11] = b11;
+  arr[12] = b12;
+  arr[13] = b13;
+  arr[14] = b14;
+  arr[15] = b15;
+}
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 47d9fce594f..eefa7cf6799 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -5007,6 +5007,8 @@ possible_vector_mask_operation_p (stmt_vec_info stmt_info)
  return TREE_CODE_CLASS (rhs_code) == tcc_comparison;
}
 }
+  else if (is_a  (stmt_info->stmt))
+return true;
   return false;
 }
 
@@ -5049,41 +5051,63 @@ vect_determine_mask_precision (vec_info *vinfo, 
stmt_vec_info stmt_info)
  The number of operations are equal, but M16 would have given
  a shorter dependency chain and allowed more ILP.  */
   unsigned int precision = ~0U;
-  gassign *assign = as_a  (stmt_info->stmt);
-  unsigned int nops = gimple_num_ops (assign);
-  for (unsigned int i = 1; i < nops; ++i)
+  if (gassign *assign = dyn_cast  (stmt_info->stmt))
 {
-  tree rhs = gimple_op (assign, i);
-  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs)))
-   continue;
+  unsigned int nops = gimple_num_ops (assign);
+  for (unsigned int i = 1; i < nops; ++i)
+   {
+ tree rhs = gimple_op (assign, i);
+ if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs)))
+   continue;
 
-  stmt_vec_info def_stmt_info = vinfo->lookup_def (rhs);
-  if (!def_stmt_info)
-   /* Don't let external or constant operands influence the choice.
-  We can convert them to whichever vector type we pick.  */
-   continue;
+ stmt_vec_info def_stmt_info = vinfo->lookup_def (rhs);
+ if (!def_stmt_info)
+   /* Don't let external or constant operands influence the choice.
+  We can convert them to whichever vector type we pick.  */
+   continue;
+
+ if (def_stmt_info->mask_precision)
+   {
+ if (precision > def_stmt_info->mask_precision)
+   precision = def_stmt_info->mask_precision;
+   }
+   }
 
-  if (def_stmt_info->mask_precision)
+  /* If the statement compares two values that shouldn't use vector masks,
+try comparing the values as normal scalars instead.  */
+  tree_code rhs_code = gimple_assign_rhs_code (assign);
+  if (precision == ~0U
+ && TREE_CODE_CLASS (rhs_code) == tcc_comparison)
{
- if (precision > def_stmt_info->mask_precision)
-   precision = def_stmt_info->mask_precision;
+ tree rhs1_type = TREE_TYPE (gimple_assign_rhs1 (assign));
+ scalar_mode mode;
+ tree vectype, mask_type;
+ if (is_a  (TYPE_MODE (rhs1_type), &mode)
+ && (vectype = get_vectype_for_scalar_type (vinfo, rhs1_type))
+ && (mask_type = get_mask_type_for_scalar_type (vinfo, rhs1_type))
+ && exp

[PATCH V2] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-06 Thread Andrea Corallo via Gcc-patches
Christophe Lyon  writes:

> On Thu, 5 Nov 2020 at 15:30, Andrea Corallo  wrote:
>>
>> Christophe Lyon  writes:
>>
>> > On Thu, 5 Nov 2020 at 12:11, Andrea Corallo  wrote:
>> >>
>> >> Christophe Lyon  writes:
>> >>
>> >> [...]
>> >>
>> >> >> I think you need to add -mfloat-abi=hard to the dg-additional-options
>> >> >> otherwise vld1_lane_bf16_1.c
>> >> >> fails on targets with a soft float-abi default (eg arm-linux-gnueabi).
>> >> >>
>> >> >> See bf16_vldn_1.c.
>> >> >
>> >> > Actually that's not sufficient because in turn we get:
>> >> > /sysroot-arm-none-linux-gnueabi/usr/include/gnu/stubs.h:10:11: fatal
>> >> > error: gnu/stubs-hard.h: No such file or directory
>> >> >
>> >> > So you should check that -mfloat-abi=hard is supported.
>> >> >
>> >> > Ditto for the vst tests.
>> >>
>> >> Hi Christophe,
>> >>
>> >> this patch should implement your suggestions.
>> >>
>> >> On my arm-none-linux-gnueabi setup the tests were already skipped
>> >> as unsupported so if you could test and confirm this fixes the
>> >> issue you see would be great.
>> >
>> > Do you know why they are unsupported in your setup?
>>
>> We probably have a different GCC configuration.  Could you share how
>> it's configured your?
>>
> Sure, for instance:
> --target=arm-none-linux-gnueabi --with-float=soft --with-mode=arm
> --with-cpu=cortex-a9

Thanks, I see now what was going on, my gas has no bf16 support so the
test was marked as unsupported.  Dunno why I assumed
check_no_compiler_messages_nocache wasn't testing the whole compilation
process.

>> >> diff --git a/gcc/testsuite/lib/target-supports.exp 
>> >> b/gcc/testsuite/lib/target-supports.exp
>> >> index 15f0649f8ae..2ab7e39756d 100644
>> >> --- a/gcc/testsuite/lib/target-supports.exp
>> >> +++ b/gcc/testsuite/lib/target-supports.exp
>> >> @@ -5213,6 +5213,10 @@ proc 
>> >> check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
>> >>  return 0;
>> >>  }
>> >>
>> >> +if { ! [check_effective_target_arm_hard_ok] } {
>> >> + return 0;
>> >> +}
>> >> +
>> >> foreach flags {"" "-mfloat-abi=hard -mfpu=neon-fp-armv8" 
>> >> "-mfloat-abi=softfp -mfpu=neon-fp-armv8" } {
>> >> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok 
>> >> object {
>> >> #include 
>> >
>> > This seems strange since you would now exit early if
>> > check_effective_target_arm_hard_ok is false, so you'll never need the
>> > -mfloat-abi=softfp version of the flags.
>>
>> So IIUC your suggestion would be to test with higher priority softfp and
>> in case we decide to go for hardfp make sure
>> check_effective_target_arm_hard_ok is satisfied.  Am I correct?
>>
> ISTM that other tests that need hardfp check if it's supported in the
> test, not in other effective targets.
>
> For instance mve/intrinsics/mve_fpu1.c
>
> I can see that quite a few tests that use -mfloat-abi=hard do not
> check whether it's supported. Those I checked do not include
> arm_neon.h and thus do not end up with the gnu/stubs-hard.h error
> above.

I see thanks for the explaination.  The attached should do the job.

  Andrea

>From 1fc3854d1cb48840d7b8db9fcf7b2997a25f35f4 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 5 Nov 2020 08:57:03 +
Subject: [PATCH] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05  Andrea Corallo  

* gcc.target/arm/simd/vld1_lane_bf16_1.c: Require target to
support and add -mfloat-abi=hard flag.
* gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c  | 3 ++-
 gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c  | 2 ++
 gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c | 2 ++
 gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c  | 3 ++-
 gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c  | 2 ++
 gcc/testsuite/gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c | 2 ++
 6 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
index fa4e45b7217..94fb38f32b8 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
@@ -1,7 +1,8 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-require-effective-target arm_hard_ok } */
 /* { dg-add-options arm_v8_2a_bf16_neon } */
-/* { dg-additional-options "-O3 --save-temps" } */
+/* { dg-additional-options "-O3 --save-temps -mfloat-abi=hard" } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_

Re: [00/32] C++ 20 Modules

2020-11-06 Thread Boris Kolpackov
Nathan Sidwell  writes:

> The repo is providing a mechanism by which two processes can synchronize 
> on a fixed location in the file system that is not /.  You need such a 
> capability as the file system is the bulk transfer mechanism.
> 
> The alternatives are to always use absolute paths, or require the two 
> ends of the communication to have the same working directory [...]

Isn't the latter pretty much the norm for a build system that spawns
the compiler?


> The location of the repo is entirely under the mapper-server's control. 
> Set it to / if you want.

Except that now all my relative paths are relative to / and not CWD.

I find the current semantics heavily skewed towards the mapper operating
outside the build system (like the builtin mapper) while I expect most
non-toy/legacy build systems that wish to support C++ modules to have
an integrated mapper (build2 certainly does it this way). I think there
should at least be a way for the mapper to opt out of this repository
functionality.


Also, you mentioning synchronization reminded me of this part from
Invoking GCC/C++ Modules:

> When creating an output CMI any missing directory components are
> created in a manner that is safe for concurrent builds creating
> multiple, different, CMIs within a common subdirectory tree.
>
> CMI contents are written to a temporary file, which is then atomically
> renamed.  Observers will either see old contents (if there is an
> existing file), or complete new contents.  They will not observe the CMI
> during its creation.

This works atomically on POSIX but not on Windows. Also, from experience,
on Windows creating a temporary file and then renaming it often causes
more problems than creating it in the final destination from the outset.
That's because on Windows you cannot (re)move a file that is open by
another process. And there are various processes on Windows (anti-virus/
malware, indexers, IDEs, etc) that routinely scan the filesystem.


[PATCH 4/6] Add documentation for dead field elimination

2020-11-06 Thread Erick Ochoa

From 015634bee522cf6224b0d4bcfd3adaf3a6a38fa0 Mon Sep 17 00:00:00 2001
From: Erick Ochoa 
Date: Mon, 10 Aug 2020 09:10:37 +0200
Subject: [PATCH 4/6] Add documentation for dead field elimination

2020-11-04  Erick Ochoa  

* gcc/Makefile.in: Add file to documentation sources
* gcc/doc/dfe.texi: New section
* gcc/doc/gccint.texi: Include new section
---
 gcc/Makefile.in |   3 +-
 gcc/doc/dfe.texi| 187 
 gcc/doc/gccint.texi |   2 +
 3 files changed, 191 insertions(+), 1 deletion(-)
 create mode 100644 gcc/doc/dfe.texi

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2184bd0fc3d..7e4c442416d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3275,7 +3275,8 @@ TEXI_GCCINT_FILES = gccint.texi gcc-common.texi 
gcc-vers.texi		\

 gnu.texi gpl_v3.texi fdl.texi contrib.texi languages.texi  \
 sourcebuild.texi gty.texi libgcc.texi cfg.texi tree-ssa.texi   \
 loop.texi generic.texi gimple.texi plugins.texi optinfo.texi   \
-match-and-simplify.texi analyzer.texi ux.texi poly-int.texi
+match-and-simplify.texi analyzer.texi ux.texi poly-int.texi\
+dfe.texi

 TEXI_GCCINSTALL_FILES = install.texi install-old.texi fdl.texi \
 gcc-common.texi gcc-vers.texi
diff --git a/gcc/doc/dfe.texi b/gcc/doc/dfe.texi
new file mode 100644
index 000..e8d01d817d3
--- /dev/null
+++ b/gcc/doc/dfe.texi
@@ -0,0 +1,187 @@
+@c Copyright (C) 2001 Free Software Foundation, Inc.
+@c This is part of the GCC manual.
+@c For copying conditions, see the file gcc.texi.
+
+@node Dead Field Elimination
+@chapter Dead Field Elimination
+
+@node Dead Field Elimination Internals
+@section Dead Field Elimination Internals
+
+@subsection Introduction
+
+Dead field elimination is a compiler transformation that removes fields 
from structs. There are several challenges to removing fields from 
structs at link time but, depending on the workload of the compiled 
program and the architecture where the program runs, dead field 
elimination might be a worthwhile transformation to apply. Generally 
speaking, when the bottle-neck of an application is given by the memory 
bandwidth of the host system and the memory requested is of a struct 
which can be reduced in size, then that combination of workload, program 
and architecture can benefit from applying dead field elimination. The 
benefits come from removing unnecessary fields from structures and thus 
reducing the memory/cache requirements to represent a structure.

+
+
+
+While challenges exist to fully automate a dead field elimination 
transformation, similar and more powerful optimizations have been 
implemented in the past. Chakrabarti et al [0] implement struct peeling, 
splitting into hot and cold parts of a structure, and field reordering. 
Golovanevsky et al [1] also shows efforts to implement data layout 
optimizations at link time. Unlike the work of Chakrabarti and 
Golovanesky, this text only talks about dead field elimination. This 
doesn't mean that the implementation can't be expanded to perform other 
link-time layout optimizations, it just means that dead field 
elimination is the only transformation that is implemented at the time 
of this writing.

+
+[0] Chakrabarti, Gautam, Fred Chow, and L. PathScale. "Structure layout 
optimizations in the open64 compiler: Design, implementation and 
measurements." Open64 Workshop at the International Symposium on Code 
Generation and Optimization. 2008.

+
+[1] Golovanevsky, Olga, and Ayal Zaks. "Struct-reorg: current status 
and future perspectives." Proceedings of the GCC Developers’ Summit. 2007.

+
+@subsection Overview
+
+The dead field implementation is structured in the following way:
+
+
+@itemize @bullet
+@item
+Collect all types which can refer to a @code{RECORD_TYPE}. This means 
that if we have a pointer to a record, we also collect this pointer. Or 
an array, or a union.

+@item
+Mark types as escaping. More of this in the following section.
+@item
+Find fields which can be deleted. (Iterate over all gimple code and 
find which fields are read.)

+@item
+Create new types with removed fields (and reference these types in 
pointers, arrays, etc.)

+@item
+Modify gimple to include these types.
+@end itemize
+
+
+Most of this code relies on the visitor pattern. Types, Expr, and 
Gimple statements are visited using this pattern. You can find the base 
classes in @file{type-walker.c} @file{expr-walker.c} and 
@file{gimple-walker.c}. There are assertions in place where a type, 
expr, or gimple code is encountered which has not been encountered 
before during the testing of this transformation. This facilitates 
fuzzying of the transformation.

+
+@subsubsection Implementation Details: Is a global variable escaping?
+
+How does the analysis determine whether a global variable is visible to 
code outside the current linking unit? In the file 
@file{gimple-escaper.c} we have a simple function called 
@co

[PATCH 2/6] Add Dead Field Elimination

2020-11-06 Thread Erick Ochoa

From 2cd94824269e94babedd2a963e4b9ee96889ec82 Mon Sep 17 00:00:00 2001
From: Erick Ochoa 
Date: Thu, 6 Aug 2020 14:07:20 +0200
Subject: [PATCH 2/6] Add Dead Field Elimination

Using the Dead Field Analysis, Dead Field Elimination
automatically transforms gimple to eliminate fields that
are never read.

2020-11-04  Erick Ochoa  

* gcc/Makefile.in: add file to list of sources
* gcc/ipa-dfe.c: New
* gcc/ipa-dfe.h: Same
* gcc/ipa-type-escape-analysis.h: Export code used in dfe.
* gcc/ipa-type-escape-analysis.c: Call transformation
---
 gcc/Makefile.in|1 +
 gcc/ipa-dfe.c  | 1283 
 gcc/ipa-dfe.h  |  247 ++
 gcc/ipa-type-escape-analysis.c |   22 +-
 gcc/ipa-type-escape-analysis.h |   10 +
 5 files changed, 1553 insertions(+), 10 deletions(-)
 create mode 100644 gcc/ipa-dfe.c
 create mode 100644 gcc/ipa-dfe.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 8b18c9217a2..8ef6047870b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1416,6 +1416,7 @@ OBJS = \
init-regs.o \
internal-fn.o \
ipa-type-escape-analysis.o \
+   ipa-dfe.o \
ipa-cp.o \
ipa-sra.o \
ipa-devirt.o \
diff --git a/gcc/ipa-dfe.c b/gcc/ipa-dfe.c
new file mode 100644
index 000..31a5066f1b5
--- /dev/null
+++ b/gcc/ipa-dfe.c
@@ -0,0 +1,1283 @@
+/* IPA Type Escape Analysis and Dead Field Elimination
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+  Contributed by Erick Ochoa 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* Interprocedural dead field elimination (IPA-DFE)
+
+   The goal of this transformation is to
+
+   1) Create new types to replace RECORD_TYPEs which hold dead fields.
+   2) Substitute instances of old RECORD_TYPEs for new RECORD_TYPEs.
+   3) Substitute instances of old FIELD_DECLs for new FIELD_DECLs.
+   4) Fix some instances of pointer arithmetic.
+   5) Relayout where needed.
+
+   First stage - DFA
+   =
+
+   Use DFA to compute the set of FIELD_DECLs which can be deleted.
+
+   Second stage - Reconstruct Types
+   
+
+   This stage is done by two family of classes, the SpecificTypeCollector
+   and the TypeReconstructor.
+
+   The SpecificTypeCollector collects all TYPE_P trees which point to
+   RECORD_TYPE trees returned by DFA.  The TypeReconstructor will create
+   new RECORD_TYPE trees and new TYPE_P trees replacing the old RECORD_TYPE
+   trees with the new RECORD_TYPE trees.
+
+   Third stage - Substitute Types and Relayout
+   ===
+
+   This stage is handled by ExprRewriter and GimpleRewriter.
+   Some pointer arithmetic is fixed here to take into account those 
eliminated

+   FIELD_DECLS.
+ */
+
+#include "config.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple-expr.h"
+#include "predict.h"
+#include "alloc-pool.h"
+#include "tree-pass.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "fold-const.h"
+#include "gimple-fold.h"
+#include "symbol-summary.h"
+#include "tree-vrp.h"
+#include "ipa-prop.h"
+#include "tree-pretty-print.h"
+#include "tree-inline.h"
+#include "ipa-fnsummary.h"
+#include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "basic-block.h" //needed for gimple.h
+#include "function.h"//needed for gimple.h
+#include "gimple.h"
+#include "stor-layout.h"
+#include "cfg.h" // needed for gimple-iterator.h
+#include "gimple-iterator.h"
+#include "gimplify.h"  //unshare_expr
+#include "value-range.h"   // make_ssa_name dependency
+#include "tree-ssanames.h" // make_ssa_name
+#include "ssa.h"
+#include "tree-into-ssa.h"
+#include "gimple-ssa.h" // update_stmt
+#include "tree.h"
+#include "gimple-expr.h"
+#include "predict.h"
+#include "alloc-pool.h"
+#include "tree-pass.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "fold-const.h"
+#include "gimple-fold.h"
+#include "symbol-summary.h"
+#include "tree-vrp.h"
+#include "ipa-prop.h"
+#include "tree-pretty-print.h"
+#include "tree-inline.h"
+#include "ipa-fnsummary.h"
+#include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "tree-ssa-alias.h"
+#include "tree-s

[PATCH 0/6] Dead Field Elimination and Field Reorder

2020-11-06 Thread Erick Ochoa

Just submitting changes to the previously submitted patches.

* I have removed all non-essential flags I introduced
* I have placed the standard headers before config
* I have squashed some changes that I sent to the patches mailing list 
and make sure that the transformation bootstraps on every commit


[PATCH 6/6] Add heuristic to take into account void* pattern.

2020-11-06 Thread Erick Ochoa

From ccd82a7e484d9e4562c23f1b9cbebf3f47e2a822 Mon Sep 17 00:00:00 2001
From: Erick Ochoa 
Date: Fri, 16 Oct 2020 08:49:08 +0200
Subject: [PATCH 6/6] Add heuristic to take into account void* pattern.

We add a heuristic in order to be able to transform functions which
receive void* arguments as a way to generalize over arguments. An
example of this is qsort. The heuristic works by first inspecting
leaves in the call graph. If the leaves only contain a reference
to a single RECORD_TYPE then we color the nodes in the call graph
as "casts are safe in this function and does not call external
visible functions". We propagate this property up the callgraph
until a fixed point is reached. This will later be changed to
use ipa-modref.

2020-11-04  Erick Ochoa  

* ipa-type-escape-analysis.c : Add new heuristic
* ipa-field-reorder.c : Use heuristic
* ipa-type-escape-analysis.h : Change signatures
---
 gcc/ipa-field-reorder.c|   3 +-
 gcc/ipa-type-escape-analysis.c | 182 +++--
 gcc/ipa-type-escape-analysis.h |  72 -
 3 files changed, 243 insertions(+), 14 deletions(-)

diff --git a/gcc/ipa-field-reorder.c b/gcc/ipa-field-reorder.c
index 9a28097b473..2f694cff7ea 100644
--- a/gcc/ipa-field-reorder.c
+++ b/gcc/ipa-field-reorder.c
@@ -588,8 +588,9 @@ lto_fr_execute ()
   log ("here in field reordering \n");
   // Analysis.
   detected_incompatible_syntax = false;
+  std::map whitelisted = get_whitelisted_nodes();
   tpartitions_t escaping_nonescaping_sets
-= partition_types_into_escaping_nonescaping ();
+= partition_types_into_escaping_nonescaping (whitelisted);
   record_field_map_t record_field_map = find_fields_accessed ();
   record_field_offset_map_t record_field_offset_map
 = obtain_nonescaping_unaccessed_fields (escaping_nonescaping_sets,
diff --git a/gcc/ipa-type-escape-analysis.c b/gcc/ipa-type-escape-analysis.c
index 40dc89c51a2..2fc504ce6f5 100644
--- a/gcc/ipa-type-escape-analysis.c
+++ b/gcc/ipa-type-escape-analysis.c
@@ -104,6 +104,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include 
 #include 
+#include 

 #include "config.h"
 #include "system.h"
@@ -249,6 +250,99 @@ lto_dfe_execute ()
   return 0;
 }

+/* Heuristic to determine if casting is allowed in a function.
+ * This heuristic attempts to allow casting in functions which follow the
+ * pattern where a struct pointer or array pointer is casted to void* or
+ * char*.  The heuristic works as follows:
+ *
+ * There is a simple per-function analysis that determines whether there
+ * is more than 1 type of struct referenced in the body of the method.
+ * If there is more than 1 type of struct referenced in the body,
+ * then the layout of the structures referenced within the body
+ * cannot be casted.  However, if there's only one type of struct 
referenced

+ * in the body of the function, casting is allowed in the function itself.
+ * The logic behind this is that the if the code follows good programming
+ * practices, the only way the memory should be accessed is via a singular
+ * type. There is also another requisite to this per-function analysis, and
+ * that is that the function can only call colored functions or functions
+ * which are available in the linking unit.
+ *
+ * Using this per-function analysis, we then start coloring leaf nodes 
in the

+ * call graph as ``safe'' or ``unsafe''.  The color is propagated to the
+ * callers of the functions until a fixed point is reached.
+ */
+std::map
+get_whitelisted_nodes ()
+{
+  cgraph_node *node = NULL;
+  std::set nodes;
+  std::set leaf_nodes;
+  std::set leaf_nodes_decl;
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+  {
+node->get_untransformed_body ();
+nodes.insert(node);
+if (node->callees) continue;
+
+leaf_nodes.insert (node);
+leaf_nodes_decl.insert (node->decl);
+  }
+
+  std::queue worklist;
+  for (std::set::iterator i = leaf_nodes.begin (),
+e = leaf_nodes.end (); i != e; ++i)
+  {
+if (dump_file) fprintf (dump_file, "is a leaf node %s\n", 
(*i)->name ());

+worklist.push (*i);
+  }
+
+  for (std::set::iterator i = nodes.begin (),
+e = nodes.end (); i != e; ++i)
+  {
+worklist.push (*i);
+  }
+
+  std::map map;
+  while (!worklist.empty ())
+  {
+
+if (detected_incompatible_syntax) return map;
+cgraph_node *i = worklist.front ();
+worklist.pop ();
+if (dump_file) fprintf (dump_file, "analyzing %s %p\n", i->name (), 
(void*)i);

+GimpleWhiteLister whitelister;
+whitelister._walk_cnode (i);
+bool no_external = whitelister.does_not_call_external_functions (i, 
map);

+bool before_in_map = map.find (i->decl) != map.end ();
+bool place_callers_in_worklist = !before_in_map;
+if (!before_in_map)
+{
+  map.insert(std::pair(i->decl, no_external));
+} else
+{
+  map[i->decl] = no_external;
+}
+bool previous_value = map[i->decl];
+place_callers_in_worklist |= previous_value != no_external;

[PATCH 5/6] Abort if Gimple from C++ or Fortran sources is found.

2020-11-06 Thread Erick Ochoa

From f5dbfa73962d5443013d0193b2f91ea112a6d2d1 Mon Sep 17 00:00:00 2001
From: Erick Ochoa 
Date: Sun, 30 Aug 2020 10:21:35 +0200
Subject: [PATCH 5/6] Abort if Gimple from C++ or Fortran sources is found.

2020-11-04  Erick Ochoa  

* gcc/ipa-field-reorder: Add flag to exit transformation
* gcc/ipa-type-escape-analysis: Same
---
 gcc/ipa-field-reorder.c|  3 +-
 gcc/ipa-type-escape-analysis.c | 53 --
 gcc/ipa-type-escape-analysis.h |  2 ++
 3 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/gcc/ipa-field-reorder.c b/gcc/ipa-field-reorder.c
index 4c1ddc6d0e3..9a28097b473 100644
--- a/gcc/ipa-field-reorder.c
+++ b/gcc/ipa-field-reorder.c
@@ -587,6 +587,7 @@ lto_fr_execute ()
 {
   log ("here in field reordering \n");
   // Analysis.
+  detected_incompatible_syntax = false;
   tpartitions_t escaping_nonescaping_sets
 = partition_types_into_escaping_nonescaping ();
   record_field_map_t record_field_map = find_fields_accessed ();
@@ -594,7 +595,7 @@ lto_fr_execute ()
 = obtain_nonescaping_unaccessed_fields (escaping_nonescaping_sets,
record_field_map, 0);

-  if (record_field_offset_map.empty ())
+  if (detected_incompatible_syntax || record_field_offset_map.empty ())
 return 0;

   // Prepare for transformation.
diff --git a/gcc/ipa-type-escape-analysis.c b/gcc/ipa-type-escape-analysis.c
index aec7b924533..40dc89c51a2 100644
--- a/gcc/ipa-type-escape-analysis.c
+++ b/gcc/ipa-type-escape-analysis.c
@@ -171,6 +171,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-type-escape-analysis.h"
 #include "ipa-dfe.h"

+#define ABORT_IF_NOT_C true
+
+bool detected_incompatible_syntax = false;
+
 // Main function that drives dfe.
 static unsigned int
 lto_dfe_execute ();
@@ -256,13 +260,14 @@ static void
 lto_dead_field_elimination ()
 {
   // Analysis.
+  detected_incompatible_syntax = false;
   tpartitions_t escaping_nonescaping_sets
 = partition_types_into_escaping_nonescaping ();
   record_field_map_t record_field_map = find_fields_accessed ();
   record_field_offset_map_t record_field_offset_map
 = obtain_nonescaping_unaccessed_fields (escaping_nonescaping_sets,
record_field_map, OPT_Wdfa);
-  if (record_field_offset_map.empty ())
+  if (detected_incompatible_syntax || record_field_offset_map.empty ())
 return;

 // Prepare for transformation.
@@ -581,6 +586,7 @@ TypeWalker::_walk (tree type)
   // Improve, verify that having a type is an invariant.
   // I think there was a specific example which didn't
   // allow for it
+  if (detected_incompatible_syntax) return;
   if (!type)
 return;

@@ -634,9 +640,9 @@ TypeWalker::_walk (tree type)
 case POINTER_TYPE:
   this->walk_POINTER_TYPE (type);
   break;
-case REFERENCE_TYPE:
-  this->walk_REFERENCE_TYPE (type);
-  break;
+//case REFERENCE_TYPE:
+//  this->walk_REFERENCE_TYPE (type);
+//  break;
 case ARRAY_TYPE:
   this->walk_ARRAY_TYPE (type);
   break;
@@ -646,18 +652,24 @@ TypeWalker::_walk (tree type)
 case FUNCTION_TYPE:
   this->walk_FUNCTION_TYPE (type);
   break;
-case METHOD_TYPE:
-  this->walk_METHOD_TYPE (type);
-  break;
+//case METHOD_TYPE:
+  //this->walk_METHOD_TYPE (type);
+  //break;
 // Since we are dealing only with C at the moment,
 // we don't care about QUAL_UNION_TYPE nor LANG_TYPEs
 // So fail early.
+case REFERENCE_TYPE:
+case METHOD_TYPE:
 case QUAL_UNION_TYPE:
 case LANG_TYPE:
 default:
   {
log ("missing %s\n", get_tree_code_name (code));
+#ifdef ABORT_IF_NOT_C
+   detected_incompatible_syntax = true;
+#else
gcc_unreachable ();
+#endif
   }
   break;
 }
@@ -840,6 +852,7 @@ TypeWalker::_walk_arg (tree t)
 void
 ExprWalker::walk (tree e)
 {
+  if (detected_incompatible_syntax) return;
   _walk_pre (e);
   _walk (e);
   _walk_post (e);
@@ -924,7 +937,11 @@ ExprWalker::_walk (tree e)
 default:
   {
log ("missing %s\n", get_tree_code_name (code));
+#ifdef ABORT_IF_NOT_C
+   detected_incompatible_syntax = true;
+#else
gcc_unreachable ();
+#endif
   }
   break;
 }
@@ -1157,6 +1174,7 @@ GimpleWalker::walk ()
   cgraph_node *node = NULL;
   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
 {
+  if (detected_incompatible_syntax) return;
   node->get_untransformed_body ();
   tree decl = node->decl;
   gcc_assert (decl);
@@ -1403,7 +1421,11 @@ GimpleWalker::_walk_gimple (gimple *stmt)
   // Break if something is unexpected.
   const char *name = gimple_code_name[code];
   log ("gimple code name %s\n", name);
+#ifdef ABORT_IF_NOT_C
+  detected_incompatible_syntax = true;
+#else
   gcc_unreachable ();
+#endif
 }

 void
@@ -2935,6 +2957,8 @@ TypeStringifier::stringify (tree t)
 return std::string ("");
   _stringification.clear ();
   gcc_assert (t);

[PATCH 3/6] Add Field Reordering

2020-11-06 Thread Erick Ochoa

From 72e6ea57b04ca2bf223faef262b478dc407cdca7 Mon Sep 17 00:00:00 2001
From: Erick Ochoa 
Date: Sun, 9 Aug 2020 10:22:49 +0200
Subject: [PATCH 3/6] Add Field Reordering

Field reordering of structs at link-time

2020-11-04  Erick Ochoa  

* gcc/Makefile.in: add new file to list of sources
* gcc/common.opt: add new flag for field reordering
* gcc/passes.def: add new pass
* gcc/tree-pass.h: same
* gcc/ipa-field-reorder.c: New file
* gcc/ipa-type-escape-analysis.c: Export common functions
* gcc/ipa-type-escape-analysis.h: Same
---
 gcc/Makefile.in|   1 +
 gcc/common.opt |   4 +
 gcc/ipa-dfe.c  |  86 -
 gcc/ipa-dfe.h  |  26 +-
 gcc/ipa-field-reorder.c| 622 +
 gcc/ipa-type-escape-analysis.c |  44 ++-
 gcc/ipa-type-escape-analysis.h |  12 +-
 gcc/passes.def |   1 +
 gcc/tree-pass.h|   2 +
 9 files changed, 749 insertions(+), 49 deletions(-)
 create mode 100644 gcc/ipa-field-reorder.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 8ef6047870b..2184bd0fc3d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1417,6 +1417,7 @@ OBJS = \
internal-fn.o \
ipa-type-escape-analysis.o \
ipa-dfe.o \
+   ipa-field-reorder.o \
ipa-cp.o \
ipa-sra.o \
ipa-devirt.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 85351738a29..7885d0f5c0c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3468,4 +3468,8 @@ Wdfa
 Common Var(warn_dfa) Init(1) Warning
 Warn about dead fields at link time.

+fipa-field-reorder
+Common Report Var(flag_ipa_field_reorder) Optimization
+Reorder fields.
+
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/ipa-dfe.c b/gcc/ipa-dfe.c
index 31a5066f1b5..5602ee667d4 100644
--- a/gcc/ipa-dfe.c
+++ b/gcc/ipa-dfe.c
@@ -185,7 +185,7 @@ get_types_replacement (record_field_offset_map_t 
record_field_offset_map,

 {
   TypeStringifier stringifier;

-  TypeReconstructor reconstructor (record_field_offset_map);
+  TypeReconstructor reconstructor (record_field_offset_map, "reorg");
   for (std::set::const_iterator i = to_modify.begin (),
e = to_modify.end ();
i != e; ++i)
@@ -245,9 +245,9 @@ get_types_replacement (record_field_offset_map_t 
record_field_offset_map,

  */
 void
 substitute_types_in_program (reorg_record_map_t map,
-reorg_field_map_t field_map)
+reorg_field_map_t field_map, bool _delete)
 {
-  GimpleTypeRewriter rewriter (map, field_map);
+  GimpleTypeRewriter rewriter (map, field_map, _delete);
   rewriter.walk ();
   rewriter._rewrite_function_decl ();
 }
@@ -361,8 +361,11 @@ TypeReconstructor::set_is_not_modified_yet (tree t)
 return;

   tree type = _reorg_map[tt];
-  const bool is_modified
+  bool is_modified
 = strstr (TypeStringifier::get_type_identifier (type).c_str (), 
".reorg");

+  is_modified
+|= (bool) strstr (TypeStringifier::get_type_identifier (type).c_str (),
+ ".reorder");
   if (!is_modified)
 return;

@@ -408,14 +411,20 @@ TypeReconstructor::is_memoized (tree t)
   return already_changed;
 }

-static tree
-get_new_identifier (tree type)
+const char *
+TypeReconstructor::get_new_suffix ()
+{
+  return _suffix;
+}
+
+tree
+get_new_identifier (tree type, const char *suffix)
 {
   const char *identifier = TypeStringifier::get_type_identifier 
(type).c_str ();

-  const bool is_new_type = strstr (identifier, "reorg");
+  const bool is_new_type = strstr (identifier, suffix);
   gcc_assert (!is_new_type);
   char *new_name;
-  asprintf (&new_name, "%s.reorg", identifier);
+  asprintf (&new_name, "%s.%s", identifier, suffix);
   return get_identifier (new_name);
 }

@@ -471,7 +480,9 @@ TypeReconstructor::_walk_ARRAY_TYPE_post (tree t)
   TREE_TYPE (copy) = build_variant_type_copy (TREE_TYPE (copy));
   copy = is_modified ? build_distinct_type_copy (copy) : copy;
   TREE_TYPE (copy) = is_modified ? _reorg_map[TREE_TYPE (t)] : 
TREE_TYPE (copy);
-  TYPE_NAME (copy) = is_modified ? get_new_identifier (copy) : 
TYPE_NAME (copy);

+  TYPE_NAME (copy) = is_modified
+  ? get_new_identifier (copy, this->get_new_suffix ())
+  : TYPE_NAME (copy);
   // This is useful so that we go again through type layout
   TYPE_SIZE (copy) = is_modified ? NULL : TYPE_SIZE (copy);
   tree domain = TYPE_DOMAIN (t);
@@ -524,7 +535,9 @@ TypeReconstructor::_walk_POINTER_TYPE_post (tree t)

   copy = is_modified ? build_variant_type_copy (copy) : copy;
   TREE_TYPE (copy) = is_modified ? _reorg_map[TREE_TYPE (t)] : 
TREE_TYPE (copy);
-  TYPE_NAME (copy) = is_modified ? get_new_identifier (copy) : 
TYPE_NAME (copy);

+  TYPE_NAME (copy) = is_modified
+  ? get_new_identifier (copy, this->get_new_suffix ())
+  : TYPE_NAME (copy);
   TYPE_CACHED_V

[PATCH] make PRE constant value IDs negative

2020-11-06 Thread Richard Biener
This separates constant and non-constant value-ids to allow for
a more efficient constant_value_id_p and for more efficient bit-packing
inside the bitmap sets which never contain any constant values.

There's further optimization opportunities but at this stage
I'll do small refactorings.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

2020-11-06  Richard Biener  

* tree-ssa-sccvn.h (get_max_constant_value_id): Declare.
(get_next_constant_value_id): Likewise.
(value_id_constant_p): Inline and simplify.
* tree-ssa-sccvn.c (constant_value_ids): Remove.
(next_constant_value_id): Add.
(get_or_alloc_constant_value_id): Adjust.
(value_id_constant_p): Remove definition.
(get_max_constant_value_id): Define.
(get_next_value_id): Add assert for overflow.
(get_next_constant_value_id): Define.
(run_rpo_vn): Adjust.
(free_rpo_vn): Likewise.
(do_rpo_vn): Initialize next_constant_value_id.
* tree-ssa-pre.c (constant_value_expressions): New.
(add_to_value): Split into constant/non-constant value
handling.  Avoid exact re-allocation.
(vn_valnum_from_value_id): Adjust.
(phi_translate_1): Remove spurious exact re-allocation.
(bitmap_find_leader): Adjust.  Make sure we return
a CONSTANT value for a constant value id.
(do_pre_regular_insertion): Use 2 auto-elements for avail.
(do_pre_partial_partial_insertion): Likewise.
(init_pre): Allocate constant_value_expressions.
(fini_pre): Release constant_value_expressions.
---
 gcc/tree-ssa-pre.c   | 57 
 gcc/tree-ssa-sccvn.c | 34 --
 gcc/tree-ssa-sccvn.h | 12 +-
 3 files changed, 69 insertions(+), 34 deletions(-)

diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 39c52c9b0f0..65e8aaaca02 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -444,6 +444,9 @@ public:
 
 /* Mapping from value id to expressions with that value_id.  */
 static vec value_expressions;
+/* ???  We want to just record a single expression for each constant
+   value, one of kind CONSTANT.  */
+static vec constant_value_expressions;
 
 /* Sets that we need to keep track of.  */
 typedef struct bb_bitmap_sets
@@ -624,18 +627,30 @@ add_to_value (unsigned int v, pre_expr e)
 
   gcc_checking_assert (get_expr_value_id (e) == v);
 
-  if (v >= value_expressions.length ())
+  if (value_id_constant_p (v))
 {
-  value_expressions.safe_grow_cleared (v + 1, true);
-}
+  if (-v >= constant_value_expressions.length ())
+   constant_value_expressions.safe_grow_cleared (-v + 1);
 
-  set = value_expressions[v];
-  if (!set)
-{
-  set = BITMAP_ALLOC (&grand_bitmap_obstack);
-  value_expressions[v] = set;
+  set = constant_value_expressions[-v];
+  if (!set)
+   {
+ set = BITMAP_ALLOC (&grand_bitmap_obstack);
+ constant_value_expressions[-v] = set;
+   }
 }
+  else
+{
+  if (v >= value_expressions.length ())
+   value_expressions.safe_grow_cleared (v + 1);
 
+  set = value_expressions[v];
+  if (!set)
+   {
+ set = BITMAP_ALLOC (&grand_bitmap_obstack);
+ value_expressions[v] = set;
+   }
+}
   bitmap_set_bit (set, get_or_alloc_expression_id (e));
 }
 
@@ -687,7 +702,11 @@ vn_valnum_from_value_id (unsigned int val)
 {
   bitmap_iterator bi;
   unsigned int i;
-  bitmap exprset = value_expressions[val];
+  bitmap exprset;
+  if (value_id_constant_p (val))
+exprset = constant_value_expressions[-val];
+  else
+exprset = value_expressions[val];
   EXECUTE_IF_SET_IN_BITMAP (exprset, 0, i, bi)
 {
   pre_expr vexpr = expression_for_id (i);
@@ -1451,8 +1470,6 @@ phi_translate_1 (bitmap_set_t dest,
else
  {
new_val_id = get_next_value_id ();
-   value_expressions.safe_grow_cleared (get_max_value_id () + 1,
-true);
nary = vn_nary_op_insert_pieces (newnary->length,
 newnary->opcode,
 newnary->type,
@@ -1603,11 +1620,7 @@ phi_translate_1 (bitmap_set_t dest,
else
  {
if (changed || !same_valid)
- {
-   new_val_id = get_next_value_id ();
-   value_expressions.safe_grow_cleared
- (get_max_value_id () + 1, true);
- }
+ new_val_id = get_next_value_id ();
else
  new_val_id = ref->value_id;
if (!newoperands.exists ())
@@ -1745,7 +1758,7 @@ bitmap_find_leader (bitmap_set_t set, unsigned int val)
 {
   unsigned int i;
   bitmap_iterator bi;
-  bitmap exprset = value_expressions[val];
+  bitmap exprset = con

Re: [PATCH]ira: recompute regstat as max_regno changes [PR97705]

2020-11-06 Thread Vladimir Makarov via Gcc-patches



On 2020-11-06 1:15 a.m., Kewen.Lin wrote:

Hi,

As PR97705 shows, my commit r11-4637 caused some dumping
comparison difference error on pass ira.  It exposed one
issue about the newly introduced function remove_scratches,
which can increase the largest pseudo reg number if it
succeeds, later some function will use the max_reg_num()
to get the latest max_regno, when iterating the numbers
we can access some data structures which are allocated as
the previous max_regno, some out of array bound accesses
can occur, the failure can be random since the values
beyond the array could be random.

This patch is to free/reinit/recompute the relevant data
structures that is regstat_n_sets_and_refs and reg_info_p
to ensure we won't access beyond some array bounds.

Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

Any thoughts?  Is it a reasonable fix?

Sure, Kewen.  A bit unexpected to see lambda to use for this but I 
checked and found couple places in GCC where lambdas are already used.


The patch is ok.  Please, commit it to the mainline.

Thank you for the patch.



Re: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64 intrinsics

2020-11-06 Thread Christophe Lyon via Gcc-patches
On Thu, 5 Nov 2020 at 12:55, Christophe Lyon  wrote:
>
> On Thu, 5 Nov 2020 at 10:36, Kyrylo Tkachov  wrote:
> >
> > H, Christophe,
> >
> > > -Original Message-
> > > From: Gcc-patches  On Behalf Of
> > > Christophe Lyon via Gcc-patches
> > > Sent: 15 October 2020 18:23
> > > To: gcc-patches@gcc.gnu.org
> > > Subject: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64
> > > intrinsics
> > >
> > > This patch adds implementations for vceqq_p64, vceqz_p64 and
> > > vceqzq_p64 intrinsics.
> > >
> > > vceqq_p64 uses the existing vceq_p64 after splitting the input vectors
> > > into their high and low halves.
> > >
> > > vceqz[q] simply call the vceq and vceqq with a second argument equal
> > > to zero.
> > >
> > > The added (executable) testcases make sure that the poly64x2_t
> > > variants have results with one element of all zeroes (false) and the
> > > other element with all bits set to one (true).
> > >
> > > 2020-10-15  Christophe Lyon  
> > >
> > >   gcc/
> > >   * config/arm/arm_neon.h (vceqz_p64, vceqq_p64, vceqzq_p64):
> > > New.
> > >
> > >   gcc/testsuite/
> > >   * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c: Add tests for
> > >   vceqz_p64, vceqq_p64 and vceqzq_p64.
> > > ---
> > >  gcc/config/arm/arm_neon.h  | 31 +++
> > >  .../aarch64/advsimd-intrinsics/p64_p128.c  | 46
> > > +-
> > >  2 files changed, 76 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
> > > index aa21730..f7eff37 100644
> > > --- a/gcc/config/arm/arm_neon.h
> > > +++ b/gcc/config/arm/arm_neon.h
> > > @@ -16912,6 +16912,37 @@ vceq_p64 (poly64x1_t __a, poly64x1_t __b)
> > >return vreinterpret_u64_u32 (__m);
> > >  }
> > >
> > > +__extension__ extern __inline uint64x1_t
> > > +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> > > +vceqz_p64 (poly64x1_t __a)
> > > +{
> > > +  poly64x1_t __b = vreinterpret_p64_u32 (vdup_n_u32 (0));
> > > +  return vceq_p64 (__a, __b);
> > > +}
> >
> > This approach is okay, but can we have some kind of test to confirm it 
> > generates the VCEQ instruction with immediate zero rather than having a 
> > separate DUP...
>
> I had checked that manually, but I'll add a test.
> However, I have noticed that although vceqz_p64 uses vceq.i32 dX, dY, #0,
> the vceqzq_64 version below first sets
> vmov dZ, #0
> and then emits two
> vmoz dX, dY, dZ
>
> I'm looking at why this happens.
>

Hi,

Here is an updated version, which adds two tests (arm/simd/vceqz_p64.c
and arm/simd/vceqzq_p64.c).

The vceqzq_64 test does not currently expect instructions with
immediate zero, because we generate:
vmov.i32q9, #0  @ v4si
[...]
vceq.i32d16, d16, d19
vceq.i32d17, d17, d19

Looking at the traces, I can see this in reload:
(insn 19 8 15 2 (set (reg:V2SI 48 d16 [orig:128 _18 ] [128])
(neg:V2SI (eq:V2SI (reg:V2SI 48 d16 [orig:139 v1 ] [139])
(reg:V2SI 54 d19 [ _5+8 ]
"/home/christophe.lyon/src/GCC/builds/gcc-fsf-git-neon-intrinsics/tools/lib/gcc/arm-none-linux-gnueabihf/11.0.0/include/arm_neon.h":2404:22
1650 {neon_vceqv2si_insn}
 (expr_list:REG_EQUAL (neg:V2SI (eq:V2SI (subreg:V2SI (reg:DI 48
d16 [orig:139 v1 ] [139]) 0)
(const_vector:V2SI [
(const_int 0 [0]) repeated x2
])))
(nil)))
(insn 15 19 20 2 (set (reg:V2SI 50 d17 [orig:121 _11 ] [121])
(neg:V2SI (eq:V2SI (reg:V2SI 50 d17 [orig:141 v2 ] [141])
(reg:V2SI 54 d19 [ _5+8 ]
"/home/christophe.lyon/src/GCC/builds/gcc-fsf-git-neon-intrinsics/tools/lib/gcc/arm-none-linux-gnueabihf/11.0.0/include/arm_neon.h":2404:22
1650 {neon_vceqv2si_insn}
 (expr_list:REG_EQUAL (neg:V2SI (eq:V2SI (subreg:V2SI (reg:DI 50
d17 [orig:141 v2 ] [141]) 0)
(const_vector:V2SI [
(const_int 0 [0]) repeated x2
])))
(nil)))

but it says:
 Choosing alt 0 in insn 19:  (0) =w  (1) w  (2) w {neon_vceqv2si_insn}
  alt=0,overall=0,losers=0,rld_nregs=0
 Choosing alt 0 in insn 15:  (0) =w  (1) w  (2) w {neon_vceqv2si_insn}
  alt=0,overall=0,losers=0,rld_nregs=0

Why isn't it picking alternative 1 with the Dz constraint?

Christophe


> Thanks,
>
> Christophe
>
>
> > Thanks,
> > Kyrill
> >
> > > +
> > > +/* For vceqq_p64, we rely on vceq_p64 for each of the two elements.  */
> > > +__extension__ extern __inline uint64x2_t
> > > +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> > > +vceqq_p64 (poly64x2_t __a, poly64x2_t __b)
> > > +{
> > > +  poly64_t __high_a = vget_high_p64 (__a);
> > > +  poly64_t __high_b = vget_high_p64 (__b);
> > > +  uint64x1_t __high = vceq_p64(__high_a, __high_b);
> > > +
> > > +  poly64_t __low_a = vget_low_p64 (__a);
> > > +  poly64_t __low_b = vget_low_p64 (__b);
> > > +  uint64x1_t __low = vceq_p64(__low_a, __low_b);
> > > +  r

Re: Fix uninitialized memory use in ipa-modref

2020-11-06 Thread Martin Liška

On 11/5/20 6:54 PM, Jan Hubicka wrote:

On 11/5/20 3:27 PM, Jan Hubicka wrote:

 poly_int64 offset;
 struct modref_parm_map parm_map;
+  parm_map.parm_offset_known = false;
+  parm_map.parm_offset = 0;
+


I'm curious, can't we use a proper C++ class construction.
The IPA pass is new and so we can make it more C++-ish? Similarly
for all newly introduced structs in mod ref.


We can't because our vec does not accept non-pods and this needs to be
GGC safe since it points to trees.


We could probably add construction of writes_errno even though in corret
run it should be never used (in analysis we need to be able to
reinitialize and during stream in we will always stream it in).


It may be error prone approach to initialize it to NULL.


What else do you think can be more ++-ish? The pass even has two
templates :).


Heh, all right :)



diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index b40f3da3ba2..e80f6de09f2 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -124,7 +124,7 @@ static GTY(()) fast_function_summary 
  /* Summary for a single function which this pass produces.  */
  
  modref_summary::modref_summary ()

-  : loads (NULL), stores (NULL)
+  : loads (NULL), stores (NULL), writes_errno (NULL)
  {
  }
  





Re: Fix uninitialized memory use in ipa-modref

2020-11-06 Thread Martin Liška

On 11/5/20 6:37 PM, Jan Hubicka wrote:

We can't because our vec does not accept non-pods and this needs to be
GGC safe since it points to trees.


Ah, that's new to me!

Thanks,
Martin


Re: [PATCH] generalized range_query class for multiple contexts

2020-11-06 Thread Martin Sebor via Gcc-patches

On 11/5/20 8:08 PM, Andrew MacLeod wrote:

On 11/5/20 7:50 PM, Martin Sebor wrote:

On 11/5/20 5:02 PM, Andrew MacLeod wrote:

On 11/5/20 4:02 PM, Martin Sebor wrote:

On 11/5/20 12:29 PM, Martin Sebor wrote:

On 10/1/20 11:25 AM, Martin Sebor wrote:

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't 
understand
 >>> in the sprintf changes so no questions from me (well, almost 
none).

 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions 
that take
 >>> a directive argument, as dir->info.callstmt.  There should 
be no need
 >>> to also add it as a new argument to the functions that now 
need it.

 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being 
used on
 >> supported types (integers or pointers).  So range_of_expr 
will always

 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be 
nice

 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just 
initialization:

 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the 
sprintf
 >>> pass (and maybe also strlen which also seems to use the same 
pattern),
 >>> or it could be a value_range AKA irange ctor, or it could be 
a member

 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too 
expensive,

 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant? (I.e., 
does

 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a constant 
has been
 >>> -    substituted earlier for VAL by this pass that 
hasn't been
 >>> -    propagated through the CFG.  This shoud be fixed by 
the new
 >>> -    on-demand VRP if/when it becomes available 
(hopefully in

 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to 
get_range_strlen_dynamic

 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it 
would be

 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the 
patchset
 >> or just the valuation part.  I'll wait for his nod before 
committing

 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy 
has all
 > the code for converting to a ranger, but we have not pursued 
that any
 > further yet since there is a regression due to our lack of 
equivalence
 > processing I think?  That should be resolved in the coming 
month, but at
 > the moment is a holdback/concern for converting these 
passes... iirc.


Yes.  Martin, the take away here is that the strlen/sprintf pass 
has been converted to the new API, but ranger is still not up and 
running on it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger. That's it.
Below is an untested patch that would convert you to a ranger 
once it's contributed.


IIRC when I enabled the ranger for your pass a while back, there 
was one or two regressions due to missing equivalences, and the 
rest were because the tests were expecting an actual specific 
range, and the ranger returned a slightly different/better one. 
You'll need to adjust your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).


I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h()

Re: [PATCH] use get_size_range to get allocated size (PR 92942)

2020-11-06 Thread Jeff Law via Gcc-patches


On 8/28/20 11:12 AM, Martin Sebor via Gcc-patches wrote:
> The gimple_call_alloc_size() function that determines the range
> of sizes of allocated objects and constrains the bounds in calls
> to functions like memcpy calls get_range() instead of
> get_size_range() to obtain its result.  The latter is the right
> function to call because it has the necessary logic to constrain
> the range to just the values that are valid for object sizes.
> This is especially useful when the range is the result of
> a conversion from a signed to a wider unsigned integer where
> the upper subrange is excessive and can be eliminated such as in:
>
>   char* f (int n)
>   {
>     if (n > 8)
>   n = 8;
>     char *p = malloc (n);
>     strcpy (p, "0123456789");   // buffer overflow
>     ...
>   }
>
> Attached is a fix that lets -Wstringop-overflow diagnose the buffer
> overflow above.  Besides with GCC I have also tested the change by
> building Binutils/GDB and Glibc and verifying that it doesn't
> introduce any false positives.
>
> Martin
>
> gcc-92942.diff
>
> PR middle-end/92942 - missing -Wstringop-overflow for allocations with a 
> negative lower bound size
>
> gcc/ChangeLog:
>
>   PR middle-end/92942
>   * builtins.c (gimple_call_alloc_size): Call get_size_range instead
>   of get_range.
>   * calls.c (get_size_range): Define new overload.  Handle anti-ranges
> whose upper part is with the valid size range.
>   * calls.h (get_size_range): Declare new overload.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/92942
>   * gcc.dg/Wstringop-overflow-40.c: New test.
>   * gcc.dg/Wstringop-overflow-41.c: New test.
>   * gcc.dg/attr-alloc_size-10.c: Disable macro tracking.

Please re-rest and once re-validated, this is fine for the trunk.


jeff




Re: [PATCH] issue -Wstring-compare in more case (PR 95673)

2020-11-06 Thread Jeff Law via Gcc-patches


On 9/30/20 6:14 PM, Martin Sebor via Gcc-patches wrote:
> -Wstring-compare triggers under the same strict conditions as
> the strcmp/strncmp call is folded into a constant: only when
> all the uses of the result are [in]equality expressions with
> zero.  However, even when the call cannot be folded into
> a constant because the result is in addition used in other
> expressions besides equality to zero, GCC still sets the range
> of the result to nonzero.  So in more complex functions where
> some of the uses of the same result are in tests for equality
> to zero and others in other expressions, the warning fails to
> point out the very mistake it's designed to detect.
>
> The attached change enhances the function that determines how
> the strcmp/strncmp is used to also make it possible to detect
> the mistakes in the multi-use situations.
>
> Tested on x86_64-linux & by building Glibc and Binutils/GDB
> and confirming it triggers no new warnings.
>
> Martin
>
> gcc-95673.diff
>
> PR middle-end/95673 - missing -Wstring-compare for an impossible strncmp test
>
> gcc/ChangeLog:
>
>   PR middle-end/95673
>   * tree-ssa-strlen.c (used_only_for_zero_equality): Rename...
>   (use_in_zero_equality): ...to this.  Add a default argument.
>   (handle_builtin_memcmp): Adjust to the name change above.
>   (handle_builtin_string_cmp): Same.
>   (maybe_warn_pointless_strcmp): Same.  Pass in an explicit argument.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/95673
>   * gcc.dg/Wstring-compare-3.c: New test.

Please retest on the trunk and if testing is OK this is fine for the trunk.

jeff




Re: Patch for 96948

2020-11-06 Thread Jeff Law via Gcc-patches


On 9/8/20 9:34 AM, Martin Storsjö wrote:
> Hi,
>
> On Tue, 8 Sep 2020, Kirill Müller wrote:
>
>> Thanks for the heads up. The coincidence is funny -- a file that
>> hasn't been touched for years.
>
> I think we both may originally be triggered from the same guy asking
> around in different places about implementations of _Unwind_Backtrace
> for windows, actually.
>
>> I do believe that we need the logic around the `first` flag for
>> consistency with the other unwind-*.c implementations.
>
> Yes, if you store ms_context.Rip/Rsp before the RtlVirtualUnwind step
> - but my patch stores them afterwards; after RtlVirtualUnwind, before
> calling the callback.
>
> The result should be the same, except if using the first flag
> approach, I believe you're missing the last frame that is printed if
> using my patch.

Presumably with your patch installed, the patch from Kirill is
unnecessary, right?


jeff




Re: [PATCH, rs6000] Update instruction attributes for Power10

2020-11-06 Thread Pat Haugen via Gcc-patches
On 11/5/20 4:32 PM, will schmidt wrote:
> On Wed, 2020-11-04 at 14:42 -0600, Pat Haugen via Gcc-patches wrote:
>>  * config/rs6000/rs6000.c (rs6000_final_prescan_insn): Only add 'p' for
>>  PREFIXED_YES.
> 
> The code change reads as roughly 
> - next_insn_prefixed_p != PREFIXED_NO
> 
> + next_insn_prefixed_p == PREFIXED_YES"
> 
> So just an inversion of the logic? I don't obviously see the 'p' impact
> there.
> 
It's no longer an inversion of the logic since I added a PREFIXED_ALWAYS value. 
'next_insn_prefixed' is used by rs6000_final_prescan_insn() to determine 
whether an insn mnemonic needs a 'p' prefix. We want it set for PREFIXED_YES, 
but not for PREFIXED_NO or PREFIXED_ALWAYS.

> 
>>  * config/rs6000/rs6000.md (define_attr "size"): Add 256.
>>  (define_attr "prefixed"): Add 'always'.
>>  (define_mode_attr bits): Add DD/TD modes.
>>  (cfuged, cntlzdm, cnttzdm, pdepd, pextd, bswaphi2_reg, bswapsi2_reg,
>>  bswapdi2_brd, setbc_signed_,
>>  *setbcr_signed_, *setnbc_signed_,
>>  *setnbcr_signed_): Update instruction attributes for
>>  Power10.
> 
> ok.  (assuming the assorted 'integer' -> 'crypto' changes are correct,
> of course).  
> 
Yes, crypto represents the correct pipe the insns are executed on.

Thanks for the review,
Pat



[PATCH] aarch64: Support permutes on unpacked SVE vectors

2020-11-06 Thread Richard Sandiford via Gcc-patches
This patch adds support for permuting unpacked SVE vectors using:

- DUP
- EXT
- REV[BHW]
- REV
- TRN[12]
- UZP[12]
- ZIP[12]

This involves rewriting the REV[BHW] permute code so that the inputs
and outputs of the insn pattern have the same mode as the vectors
being permuted.  This is different from the ACLE form, where the
reversal happens within individual elements rather than within
groups of multiple elements.

The patch does not add a conditional version of REV[BHW].  I'll come
back to that once we have partial-vector comparisons and selects.

The patch is really just enablement, adding an extra tool to the
toolbox.  It doesn't bring any significant vectorisation opportunities
on its own.  However, the patch does have one artificial example that
is now vectorised in a better way than before.

Tested on aarch64-linux-gnu (with and without SVE), applied.

Richard


gcc/
* config/aarch64/aarch64-modes.def (VNx2BF, VNx4BF): Adjust nunits
and alignment based on the current VG.
* config/aarch64/iterators.md (SVE_ALL, SVE_24, SVE_2, SVE_4): Add
partial SVE BF modes.
(UNSPEC_REVBHW): New unspec.
(Vetype, Vesize, Vctype, VEL, Vel, vwcore, V_INT_CONTAINER)
(v_int_container, VPRED, vpred): Handle partial SVE BF modes.
(container_bits, Vcwtype): New mode attributes.
* config/aarch64/aarch64-sve.md
(@aarch64_sve_revbhw_): New pattern.
(@aarch64_sve_dup_lane): Extended from SVE_FULL to SVE_ALL.
(@aarch64_sve_rev, @aarch64_sve_): Likewise.
(@aarch64_sve_ext): Likewise.
* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle
E_VNx2BFmode and E_VNx4BFmode.
(aarch64_evpc_rev_local): Base the analysis on the container size
instead of the element size.  Use the new aarch64_sve_revbhw
patterns for SVE.
(aarch64_evpc_dup): Handle partial SVE data modes.  Use the
container size instead of the element size when applying the
SVE immediate limit.  Fix a previously incorrect bounds check.
(aarch64_expand_vec_perm_const_1): Handle partial SVE data modes.

gcc/testsuite/
* gcc.target/aarch64/sve/dup_lane_2.c: New test.
* gcc.target/aarch64/sve/dup_lane_3.c: Likewise.
* gcc.target/aarch64/sve/ext_4.c: Likewise.
* gcc.target/aarch64/sve/rev_2.c: Likewise.
* gcc.target/aarch64/sve/revhw_1.c: Likewise.
* gcc.target/aarch64/sve/revhw_2.c: Likewise.
* gcc.target/aarch64/sve/slp_perm_8.c: Likewise.
* gcc.target/aarch64/sve/trn1_2.c: Likewise.
* gcc.target/aarch64/sve/trn2_2.c: Likewise.
* gcc.target/aarch64/sve/uzp1_2.c: Likewise.
* gcc.target/aarch64/sve/uzp2_2.c: Likewise.
* gcc.target/aarch64/sve/zip1_2.c: Likewise.
* gcc.target/aarch64/sve/zip2_2.c: Likewise.
---
 gcc/config/aarch64/aarch64-modes.def  |   4 +
 gcc/config/aarch64/aarch64-sve.md |  57 ++-
 gcc/config/aarch64/aarch64.c  |  45 +-
 gcc/config/aarch64/iterators.md   |  54 ++-
 .../gcc.target/aarch64/sve/dup_lane_2.c   | 331 ++
 .../gcc.target/aarch64/sve/dup_lane_3.c   |  90 
 gcc/testsuite/gcc.target/aarch64/sve/ext_4.c  | 353 +++
 gcc/testsuite/gcc.target/aarch64/sve/rev_2.c  | 177 
 .../gcc.target/aarch64/sve/revhw_1.c  | 127 ++
 .../gcc.target/aarch64/sve/revhw_2.c  | 127 ++
 .../gcc.target/aarch64/sve/slp_perm_8.c   |  18 +
 gcc/testsuite/gcc.target/aarch64/sve/trn1_2.c | 403 ++
 gcc/testsuite/gcc.target/aarch64/sve/trn2_2.c | 403 ++
 gcc/testsuite/gcc.target/aarch64/sve/uzp1_2.c | 375 
 gcc/testsuite/gcc.target/aarch64/sve/uzp2_2.c | 375 
 gcc/testsuite/gcc.target/aarch64/sve/zip1_2.c | 403 ++
 gcc/testsuite/gcc.target/aarch64/sve/zip2_2.c | 403 ++
 17 files changed, 3685 insertions(+), 60 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/dup_lane_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/dup_lane_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/ext_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/rev_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/revhw_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/revhw_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_perm_8.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/trn1_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/trn2_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/uzp1_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/uzp2_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/zip1_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/zip2_2.c

diff --git a/gcc/config/aarch64/aarch64-modes.def 
b/gcc/config/aarch64/aarch64-modes.def
index af972e8f72b..f3049

libcpp: Provide date routine

2020-11-06 Thread Nathan Sidwell

Joseph pointed me at cb_get_source_date_epoch, which allows repeatable
builds and solves a FIXME I had on the modules branch.  Unfortunately
it's used exclusively to generate __DATE__ and __TIME__ values, which
fallback to using a time(2) call.  It'd be nicer if the preprocessor
made whatever time value it determined available to the rest of the
compiler.  So this patch adds a new cpp_get_date function, which
abstracts the call to the get_source_date_epoch hook, or uses time
directly.  The value is cached.  Thus the timestamp I end up putting
on CMI files matches __DATE__ and __TIME__ expansions.  That seems
worthwhile.

libcpp/
* include/libcpp.h (enum class CPP_time_kind): New.
(cpp_get_date): Declare.
* internal.h (struct cpp_reader): Replace source_date_epoch with
time_stamp and time_stamp_kind.
* init.c (cpp_create_reader): Initialize them.
* macro.c (_cpp_builtin_macro_text): Use cpp_get_date.
(cpp_get_date): Broken out from _cpp_builtin_macro_text and
genericized.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/libcpp/include/cpplib.h w/libcpp/include/cpplib.h
index 8e398863cf6..c4d7cc520d1 100644
--- i/libcpp/include/cpplib.h
+++ w/libcpp/include/cpplib.h
@@ -1040,6 +1040,15 @@ inline location_t cpp_macro_definition_location (cpp_hashnode *node)
 {
   return node->value.macro->line;
 }
+/* Return an idempotent time stamp (possibly from SOURCE_DATE_EPOCH).  */
+enum class CPP_time_kind 
+{
+  FIXED = -1,	/* Fixed time via source epoch.  */
+  DYNAMIC = -2,	/* Dynamic via time(2).  */
+  UNKNOWN = -3	/* Wibbly wobbly, timey wimey.  */
+};
+extern CPP_time_kind cpp_get_date (cpp_reader *, time_t *);
+
 extern void _cpp_backup_tokens (cpp_reader *, unsigned int);
 extern const cpp_token *cpp_peek_token (cpp_reader *, int);
 
diff --git i/libcpp/init.c w/libcpp/init.c
index 6c52f50de39..dcf1d4be587 100644
--- i/libcpp/init.c
+++ w/libcpp/init.c
@@ -273,8 +273,9 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   /* Do not force token locations by default.  */
   pfile->forced_token_location = 0;
 
-  /* Initialize source_date_epoch to -2 (not yet set).  */
-  pfile->source_date_epoch = (time_t) -2;
+  /* Note the timestamp is unset.  */
+  pfile->time_stamp = time_t (-1);
+  pfile->time_stamp_kind = 0;
 
   /* The expression parser stack.  */
   _cpp_expand_op_stack (pfile);
diff --git i/libcpp/internal.h w/libcpp/internal.h
index 4759961a33a..d7780e49d27 100644
--- i/libcpp/internal.h
+++ w/libcpp/internal.h
@@ -512,10 +512,9 @@ struct cpp_reader
   const unsigned char *date;
   const unsigned char *time;
 
-  /* Externally set timestamp to replace current date and time useful for
- reproducibility.  It should be initialized to -2 (not yet set) and
- set to -1 to disable it or to a non-negative value to enable it.  */
-  time_t source_date_epoch;
+  /* Time stamp, set idempotently lazily.  */
+  time_t time_stamp;
+  int time_stamp_kind; /* Or errno.  */
 
   /* A token forcing paste avoidance, and one demarking macro arguments.  */
   cpp_token avoid_paste;
diff --git i/libcpp/macro.c w/libcpp/macro.c
index e304f67c2e0..e2cb89e4c43 100644
--- i/libcpp/macro.c
+++ w/libcpp/macro.c
@@ -606,29 +606,21 @@ _cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node,
 	 at init time, because time() and localtime() are very
 	 slow on some systems.  */
 	  time_t tt;
-	  struct tm *tb = NULL;
+	  auto kind = cpp_get_date (pfile, &tt);
 
-	  /* Set a reproducible timestamp for __DATE__ and __TIME__ macro
-	 if SOURCE_DATE_EPOCH is defined.  */
-	  if (pfile->source_date_epoch == (time_t) -2
-	  && pfile->cb.get_source_date_epoch != NULL)
-	pfile->source_date_epoch = pfile->cb.get_source_date_epoch (pfile);
-
-	  if (pfile->source_date_epoch >= (time_t) 0)
-	tb = gmtime (&pfile->source_date_epoch);
-	  else
+	  if (kind == CPP_time_kind::UNKNOWN)
 	{
-	  /* (time_t) -1 is a legitimate value for "number of seconds
-		 since the Epoch", so we have to do a little dance to
-		 distinguish that from a genuine error.  */
-	  errno = 0;
-	  tt = time (NULL);
-	  if (tt != (time_t)-1 || errno == 0)
-		tb = localtime (&tt);
+	  cpp_errno (pfile, CPP_DL_WARNING,
+			 "could not determine date and time");
+		
+	  pfile->date = UC"\"??? ?? \"";
+	  pfile->time = UC"\"??:??:??\"";
 	}
-
-	  if (tb)
+	  else
 	{
+	  struct tm *tb = (kind == CPP_time_kind::FIXED
+			   ? gmtime : localtime) (&tt);
+
 	  pfile->date = _cpp_unaligned_alloc (pfile,
 		  sizeof ("\"Oct 11 1347\""));
 	  sprintf ((char *) pfile->date, "\"%s %2d %4d\"",
@@ -640,14 +632,6 @@ _cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node,
 	  sprintf ((char *) pfile->time, "\"%02d:%02d:%02d\"",
 		   tb->tm_hour, tb->tm_min, tb->tm_sec);
 	}
-	  else
-	{
-	  cpp_errno (pfile, CPP_DL_WARNING,
-			 "could not determine date and time");
-		

[PATCH] rework PRE PHI translation cache

2020-11-06 Thread Richard Biener
Turns out its size and time requirements can be stripped down
dramatically.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

2020-11-06  Richard Biener  

* tree-ssa-pre.c (expr_pred_trans_d): Modify so elements
are embedded rather than allocated.  Remove hashval member,
make all members integers.
(phi_trans_add): Adjust accordingly.
(phi_translate): Likewise.  Deal with re-allocation
of the table.
---
 gcc/tree-ssa-pre.c | 104 +
 1 file changed, 68 insertions(+), 36 deletions(-)

diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 65e8aaaca02..3496891f8b5 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -545,45 +545,74 @@ static bitmap_obstack grand_bitmap_obstack;
 /* A three tuple {e, pred, v} used to cache phi translations in the
phi_translate_table.  */
 
-typedef struct expr_pred_trans_d : free_ptr_hash
+typedef struct expr_pred_trans_d : public typed_noop_remove 
 {
-  /* The expression.  */
-  pre_expr e;
+  typedef expr_pred_trans_d value_type;
+  typedef expr_pred_trans_d compare_type;
 
-  /* The predecessor block along which we translated the expression.  */
-  basic_block pred;
+  /* The expression ID.  */
+  unsigned e;
 
-  /* The value that resulted from the translation.  */
-  pre_expr v;
+  /* The predecessor block index along which we translated the expression.  */
+  int pred;
 
-  /* The hashcode for the expression, pred pair. This is cached for
- speed reasons.  */
-  hashval_t hashcode;
+  /* The value expression ID that resulted from the translation.  */
+  unsigned v;
 
   /* hash_table support.  */
-  static inline hashval_t hash (const expr_pred_trans_d *);
-  static inline int equal (const expr_pred_trans_d *, const expr_pred_trans_d 
*);
+  static inline void mark_empty (expr_pred_trans_d &);
+  static inline bool is_empty (const expr_pred_trans_d &);
+  static inline void mark_deleted (expr_pred_trans_d &);
+  static inline bool is_deleted (const expr_pred_trans_d &);
+  static const bool empty_zero_p = true;
+  static inline hashval_t hash (const expr_pred_trans_d &);
+  static inline int equal (const expr_pred_trans_d &, const expr_pred_trans_d 
&);
 } *expr_pred_trans_t;
 typedef const struct expr_pred_trans_d *const_expr_pred_trans_t;
 
+inline bool
+expr_pred_trans_d::is_empty (const expr_pred_trans_d &e)
+{
+  return e.e == 0;
+}
+
+inline bool
+expr_pred_trans_d::is_deleted (const expr_pred_trans_d &e)
+{
+  return e.e == -1u;
+}
+
+inline void
+expr_pred_trans_d::mark_empty (expr_pred_trans_d &e)
+{
+  e.e = 0;
+}
+
+inline void
+expr_pred_trans_d::mark_deleted (expr_pred_trans_d &e)
+{
+  e.e = -1u;
+}
+
 inline hashval_t
-expr_pred_trans_d::hash (const expr_pred_trans_d *e)
+expr_pred_trans_d::hash (const expr_pred_trans_d &e)
 {
-  return e->hashcode;
+  return iterative_hash_hashval_t (e.e, e.pred);
 }
 
 inline int
-expr_pred_trans_d::equal (const expr_pred_trans_d *ve1,
- const expr_pred_trans_d *ve2)
+expr_pred_trans_d::equal (const expr_pred_trans_d &ve1,
+ const expr_pred_trans_d &ve2)
 {
-  basic_block b1 = ve1->pred;
-  basic_block b2 = ve2->pred;
+  int b1 = ve1.pred;
+  int b2 = ve2.pred;
 
   /* If they are not translations for the same basic block, they can't
  be equal.  */
   if (b1 != b2)
 return false;
-  return pre_expr_d::equal (ve1->e, ve2->e);
+
+  return ve1.e == ve2.e;
 }
 
 /* The phi_translate_table caches phi translations for a given
@@ -596,24 +625,22 @@ static hash_table *phi_translate_table;
 static inline bool
 phi_trans_add (expr_pred_trans_t *entry, pre_expr e, basic_block pred)
 {
-  expr_pred_trans_t *slot;
+  expr_pred_trans_t slot;
   expr_pred_trans_d tem;
-  hashval_t hash = iterative_hash_hashval_t (pre_expr_d::hash (e),
-pred->index);
-  tem.e = e;
-  tem.pred = pred;
-  tem.hashcode = hash;
-  slot = phi_translate_table->find_slot_with_hash (&tem, hash, INSERT);
-  if (*slot)
+  unsigned id = get_expression_id (e);
+  hashval_t hash = iterative_hash_hashval_t (id, pred->index);
+  tem.e = id;
+  tem.pred = pred->index;
+  slot = phi_translate_table->find_slot_with_hash (tem, hash, INSERT);
+  if (slot->e)
 {
-  *entry = *slot;
+  *entry = slot;
   return true;
 }
 
-  *entry = *slot = XNEW (struct expr_pred_trans_d);
-  (*entry)->e = e;
-  (*entry)->pred = pred;
-  (*entry)->hashcode = hash;
+  *entry = slot;
+  slot->e = id;
+  slot->pred = pred->index;
   return false;
 }
 
@@ -1675,6 +1702,7 @@ phi_translate (bitmap_set_t dest, pre_expr expr,
   bitmap_set_t set1, bitmap_set_t set2, edge e)
 {
   expr_pred_trans_t slot = NULL;
+  size_t slot_size = 0;
   pre_expr phitrans;
 
   if (!expr)
@@ -1691,10 +1719,11 @@ phi_translate (bitmap_set_t dest, pre_expr expr,
   if (expr->kind != NAME)
 {
   if (phi_trans_add (&slot, expr, e->src))
-   return slot->v;
+   

Re: [21/32] miscelaneous

2020-11-06 Thread Nathan Sidwell

On 11/5/20 8:30 AM, Richard Biener wrote:

On Tue, Nov 3, 2020 at 10:16 PM Nathan Sidwell  wrote:


These are changes to gcc/tree.h adding some raw accessors to nodes,
which seemed preferable to direct field access.  I also needed access to
the integral constant cache


can you please document the adjusted interface to cache_integer_cst in
its (non-existing) function level comment?  It looks like 'replace'== true
turns it into get_or_insert from now put with an assertion it wasn't in the
cache.


Sure.  It's a little weird in that the current behaviour is to allow 
duplicates in the hash table, but not in the type's small-value vector.


I renamed the new parameter and documented what happens.  I'll apply 
this as a distinct patch during the merge (with changelog).  For now it 
lives on the modules branch


nathan

--
Nathan Sidwell
diff --git c/gcc/tree.c w/gcc/tree.c
index 9260772b846..9e10df0d7d0 100644
--- c/gcc/tree.c
+++ w/gcc/tree.c
@@ -1727,8 +1727,15 @@ wide_int_to_tree (tree type, const poly_wide_int_ref &value)
   return build_poly_int_cst (type, value);
 }
 
-void
-cache_integer_cst (tree t)
+/* Insert INTEGER_CST T into a cache of integer constants.  And return
+   the cached constant (which may or may not be T).  If MAY_DUPLICATE
+   is false, and T falls into the type's 'smaller values' range, there
+   cannot be an existing entry.  Otherwise, if MAY_DUPLICATE is true,
+   or the value is large, should an existing entry exist, it is
+   returned (rather than inserting T).  */
+
+tree
+cache_integer_cst (tree t, bool may_duplicate ATTRIBUTE_UNUSED)
 {
   tree type = TREE_TYPE (t);
   int ix = -1;
@@ -1742,7 +1749,7 @@ cache_integer_cst (tree t)
   switch (TREE_CODE (type))
 {
 case NULLPTR_TYPE:
-  gcc_assert (integer_zerop (t));
+  gcc_checking_assert (integer_zerop (t));
   /* Fallthru.  */
 
 case POINTER_TYPE:
@@ -1822,21 +1829,32 @@ cache_integer_cst (tree t)
 	  TYPE_CACHED_VALUES (type) = make_tree_vec (limit);
 	}
 
-  gcc_assert (TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) == NULL_TREE);
-  TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
+  if (tree r = TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix))
+	{
+	  gcc_checking_assert (may_duplicate);
+	  t = r;
+	}
+  else
+	TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
 }
   else
 {
   /* Use the cache of larger shared ints.  */
   tree *slot = int_cst_hash_table->find_slot (t, INSERT);
-  /* If there is already an entry for the number verify it's the
- same.  */
-  if (*slot)
-	gcc_assert (wi::to_wide (tree (*slot)) == wi::to_wide (t));
+  if (tree r = *slot)
+	{
+	  /* If there is already an entry for the number verify it's the
+	 same value.  */
+	  gcc_checking_assert (wi::to_wide (tree (r)) == wi::to_wide (t));
+	  /* And return the cached value.  */
+	  t = r;
+	}
   else
 	/* Otherwise insert this one into the hash table.  */
 	*slot = t;
 }
+
+  return t;
 }
 
 


Re: [PATCH] Support the new ("v0") mangling scheme in rust-demangle.

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/1/20 10:26 AM, Nikhil Benesch via Gcc-patches wrote:
>
>
> On 11/1/20 6:57 AM, Eduard-Mihai Burtescu wrote:
>> Reading the diff patch, the v0 changes look great. I wouldn't be too
>> worried
>> about the "printable character" aspect, there are similar
>> Unicode-related
>> issues elsewhere, e.g. the "non-control ASCII" comment in
>> decode_legacy_escape
>> (I suppose we could make it also go through the "print a non-control
>> ASCII
>> character or some escape sequence" logic you added, if you think that
>> helps).
>
> No, it's entirely fine with me! I just wasn't sure if the small
> deviations in output were acceptable. It sounds like they are.

So I think the best path forward is to let you and Eduard-Mihai make the
technical decisions about what bits are ready for the trunk.  When y'all
think something is ready, let's go ahead and get it installed and
iterate on things that aren't quite ready yet.


For bits y'all think are ready, ISTM that Eduard-Mihai should commit the
changes.



>> I can test the patch and upload the dataset tomorrow, but if you want
>> to get
>> something committed sooner (is there a deadline for the next
>> release?), feel
>> free to land the v0 changes (snprintf + const values) without the
>> legacy ones.
>
> My understanding is that the GCC tree closes to new features on
> November 16 (for "GCC 11 Stage 3"), but I'm not sure whether that
> applies to libiberty or whether this patch would be classified as a
> feature or a bugfix.
>
> I don't have commit rights (nor am I even a GCC developer). Just
> wanted to tee things up for you and Ian this week. I'm very much
> looking forward to the new demangling scheme and didn't want to be
> just another +1 on the GitHub issue.
>
> So certainly no time pressure from me. But perhaps someone from the
> GCC side can confirm whether we are under a bit of time pressure here
> given the GCC 11 release.

It's better to get it in sooner, but there is some degree of freedom
depending on the impact of the changes.  Changes in the rust demangler
aren't likely to trigger codegen or ABI breakages in the compiler itself
-- so with that in mind I think we should give this code a higher degree
of freedom to land after the stage1 close deadline.


jeff




Re: [PATCH] arc: Improve/add instruction patterns to better use MAC instructions.

2020-11-06 Thread Jeff Law via Gcc-patches


On 10/9/20 8:24 AM, Claudiu Zissulescu wrote:
> From: Claudiu Zissulescu 
>
> ARC MYP7+ instructions add MAC instructions for vector and scalar data
> types. This patch adds a madd pattern for 16it datum that is using the
> 32bit MAC instruction, and dot_prod patterns for v4hi vector
> types. The 64bit moves are also upgraded by using vadd2 instuction.
>
> gcc/
> -xx-xx  Claudiu Zissulescu  
>
>   * config/arc/arc.c (arc_split_move): Recognize vadd2 instructions.
>   * config/arc/arc.md (movdi_insn): Update pattern to use vadd2
>   instructions.
>   (movdf_insn): Likewise.
>   (maddhisi4): New pattern.
>   (umaddhisi4): Likewise.
>   * config/arc/simdext.md (mov_int): Update pattern to use
>   vadd2.
>   (sdot_prodv4hi): New pattern.
>   (udot_prodv4hi): Likewise.
>   (arc_vec_mac_hi_v4hi): Update/renamed to
>   arc_vec_mac_v2hiv2si.
>   (arc_vec_mac_v2hiv2si_zero): New pattern.

OK for the trunk.  Sorry for the delay.

jeff




Re: [PATCH] Optimize macro: make it more predictable

2020-11-06 Thread Jeff Law via Gcc-patches


On 10/23/20 5:47 AM, Martin Liška wrote:
> Hey.
>
> This is a follow-up of the discussion that happened in thread about
> no_stack_protector
> attribute: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545916.html
>
> The current optimize attribute works in the following way:
> - 1) we take current global_options as base
> - 2) maybe_default_options is called for the currently selected
> optimization level, which
>      means all rules in default_options_table are executed
> - 3) attribute values are applied (via decode_options)
>
> So the step 2) is problematic: in case of -O2 -fno-omit-frame-pointer
> and __attribute__((optimize("-fno-stack-protector")))
> ends basically with -O2 -fno-stack-protector because
> -fno-omit-frame-pointer is default:
>     /* -O1 and -Og optimizations.  */
>     { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
>
> My patch handled and the current optimize attribute really behaves
> that same as appending attribute value
> to the command line. So far so good. We should also reflect that in
> documentation entry which is quite
> vague right now:
>
> """
> The optimize attribute is used to specify that a function is to be
> compiled with different optimization options than specified on the
> command line.
> """
>
> and we may want to handle -Ox in the attribute in a special way. I
> guess many macro/pragma users expect that
>
> -O2 -ftree-vectorize and __attribute__((optimize(1))) will end with
> -O1 and not
> with -ftree-vectorize -O1 ?
>
> I'm also planning to take a look at the target macro/attribute, I
> expect similar problems:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97469
>
> Thoughts?
> Thanks,
> Martin
>
> gcc/c-family/ChangeLog:
>
>     * c-common.c (parse_optimize_options): Decoded attribute options
>     with the ones that were already set on the command line.
>
> gcc/ChangeLog:
>
>     * toplev.c (toplev::main): Save decoded Optimization options.
>     * toplev.h (save_opt_decoded_options): New.
>
> gcc/testsuite/ChangeLog:
>
>     * gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable -ffast-math.
>     * gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
So you XNEWVEC and store the result into "merge_decoded_options".  But
you free "decoded_options".  Was that intentional?

This seems to bring a bit more predictability, but I suspect there's
more to do here.


jeff




Re: Rename DECL_IS_BUILTIN to DECL_IS_UNDECLARED_BUILTIN

2020-11-06 Thread Jeff Law via Gcc-patches


On 10/23/20 6:48 AM, Nathan Sidwell wrote:
> Patch affects C++, C, GO, common-core
>
> In cleaning up C++'s handling of hidden decls, I renamed its
> DECL_BUILTIN_P, which checks for loc == BUILTINS_LOCATION to
> DECL_UNDECLARED_BUILTIN_P, because the location gets updated, if user
> source declares the builtin, and the predicate no longer holds.  The
> original name was confusing me.  (The builtin may still retain builtin
> properties in the redeclaration, and other predicates can still detect
> that.)
>
> I discovered that tree.h had its own variant 'DECL_IS_BUILTIN', which
> behaves in (almost) the same manner.  And therefore has the same
> mutating behaviour.
>
> This patch deletes the C++ one, and renames tree.h's to
> DECL_IS_UNDECLARED_BUILTIN, to emphasize its non-constantness.  I
> guess _IS_ wins over _P :)
>
> The indirection via SOURCE_LOCUS was introduced by Richard in 2012:
>     2012-09-26  Richard Guenther  
>
>     * tree.h (DECL_IS_BUILTIN): Compare LOCATION_LOCUS.
>
>     From-SVN: r191759
>
> I couldn't find the email on gcc-patches, but I don't see why this is
> necessary -- no undeclared builtin has an adhoc location, they're all
> BUILTINS_LOCATION, or UNKNOWN_LOCATION.
>
> That some builtins have UNKNOWN_LOCATION is why the test is <= rather
> than ==.  This seems wrong, and we should be using BUILTINS_LOCATION
> everywhere.  But that's a different bug.
>
> bootstrapped on x86-64-linux and test results look the same across all
> the languages I can build. ok?
>
> gcc/
> * tree.h (DECL_IS_BUILTIN): Rename to ...
> (DECL_IS_UNDECLARED_BUILTIN): ... here.  No need to use
>     SOURCE_LOCUS.
> * calls.c (maybe_warn_alloc_args_overflow): Adjust for rename.
> * cfgexpand.c (pass_expand::execute): Likewise.
> * dwarf2out.c (base_type_die, is_naming_typedef_decl): Likewise.
> * godump.c (go_decl, go_type_decl): Likewise.
> * print-tree.c (print_decl_identifier): Likewise.
> * tree-pretty-print.c (dump_generic_node): Likewise.
> * tree-ssa-ccp.c (pass_post_ipa_warn::execute): Likewise.
> * xcoffout.c (xcoff_assign_fundamental_type_number): Likewise.
> gcc/c-family/
> * c-ada-spec.c (collect_ada_nodes): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> (collect_ada_node): Likewise.
> (dump_forward_type): Likewise.
> * c-common.c (set_underlying_type): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> (user_facing_original_type): Likewise.
> (c_common_finalize_early_debug): Likewise.
> gcc/c/
> * c-decl.c (diagnose_mismatched_decls): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> (warn_if_shadowing, implicitly_declare, names_builtin_p)
> (collect_source_refs): Likewise.
> * c-typeck.c (inform_declaration, inform_for_arg)
> (convert_for_assignment): Likewise.
> gcc/cp/
> * cp-tree.h (DECL_UNDECLARED_BUILTIN_P): Delete.
> * cp-objcp-common.c (names_bultin_p): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> * decl.c (decls_match): Likewise.  Replace
> DECL_UNDECLARED_BUILTIN_P with DECL_IS_UNDECLARED_BUILTIN.
> (duplicate_decls): Likewise.
> * decl2.c (collect_source_refs): Likewise.
> * name-lookup.c (anticipated_builtin_p, print_binding_level)
> (do_nonmember_using_decl): Likewise.
> * pt.c (builtin_pack_fn_p): Likewise.
> * typeck.c (error_args_num): Likewise.
> gcc/lto/
> * lto-symtab.c (lto_symtab_merge_decls_1): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> gcc/go/
> * go-gcc.cc (Gcc_backend::call_expression): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> libcc1/
> * libcc1plugin.cc (address_rewriter): Rename
> DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
> * libcp1plugin.cc (supplement_binding): Likewise.

OK

jeff




Re: [PATCH] libcpp: Update cpp_wcwidth() to Unicode 13.0.0

2020-11-06 Thread Jeff Law via Gcc-patches


On 10/23/20 9:01 AM, Lewis Hyatt via Gcc-patches wrote:
> Hello-
>
> The attached patch updates cpp_wcwidth() (for computation of display
> widths needed to calculate column numbers in diagnostics) from Unicode 12
> to Unicode 13. The patch was purely mechanical, following the directions
> in contrib/unicode/README without any unexpected hiccups. A couple
> questions please:
>
> -Is it OK for master?

Yes, it is OK for the trunk.  Please go ahead and commit it.


>
> -Unicode 13 actually came out just immediately before GCC 10 was
>  released. Would it make sense to put this on GCC 10 branch as well?

I wouldn't.  The general guidance is that we fix regressions on the
release branches and this wouldn't qualify. 


Jeff




Re: [PATCH] c-family: Fix regression in location-overflow-test-1.c [PR97117]

2020-11-06 Thread Jeff Law via Gcc-patches


On 10/14/20 9:56 AM, Patrick Palka via Gcc-patches wrote:
> The r11-3266 patch that added macro support to -Wmisleading-indentation
> accidentally suppressed the column-tracking diagnostic in
> get_visual_column in some cases, e.g. in the location-overflow-test-1.c
> testcase.
>
> More generally, when all three tokens are on the same line and we've run
> out of locations with column info, then their location_t values will be
> equal, and we exit early from should_warn_for_misleading_indentation due
> to the new check
>
>   /* Give up if the loci are not all distinct.  */
>   if (guard_loc == body_loc || body_loc == next_stmt_loc)
> return false;
>
> before we ever call get_visual_column.
>
> [ This new check is needed to detect and give up on analyzing code
>   fragments where exactly two out of the three tokens come from the same
>   macro expansion, e.g.
>
> #define MACRO \
>   if (a)  \
> foo ();
>
> MACRO; bar ();
>
>   Here, guard_loc and body_loc will be equal and point to the macro
>   expansion point.  The heuristics the warning uses are not really valid
>   in scenarios like these.  ]
>
> In order to restore the column-tracking diagnostic, this patch moves the
> the diagnostic code out from get_visual_column to earlier in
> should_warn_for_misleading_indentation.  Moreover, it tests the three
> location_t values for a zero column all at once, which I suppose should
> make us issue the diagnostic more consistently.
>
> Tested on x86_64-pc-linux-gnu, does this look OK to commit?
>
> gcc/c-family/ChangeLog:
>
>   PR testsuite/97117
>   * c-indentation.c (get_visual_column): Remove location_t
>   parameter.  Move the column-tracking diagnostic code from here
>   to ...
>   (should_warn_for_misleading_indentation): ... here, before the
>   early exit for when the loci are not all distinct.  Don't pass a
>   location_t argument to get_visual_column.
>   (assert_get_visual_column_succeeds): Don't pass a location_t
>   argument to get_visual_column.
>   (assert_get_visual_column_fails): Likewise.

OK.


jeff




Re: [Patch, fortran] PR83118 - [8/9/10/11 Regression] Bad intrinsic assignment of class(*) array component of derived type

2020-11-06 Thread Tobias Burnus

Hi Paul,

sorry for the belated attempt to review your patch.
Attempt as both via @gcc.gnu.org as in the direct email,
I did not see the attached patch.
(Matches what Andre mentioned to you today at IRC #gfortran.)

I only see:
* Content-Type: multipart/alternative
  Content-Type: text/plain; charset="UTF-8"
* Content-Type: text/html; charset="UTF-8"
(Side remark: I did not know that GCC now accepts
 text/html multipart emails. Still, it is better to
 avoid this.)

* Content-Type: application/octet-stream; name="Change2.Logs"
* Content-Type: application/octet-stream; name="unlimited_polymorphic_32.f03

Regarding the changelog: There first line got he git commit is missing.
That should be a single (not too long) line, which "git log --oneline"
shows, followed by an empty line.
If possible, it is a substring of the email subject (if that makes sense).
For this thread, it helps if "PR83118" is present in that line, which is
also in the thread name.

Additionally spotted:

"Reallocation of lhs only to happen if siz changes"
Typo "siz"? Or is this a badly named variable name?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [Patch] testsuite: Avoid TCL errors when rootme or ASAN/TSAN/UBSAN is not available (was: Re: [Patch] testsuite: Avoid TCL errors when ASAN/TSAN/UBSAN is not available)

2020-11-06 Thread Jeff Law via Gcc-patches


On 10/19/20 10:03 AM, Tobias Burnus wrote:
> Thomas Schwinge and Joseph convinced me that 'rootme' only makes sense
> for in-tree testing and, hence, does not need (or: should not) be set in
> site.exp.
>
> Thus, if it is not set, we have to check its existence before using it –
> to avoid similar TCL errors.
> Hence, I updated the patch to check also for 'rootme'.
>
> OK?
>
> Tobias
>
> On 10/19/20 11:46 AM, Tobias Burnus wrote:
>> In a --disable-libsanitizer build, I see errors such as:
>>   g++.sum:ERROR: can't read "asan_saved_library_path": no such variable
>>
>> I believe the following patch is the right way to solve this.
>> OK?
>>
>> Tobias
>>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
> Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> Alexander Walter
>
> san-test-v2.diff
>
> testsuite: Avoid TCL errors when rootme or ASAN/TSAN/UBSAN is not avail
>
>   * g++.dg/guality/guality.exp:
>   * gcc.dg/guality/guality.exp:
>   * gfortran.dg/guality/guality.exp:
>   * lib/asan-dg.exp:
>   * lib/tsan-dg.exp:
>   * lib/ubsan-dg.exp:
>
>  gcc/testsuite/g++.dg/guality/guality.exp  | 2 +-
>  gcc/testsuite/gcc.dg/guality/guality.exp  | 2 +-
>  gcc/testsuite/gfortran.dg/guality/guality.exp | 2 +-
>  gcc/testsuite/lib/asan-dg.exp | 6 --
>  gcc/testsuite/lib/tsan-dg.exp | 6 --
>  gcc/testsuite/lib/ubsan-dg.exp| 6 --
>  6 files changed, 15 insertions(+), 9 deletions(-)

OK with ChangeLog entry completed.


jeff




Re: builtins: Add DFP signaling NaN built-in functions

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/4/20 5:35 PM, Joseph Myers wrote:
> Add built-in functions __builtin_nansd32, __builtin_nansd64 and
> __builtin_nansd128 to return signaling NaNs of decimal floating-point
> types, analogous to the functions already present for binary
> floating-point types.
>
> This patch, independent of
> 
> (pending review), is in preparation for adding the  macros
> for such signaling NaNs that are in C2x, analogous to the macros for
> other types that are in that patch.
>
> Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Also ran
> the new tests for powerpc64le-linux-gnu to confirm they do work in the
> case (hardware DFP) where floating-point exceptions are supported for
> DFP.  OK to commit?
>
> gcc/
> 2020-11-05  Joseph Myers  
>
>   * builtins.def (BUILT_IN_NANSD32, BUILT_IN_NANSD64)
>   (BUILT_IN_NANSD128): New built-in functions.
>   * fold-const-call.c (fold_const_call): Handle the new built-in
>   functions.
>   * doc/extend.texi (__builtin_nansd32, __builtin_nansd64)
>   (__builtin_nansd128): Document.
>   * doc/sourcebuild.texi (Effective-Target Keywords): Document
>   fenv_exceptions_dfp.
>
> gcc/testsuite/
> 2020-11-05  Joseph Myers  
>
>   * lib/target-supports.exp
>   (check_effective_target_fenv_exceptions_dfp): New.
>   * gcc.dg/dfp/builtin-snan-1.c, gcc.dg/dfp/builtin-snan-2.c: New
>   tests.

OK

jeff




Re: float.h: C2x decimal signaling NaN macros

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/5/20 4:21 PM, Joseph Myers wrote:
> C2x adds macros for decimal floating-point signaling NaNs to
> .  Add these macros to GCC's  implementation.
>
> Note that the current C2x draft has these under incorrect names
> D32_SNAN, D64_SNAN, D128_SNAN.  The intent was to change the naming
> convention to be consistent with other  macros when they were
> moved to , so DEC32_SNAN, DEC64_SNAN, DEC128_NAN, which this
> patch uses (as does the current draft integration of TS 18661-3 as an
> Annex to C2x, for its _Decimal* and _Decimal*x types).
>
> This patch is relative to a tree with
> 
> and
> 
> (both pending review) applied.
>
> Bootstrapped with no regressions for x86_64-pc-linux-gnu.  OK to commit?
>
> gcc/
> 2020-11-05  Joseph Myers  
>
>   * ginclude/float.h (DEC32_SNAN, DEC64_SNAN, DEC128_SNAN): New C2x
>   macros.
>
> gcc/testsuite/
> 2020-11-05  Joseph Myers  
>
>   * gcc.dg/dfp/c2x-float-dfp-7.c, gcc.dg/dfp/c2x-float-dfp-8.c: New
>   tests.
>   * gcc.dg/c2x-float-no-dfp-3.c: Also check that DEC32_SNAN,
>   DEC64_SNAN and DEC128_SNAN are not defined.

OK

jeff




Re: [PATCH][AArch64] Use intrinsics for upper saturating shift right

2020-11-06 Thread Richard Sandiford via Gcc-patches
David Candler  writes:
> Hi Richard,
>
> Thanks for the feedback.
>
> Richard Sandiford  writes:
>> > diff --git a/gcc/config/aarch64/aarch64-builtins.c 
>> > b/gcc/config/aarch64/aarch64-builtins.c
>> > index 4f33dd936c7..f93f4e29c89 100644
>> > --- a/gcc/config/aarch64/aarch64-builtins.c
>> > +++ b/gcc/config/aarch64/aarch64-builtins.c
>> > @@ -254,6 +254,10 @@ 
>> > aarch64_types_binop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> >  #define TYPES_GETREG (aarch64_types_binop_imm_qualifiers)
>> >  #define TYPES_SHIFTIMM (aarch64_types_binop_imm_qualifiers)
>> >  static enum aarch64_type_qualifiers
>> > +aarch64_types_ternop_s_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> > +  = { qualifier_none, qualifier_none, qualifier_none, 
>> > qualifier_immediate};
>> > +#define TYPES_SHIFT2IMM (aarch64_types_ternop_s_imm_qualifiers)
>> > +static enum aarch64_type_qualifiers
>> >  aarch64_types_shift_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> >= { qualifier_unsigned, qualifier_none, qualifier_immediate };
>> >  #define TYPES_SHIFTIMM_USS (aarch64_types_shift_to_unsigned_qualifiers)
>> > @@ -265,14 +269,16 @@ static enum aarch64_type_qualifiers
>> >  aarch64_types_unsigned_shift_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> >= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate };
>> >  #define TYPES_USHIFTIMM (aarch64_types_unsigned_shift_qualifiers)
>> > +#define TYPES_USHIFT2IMM (aarch64_types_ternopu_imm_qualifiers)
>> > +static enum aarch64_type_qualifiers
>> > +aarch64_types_shift2_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> > +  = { qualifier_unsigned, qualifier_unsigned, qualifier_none, 
>> > qualifier_immediate };
>> > +#define TYPES_SHIFT2IMM_UUSS (aarch64_types_shift2_to_unsigned_qualifiers)
>> >
>> >  static enum aarch64_type_qualifiers
>> >  aarch64_types_ternop_s_imm_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> >= { qualifier_none, qualifier_none, qualifier_poly, 
>> > qualifier_immediate};
>> >  #define TYPES_SETREGP (aarch64_types_ternop_s_imm_p_qualifiers)
>> > -static enum aarch64_type_qualifiers
>> > -aarch64_types_ternop_s_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> > -  = { qualifier_none, qualifier_none, qualifier_none, 
>> > qualifier_immediate};
>> >  #define TYPES_SETREG (aarch64_types_ternop_s_imm_qualifiers)
>> >  #define TYPES_SHIFTINSERT (aarch64_types_ternop_s_imm_qualifiers)
>> >  #define TYPES_SHIFTACC (aarch64_types_ternop_s_imm_qualifiers)
>>
>> Very minor, but I think it would be better to keep
>> aarch64_types_ternop_s_imm_qualifiers where it is and define
>> TYPES_SHIFT2IMM here rather than above.  For better or worse,
>> the current style seems to be to keep the defines next to the
>> associated arrays, rather than group them based on the TYPES_* name.
>>
>> > diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
>> > b/gcc/config/aarch64/aarch64-simd-builtins.def
>> > index d1b21102b2f..0b82b9c072b 100644
>> > --- a/gcc/config/aarch64/aarch64-simd-builtins.def
>> > +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
>> > @@ -285,6 +285,13 @@
>> >BUILTIN_VSQN_HSDI (USHIFTIMM, uqshrn_n, 0, ALL)
>> >BUILTIN_VSQN_HSDI (SHIFTIMM, sqrshrn_n, 0, ALL)
>> >BUILTIN_VSQN_HSDI (USHIFTIMM, uqrshrn_n, 0, ALL)
>> > +  /* Implemented by aarch64_qshrn2_n.  */
>> > +  BUILTIN_VQN (SHIFT2IMM_UUSS, sqshrun2_n, 0, ALL)
>> > +  BUILTIN_VQN (SHIFT2IMM_UUSS, sqrshrun2_n, 0, ALL)
>> > +  BUILTIN_VQN (SHIFT2IMM, sqshrn2_n, 0, ALL)
>> > +  BUILTIN_VQN (USHIFT2IMM, uqshrn2_n, 0, ALL)
>> > +  BUILTIN_VQN (SHIFT2IMM, sqrshrn2_n, 0, ALL)
>> > +  BUILTIN_VQN (USHIFT2IMM, uqrshrn2_n, 0, ALL)
>>
>> Using ALL is a holdover from the time (until a few weeks ago) when we
>> didn't record function attributes.  New intrinsics should therefore
>> have something more specific than ALL.
>>
>> We discussed offline whether the Q flag side effect of the intrinsics
>> should be observable or not, and the conclusion was that it shouldn't.
>> I think we can therefore treat these functions as pure functions,
>> meaning that they should have flags NONE rather than ALL.
>>
>> For that reason, I think we should also remove the Set_Neon_Cumulative_Sat
>> and CHECK_CUMULATIVE_SAT parts of the test (sorry).
>>
>> Other than that, the patch looks good to go.
>>
>> Thanks,
>> Richard
>
> I've updated the patch with TYPES_SHIFT2IMM moved, the builtins changed
> to NONE, and the Q flag portion of the tests removed.

Looks good to me, thanks.  Pushed to trunk.

Richard


Re: [PATCH, rs6000] Update instruction attributes for Power10

2020-11-06 Thread will schmidt via Gcc-patches
On Fri, 2020-11-06 at 10:46 -0600, Pat Haugen wrote:
> On 11/5/20 4:32 PM, will schmidt wrote:
> > On Wed, 2020-11-04 at 14:42 -0600, Pat Haugen via Gcc-patches
> > wrote:
> > >   * config/rs6000/rs6000.c (rs6000_final_prescan_insn): Only add
> > > 'p' for
> > >   PREFIXED_YES.
> > 
> > The code change reads as roughly 
> > - next_insn_prefixed_p != PREFIXED_NO
> > 
> > + next_insn_prefixed_p == PREFIXED_YES"
> > 
> > So just an inversion of the logic? I don't obviously see the 'p'
> > impact
> > there.
> > 
> 
> It's no longer an inversion of the logic since I added a
> PREFIXED_ALWAYS value. 'next_insn_prefixed' is used by
> rs6000_final_prescan_insn() to determine whether an insn mnemonic
> needs a 'p' prefix. We want it set for PREFIXED_YES, but not for
> PREFIXED_NO or PREFIXED_ALWAYS.

Ok.  So the next_insn_prefixed_p indicates whether the instruction
has/gets/needs a p prefix.  gotcha.  thanks for clarifying.  :-)

thanks
-will


> 
> > 
> > >   * config/rs6000/rs6000.md (define_attr "size"): Add 256.
> > >   (define_attr "prefixed"): Add 'always'.
> > >   (define_mode_attr bits): Add DD/TD modes.
> > >   (cfuged, cntlzdm, cnttzdm, pdepd, pextd, bswaphi2_reg,
> > > bswapsi2_reg,
> > >   bswapdi2_brd, setbc_signed_,
> > >   *setbcr_signed_, *setnbc_signed_,
> > >   *setnbcr_signed_): Update instruction attributes
> > > for
> > >   Power10.
> > 
> > ok.  (assuming the assorted 'integer' -> 'crypto' changes are
> > correct,
> > of course).  
> > 
> 
> Yes, crypto represents the correct pipe the insns are executed on.
> 
> Thanks for the review,
> Pat
> 



Re: [10/32] config

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/4/20 12:24 PM, Nathan Sidwell wrote:
> I managed to flub sending this yesterday.
>
> This is the gcc/configure.ac changes (rebuild configure and
> config.h.in after applying).  Generally just checking for
> network-related functionality.  If it's not available, those features
> of the module mapper will be unavailable.

OK with a ChangeLog entry.


jeff

ps.  Yes, I know this is out of order WRT the others in the series. 
It's not being threaded with the main patch series, so it stood out in
my queue :-)





Re: [patch] Add dg-require-effective-target fpic to an aarch64 specific test in gcc.dg

2020-11-06 Thread Richard Sandiford via Gcc-patches
Olivier Hainque  writes:
>> On 4 Nov 2020, at 20:16, Richard Sandiford  wrote:
>> 
>> Olivier Hainque  writes:
>>> Hello,
>>> 
>>> This patch adds dg-require-effective-target fpic
>>> to an aarch64 specific gcc.dg test using -fPIC,
>>> which helps circumvent a failure we observed while
>>> testing the aarch64 port for VxWorks.
>>> 
>>> ok to commit ?
>> 
>> OK, thanks.  Also OK for any other current or future aarch64 test that
>> has -fpic or -fPIC in the options and forgets to do this.
>
> Great, thanks Richard!
>
> This echoes what you had actually already told me some
> months ago at
>
>   https://gcc.gnu.org/pipermail/gcc-patches/2019-December/536909.html
>
> which I didn't remember before sending this patch.
>
> For the avoidance of doubt, that would hold for arm
> tests as well, right ?

Yeah, it does.

Thanks,
Richard

> (I'll be careful not to introduce obvious redundancy with e.g.
> os=linux, as pointed out by Jakub earlier this week for another
> set on i386).
>
> Cheers,
>
> Olivier


Re: [patch] Add dg-require-effective-target fpic to an aarch64 specific test in gcc.dg

2020-11-06 Thread Olivier Hainque



> On 6 Nov 2020, at 19:03, Richard Sandiford  wrote:

>>> OK, thanks.  Also OK for any other current or future aarch64 test that
>>> has -fpic or -fPIC in the options and forgets to do this.

>> For the avoidance of doubt, that would hold for arm
>> tests as well, right ?
> 
> Yeah, it does.
> 
> Thanks,

Sure, thanks for confirming.

Cheers,

Olivier



Re: [10/32] config

2020-11-06 Thread Nathan Sidwell

On 11/6/20 12:56 PM, Jeff Law wrote:


On 11/4/20 12:24 PM, Nathan Sidwell wrote:

I managed to flub sending this yesterday.

This is the gcc/configure.ac changes (rebuild configure and
config.h.in after applying).  Generally just checking for
network-related functionality.  If it's not available, those features
of the module mapper will be unavailable.


OK with a ChangeLog entry.


thanks.  I should have mentioned the lack of changelogs on the patches. 
I will of course write them when committing, along with rationale that 
may have already been written in the emails


nathan

--
Nathan Sidwell


[r11-4770 Regression] FAIL: gcc.dg/ipa/modref-2.c scan-ipa-dump modref "Parm 1 param offset:0 offset:0 size:-1 max_size:64" on Linux/x86_64

2020-11-06 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

6cef01c32817b3d08af2cadcdb0e23c72ceed426 is the first bad commit
commit 6cef01c32817b3d08af2cadcdb0e23c72ceed426
Author: Jan Hubicka 
Date:   Fri Nov 6 10:23:58 2020 +0100

Add fnspec handling to ipa mode of ipa-modef.

caused

FAIL: gcc.dg/ipa/modref-2.c scan-ipa-dump modref "Parm 1 param offset:0 
offset:0 size:-1 max_size:64"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-4770/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/modref-2.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/modref-2.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] RX add control register PC

2020-11-06 Thread Jeff Law via Gcc-patches


On 8/26/20 4:24 AM, Darius Galis wrote:
> Hello,
> Thank you for adjusting the patch.
> I don't have commit privileges, so if you could please commit it, that would 
> be great.

Sorry it took so long.  I got pulled away for most of the last few months.


It's committed to the trunk.


jeff




[PATCH] c++: DR 1914 - Allow duplicate standard attributes.

2020-11-06 Thread Marek Polacek via Gcc-patches
Following Joseph's change for C to allow duplicate C2x standard attributes
,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
, which is not part of the standard yet, but has a wide
support so look like a shoo-in.  Some duplications still produce warnings;
I didn't change that because a warning might be desirable.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/c-family/ChangeLog:

DR 1914
* c-common.c (attribute_fallthrough_p): Update comment.

gcc/cp/ChangeLog:

DR 1914
* parser.c (cp_parser_check_std_attribute): Remove.
(cp_parser_std_attribute_list): Don't call it.

gcc/testsuite/ChangeLog:

DR 1914
* g++.dg/cpp0x/gen-attrs-60.C: Remove dg-error.
* g++.dg/cpp2a/nodiscard-once.C: Likewise.
* g++.dg/cpp1y/attr-deprecated-2.C: Likewise.
* g++.dg/cpp0x/gen-attrs-72.C: New test.
---
 gcc/c-family/c-common.c   |  3 +-
 gcc/cp/parser.c   | 27 
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C |  2 +-
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C | 42 +++
 .../g++.dg/cpp1y/attr-deprecated-2.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/nodiscard-once.C   |  2 +-
 6 files changed, 47 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 232a4797c09..03d8fec9936 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5751,7 +5751,8 @@ attribute_fallthrough_p (tree attr)
   tree t = lookup_attribute ("fallthrough", attr);
   if (t == NULL_TREE)
 return false;
-  /* This attribute shall appear at most once in each attribute-list.  */
+  /* It is no longer true that "this attribute shall appear at most once in
+ each attribute-list", but we still give a warning.  */
   if (lookup_attribute ("fallthrough", TREE_CHAIN (t)))
 warning (OPT_Wattributes, "% attribute specified multiple "
 "times");
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 6e7b982f073..0567646fbe2 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -27269,32 +27269,6 @@ cp_parser_std_attribute (cp_parser *parser, tree 
attr_ns)
   return attribute;
 }
 
-/* Check that the attribute ATTRIBUTE appears at most once in the
-   attribute-list ATTRIBUTES.  This is enforced for noreturn (7.6.3),
-   nodiscard, and deprecated (7.6.5).  Note that
-   carries_dependency (7.6.4) isn't implemented yet in GCC.  */
-
-static void
-cp_parser_check_std_attribute (tree attributes, tree attribute)
-{
-  if (attributes)
-{
-  tree name = get_attribute_name (attribute);
-  if (is_attribute_p ("noreturn", name)
- && lookup_attribute ("noreturn", attributes))
-   error ("attribute % can appear at most once "
-  "in an attribute-list");
-  else if (is_attribute_p ("deprecated", name)
-  && lookup_attribute ("deprecated", attributes))
-   error ("attribute % can appear at most once "
-  "in an attribute-list");
-  else if (is_attribute_p ("nodiscard", name)
-  && lookup_attribute ("nodiscard", attributes))
-   error ("attribute % can appear at most once "
-  "in an attribute-list");
-}
-}
-
 /* Parse a list of standard C++-11 attributes.
 
attribute-list:
@@ -27317,7 +27291,6 @@ cp_parser_std_attribute_list (cp_parser *parser, tree 
attr_ns)
break;
   if (attribute != NULL_TREE)
{
- cp_parser_check_std_attribute (attributes, attribute);
  TREE_CHAIN (attribute) = attributes;
  attributes = attribute;
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C 
b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C
index cb0c31ec63f..4e905c394ca 100644
--- a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C
+++ b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C
@@ -1,4 +1,4 @@
 // PR c++/60365
 // { dg-do compile { target c++11 } }
 
-void func [[noreturn, noreturn]] (); // { dg-error "at most once" }
+void func [[noreturn, noreturn]] ();
diff --git a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C 
b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C
new file mode 100644
index 000..0cb874b6191
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C
@@ -0,0 +1,42 @@
+// DR 1914 - Duplicate standard attributes 
+// { dg-do compile { target c++11 } }
+
+[[noreturn, noreturn]] void fn0();
+[[noreturn]] [[noreturn]] void fn1();
+[[deprecated, deprecated]] void fn2();
+[[deprecated]] [[deprecated]] void fn3();
+[[maybe_unused]] [[maybe_unused]] int fn4();
+[[maybe_unused, maybe_unused]] int fn5();
+[[nodiscard]] [[nodiscard]] int fn6();
+[[nodiscard, nodiscard]] int fn7();
+
+struct E { };
+struct A {
+  [[no_unique_address]] [[no_unique_address]] E e;
+};
+struct B {
+  [[no_unique_address, no_unique_address]] E e;
+};
+
+int
+f

Re: [PATCH] c++: DR 1914 - Allow duplicate standard attributes.

2020-11-06 Thread Jason Merrill via Gcc-patches

On 11/6/20 2:06 PM, Marek Polacek wrote:

Following Joseph's change for C to allow duplicate C2x standard attributes
,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
, which is not part of the standard yet, but has a wide
support so look like a shoo-in.  Some duplications still produce warnings;
I didn't change that because a warning might be desirable.


What's the rationale for warning about some and not others?


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/c-family/ChangeLog:

DR 1914
* c-common.c (attribute_fallthrough_p): Update comment.

gcc/cp/ChangeLog:

DR 1914
* parser.c (cp_parser_check_std_attribute): Remove.
(cp_parser_std_attribute_list): Don't call it.

gcc/testsuite/ChangeLog:

DR 1914
* g++.dg/cpp0x/gen-attrs-60.C: Remove dg-error.
* g++.dg/cpp2a/nodiscard-once.C: Likewise.
* g++.dg/cpp1y/attr-deprecated-2.C: Likewise.
* g++.dg/cpp0x/gen-attrs-72.C: New test.
---
  gcc/c-family/c-common.c   |  3 +-
  gcc/cp/parser.c   | 27 
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C |  2 +-
  gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C | 42 +++
  .../g++.dg/cpp1y/attr-deprecated-2.C  |  2 +-
  gcc/testsuite/g++.dg/cpp2a/nodiscard-once.C   |  2 +-
  6 files changed, 47 insertions(+), 31 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 232a4797c09..03d8fec9936 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5751,7 +5751,8 @@ attribute_fallthrough_p (tree attr)
tree t = lookup_attribute ("fallthrough", attr);
if (t == NULL_TREE)
  return false;
-  /* This attribute shall appear at most once in each attribute-list.  */
+  /* It is no longer true that "this attribute shall appear at most once in
+ each attribute-list", but we still give a warning.  */
if (lookup_attribute ("fallthrough", TREE_CHAIN (t)))
  warning (OPT_Wattributes, "% attribute specified multiple "
 "times");
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 6e7b982f073..0567646fbe2 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -27269,32 +27269,6 @@ cp_parser_std_attribute (cp_parser *parser, tree 
attr_ns)
return attribute;
  }
  
-/* Check that the attribute ATTRIBUTE appears at most once in the

-   attribute-list ATTRIBUTES.  This is enforced for noreturn (7.6.3),
-   nodiscard, and deprecated (7.6.5).  Note that
-   carries_dependency (7.6.4) isn't implemented yet in GCC.  */
-
-static void
-cp_parser_check_std_attribute (tree attributes, tree attribute)
-{
-  if (attributes)
-{
-  tree name = get_attribute_name (attribute);
-  if (is_attribute_p ("noreturn", name)
- && lookup_attribute ("noreturn", attributes))
-   error ("attribute % can appear at most once "
-  "in an attribute-list");
-  else if (is_attribute_p ("deprecated", name)
-  && lookup_attribute ("deprecated", attributes))
-   error ("attribute % can appear at most once "
-  "in an attribute-list");
-  else if (is_attribute_p ("nodiscard", name)
-  && lookup_attribute ("nodiscard", attributes))
-   error ("attribute % can appear at most once "
-  "in an attribute-list");
-}
-}
-
  /* Parse a list of standard C++-11 attributes.
  
 attribute-list:

@@ -27317,7 +27291,6 @@ cp_parser_std_attribute_list (cp_parser *parser, tree 
attr_ns)
break;
if (attribute != NULL_TREE)
{
- cp_parser_check_std_attribute (attributes, attribute);
  TREE_CHAIN (attribute) = attributes;
  attributes = attribute;
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C 
b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C
index cb0c31ec63f..4e905c394ca 100644
--- a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C
+++ b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-60.C
@@ -1,4 +1,4 @@
  // PR c++/60365
  // { dg-do compile { target c++11 } }
  
-void func [[noreturn, noreturn]] (); // { dg-error "at most once" }

+void func [[noreturn, noreturn]] ();
diff --git a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C 
b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C
new file mode 100644
index 000..0cb874b6191
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-72.C
@@ -0,0 +1,42 @@
+// DR 1914 - Duplicate standard attributes
+// { dg-do compile { target c++11 } }
+
+[[noreturn, noreturn]] void fn0();
+[[noreturn]] [[noreturn]] void fn1();
+[[deprecated, deprecated]] void fn2();
+[[deprecated]] [[deprecated]] void fn3();
+[[maybe_unused]] [[maybe_unused]] int fn4();
+[[maybe_unused, maybe_unused]] int fn5();
+[[nodiscard]] [[nodiscard]] int fn6();
+[[nodiscard, nodiscard]] int fn7();
+
+struct E { };
+struct A {

Re: [PATCH] c++: Propagate attributes to clones in duplicate_decls [PR67453]

2020-11-06 Thread Jason Merrill via Gcc-patches

On 11/6/20 4:07 AM, Jakub Jelinek wrote:

Hi!

On the following testcase where the cdtor attributes aren't on the
in-class declaration but on an out-of-class definition, the cdtors
have their clones created from the in-class declaration, and later on
duplicate_decls updates attributes on the abstract cdtors, but nothing
propagates them to the clones.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?


OK.


2020-11-06  Jakub Jelinek  

PR c++/67453
* decl.c (duplicate_decls): Propagate DECL_ATTRIBUTES and
DECL_PRESERVE_P from olddecl to its clones if any.

* g++.dg/ext/attr-used-2.C: New test.

--- gcc/cp/decl.c.jj2020-11-03 21:42:00.536043737 +0100
+++ gcc/cp/decl.c   2020-11-05 17:33:40.064072970 +0100
@@ -2921,6 +2921,16 @@ duplicate_decls (tree newdecl, tree oldd
snode->remove ();
  }
  
+  if (TREE_CODE (olddecl) == FUNCTION_DECL)

+{
+  tree clone;
+  FOR_EACH_CLONE (clone, olddecl)
+   {
+ DECL_ATTRIBUTES (clone) = DECL_ATTRIBUTES (olddecl);
+ DECL_PRESERVE_P (clone) |= DECL_PRESERVE_P (olddecl);
+   }
+}
+
/* Remove the associated constraints for newdecl, if any, before
   reclaiming memory. */
if (flag_concepts)
--- gcc/testsuite/g++.dg/ext/attr-used-2.C.jj   2020-11-05 17:42:49.895949119 
+0100
+++ gcc/testsuite/g++.dg/ext/attr-used-2.C  2020-11-05 17:42:07.934416482 
+0100
@@ -0,0 +1,15 @@
+// PR c++/67453
+// { dg-do compile }
+// { dg-final { scan-assembler "_ZN1SC\[12]Ev" } }
+// { dg-final { scan-assembler "_ZN1SD\[12]Ev" } }
+// { dg-final { scan-assembler "_ZN1SC\[12]ERKS_" } }
+
+struct S {
+S();
+~S();
+S(const S&);
+};
+
+__attribute__((used)) inline S::S()  { }
+__attribute__((used)) inline S::~S() { }
+__attribute__((used)) inline S::S(const S&) { }

Jakub





Re: [PATCH] c++: Small tweak to can_convert_eh [PR81660]

2020-11-06 Thread Jason Merrill via Gcc-patches

On 11/6/20 1:15 AM, Marek Polacek wrote:

While messing with check_handlers_1, I spotted this bug report which
complains that we don't warn about the case when we have two duplicated
handlers of type int.  can_convert_eh implements [except.handle] and
that says: A handler is a match for an exception object of type E if
  - The handler is of type cv T or cv T& and E and T are the same type
(ignoring the top-level cv-qualifiers), or [...]

but we don't implement this bullet properly for non-class types.  The
fix therefore seems pretty obvious.  Also change the return type to
bool when we're only returning yes/no.


OK.


gcc/cp/ChangeLog:

PR c++/81660
* except.c (can_convert_eh): Change the return type to bool.  If
the type TO and FROM are the same, return true.

gcc/testsuite/ChangeLog:

PR c++/81660
* g++.dg/warn/Wexceptions3.C: New test.
* g++.dg/eh/pr42859.C: Add dg-warning.
* g++.dg/torture/pr81659.C: Likewise.
---
  gcc/cp/except.c  | 14 +++-
  gcc/testsuite/g++.dg/eh/pr42859.C|  2 +-
  gcc/testsuite/g++.dg/torture/pr81659.C   |  2 +-
  gcc/testsuite/g++.dg/warn/Wexceptions3.C | 29 
  4 files changed, 39 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions3.C

diff --git a/gcc/cp/except.c b/gcc/cp/except.c
index b72a28c1aa9..0f6c76b9892 100644
--- a/gcc/cp/except.c
+++ b/gcc/cp/except.c
@@ -41,7 +41,6 @@ static tree do_allocate_exception (tree);
  static tree wrap_cleanups_r (tree *, int *, void *);
  static int complete_ptr_ref_or_void_ptr_p (tree, tree);
  static bool is_admissible_throw_operand_or_catch_parameter (tree, bool);
-static int can_convert_eh (tree, tree);
  
  /* Sets up all the global eh stuff that needs to be initialized at the

 start of compilation.  */
@@ -932,31 +931,34 @@ nothrow_libfn_p (const_tree fn)
  /* Returns nonzero if an exception of type FROM will be caught by a
 handler for type TO, as per [except.handle].  */
  
-static int

+static bool
  can_convert_eh (tree to, tree from)
  {
to = non_reference (to);
from = non_reference (from);
  
+  if (same_type_ignoring_top_level_qualifiers_p (to, from))

+return true;
+
if (TYPE_PTR_P (to) && TYPE_PTR_P (from))
  {
to = TREE_TYPE (to);
from = TREE_TYPE (from);
  
if (! at_least_as_qualified_p (to, from))

-   return 0;
+   return false;
  
if (VOID_TYPE_P (to))

-   return 1;
+   return true;
  
/* Else fall through.  */

  }
  
if (CLASS_TYPE_P (to) && CLASS_TYPE_P (from)

&& publicly_uniquely_derived_p (to, from))
-return 1;
+return true;
  
-  return 0;

+  return false;
  }
  
  /* Check whether any of the handlers in I are shadowed by another handler

diff --git a/gcc/testsuite/g++.dg/eh/pr42859.C 
b/gcc/testsuite/g++.dg/eh/pr42859.C
index a9f1473bc85..0de91409c83 100644
--- a/gcc/testsuite/g++.dg/eh/pr42859.C
+++ b/gcc/testsuite/g++.dg/eh/pr42859.C
@@ -13,7 +13,7 @@ ptw32_terminate (void)
  catch (int)
  {
  }
-catch (int)
+catch (int) // { dg-warning "will be caught by earlier handler" }
  {
  }
}
diff --git a/gcc/testsuite/g++.dg/torture/pr81659.C 
b/gcc/testsuite/g++.dg/torture/pr81659.C
index 3696957532e..074099be6fc 100644
--- a/gcc/testsuite/g++.dg/torture/pr81659.C
+++ b/gcc/testsuite/g++.dg/torture/pr81659.C
@@ -12,7 +12,7 @@ a (int b)
catch (int)
  {
  }
-  catch (int)
+  catch (int) // { dg-warning "will be caught by earlier handler" }
  {
  }
  }
diff --git a/gcc/testsuite/g++.dg/warn/Wexceptions3.C 
b/gcc/testsuite/g++.dg/warn/Wexceptions3.C
new file mode 100644
index 000..97fda9dbd91
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wexceptions3.C
@@ -0,0 +1,29 @@
+// PR c++/81660
+
+void bar (int);
+
+void
+fn (int b)
+{
+  if (b)
+throw;
+  try
+{
+  bar (3);
+}
+  catch (int)
+{
+}
+  catch (int) // { dg-warning "will be caught by earlier handler" }
+{
+}
+  catch (const int) // { dg-warning "will be caught by earlier handler" }
+{
+}
+  catch (int &) // { dg-warning "will be caught by earlier handler" }
+{
+}
+  catch (const int &) // { dg-warning "will be caught by earlier handler" }
+{
+}
+}

base-commit: f72af3af8d526793e4927daf44ae0611c3d0cc85





[PATCH] Combine new calculated ranges with existing range.

2020-11-06 Thread Andrew MacLeod via Gcc-patches

This patch fixed PR 97737 and 97741.

basically, when we re-calculate a range, intersect it with whatever we 
knew before so we dont lose any accumulated information.  PR97741 
details the specific case, but I was considering doing this anyway as it 
will allo us to retain any externally accumulated information that is 
known about an SSA_NAME which we would lose right now.


It wasnt really an issue before because the ranger never recalculated a 
range, but with the temporal cache allowing for follow on calculations 
it become more important.


bootstrapped on x86_64-pc-linux-gnu, no regressions. pushed.

Andrew

commit 129e1a8a96d140150705fab30d25afb464eb1d99
Author: Andrew MacLeod 
Date:   Fri Nov 6 14:14:46 2020 -0500

Combine new calculated ranges with existing range.

When a range is recalculated, retain what was previously known as IL changes
can produce different results from un-executed code.   This also paves
the way for external injection of ranges.

gcc/
PR tree-optimization/97737
PR tree-optimization/97741
* gimple-range.cc: (gimple_ranger::range_of_stmt): Intersect newly
calculated ranges with the existing known global range.
gcc/testsuite/
* gcc.dg/pr97737.c: New.
* gcc.dg/pr97741.c: New.

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 0c8ec40448f..92a6335bec5 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -1026,8 +1026,14 @@ gimple_ranger::range_of_stmt (irange &r, gimple *s, tree name)
   if (m_cache.get_non_stale_global_range (r, name))
 return true;
 
-  // Otherwise calculate a new value and save it.
-  calc_stmt (r, s, name);
+  // Otherwise calculate a new value.
+  int_range_max tmp;
+  calc_stmt (tmp, s, name);
+
+  // Combine the new value with the old value.  This is required because
+  // the way value propagation works, when the IL changes on the fly we
+  // can sometimes get different results.  See PR 97741.
+  r.intersect (tmp);
   m_cache.set_global_range (name, r);
   return true;
 }
diff --git a/gcc/testsuite/gcc.dg/pr97737.c b/gcc/testsuite/gcc.dg/pr97737.c
new file mode 100644
index 000..eef1c353191
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97737.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int a = 1, b, c;
+
+void d() {
+  int e = 1;
+L1:
+  b = e;
+L2:
+  e = e / a;
+  if (!(e || c || e - 1))
+goto L1;
+  if (!b)
+goto L2;
+}
diff --git a/gcc/testsuite/gcc.dg/pr97741.c b/gcc/testsuite/gcc.dg/pr97741.c
new file mode 100644
index 000..47115d31d4a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97741.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -Wextra -fno-strict-aliasing -fwrapv -Os -fno-toplevel-reorder -fno-tree-ccp -fno-tree-fre" } */
+
+short a = 0;
+long b = 0;
+char c = 0;
+void d() {
+  int e = 0;
+f:
+  for (a = 6; a;)
+c = e;
+  e = 0;
+  for (; e == 20; ++e)
+for (; b;)
+  goto f;
+}
+int main() { return 0; }


Re: [PATCH] c++: DR 1914 - Allow duplicate standard attributes.

2020-11-06 Thread Marek Polacek via Gcc-patches
On Fri, Nov 06, 2020 at 02:23:10PM -0500, Jason Merrill via Gcc-patches wrote:
> On 11/6/20 2:06 PM, Marek Polacek wrote:
> > Following Joseph's change for C to allow duplicate C2x standard attributes
> > ,
> > this patch does a similar thing for C++.  This is DR 1914, to be resolved by
> > , which is not part of the standard yet, but has a wide
> > support so look like a shoo-in.  Some duplications still produce warnings;
> > I didn't change that because a warning might be desirable.
> 
> What's the rationale for warning about some and not others?

I don't have any.  Joseph's patch removed the error for a duplicated
'fallthrough' attribute, but the warning remained so I left it as-is
too.

So either we just downgrade the error to a warning, or remove the
remaining warnings too.  I think I slightly prefer the former; with perhaps
a small tweak not to warn when the duplicated attribute comes from a macro
expansion.

Marek



[pushed] Darwin: Darwin 20 is to be macOS 11 (Big Sur).

2020-11-06 Thread Iain Sandoe
Hi,

It looks like the next macOS release is imminent.

Tested across the Darwin patch, and on x86_64-linux-gnu,
pushed to master
thanks
Iain

--

As per Nigel Tufnel's assertion "... this one goes to 11".

The various parts of the code that deal with mapping Darwin versions
to macOS (X) versions need updating to deal with  a major version of
11.

So now we have, for example:

Darwin  4 => macOS (X) 10.0
…
Darwin 14 => macOS (X) 10.10
...
Darwin 19 => macOS (X) 10.15

Darwin 20 => macOS  11.0

Because of the historical duplication of the "10" in macOSX 10.xx and
the number of tools that expect this, it is likely that system tools will
allow macos11.0 and/or macosx11.0 (despite that the latter makes little
sense).

Update the link test to cover Catalina (Darwin19/10.15) and
Big Sur (Darwin20/11.0).

gcc/ChangeLog:

* config/darwin-c.c: Allow for Darwin20 to correspond to macOS 11.
* config/darwin-driver.c: Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/darwin-minversion-link.c: Allow for Darwin19 (macOS 10.15)
and Darwin20 (macOS 11.0).
---
 gcc/config/darwin-c.c |  4 ++--
 gcc/config/darwin-driver.c| 21 ---
 gcc/testsuite/gcc.dg/darwin-minversion-link.c |  5 +++--
 3 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/gcc/config/darwin-c.c b/gcc/config/darwin-c.c
index e3b999e166b..9034f49908e 100644
--- a/gcc/config/darwin-c.c
+++ b/gcc/config/darwin-c.c
@@ -692,10 +692,10 @@ macosx_version_as_macro (void)
   if (!version_array)
 goto fail;
 
-  if (version_array[MAJOR] != 10)
+  if (version_array[MAJOR] < 10 || version_array[MAJOR] > 11)
 goto fail;
 
-  if (version_array[MINOR] < 10)
+  if (version_array[MAJOR] == 10 && version_array[MINOR] < 10)
 version_macro = version_as_legacy_macro (version_array);
   else
 version_macro = version_as_modern_macro (version_array);
diff --git a/gcc/config/darwin-driver.c b/gcc/config/darwin-driver.c
index 8fdd32e2858..8ae300057fd 100644
--- a/gcc/config/darwin-driver.c
+++ b/gcc/config/darwin-driver.c
@@ -65,7 +65,7 @@ validate_macosx_version_min (const char *version_str)
   major = strtoul (version_str, &end, 10);
   version_str = end + ((*end == '.') ? 1 : 0);
 
-  if (major != 10) /* So far .. all MacOS 10 ... */
+  if (major < 10 || major > 11 ) /* MacOS 10 and 11 are known. */
 return NULL;
 
   /* Version string components must be present and numeric.  */
@@ -104,7 +104,7 @@ validate_macosx_version_min (const char *version_str)
   if (need_rewrite)
 {
   char *new_version;
-  asprintf (&new_version, "10.%lu.%lu", minor, tiny);
+  asprintf (&new_version, "%2lu.%lu.%lu", major, minor, tiny);
   return new_version;
 }
 
@@ -115,6 +115,12 @@ validate_macosx_version_min (const char *version_str)
 #include 
 #include "xregex.h"
 
+/* Determine the version of the running OS.
+   We only look at the first two components (ignoring the patch one) and
+   report NN.MM.0 where NN is currently either 10 or 11 and MM is the OS
+   minor release number.
+   If we can't parse what the kernel gives us, warn the user, and do nothing.  
*/
+
 static char *
 darwin_find_version_from_kernel (void)
 {
@@ -125,8 +131,6 @@ darwin_find_version_from_kernel (void)
   char * version_p;
   char * new_flag;
 
-  /* Determine the version of the running OS.  If we can't, warn user,
- and do nothing.  */
   if (sysctl (osversion_name, ARRAY_SIZE (osversion_name), osversion,
  &osversion_len, NULL, 0) == -1)
 {
@@ -144,10 +148,11 @@ darwin_find_version_from_kernel (void)
 major_vers = major_vers * 10 + (*version_p++ - '0');
   if (*version_p++ != '.')
 goto parse_failed;
-  
-  /* The major kernel version number is 4 plus the second OS version
- component.  */
-  if (major_vers - 4 <= 4)
+
+  /* Darwin20 sees a transition to macOS 11.  */
+  if (major_vers >= 20)
+asprintf (&new_flag, "11.%02d.00", major_vers - 20);
+  else if (major_vers - 4 <= 4)
 /* On 10.4 and earlier, the old linker is used which does not
support three-component system versions.
FIXME: we should not assume this - a newer linker could be used.  */
diff --git a/gcc/testsuite/gcc.dg/darwin-minversion-link.c 
b/gcc/testsuite/gcc.dg/darwin-minversion-link.c
index 0a80048ba35..765fb799a91 100644
--- a/gcc/testsuite/gcc.dg/darwin-minversion-link.c
+++ b/gcc/testsuite/gcc.dg/darwin-minversion-link.c
@@ -13,8 +13,9 @@
 /* { dg-additional-options "-mmacosx-version-min=010.011.06 -DCHECK=101106" { 
target *-*-darwin15* } } */
 /* { dg-additional-options "-mmacosx-version-min=010.012.06 -DCHECK=101206" { 
target *-*-darwin16* } } */
 /* { dg-additional-options "-mmacosx-version-min=010.013.06 -DCHECK=101306" { 
target *-*-darwin17* } } */
-/* This next test covers 10.18 and (currently unreleased) 10.19 for now. */  
-/* { dg-additional-options "-mmacosx-version-min=010.014.05 -DCHECK=101405" { 
target *-*-darwin1[89]* } } */
+/* { d

[pushed] Objective-C/C++ (parsers) : Update @property attribute parsing.

2020-11-06 Thread Iain Sandoe
Hi

This is preparatory work for bringing at least one aspect of the
Objective-C implementation up to date.

Tested on a number of Darwin revisions, and x86_64-linux-gnu
pushed to master,
thanks
Iain

---

At present, we are missing parsing and checking for around
half of the property attributes in use.  The existing ad hoc scheme
for the parser's communication with the Objective C validation
is not suitable for extending to cover all the missing cases.

Additionally:

1/ We were declaring errors in two cases that the reference
   implementation warns (or is silent).

   I've elected to warn for both those cases, (Wattributes) it
   could be that we should implement Wobjc-xxx-property warning
   masks (TODO).

2/ We were emitting spurious complaints about missing property
   attributes when these were not being parsed because we gave
   up on the first syntax error.

3/ The quality of the diagnostic locations was poor (that's
   true for much of Objective-C, we will have to improve it as
   we modernise areas).

We continue to attempt to keep the code, warning and error output
similar (preferably identical output) between the C and C++ front
ends.

The interface to the Objective-C-specific parts of the parsing is
simplified to a vector of parsed (but not fully-checked) property
attributes, this will simplify the addition of new attributes.

gcc/c-family/ChangeLog:

* c-objc.h (enum objc_property_attribute_group): New
(enum objc_property_attribute_kind): New.
(OBJC_PROPATTR_GROUP_MASK): New.
(struct property_attribute_info): Small class encapsulating
parser output from property attributes.
(objc_prop_attr_kind_for_rid): New
(objc_add_property_declaration): Simplify interface.
* stub-objc.c (enum rid): Dummy type.
(objc_add_property_declaration): Simplify interface.
(objc_prop_attr_kind_for_rid): New.

gcc/c/ChangeLog:

* c-parser.c (c_parser_objc_at_property_declaration):
Improve parsing fidelity. Associate better location info
with @property attributes.  Clean up the interface to
objc_add_property_declaration ().

gcc/cp/ChangeLog:

* parser.c (cp_parser_objc_at_property_declaration):
Improve parsing fidelity. Associate better location info
with @property attributes.  Clean up the interface to
objc_add_property_declaration ().

gcc/objc/ChangeLog:

* objc-act.c (objc_prop_attr_kind_for_rid): New.
(objc_add_property_declaration): Adjust to consume the
parser output using a vector of parsed attributes.

gcc/testsuite/ChangeLog:

* obj-c++.dg/property/at-property-1.mm: Adjust expected
diagnostics.
* obj-c++.dg/property/at-property-29.mm: Likewise.
* obj-c++.dg/property/at-property-4.mm: Likewise.
* obj-c++.dg/property/property-neg-2.mm: Likewise.
* objc.dg/property/at-property-1.m: Likewise.
* objc.dg/property/at-property-29.m: Likewise.
* objc.dg/property/at-property-4.m: Likewise.
* objc.dg/property/at-property-5.m: Likewise.
* objc.dg/property/property-neg-2.m: Likewise.
---
 gcc/c-family/c-objc.h |  65 +++-
 gcc/c-family/stub-objc.c  |  21 +-
 gcc/c/c-parser.c  | 280 ++---
 gcc/cp/parser.c   | 266 +---
 gcc/objc/objc-act.c   | 295 +++---
 .../obj-c++.dg/property/at-property-1.mm  |  12 +-
 .../obj-c++.dg/property/at-property-29.mm |   8 +-
 .../obj-c++.dg/property/at-property-4.mm  |  10 +-
 .../obj-c++.dg/property/property-neg-2.mm |   2 +-
 .../objc.dg/property/at-property-1.m  |  12 +-
 .../objc.dg/property/at-property-29.m |   7 +-
 .../objc.dg/property/at-property-4.m  |  10 +-
 .../objc.dg/property/at-property-5.m  |   2 +-
 .../objc.dg/property/property-neg-2.m |   2 +-
 14 files changed, 598 insertions(+), 394 deletions(-)

diff --git a/gcc/c-family/c-objc.h b/gcc/c-family/c-objc.h
index 4577e4f1c7f..a2ca112dcc6 100644
--- a/gcc/c-family/c-objc.h
+++ b/gcc/c-family/c-objc.h
@@ -28,6 +28,67 @@ enum GTY(()) objc_ivar_visibility_kind {
   OBJC_IVAR_VIS_PACKAGE   = 3
 };
 
+/* ObjC property attribute kinds.
+   These have two fields; a unique value (that identifies which attribute)
+   and a group key that indicates membership of an exclusion group.
+   Only one member may be present from an exclusion group in a given attribute
+   list.
+   getters and setters have additional rules, since they are excluded from
+   non-overlapping group sets.  */
+
+enum objc_property_attribute_group
+{
+  OBJC_PROPATTR_GROUP_UNKNOWN = 0,
+  OBJC_PROPATTR_GROUP_GETTER,
+  OBJC_PROPATTR_GROUP_SETTER,
+  OBJC_PROPATTR_GROUP_READWRITE,
+  OBJC_PROPATTR_GROUP_ASSIGN,
+  OBJC_PROPATTR_GROUP_ATOMIC,
+  OBJC_PROPATTR_GROUP_MAX
+};
+
+enum objc_property_attribute_ki

Re: [01/32] langhooks

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/3/20 2:13 PM, Nathan Sidwell wrote:
> I needed a set of hook interfacing the preprocessor to the language.
> they get called from pieces in c-family.
>
> preprocess_main_file:  we need to know when any forced headers have
> been parsed in order to deal with linemaps and macro visibility
>
> preprocess_options: A way for the language to adjust any preprocessor
> options and alter direct callbacks
>
> preprocess_undef: We need visibility of #undefs
>
> preprocess_deferred_macro: macros from header-units are instantiated
> lazily.  This is the hook for the preprocessor to get that done.
>
> preprocess_token: Even in -E processing, we need to observe the token
> stream in order to load up the macro tables of header units.
>
> c-family's c-lex.c, c-opts.c & c-ppoutput.c get to call these hooks in
> various cases

LGTM.

jeff




Re: [PATCH 1/4] c++: Fix ICE with variadic concepts and aliases [PR93907]

2020-11-06 Thread Jason Merrill via Gcc-patches

On 11/5/20 8:40 PM, Patrick Palka wrote:

This patch (naively) extends the PR93907 fix to also apply to variadic
concepts invoked with a type argument pack.  Without this, we ICE on
the below testcase (a variadic version of concepts-using2.C) in the same
manner as we used to on concepts-using2.C before r10-7133.

Patch series bootstrapped and regtested on x86_64-pc-linux-gnu,
and also tested against cmcstl2 and range-v3.

gcc/cp/ChangeLog:

PR c++/93907
* constraint.cc (tsubst_parameter_mapping): Also canonicalize
the type arguments of a TYPE_ARGUMENT_PACk.

gcc/testsuite/ChangeLog:

PR c++/93907
* g++.dg/cpp2a/concepts-using3.C: New test, based off of
concepts-using2.C.
---
  gcc/cp/constraint.cc | 10 
  gcc/testsuite/g++.dg/cpp2a/concepts-using3.C | 52 
  2 files changed, 62 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-using3.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index b6f6f0d02a5..c871a8ab86a 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2252,6 +2252,16 @@ tsubst_parameter_mapping (tree map, tree args, 
subst_info info)


Hmm, the

>   else if (ARGUMENT_PACK_P (arg))
> new_arg = tsubst_argument_pack (arg, args, complain, in_decl);

just above this seems redundant, since tsubst_template_arg handles packs 
just fine.  In fact, I wonder why tsubst_argument_pack is used 
specifically anywhere?  It seems to get some edge cases better than the 
code in tsubst, but the solution to that would seem to be replacing the 
code in tsubst with a call to tsubst_argument_pack; then we can remove 
all the other calls to the function.



  new_arg = tsubst_template_arg (arg, args, complain, in_decl);
  if (TYPE_P (new_arg))
new_arg = canonicalize_type_argument (new_arg, complain);
+ if (TREE_CODE (new_arg) == TYPE_ARGUMENT_PACK)
+   {
+ tree pack_args = ARGUMENT_PACK_ARGS (new_arg);
+ for (int i = 0; i < TREE_VEC_LENGTH (pack_args); i++)
+   {
+ tree& pack_arg = TREE_VEC_ELT (pack_args, i);
+ if (TYPE_P (pack_arg))
+   pack_arg = canonicalize_type_argument (pack_arg, complain);


Do we need the TYPE_P here, since we already know we're in a 
TYPE_ARGUMENT_PACK?  That is, can an element of a TYPE_ARGUMENT_PACK be 
an invalid argument to canonicalize_type_argument?


OTOH, I wonder if we need to canonicalize non-type arguments here as well?

I wonder if tsubst_template_arg should canonicalize rather than leave 
that up to the caller?  I suppose that could do a bit more work when the 
result is going to end up in convert_template_argument and get 
canonicalized again; I don't know if that would be significant.



}
if (new_arg == error_mark_node)
return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
new file mode 100644
index 000..2c8ad40d104
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
@@ -0,0 +1,52 @@
+// PR c++/93907
+// { dg-options -std=gnu++20 }
+
+// This testcase is a variadic version of concepts-using2.C; the only
+// difference is that 'cd' and 'ce' are now variadic concepts.
+
+template  struct c {
+  static constexpr int d = a;
+  typedef c e;
+};
+template  struct f;
+template  using g = typename f::e;
+struct b;
+template  struct f { using e = b; };
+template  struct m { typedef g aj; };
+template  struct n { typedef typename m::aj e; };
+template  using an = typename n::e;
+template  constexpr bool ao = c::d;
+template  constexpr bool i = c<1>::d;
+template  concept bb = i;
+#ifdef __SIZEOF_INT128__
+using cc = __int128;
+#else
+using cc = long long;
+#endif
+template  concept cd = bb;
+template  concept ce = requires { requires cd; };
+template  concept h = ce;
+template  concept l = h;
+template  concept cl = ao;
+template  concept cp = requires(b j) {
+  requires h>;
+};
+struct o {
+  template  requires cp auto operator()(b) {}
+};
+template  using cm = decltype(o{}(b()));
+template  concept ct = l;
+template  concept dd = ct>;
+template  concept de = dd;
+struct {
+  template  void operator()(da, b);
+} di;
+struct p {
+  void begin();
+};
+template  using df = p;
+template  void q() {
+  df k;
+  int d;
+  di(k, d);
+}





[pushed] Objective-C/C++ : Allow visibility prefix attributes on interfaces.

2020-11-06 Thread Iain Sandoe
Hi,

Some system headers apply visibility attributes to Objective-C
@interface declarations.  Those are “default”, but still need to be
accepted.

tested across the Darwin patch and on x86_64-linux-gnu,
pushed to master,
thanks
Iain

-

This passes visibility through without warning (so that, for example,
__attribute__((__visibility("default"))) does not result in any
diagnostic).

gcc/objc/ChangeLog:

* objc-act.c (start_class): Accept visibility attributes
without warning.
---
 gcc/objc/objc-act.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/objc/objc-act.c b/gcc/objc/objc-act.c
index 26cdeddfc5a..68d829fd773 100644
--- a/gcc/objc/objc-act.c
+++ b/gcc/objc/objc-act.c
@@ -7013,12 +7013,14 @@ start_class (enum tree_code code, tree class_name, tree 
super_name,
  tree name = TREE_PURPOSE (attribute);
 
  /* TODO: Document what the objc_exception attribute is/does.  */
- /* We handle the 'deprecated' and (undocumented) 'objc_exception'
-attributes.  */
+ /* We handle the 'deprecated', 'visibility' and (undocumented)
+'objc_exception' attributes.  */
  if (is_attribute_p  ("deprecated", name))
TREE_DEPRECATED (klass) = 1;
  else if (is_attribute_p  ("objc_exception", name))
CLASS_HAS_EXCEPTION_ATTR (klass) = 1;
+ else if (is_attribute_p  ("visibility", name))
+   ;
  else
/* Warn about and ignore all others for now, but store them.  */
warning (OPT_Wattributes, "%qE attribute directive ignored", 
name);
-- 
2.24.1




Re: [PATCH] c++: DR 1914 - Allow duplicate standard attributes.

2020-11-06 Thread Jason Merrill via Gcc-patches

On 11/6/20 2:34 PM, Marek Polacek wrote:

On Fri, Nov 06, 2020 at 02:23:10PM -0500, Jason Merrill via Gcc-patches wrote:

On 11/6/20 2:06 PM, Marek Polacek wrote:

Following Joseph's change for C to allow duplicate C2x standard attributes
,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
, which is not part of the standard yet, but has a wide
support so look like a shoo-in.  Some duplications still produce warnings;
I didn't change that because a warning might be desirable.


What's the rationale for warning about some and not others?


I don't have any.  Joseph's patch removed the error for a duplicated
'fallthrough' attribute, but the warning remained so I left it as-is
too.

So either we just downgrade the error to a warning, or remove the
remaining warnings too.  I think I slightly prefer the former; with perhaps
a small tweak not to warn when the duplicated attribute comes from a macro
expansion.


Sounds good.

Jason



Re: [02/32] linemaps

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/3/20 2:13 PM, Nathan Sidwell wrote:
> Location handling needs to add 2 things
>
> 1) a new kind of inclusion -- namely a module.  We add LC_MODULE as a
> map kind,
>
> 2) the ability to allocate blocks of line-maps for both ordinary
> locations and macro locations, that are then filled in by the module
> loader.

LGTM, but give David M a little more time to chime in (he's been on PTO
for at least part of this week).


Say perhaps until Tuesday?


jeff




Re: [03/32] cpp-callbacks

2020-11-06 Thread Jeff Law via Gcc-patches


On 11/3/20 2:13 PM, Nathan Sidwell wrote:
> Here are the callbacks in the preprocessor itself.
>
> a) one to handle deferred macros
>
> b) one to handle include translation.  For every '#include ',
> there is the possibility of replacing that with 'import '. 
> This hook determines if that happens.

OK

jeff




[PATCH] openmp: Retire nest-var ICV

2020-11-06 Thread Kwok Cheung Yeung

Hello

In addition to deprecating the omp_(get|set)_nested() functions and OMP_NESTED 
environment variable, OpenMP 5.0 also removes the nest-var ICV altogether, 
defining it in terms of the max-active-levels-var ICV instead.


This patch removes the ICV, and implements the handling of 
omp_(get|set)_nested() in terms of max-active-levels-var as defined in the spec.


The initial value of max-active-levels-var now depends on the number of items in 
OMP_NUM_THREADS and OMP_PROC_BIND as defined in section 2.5.2 of the spec. 
OMP_NESTED now changes the value of max-active-levels-var. If OMP_NESTED is 
false and OMP_MAX_ACTIVE_LEVELS is > 1, I have opted to use the value specified 
by OMP_MAX_ACTIVE_LEVELS, as OMP_NESTED is deprecated in OpenMP 5.0 (the spec 
says this is implementation defined in section 6.9).


The default value of max-active-levels-var is implementation defined (section 
2.5.2). It was previously set to the maximum supported number, but I think it 
should be 1 now, since OMP_NESTED defaults to false on OpenMP 4.5, and this 
replicates that behaviour.


This change regresses the testcase libgomp.c/target-5.c because nested-var is 
per data environment, while max-active-levels-var is per-device. The change in 
semantics causes the test for the nesting setting to fail, because now any 
changes to the nesting setting apply to the whole device, and not just to the 
current data environment. I just deleted this part of the testing as the test 
looks like it is testing per data environment ICVs.


Bootstrapped on x86_64 with no offloading, and libgomp testing carried out with 
nvptx offloading with no regressions.


Okay for trunk?

Thanks

Kwok
commit aad8afea37b33b4d5836b2b64be8f4dab6d74509
Author: Kwok Cheung Yeung 
Date:   Wed Nov 4 15:34:12 2020 -0800

openmp: Retire nest-var ICV for OpenMP 5.0

This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead.

2020-11-06  Kwok Cheung Yeung  

libgomp/
* env.c (gomp_global_icv): Remove nest_var field.
(gomp_max_active_levels_var): Initialize to 1.
(parse_boolean): Return true on success.
(handle_omp_display_env): Express OMP_NESTED in terms of
gomp_max_active_levels_var.
(initialize_env): Set gomp_max_active_levels_var from
OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
OMP_PROC_BIND.
* icv.c (omp_set_nested): Express in terms of
gomp_max_active_levels_var.
(omp_get_nested): Likewise.
* libgomp.h (struct gomp_task_icv): Remove nest_var field.
* parallel.c (gomp_resolve_num_threads): Replace reference
to nest_var with gomp_max_active_levels_var.
* testsuite/libgomp.c/target-5.c: Remove additional options.
(main): Remove references to omp_get_nested and omp_set_nested.

diff --git a/libgomp/env.c b/libgomp/env.c
index ab22525..75d0fe2 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -68,12 +68,11 @@ struct gomp_task_icv gomp_global_icv = {
   .run_sched_chunk_size = 1,
   .default_device_var = 0,
   .dyn_var = false,
-  .nest_var = false,
   .bind_var = omp_proc_bind_false,
   .target_data = NULL
 };
 
-unsigned long gomp_max_active_levels_var = gomp_supported_active_levels;
+unsigned long gomp_max_active_levels_var = 1;
 bool gomp_cancel_var = false;
 enum gomp_target_offload_t gomp_target_offload_var
   = GOMP_TARGET_OFFLOAD_DEFAULT;
@@ -959,16 +958,17 @@ parse_spincount (const char *name, unsigned long long 
*pvalue)
 }
 
 /* Parse a boolean value for environment variable NAME and store the
-   result in VALUE.  */
+   result in VALUE.  Return true if one was present and it was
+   successfully parsed.  */
 
-static void
+static bool
 parse_boolean (const char *name, bool *value)
 {
   const char *env;
 
   env = getenv (name);
   if (env == NULL)
-return;
+return false;
 
   while (isspace ((unsigned char) *env))
 ++env;
@@ -987,7 +987,11 @@ parse_boolean (const char *name, bool *value)
   while (isspace ((unsigned char) *env))
 ++env;
   if (*env != '\0')
-gomp_error ("Invalid value for environment variable %s", name);
+{
+  gomp_error ("Invalid value for environment variable %s", name);
+  return false;
+}
+  return true;
 }
 
 /* Parse the OMP_WAIT_POLICY environment variable and return the value.  */
@@ -1252,7 +1256,7 @@ handle_omp_display_env (unsigned long stacksize, int 
wait_policy)
   fprintf (stderr, "  OMP_DYNAMIC = '%s'\n",
   gomp_global_icv.dyn_var ? "TRUE" : "FALSE");
   fprintf (stderr, "  OMP_NESTED = '%s'\n",
-  gomp_global_icv.nest_var ? "TRUE" : "FALSE");
+  gomp_max_active_levels_var > 1 ? "TRUE" : "FALSE");
 
   fprintf (stderr, "  OMP_NUM_THREADS = '%lu", gomp_global_icv.nthreads_var);
   for (i = 1; i < gomp_nthreads_var_list_len; i++)
@@ -1417,16 +1421,11 @@ initialize_env (void)
 
   parse_schedule ();
   parse_boolean ("OMP_DYNAMIC", &gomp_global_ic

Re: [committed 1/2] libstdc++: Export basic_stringbuf constructor [PR 97729]

2020-11-06 Thread Jonathan Wakely via Gcc-patches

On 06/11/20 11:56 +0100, Rainer Orth wrote:

Hi Jonathan,


libstdc++-v3/ChangeLog:

PR libstdc++/97729
* config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Add exports.
* src/c++20/sstream-inst.cc (basic_stringbuf): Instantiate
private constructor taking __xfer_bufptrs.

Tested powerpc64le-linux. Committed to trunk.


unfortunately, this broke Solaris bootstrap again:

ld: fatal: libstdc++-symbols.ver-sun: 7314: symbol 
'_ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEEC1EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 7315: symbol 
'_ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEEC2EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 7316: symbol 
'_ZNSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEEC1EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict
ld: fatal: libstdc++-symbols.ver-sun: 7317: symbol 
'_ZNSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEEC2EOS4_RKS3_ONS4_14__xfer_bufptrsE':
 symbol version conflict

Those are matched by both

   
##_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]EOS4_RKS3_ONS4_14__xfer_bufptrsE
 (glob)

but also by the previous

   ##_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]*__xfer_bufptrs* 
(glob)

I do have a hacky patch to avoid this, but I guess I best leave it to
you how to best tighten the previous pattern.


It should be fixed at 887515acd27e49c176395ab76d5826959d89cb9b which
is the attached patch. Only tested on x86_64-linux, but my script no
longer shows the conflicts.

I'll try to incorporate that script into the testsuite for gcc-11, or
rewrite it as aprt of testsuite/util/testsuite_abi.cc


commit 887515acd27e49c176395ab76d5826959d89cb9b
Author: Jonathan Wakely 
Date:   Fri Nov 6 19:53:36 2020

libstdc++: Fix symbol version conflict in linker script

The change in r11-4748-50b840ac5e1d6534e345c3fee9a97ae45ced6bc7 causes
a build error on Solaris, due to the new explicit instantiation matching
patterns for two different symbol versions.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver (GLIBCXX_3.4.21): Tighten up patterns
for basic_stringbuf that refer to __xfer_bufptrs.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index ed68ffa28723..2d0f87aa7cc7 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1774,7 +1774,8 @@ GLIBCXX_3.4.21 {
 _ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]ERKNS_12basic_stringI[cw]S2_S3_EESt13_Ios_Openmode;
 _ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]ESt13_Ios_Openmode;
 _ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EED[012]Ev;
-_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]*__xfer_bufptrs*;
+_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]EOS4_ONS4_14__xfer_bufptrsE;
+_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]14__xfer_bufptrs[CD][12]*;
 _ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE[a1346789]*;
 #   _ZNSt7__cxx1118basic_stringstreamI[cw]St11char_traitsI[cw]*;
 _ZNSt7__cxx1118basic_stringstreamI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]EOS4_;


  1   2   >