Re: [PR77691] x86-vxworks malloc aligns to 8 bytes like solaris

2020-05-13 Thread Alexandre Oliva
Hello, Jonathan,

On May  9, 2020, Jonathan Wakely  wrote:

> On 08/05/20 17:22 -0300, Alexandre Oliva wrote:

>> (Couldn't r1->allocate(2, alignof(char)) possibly return a pointer
>> that's *not* aligned?  Maybe we should drop the test even
>> if !defined(BAD_MAX_ALIGN_T).)

> Yes.

> Different malloc implementations interpret the C standard differently
> here. One interpretation is that all allocations must be aligned to
> alignof(max_align_t) but another is that allocations smaller than that
> don't need to meet that requirement. An object that is two bytes in
> size cannot require 16-byte alignment (otherwise its sizeof would be
> 16 too).

I understand you're talking about malloc because that's what our
implementation ultimately uses, but my question was on language
lawyering, on whether C++ would mandate more alignment than requested by
the caller of allocate.  If it were to do so, I wonder what the point of
specifying the alignment explicitly would be.

> Please do remove that line of the test, instead of wrapping it in the
> #ifdef.

> OK for master.

Thanks, here's what I'm installing in master.


x86-vxworks malloc aligns to 8 bytes like solaris

From: Alexandre Oliva 

Vxworks 7's malloc, like Solaris', only ensures 8-byte alignment of
returned pointers on 32-bit x86, though GCC's stddef.h defines
max_align_t with 16-byte alignment for __float128.  This patch enables
on x86-vxworks the same memory_resource workaround used for x86-solaris.

The testsuite also had a workaround, defining BAD_MAX_ALIGN_T and
xfailing the test; extend those to x86-vxworks as well, and remove the
check for char-aligned requested allocation to be aligned like
max_align_t.  With that change, the test passes on x86-vxworks; I'm
guessing that's the same reason for the test not to pass on
x86-solaris (and on x86_64-solaris -m32), so with the fix, I'm
tentatively removing the xfail.


for libstdc++-v3/ChangeLog

PR libstdc++/77691
* include/experimental/memory_resource
(__resource_adaptor_imp::do_allocate): Handle max_align_t on
x86-vxworks as on x86-solaris.
(__resource_adaptor_imp::do_deallocate): Likewise.
* testsuite/experimental/memory_resource/new_delete_resource.cc:
Drop xfail.
(BAD_MAX_ALIGN_T): Define on x86-vxworks as on x86-solaris.
(test03): Drop max-align test for char-aligned alloc.
---
 libstdc++-v3/include/experimental/memory_resource  |4 ++--
 .../memory_resource/new_delete_resource.cc |4 +---
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/experimental/memory_resource 
b/libstdc++-v3/include/experimental/memory_resource
index 850a78d..1c4de70 100644
--- a/libstdc++-v3/include/experimental/memory_resource
+++ b/libstdc++-v3/include/experimental/memory_resource
@@ -413,7 +413,7 @@ namespace pmr {
   do_allocate(size_t __bytes, size_t __alignment) override
   {
// Cannot use max_align_t on 32-bit Solaris x86, see PR libstdc++/77691
-#if ! (defined __sun__ && defined __i386__)
+#if ! ((defined __sun__ || defined __VXWORKS__) && defined __i386__)
if (__alignment == alignof(max_align_t))
  return _M_allocate(__bytes);
 #endif
@@ -439,7 +439,7 @@ namespace pmr {
   do_deallocate(void* __ptr, size_t __bytes, size_t __alignment) noexcept
   override
   {
-#if ! (defined __sun__ && defined __i386__)
+#if ! ((defined __sun__ || defined __VXWORKS__) && defined __i386__)
if (__alignment == alignof(max_align_t))
  return (void) _M_deallocate(__ptr, __bytes);
 #endif
diff --git 
a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc 
b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
index 8a98954..65a42da 100644
--- a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
+++ b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
@@ -17,13 +17,12 @@
 
 // { dg-do run { target c++14 } }
 // { dg-require-cstdint "" }
-// { dg-xfail-run-if "PR libstdc++/77691" { { i?86-*-solaris2.* 
x86_64-*-solaris2.* } && ilp32 } }
 
 #include 
 #include 
 #include 
 
-#if defined __sun__ && defined __i386__
+#if (defined __sun__ || defined __VXWORKS__) && defined __i386__
 // See PR libstdc++/77691
 # define BAD_MAX_ALIGN_T 1
 #endif
@@ -128,7 +127,6 @@ test03()
 
   p = r1->allocate(2, alignof(char));
   VERIFY( bytes_allocated == 2 );
-  VERIFY( aligned(p) );
   r1->deallocate(p, 2, alignof(char));
   VERIFY( bytes_allocated == 0 );
 


-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


Re: testsuite: Fix up gcc.dg/asan/pr95051.c testcase [PR95051]

2020-05-13 Thread Martin Liška

On 5/12/20 9:23 PM, Jakub Jelinek wrote:

Hi!

On Tue, May 12, 2020 at 12:06:25PM -0700, H.J. Lu wrote:

Excess errors:
cc1: error: '-fsanitize=address' is incompatible with
'-fsanitize=kernel-address'


asan.exp adds -fsanitize=address which is incompatible with 
-fsanitize=kernel-address,
so we need to disable it first.


Sorry for the breakage, I added this test late after I made tests.
There's a simplification where the test-case failes also with 
-fsanitize=address -O2.

Martin



Tested on x86_64-linux -m32/-m64, committed to trunk as obvious.

2020-05-12  Jakub Jelinek  

PR sanitizer/95051
* gcc.dg/asan/pr95051.c: Add -fno-sanitize=all to dg-options.

--- gcc/testsuite/gcc.dg/asan/pr95051.c.jj  2020-05-12 11:25:46.209148953 
+0200
+++ gcc/testsuite/gcc.dg/asan/pr95051.c 2020-05-12 21:12:28.170118274 +0200
@@ -1,6 +1,6 @@
  /* PR sanitizer/95051 */
  /* { dg-do compile } */
-/* { dg-options "-fsanitize=kernel-address --param=asan-stack=1 -O2" } */
+/* { dg-options "-fno-sanitize=all -fsanitize=kernel-address --param=asan-stack=1 
-O2" } */
  
  struct a {

struct {


Jakub



>From 678e6b5c127307f29f0ce883497d4e7af399399c Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 12 May 2020 22:02:24 +0200
Subject: [PATCH] Simplify test-case options.

gcc/testsuite/ChangeLog:

2020-05-12  Martin Liska  

	PR sanitizer/95051
	* gcc.dg/asan/pr95051.c: Simplify options as -fsanitize=address
	and -O2 were enough to trigger the original ICE.
---
 gcc/testsuite/gcc.dg/asan/pr95051.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/asan/pr95051.c b/gcc/testsuite/gcc.dg/asan/pr95051.c
index caa5e667209..e32c04599a8 100644
--- a/gcc/testsuite/gcc.dg/asan/pr95051.c
+++ b/gcc/testsuite/gcc.dg/asan/pr95051.c
@@ -1,6 +1,6 @@
 /* PR sanitizer/95051 */
 /* { dg-do compile } */
-/* { dg-options "-fno-sanitize=all -fsanitize=kernel-address --param=asan-stack=1 -O2" } */
+/* { dg-options "-O2" } */
 
 struct a {
   struct {
-- 
2.26.2



[PATCH] Fold single imm use of a FMA if it is a negation [PR95060]

2020-05-13 Thread Jakub Jelinek via Gcc-patches
Hi!

match.pd already has simplifications for negation of a FMA (FMS, FNMA, FNMS)
call if it is single use, but when the widening_mul pass discovers FMAs,
nothing folds the statements anymore.

So, the following patch adjusts the widening_mul pass to handle that.

I had to adjust quite a lot of tests, because they have in them nested FMAs
(one FMA feeding another one) and the patch results in some (equivalent) changes
in the chosen instructions, previously the negation of one FMA's result
would result in the dependent FMA being adjusted for the negation, but now
instead the first FMA is adjusted.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-05-13  Jakub Jelinek  

PR tree-optimization/95060
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Fold a NEGATE_EXPR
if it is the single use of the FMA internal builtin.

* gcc.target/i386/avx512f-pr95060.c: New test.
* gcc.target/i386/fma_double_1.c: Adjust expected insn counts.
* gcc.target/i386/fma_double_2.c: Likewise.
* gcc.target/i386/fma_double_3.c: Likewise.
* gcc.target/i386/fma_double_4.c: Likewise.
* gcc.target/i386/fma_double_5.c: Likewise.
* gcc.target/i386/fma_double_6.c: Likewise.
* gcc.target/i386/fma_float_1.c: Likewise.
* gcc.target/i386/fma_float_2.c: Likewise.
* gcc.target/i386/fma_float_3.c: Likewise.
* gcc.target/i386/fma_float_4.c: Likewise.
* gcc.target/i386/fma_float_5.c: Likewise.
* gcc.target/i386/fma_float_6.c: Likewise.
* gcc.target/i386/l_fma_double_1.c: Likewise.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.

--- gcc/tree-ssa-math-opts.c.jj 2020-03-26 09:14:53.367045348 +0100
+++ gcc/tree-ssa-math-opts.c2020-05-12 12:06:19.718387179 +0200
@@ -2930,6 +2930,35 @@ convert_mult_to_fma_1 (tree mul_result,
  fprintf (dump_file, "\n");
}
 
+  /* If the FMA result is negated in a single use, fold the negation
+too.  */
+  orig_stmt = gsi_stmt (gsi);
+  use_operand_p use_p;
+  gimple *neg_stmt;
+  if (is_gimple_call (orig_stmt)
+ && gimple_call_internal_p (orig_stmt)
+ && gimple_call_lhs (orig_stmt)
+ && TREE_CODE (gimple_call_lhs (orig_stmt)) == SSA_NAME
+ && single_imm_use (gimple_call_lhs (orig_stmt), &use_p, &neg_stmt)
+ && is_gimple_assign (neg_stmt)
+ && gimple_assign_rhs_code (neg_stmt) == NEGATE_EXPR
+ && !stmt_could_throw_p (cfun, neg_stmt))
+   {
+ gsi = gsi_for_stmt (neg_stmt);
+ if (fold_stmt (&gsi, follow_all_ssa_edges))
+   {
+ if (maybe_clean_or_replace_eh_stmt (neg_stmt, gsi_stmt (gsi)))
+   gcc_unreachable ();
+ update_stmt (gsi_stmt (gsi));
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Folded FMA negation ");
+ print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_NONE);
+ fprintf (dump_file, "\n");
+   }
+   }
+   }
+
   widen_mul_stats.fmas_inserted++;
 }
 }
--- gcc/testsuite/gcc.target/i386/avx512f-pr95060.c.jj  2020-05-12 
12:17:16.052468438 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr95060.c 2020-05-12 
12:16:52.333826884 +0200
@@ -0,0 +1,22 @@
+/* PR tree-optimization/95060 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -ffast-math -mavx512f" } */
+/* { dg-final { scan-assembler "\tvfnmsub" } } */
+/* { dg-final { scan-assembler-not "\tvfmadd" } } */
+
+#define N 32
+float r[N], a[N], b[N], c[N];
+
+void
+foo (void)
+{
+  for (int i = 0; i < N; i++)
+r[i] = -(a[i] * b[i]) - c[i];
+}
+
+void
+bar (void)
+{
+  for (int i = 0; i < N; i++)
+r[i] = -(a[i] * b[i] + c[i]);
+}
--- gcc/testsuite/gcc.target/i386/fma_double_1.c.jj 2020-01-12 
11:54:37.943390325 +0100
+++ gcc/testsuite/gcc.target/i386/fma_double_1.c2020-05-13 
09:55:10.878118046 +0200
@@ -8,11 +8,9 @@
 
 #include "fma_1.h"
 
-/* { dg-final { scan-assembler-times "vfmadd132sd" 4  } } */
+/* { dg-final { scan-assembler-times "vfmadd132sd" 8  } } */
 /* { dg-final { scan-assembler-times "vfmadd231sd" 4  } } */
-/* { dg-final { scan-assembler-times "vfmsub132sd" 4  } } */
+/* { dg-final { scan-assembler-times "vfmsub132sd" 8  } } */
 /* { dg-final { scan-assembler-times "vfmsub231sd" 4  } } */
-/* { dg-final { scan-assembler-times "vfnmadd132sd" 4  }

[PATCH] tree-optimization/33315 - common stores during sinking

2020-05-13 Thread Richard Biener


This implements commoning of stores to a common successor in
a simple ad-hoc way.  I've decided to put it into the code sinking
pass since, well, it sinks stores.  It's still separate since
it does not really sink code into less executed places.

It's ad-hoc since it does not perform any dataflow or alias analysis
but simply only considers trailing stores in a block, iteratively
though.  If the stores are from different values a PHI node is
inserted to merge them.  gcc.dg/tree-ssa/split-path-7.c shows
that path splitting will eventually undo this very transform,
I've decided to not bother with it and simply disable sinking for
the particular testcase.

Doing this transform is good for code size when the stores are
from constants, once we have to insert PHIs the situation becomes
less clear but it's a transform we do elsewhere as well
(cselim for one), and reversing the transform should be easy.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Any comments?

2020-05-13  Richard Biener  

PR tree-optimization/33315
* tree-ssa-sink.c: Include tree-eh.h.
(sink_code_in_bb): Return TODO_cleanup_cfg if we commonized
and sunk stores.  Implement store commoning by sinking to
the successor.

* gcc.dg/tree-ssa/ssa-sink-13.c: New testcase.
* gcc.dg/tree-ssa/ssa-sink-14.c: Likewise.
* gcc.dg/tree-ssa/split-path-7.c: Disable sinking.
---
 gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c  |  25 
 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-14.c  |  17 +++
 gcc/tree-ssa-sink.c  | 168 ++-
 4 files changed, 207 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-14.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c
index 3d6186b34d9..a5df75c9b72 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim 
-fdump-tree-split-paths-details -w" } */
+/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fno-tree-sink 
-fdump-tree-split-paths-details -w" } */
 
 
 struct _reent
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c
new file mode 100644
index 000..a65ba35d4ba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c
@@ -0,0 +1,25 @@
+/* PR33315 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sink" } */
+
+int num;
+int a[20];
+
+void test ()
+{
+  int i;
+  int *ptr;
+  ptr = & a[0];
+  i = num;
+  if ( i == 1) *(ptr+0) = 0;
+  if ( i != 1) *(ptr+0) = 0;
+  if ( i == 2) *(ptr+1) = 0;
+  if ( i != 2) *(ptr+1) = 0;
+  if ( i == 3) *(ptr+2) = 0;
+  if ( i != 3) *(ptr+2) = 0;
+}
+
+/* We should sink/merge all stores and end up with a single BB.  */
+
+/* { dg-final { scan-tree-dump-times "MEM\[^\n\r\]* = 0;" 3 "sink" } } */
+/* { dg-final { scan-tree-dump-times "preds) > 1
+  && (phi = get_virtual_phi (bb)))
+{
+  /* Repeat until no more common stores are found.  */
+  while (1)
+   {
+ gimple *first_store = NULL;
+ auto_vec  vdefs;
+
+ /* Search for common stores defined by all virtual PHI args.
+???  Common stores not present in all predecessors could
+be handled by inserting a forwarder to sink to.  Generally
+this involves deciding which stores to do this for if
+multiple common stores are present for different sets of
+predecessors.  See PR11832 for an interesting case.  */
+ for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+   {
+ tree arg = gimple_phi_arg_def (phi, i);
+ gimple *def = SSA_NAME_DEF_STMT (arg);
+ if (! is_gimple_assign (def)
+ || stmt_can_throw_internal (cfun, def))
+   {
+ /* ???  We could handle some cascading with the def being
+another PHI.  We'd have to insert multiple PHIs for
+the rhs then though (if they are not all equal).  */
+ first_store = NULL;
+ break;
+   }
+ /* ???  Do not try to do anything fancy with aliasing, thus
+do not sink across non-aliased loads (or even stores,
+so different store order will make the sinking fail).  */
+ bool all_uses_on_phi = true;
+ imm_use_iterator iter;
+ use_operand_p use_p;
+ FOR_EACH_IMM_USE_FAST (use_p, iter, arg)
+   if (USE_STMT (use_p) != phi)
+ {
+   all_uses_on_phi = false;
+   break;
+ }
+ if (! all_uses_on_phi)
+   {
+ 

[PATCH] Fix -fcompare-debug issue in purge_dead_edges [PR95080]

2020-05-13 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase fails with -fcompare-debug, the bug used to be latent
since introduction of -fcompare-debug.
The loop at the start of purge_dead_edges behaves differently between -g0
and -g - if the last insn is a DEBUG_INSN, then it skips not just
DEBUG_INSNs but also NOTEs until it finds some other real insn (or bb head),
while with -g0 it will not skip any NOTEs, so if we have
real_insn
note
debug_insn // not present with -g0
then with -g it might remove useless REG_EH_REGION from real_insn, while
with -g0 it will not.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

Yet another option would be not skipping NOTE_P in the loop; I couldn't find
in history rationale for why it is done.

2020-05-13  Jakub Jelinek  

PR debug/95080
* cfgrtl.c (purge_dead_edges): Skip over debug and note insns even
if the last insn is a note.

* g++.dg/opt/pr95080.C: New test.

--- gcc/cfgrtl.c.jj 2020-04-17 10:32:58.589780140 +0200
+++ gcc/cfgrtl.c2020-05-12 21:02:09.808395583 +0200
@@ -3100,7 +3100,7 @@ purge_dead_edges (basic_block bb)
   bool found;
   edge_iterator ei;
 
-  if (DEBUG_INSN_P (insn) && insn != BB_HEAD (bb))
+  if ((DEBUG_INSN_P (insn) || NOTE_P (insn)) && insn != BB_HEAD (bb))
 do
   insn = PREV_INSN (insn);
 while ((DEBUG_INSN_P (insn) || NOTE_P (insn)) && insn != BB_HEAD (bb));
--- gcc/testsuite/g++.dg/opt/pr95080.C.jj   2020-05-12 21:01:09.804295824 
+0200
+++ gcc/testsuite/g++.dg/opt/pr95080.C  2020-05-12 21:00:52.738551862 +0200
@@ -0,0 +1,41 @@
+// PR debug/95080
+// { dg-do compile }
+// { dg-options "-Og -fcse-follow-jumps -fnon-call-exceptions -fcompare-debug" 
}
+
+char *a;
+
+void baz ();
+
+static inline bool
+bar ()
+{
+  int j = a[0] - 1;
+  switch (j)
+{
+case 0:
+case 2:
+  return true;
+default:
+  return false;
+}
+}
+
+static inline bool
+foo ()
+{
+  if (bar ())
+baz ();
+  return 0;
+}
+
+struct S
+{
+  int h;
+   ~S ();
+};
+
+S::~S ()
+{
+  if (a[0] == 0)
+foo () != h;
+}

Jakub



Re: [PATCH] Fold single imm use of a FMA if it is a negation [PR95060]

2020-05-13 Thread Richard Biener
On Wed, 13 May 2020, Jakub Jelinek wrote:

> Hi!
> 
> match.pd already has simplifications for negation of a FMA (FMS, FNMA, FNMS)
> call if it is single use, but when the widening_mul pass discovers FMAs,
> nothing folds the statements anymore.
> 
> So, the following patch adjusts the widening_mul pass to handle that.
> 
> I had to adjust quite a lot of tests, because they have in them nested FMAs
> (one FMA feeding another one) and the patch results in some (equivalent) 
> changes
> in the chosen instructions, previously the negation of one FMA's result
> would result in the dependent FMA being adjusted for the negation, but now
> instead the first FMA is adjusted.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2020-05-13  Jakub Jelinek  
> 
>   PR tree-optimization/95060
>   * tree-ssa-math-opts.c (convert_mult_to_fma_1): Fold a NEGATE_EXPR
>   if it is the single use of the FMA internal builtin.
> 
>   * gcc.target/i386/avx512f-pr95060.c: New test.
>   * gcc.target/i386/fma_double_1.c: Adjust expected insn counts.
>   * gcc.target/i386/fma_double_2.c: Likewise.
>   * gcc.target/i386/fma_double_3.c: Likewise.
>   * gcc.target/i386/fma_double_4.c: Likewise.
>   * gcc.target/i386/fma_double_5.c: Likewise.
>   * gcc.target/i386/fma_double_6.c: Likewise.
>   * gcc.target/i386/fma_float_1.c: Likewise.
>   * gcc.target/i386/fma_float_2.c: Likewise.
>   * gcc.target/i386/fma_float_3.c: Likewise.
>   * gcc.target/i386/fma_float_4.c: Likewise.
>   * gcc.target/i386/fma_float_5.c: Likewise.
>   * gcc.target/i386/fma_float_6.c: Likewise.
>   * gcc.target/i386/l_fma_double_1.c: Likewise.
>   * gcc.target/i386/l_fma_double_2.c: Likewise.
>   * gcc.target/i386/l_fma_double_3.c: Likewise.
>   * gcc.target/i386/l_fma_double_4.c: Likewise.
>   * gcc.target/i386/l_fma_double_5.c: Likewise.
>   * gcc.target/i386/l_fma_double_6.c: Likewise.
>   * gcc.target/i386/l_fma_float_1.c: Likewise.
>   * gcc.target/i386/l_fma_float_2.c: Likewise.
>   * gcc.target/i386/l_fma_float_3.c: Likewise.
>   * gcc.target/i386/l_fma_float_4.c: Likewise.
>   * gcc.target/i386/l_fma_float_5.c: Likewise.
>   * gcc.target/i386/l_fma_float_6.c: Likewise.
> 
> --- gcc/tree-ssa-math-opts.c.jj   2020-03-26 09:14:53.367045348 +0100
> +++ gcc/tree-ssa-math-opts.c  2020-05-12 12:06:19.718387179 +0200
> @@ -2930,6 +2930,35 @@ convert_mult_to_fma_1 (tree mul_result,
> fprintf (dump_file, "\n");
>   }
>  
> +  /* If the FMA result is negated in a single use, fold the negation
> +  too.  */
> +  orig_stmt = gsi_stmt (gsi);
> +  use_operand_p use_p;
> +  gimple *neg_stmt;
> +  if (is_gimple_call (orig_stmt)
> +   && gimple_call_internal_p (orig_stmt)
> +   && gimple_call_lhs (orig_stmt)
> +   && TREE_CODE (gimple_call_lhs (orig_stmt)) == SSA_NAME
> +   && single_imm_use (gimple_call_lhs (orig_stmt), &use_p, &neg_stmt)
> +   && is_gimple_assign (neg_stmt)
> +   && gimple_assign_rhs_code (neg_stmt) == NEGATE_EXPR
> +   && !stmt_could_throw_p (cfun, neg_stmt))
> + {
> +   gsi = gsi_for_stmt (neg_stmt);
> +   if (fold_stmt (&gsi, follow_all_ssa_edges))
> + {
> +   if (maybe_clean_or_replace_eh_stmt (neg_stmt, gsi_stmt (gsi)))
> + gcc_unreachable ();
> +   update_stmt (gsi_stmt (gsi));
> +   if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> +   fprintf (dump_file, "Folded FMA negation ");
> +   print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_NONE);
> +   fprintf (dump_file, "\n");
> + }
> + }
> + }
> +
>widen_mul_stats.fmas_inserted++;
>  }
>  }
> --- gcc/testsuite/gcc.target/i386/avx512f-pr95060.c.jj2020-05-12 
> 12:17:16.052468438 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr95060.c   2020-05-12 
> 12:16:52.333826884 +0200
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/95060 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -ffast-math -mavx512f" } */
> +/* { dg-final { scan-assembler "\tvfnmsub" } } */
> +/* { dg-final { scan-assembler-not "\tvfmadd" } } */
> +
> +#define N 32
> +float r[N], a[N], b[N], c[N];
> +
> +void
> +foo (void)
> +{
> +  for (int i = 0; i < N; i++)
> +r[i] = -(a[i] * b[i]) - c[i];
> +}
> +
> +void
> +bar (void)
> +{
> +  for (int i = 0; i < N; i++)
> +r[i] = -(a[i] * b[i] + c[i]);
> +}
> --- gcc/testsuite/gcc.target/i386/fma_double_1.c.jj   2020-01-12 
> 11:54:37.943390325 +0100
> +++ gcc/testsuite/gcc.target/i386/fma_double_1.c  2020-05-13 
> 09:55:10.878118046 +0200
> @@ -8,11 +8,9 @@
>  
>  #include "fma_1.h"
>  
> -/* { dg-final { scan-assembler-times "vfmadd132sd" 4  } } */
> +/* { dg-final { scan-assembler-times "vfmadd132sd" 8  } } */
>  /* { dg-final { scan-assembler-times "vfmadd231sd" 4  } } */
> -/* { dg-final { scan-assemb

Re: [PATCH] Fix -fcompare-debug issue in purge_dead_edges [PR95080]

2020-05-13 Thread Richard Biener
On Wed, 13 May 2020, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase fails with -fcompare-debug, the bug used to be latent
> since introduction of -fcompare-debug.
> The loop at the start of purge_dead_edges behaves differently between -g0
> and -g - if the last insn is a DEBUG_INSN, then it skips not just
> DEBUG_INSNs but also NOTEs until it finds some other real insn (or bb head),
> while with -g0 it will not skip any NOTEs, so if we have
> real_insn
> note
> debug_insn // not present with -g0
> then with -g it might remove useless REG_EH_REGION from real_insn, while
> with -g0 it will not.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Thanks,
Richard.

> Yet another option would be not skipping NOTE_P in the loop; I couldn't find
> in history rationale for why it is done.
> 
> 2020-05-13  Jakub Jelinek  
> 
>   PR debug/95080
>   * cfgrtl.c (purge_dead_edges): Skip over debug and note insns even
>   if the last insn is a note.
> 
>   * g++.dg/opt/pr95080.C: New test.
> 
> --- gcc/cfgrtl.c.jj   2020-04-17 10:32:58.589780140 +0200
> +++ gcc/cfgrtl.c  2020-05-12 21:02:09.808395583 +0200
> @@ -3100,7 +3100,7 @@ purge_dead_edges (basic_block bb)
>bool found;
>edge_iterator ei;
>  
> -  if (DEBUG_INSN_P (insn) && insn != BB_HEAD (bb))
> +  if ((DEBUG_INSN_P (insn) || NOTE_P (insn)) && insn != BB_HEAD (bb))
>  do
>insn = PREV_INSN (insn);
>  while ((DEBUG_INSN_P (insn) || NOTE_P (insn)) && insn != BB_HEAD (bb));
> --- gcc/testsuite/g++.dg/opt/pr95080.C.jj 2020-05-12 21:01:09.804295824 
> +0200
> +++ gcc/testsuite/g++.dg/opt/pr95080.C2020-05-12 21:00:52.738551862 
> +0200
> @@ -0,0 +1,41 @@
> +// PR debug/95080
> +// { dg-do compile }
> +// { dg-options "-Og -fcse-follow-jumps -fnon-call-exceptions 
> -fcompare-debug" }
> +
> +char *a;
> +
> +void baz ();
> +
> +static inline bool
> +bar ()
> +{
> +  int j = a[0] - 1;
> +  switch (j)
> +{
> +case 0:
> +case 2:
> +  return true;
> +default:
> +  return false;
> +}
> +}
> +
> +static inline bool
> +foo ()
> +{
> +  if (bar ())
> +baz ();
> +  return 0;
> +}
> +
> +struct S
> +{
> +  int h;
> +   ~S ();
> +};
> +
> +S::~S ()
> +{
> +  if (a[0] == 0)
> +foo () != h;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH, FORTRAN] ICE in gfc_conv_array_constructor_expr PR93497

2020-05-13 Thread Tobias Burnus

On 5/12/20 5:08 PM, Mark Eggleston wrote:


fortran : ICE in gfc_conv_array_constructor_expr PR93497


"F" in "Fortran". Extra space before ":".


PR fortran/93497
* decl.c (char_len_param_value) : Check whether character


Likewise. (Do you like French typography? There, one uses a space before
(and after) the colon.)


OK to commit to master and to backport to releases/gcc-8,
releases/gcc-9 and releases/gcc-10?


Otherwise: LGTM. However, as it is just an ICE-on-invalid code, I'd
prefer if you do not backport it to GCC 8, which will soon see its last
release and then the branch will be closed.

Cheers,

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PR77691] x86-vxworks malloc aligns to 8 bytes like solaris

2020-05-13 Thread Jonathan Wakely via Gcc-patches

On 13/05/20 04:49 -0300, Alexandre Oliva wrote:

Hello, Jonathan,

On May  9, 2020, Jonathan Wakely  wrote:


On 08/05/20 17:22 -0300, Alexandre Oliva wrote:



(Couldn't r1->allocate(2, alignof(char)) possibly return a pointer
that's *not* aligned?  Maybe we should drop the test even
if !defined(BAD_MAX_ALIGN_T).)



Yes.



Different malloc implementations interpret the C standard differently
here. One interpretation is that all allocations must be aligned to
alignof(max_align_t) but another is that allocations smaller than that
don't need to meet that requirement. An object that is two bytes in
size cannot require 16-byte alignment (otherwise its sizeof would be
16 too).


I understand you're talking about malloc because that's what our
implementation ultimately uses, but my question was on language
lawyering, on whether C++ would mandate more alignment than requested by
the caller of allocate.


No it doesn't, but this is a test for our implementation, not the
standard, and the new_delete_resource uses new which (in our
implementation) uses malloc.

That said, I'm not sure if I was really trying to test that property,
or if including that line was just a mistake. I suspect it was just a
mistake.


If it were to do so, I wonder what the point of
specifying the alignment explicitly would be.


Please do remove that line of the test, instead of wrapping it in the
#ifdef.



OK for master.


Thanks, here's what I'm installing in master.


Thanks.



[GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).

2020-05-13 Thread Srinath Parvathaneni
Hello,

Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler instructions 
generated by current compiler are wrong.
eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`.
But as per Arm-arm second argument in above instructions must also be a low 
register (<= r7).
This patch fixes this issue by creating a new predicate "mve_memory_operand" 
and constraint "Ux" which allows low registers
as arguments to the generated instructions depending on the mode of the 
argument.
A new constraint "Ul" is created to handle loading to PC-relative addressing 
modes for vector store/load intrinsiscs.

All the corresponding MVE intrinsic generating wrong code-gen as vldrbq_s32 are 
modified in this patch.


Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8-M 
Architecture Reference Manual [2] for more details.
[1] 
https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics
[2] https://developer.arm.com/docs/ddi0553/latest

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk? If ok, can I also back-port this patch to GCC-10 branch?

Thanks,
Srinath.

gcc/ChangeLog:

2020-05-13  Srinath Parvathaneni  
Andre Vieira  

PR target/94959
* config/arm/arm-protos.h (arm_mode_base_reg_class): Function
declaration.
(mve_vector_mem_operand): Likewise.
* config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check
the load from memory to a core register is legitimate for give mode.
(mve_vector_mem_operand): Define function.
(arm_print_operand): Modify comment.
(arm_mode_base_reg_class): Define.
* config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for
TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE.
* config/arm/constraints.md (Ux): Likewise.
(Ul): Likewise.
* config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also
add support for missing Vector Store Register and Vector Load Register.
Add a new alternative to support load from memory to PC (or label) in
vector store/load.
(mve_vstrbq_): Modify constraint Us to Ux.
(mve_vldrbq_): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrbq_z_): Modify constraint Us to Ux.
(mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrhq_): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrhq_z_fv8hf): Likewise.
(mve_vldrhq_z_): Likewise.
(mve_vldrwq_fv4sf): Likewise.
(mve_vldrwq_v4si): Likewise.
(mve_vldrwq_z_fv4sf): Likewise.
(mve_vldrwq_z_v4si): Likewise.
(mve_vld1q_f): Modify constriant Us to Ux.
(mve_vld1q_): Likewise.
(mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand.
(mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vstrhq_p_): Likewise.
(mve_vstrhq_): Modify constriant Us to Ux, predicate to
mve_memory_operand.
(mve_vstrwq_fv4sf): Modify constriant Us to Ux.
(mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the MVE
instructions to emit.
(mve_vstrwq_p_v4si): Likewise.
(mve_vstrwq_v4si): Likewise.Modify constriant Us to Ux.
* config/arm/predicates.md (mve_memory_operand): Define.

gcc/testsuite/ChangeLog:

2020-05-13  Srinath Parvathaneni  

PR target/94959
* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Modify.
* gcc.target/arm/mve/intrinsics/vld1q_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_u8.c

Re: [PATCH] Add -fsplit-dwarf

2020-05-13 Thread Eric Botcazou
> Did I mention I dislike -fsplit-dwarf? ;)

Seconded, this will be confusing for almost all users.  Since the option only 
affects debug info generation, it should be prefixed with 'g' in any case.

-- 
Eric Botcazou


Re: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).

2020-05-13 Thread Christophe Lyon via Gcc-patches
Hi,


On Wed, 13 May 2020 at 11:47, Srinath Parvathaneni
 wrote:
>
> Hello,
>
> Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler 
> instructions generated by current compiler are wrong.
> eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`.
> But as per Arm-arm second argument in above instructions must also be a low 
> register (<= r7).

How did you catch this? Does gas complain?

If so, then do the existing tests pass (I mean before your patch)?


> This patch fixes this issue by creating a new predicate "mve_memory_operand" 
> and constraint "Ux" which allows low registers
> as arguments to the generated instructions depending on the mode of the 
> argument.
> A new constraint "Ul" is created to handle loading to PC-relative addressing 
> modes for vector store/load intrinsiscs.
>
> All the corresponding MVE intrinsic generating wrong code-gen as vldrbq_s32 
> are modified in this patch.
>
>
> Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8-M 
> Architecture Reference Manual [2] for more details.
> [1] 
> https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics
> [2] https://developer.arm.com/docs/ddi0553/latest
>
> Regression tested on arm-none-eabi and found no regressions.
>
> Ok for trunk? If ok, can I also back-port this patch to GCC-10 branch?
>
> Thanks,
> Srinath.
>
> gcc/ChangeLog:
>
> 2020-05-13  Srinath Parvathaneni  
> Andre Vieira  
>
> PR target/94959
> * config/arm/arm-protos.h (arm_mode_base_reg_class): Function
> declaration.
> (mve_vector_mem_operand): Likewise.
> * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check
> the load from memory to a core register is legitimate for give mode.
> (mve_vector_mem_operand): Define function.
> (arm_print_operand): Modify comment.
> (arm_mode_base_reg_class): Define.
> * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for
> TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE.
> * config/arm/constraints.md (Ux): Likewise.
> (Ul): Likewise.
> * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also
> add support for missing Vector Store Register and Vector Load 
> Register.
> Add a new alternative to support load from memory to PC (or label) in
> vector store/load.
> (mve_vstrbq_): Modify constraint Us to Ux.
> (mve_vldrbq_): Modify constriant Us to Ux, predicate to
> mve_memory_operand and also modify the MVE instructions to emit.
> (mve_vldrbq_z_): Modify constraint Us to Ux.
> (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
> mve_memory_operand and also modify the MVE instructions to emit.
> (mve_vldrhq_): Modify constriant Us to Ux, predicate to
> mve_memory_operand and also modify the MVE instructions to emit.
> (mve_vldrhq_z_fv8hf): Likewise.
> (mve_vldrhq_z_): Likewise.
> (mve_vldrwq_fv4sf): Likewise.
> (mve_vldrwq_v4si): Likewise.
> (mve_vldrwq_z_fv4sf): Likewise.
> (mve_vldrwq_z_v4si): Likewise.
> (mve_vld1q_f): Modify constriant Us to Ux.
> (mve_vld1q_): Likewise.
> (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
> mve_memory_operand.
> (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
> mve_memory_operand and also modify the MVE instructions to emit.
> (mve_vstrhq_p_): Likewise.
> (mve_vstrhq_): Modify constriant Us to Ux, predicate to
> mve_memory_operand.
> (mve_vstrwq_fv4sf): Modify constriant Us to Ux.
> (mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the 
> MVE
> instructions to emit.
> (mve_vstrwq_p_v4si): Likewise.
> (mve_vstrwq_v4si): Likewise.Modify constriant Us to Ux.
> * config/arm/predicates.md (mve_memory_operand): Define.
>
> gcc/testsuite/ChangeLog:
>
> 2020-05-13  Srinath Parvathaneni  
>
> PR target/94959
> * gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Modify.
> * gcc.target/arm/mve/intrinsics/vld1q_f16.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_f32.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_s16.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_s32.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_s8.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_u16.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_u32.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_u8.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise.
> * gcc.target/arm/mve/intrinsics/vld1q_z_s32.c: Likewi

Re: [PATCH] tsan: Add optional support for distinguishing volatiles

2020-05-13 Thread Marco Elver via Gcc-patches
On Wed, 6 May 2020 at 16:33, Marco Elver  wrote:
>
> Hello, Jakub,
>
> On Tue, 28 Apr 2020 at 16:58, Dmitry Vyukov  wrote:
> >
> > On Tue, Apr 28, 2020 at 4:55 PM Jakub Jelinek  wrote:
> > >
> > > On Tue, Apr 28, 2020 at 04:48:31PM +0200, Dmitry Vyukov wrote:
> > > > FWIW this is:
> > > >
> > > > Acked-by: Dmitry Vyukov 
> > > >
> > > > We just landed a similar change to llvm:
> > > > https://github.com/llvm/llvm-project/commit/5a2c31116f412c3b6888be361137efd705e05814
> > > >
> > > > Do you have any objections?
> > >
> > > I don't have objections or anything right now, we are just trying to
> > > finalize GCC 10 and once it branches, patches like this can be
> > > reviewed/committed for GCC11.
> >
> > Thanks for clarification!
> > Then we will just wait.
>
> Just saw the announcement that GCC11 is in development stage 1 [1]. In
> case it is still too early, do let us know what time window we shall
> follow up.
>
> Would it be useful to rebase and resend the patch?

So, it's starting to look like we're really going to need this sooner
than later. Given the feature is guarded behind a flag, and otherwise
does not affect anything else, would it be possible to take this for
GCC11? What do we need to do to make this happen?

Thanks,
-- Marco

> [1] https://gcc.gnu.org/pipermail/gcc/2020-April/000505.html


[PATCH] coroutines: Implicitly movable objects should use move CTORs for co_return.

2020-05-13 Thread Iain Sandoe
.. and now to the right list…

I came across a build failure in a folly experimental test case where,
at first, it appeared that GCC was DTRT … however, further
investigation concluded that this was a case of differing interpretations
between implementations.

It’s kinda unhelpful that the discussion appears in class elision
but it motivates the selection of the correct overload even in cases
where there’s no elision.

The (conflicting info) issue is being taken to WG21 / Core.

tested on x86_64-darwin so far,
OK master after regstrap on Linux?
for 10.2 after some bake time on master?
thanks
Iain



This is a case where the standard contains conflicting information.
after discussion between implementators, the accepted intent is of
[class.copy.elision].  This amends the handling of co_return statements
to follow that.

gcc/cp/ChangeLog:

2020-05-13  Iain Sandoe  

* coroutines.cc (finish_co_return_stmt): Implement rules
from [class.copy.elision] /3.

gcc/testsuite/ChangeLog:

2020-05-13  Iain Sandoe  

* g++.dg/coroutines/co-return-syntax-10-movable.C: New test.
---
 gcc/cp/coroutines.cc  | 56 +++-
 .../coroutines/co-return-syntax-10-movable.C  | 67 +++
 2 files changed, 107 insertions(+), 16 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/co-return-syntax-10-movable.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 730e6fef82a..423c37e1c9c 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -999,11 +999,10 @@ finish_co_return_stmt (location_t kw, tree expr)
 expression as it is.  */
   if (dependent_type_p (functype) || type_dependent_expression_p (expr))
{
- expr
-   = build2_loc (kw, CO_RETURN_EXPR, void_type_node, expr, NULL_TREE);
+ expr = build2_loc (kw, CO_RETURN_EXPR, unknown_type_node,
+expr, NULL_TREE);
  expr = maybe_cleanup_point_expr_void (expr);
- expr = add_stmt (expr);
- return expr;
+ return add_stmt (expr);
}
 }
 
@@ -1022,7 +1021,7 @@ finish_co_return_stmt (location_t kw, tree expr)
 
   /* If the promise object doesn't have the correct return call then
  there's a mis-match between the co_return  and this.  */
-  tree co_ret_call = NULL_TREE;
+  tree co_ret_call = error_mark_node;
   if (expr == NULL_TREE || VOID_TYPE_P (TREE_TYPE (expr)))
 {
   tree crv_meth
@@ -1045,25 +1044,50 @@ finish_co_return_stmt (location_t kw, tree expr)
   if (!crv_meth || crv_meth == error_mark_node)
return error_mark_node;
 
-  vec *args = make_tree_vector_single (expr);
-  co_ret_call = build_new_method_call (
-   get_coroutine_promise_proxy (current_function_decl), crv_meth, &args,
-   NULL_TREE, LOOKUP_NORMAL, NULL, tf_warning_or_error);
+  /* [class.copy.elision] / 3.
+An implicitly movable entity is a variable of automatic storage
+duration that is either a non-volatile object or an rvalue reference
+to a non-volatile object type.  For such objects in the context of
+the co_return, the overload resolution should be carried out first
+treating the object as an rvalue, if that fails, then we fall back
+to regular overload resolution.  */
+  tree obj = STRIP_NOPS (expr);
+  if (TREE_TYPE (obj)
+ && TYPE_REF_P (TREE_TYPE (obj))
+ && TYPE_REF_IS_RVALUE (TREE_TYPE (obj)))
+   obj = TREE_OPERAND (obj, 0);
+
+  if (TREE_CODE (obj) == PARM_DECL
+ || (VAR_P (obj)
+ && decl_storage_duration (obj) == dk_auto
+ && !TYPE_VOLATILE (TREE_TYPE (obj
+   {
+ vec *args = make_tree_vector_single (rvalue (expr));
+ co_ret_call = build_new_method_call
+   (get_coroutine_promise_proxy (current_function_decl), crv_meth,
+&args, NULL_TREE, LOOKUP_NORMAL, NULL, tf_none);
+   }
+
+  if (co_ret_call == error_mark_node )
+   {
+ vec *args = make_tree_vector_single (expr);
+ co_ret_call = build_new_method_call
+   (get_coroutine_promise_proxy (current_function_decl), crv_meth,
+&args,NULL_TREE, LOOKUP_NORMAL, NULL, tf_warning_or_error);
+ release_tree_vector (args);
+   }
 }
 
+  expr = build2_loc (kw, CO_RETURN_EXPR, void_type_node, expr, co_ret_call);
+  expr = maybe_cleanup_point_expr_void (expr);
+
   /* Makes no sense for a co-routine really. */
   if (TREE_THIS_VOLATILE (current_function_decl))
 warning_at (kw, 0,
"function declared % has a"
" % statement");
 
-  if (!co_ret_call || co_ret_call == error_mark_node)
-return error_mark_node;
-
-  expr = build2_loc (kw, CO_RETURN_EXPR, void_type_node, expr, co_ret_call);
-  expr = maybe_cleanup_point_expr_void (expr);
-  expr = add_stmt (expr);
-  return expr;
+  return add_stmt (expr);
 }
 
 /* We need to validate the arguments to __builtin_cor

ChangeLog files - server and client scripts

2020-05-13 Thread Martin Liška

Hi.

I'm sending the gcc-changelog relates scripts which should be added to contrib
folder. The patch contains:
- git_check_commit.py - checking script that verifies git message format
- git_update_version.py - a replacement of 
maintainer-scripts/update_version_git which
bumps DATESTAMP and generates ChangeLog entries (for now into ChangeLog.test 
files)
- git_commit.py, git_email.py and git_repository.py - helper classes

I also added a new git.config alias: 'gcc-verify' which can be used in the 
following
way:

$ git gcc-verify HEAD~2..HEAD -p -n
Checking 0e4009e9d523270e26856d2441c1be3d8119a477
OK
@@CL contrib
2020-05-13  Martin Liska  

* gcc-changelog/git_check_commit.py: New file.
* gcc-changelog/git_commit.py: New file.
* gcc-changelog/git_email.py: New file.
* gcc-changelog/git_repository.py: New file.
* gcc-changelog/git_update_version.py: New file.
* gcc-git-customization.sh: Add gcc-verify alias.
@@CL
Checking 18edc195442291525e04f0fa4d5ef972155117da
OK
@@CL gcc
2020-05-13  Jakub Jelinek  

PR debug/95080
* cfgrtl.c (purge_dead_edges): Skip over debug and note insns even
if the last insn is a note.
@@CL gcc/testsuite
2020-05-13  Jakub Jelinek  

PR debug/95080
* g++.dg/opt/pr95080.C: New test.
@@CL

Note the -n option which disables _strict mode_ (modification of both ChangeLog
and another files).

The second part is git hook that will reject all commits for release and master 
branches.
that violate ChangeLog format. Right now, strict mode is disabled in the hooks.

What's still missing to be done is format of Revert and Backport commits.
I suggest to use native 'git revert XYZ' and 'git cherry-pick -x XYZ'.
Doing that the commit messages will provide link to original commit and the 
script
can later append corresponding 'Backported ..' or 'Reverted' line.

Thoughts?
Martin
>From 0e4009e9d523270e26856d2441c1be3d8119a477 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 13 May 2020 12:22:39 +0200
Subject: [PATCH] Add gcc-changelog related scripts.

contrib/ChangeLog:

2020-05-13  Martin Liska  

	* gcc-changelog/git_check_commit.py: New file.
	* gcc-changelog/git_commit.py: New file.
	* gcc-changelog/git_email.py: New file.
	* gcc-changelog/git_repository.py: New file.
	* gcc-changelog/git_update_version.py: New file.
	* gcc-git-customization.sh: Add gcc-verify alias.
---
 contrib/gcc-changelog/git_check_commit.py   |  49 ++
 contrib/gcc-changelog/git_commit.py | 536 
 contrib/gcc-changelog/git_email.py  |  92 
 contrib/gcc-changelog/git_repository.py |  60 +++
 contrib/gcc-changelog/git_update_version.py | 105 
 contrib/gcc-git-customization.sh|   2 +
 6 files changed, 844 insertions(+)
 create mode 100755 contrib/gcc-changelog/git_check_commit.py
 create mode 100755 contrib/gcc-changelog/git_commit.py
 create mode 100755 contrib/gcc-changelog/git_email.py
 create mode 100755 contrib/gcc-changelog/git_repository.py
 create mode 100755 contrib/gcc-changelog/git_update_version.py

diff --git a/contrib/gcc-changelog/git_check_commit.py b/contrib/gcc-changelog/git_check_commit.py
new file mode 100755
index 000..b2d1d08a242
--- /dev/null
+++ b/contrib/gcc-changelog/git_check_commit.py
@@ -0,0 +1,49 @@
+#!/usr/bin/env python3
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+# WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .  */
+
+import argparse
+
+from git_repository import parse_git_revisions
+
+parser = argparse.ArgumentParser(description='Check git ChangeLog format '
+ 'of a commit')
+parser.add_argument('revisions',
+help='Git revisions (e.g. hash~5..hash or just hash)')
+parser.add_argument('-g', '--git-path', default='.',
+help='Path to git repository')
+parser.add_argument('-p', '--print-changelog', action='store_true',
+help='Print final changelog entires')
+parser.add_argument('-n', '--allow-non-strict-mode', action='store_true',
+help='Allow non-strict mode (change in both ChangeLog and '
+'other files.')
+args = parser.parse_args()
+
+retval = 0
+for git_commit in parse_git_revisions(args.git_path, args.revisions,
+  not args.allow_non_strict_mode):
+print('Checking %s' % git_commit.hexsha)
+if git_co

Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Tue, May 12, 2020 at 10:07 PM H.J. Lu  wrote:
>
> Update STV pass to properly count cost of XMM register push.  In 32-bit
> mode, to convert XMM register push in DImode, we do an XMM store in
> DImode, followed by 2 memory pushes in SImode, instead of 2 integer
> register pushes in SImode.  To convert XM register push in SImode, we
> do an XMM register to integer register move in SImode, followed an
> integer register push in SImode, instead of an integer register push in
> SImode.  In 64-bit mode, we do an XMM register to integer register move
> in SImode or DImode, followed an integer register push in SImode or
> DImode, instead of an integer register push SImode or DImode.
>
> Tested on Linux/x86 and Linux/x86-64.

I think it is better to implement XMM register pushes, and split them
after reload to a sequence of:

(set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
(set (match_dup 0) (match_dup 1))

This is definitelly better that tripsthrough memory to stack.

There are plenty of examples of fake pushes in i386.md, just grep for
"%%% Kill this when call"

Uros.

> OK for master?
>
> Thanks.
>
> H.J.
> --
> gcc/
> PR target/95021
> * config/i386/i386-features.c
> (general_scalar_chain::general_scalar_chain): Initialize
> n_sse_push.
> (general_scalar_chain::mark_dual_mode_def): Add a df_ref
> argument for reference.  Increment n_sse_push for XMM register
> push.
> (timode_scalar_chain::mark_dual_mode_def): Add a dummy df_ref
> argument.
> (scalar_chain::analyze_register_chain): Pass chain->ref
> to mark_dual_mode_def.
> (general_scalar_chain::compute_convert_gain): Count cost of
> XMM register push.
> * config/i386/i386-features.h (scalar_chain::mark_dual_mode_def):
> Add a df_ref argument.
> (general_scalar_chain): Add n_sse_push.
> (general_scalar_chain::mark_dual_mode_def): Add a df_ref
> argument.
> (timode_scalar_chain::mark_dual_mode_def): Add a df_ref
> argument.
>
> gcc/testsuite/
>
> PR target/95021
> * gcc.target/i386/pr95021-1.c: New test.
> * gcc.target/i386/pr95021-2.c: Likewise.
> ---
>  gcc/config/i386/i386-features.c   | 33 ---
>  gcc/config/i386/i386-features.h   |  7 ++---
>  gcc/testsuite/gcc.target/i386/pr95021-1.c | 25 +
>  gcc/testsuite/gcc.target/i386/pr95021-2.c | 25 +
>  4 files changed, 83 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95021-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95021-2.c
>
> diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
> index 78fb373db6e..c85ab41350c 100644
> --- a/gcc/config/i386/i386-features.c
> +++ b/gcc/config/i386/i386-features.c
> @@ -326,6 +326,7 @@ general_scalar_chain::general_scalar_chain (enum 
> machine_mode smode_,
>insns_conv = BITMAP_ALLOC (NULL);
>n_sse_to_integer = 0;
>n_integer_to_sse = 0;
> +  n_sse_push = 0;
>  }
>
>  general_scalar_chain::~general_scalar_chain ()
> @@ -337,7 +338,7 @@ general_scalar_chain::~general_scalar_chain ()
> conversion.  */
>
>  void
> -general_scalar_chain::mark_dual_mode_def (df_ref def)
> +general_scalar_chain::mark_dual_mode_def (df_ref def, df_ref ref)
>  {
>gcc_assert (DF_REF_REG_DEF_P (def));
>
> @@ -356,6 +357,12 @@ general_scalar_chain::mark_dual_mode_def (df_ref def)
>if (!reg_new)
> return;
>n_sse_to_integer++;
> +  rtx_insn *insn = DF_REF_INSN (ref);
> +  rtx set = single_set (insn);
> +  /* Count XMM register push.  */

Count XMM register pushes.

> +  if (set
> + && push_operand (SET_DEST (set), GET_MODE (SET_DEST (set
> +   n_sse_push++;
>  }
>
>if (dump_file)
> @@ -367,7 +374,7 @@ general_scalar_chain::mark_dual_mode_def (df_ref def)
>  /* For TImode conversion, it is unused.  */
>
>  void
> -timode_scalar_chain::mark_dual_mode_def (df_ref)
> +timode_scalar_chain::mark_dual_mode_def (df_ref, df_ref)
>  {
>gcc_unreachable ();
>  }
> @@ -408,14 +415,14 @@ scalar_chain::analyze_register_chain (bitmap 
> candidates, df_ref ref)
>   if (dump_file)
> fprintf (dump_file, "  r%d def in insn %d isn't convertible\n",
>  DF_REF_REGNO (chain->ref), uid);
> - mark_dual_mode_def (chain->ref);
> + mark_dual_mode_def (chain->ref, chain->ref);
> }
>else
> {
>   if (dump_file)
> fprintf (dump_file, "  r%d use in insn %d isn't convertible\n",
>  DF_REF_REGNO (chain->ref), uid);
> - mark_dual_mode_def (ref);
> + mark_dual_mode_def (ref, chain->ref);
> }
>  }
>  }
> @@ -627,6 +634,24 @@ general_scalar_chain::compute_convert_gain ()
>   are at the moment.  */
>cost += n_integer_to_sse * ix86_cost->sse_to_integer;
>
> +  /* In 32-bit mode,

[RFC PATCH v2] cgraph support for late declare variant resolution

2020-05-13 Thread Jakub Jelinek via Gcc-patches
Hi!

This is a new version of the
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg01493.html
patch.  Unlike the previous version, this one actually works properly
except for LTO, bootstrapped/regtested on x86_64-linux and i686-linux
too.

In short, #pragma omp declare variant is a directive which allows
redirection of direct calls to certain function to other calls with a
scoring system and some of those decisions need to be deferred until after
IPA.  The patch represents them with calls to an artificial FUNCTION_DECL
with declare_variant_alt in the cgraph_node set.

Honza/Martin, are the cgraph related changes acceptable to you?

For LTO, the patch only saves/restores the two cgraph_node bits added in the
patch, but doesn't yet stream out and back in the on the side info for the
declare_variant_alt.  For the LTO partitioning, I believe those artificial
FUNCTION_DECLs with declare_variant_alt need to go into partition together
with anything that calls them (possibly duplicated), any way how to achieve
that?  Say if declare variant artificial fn foobar is directly
called from all of foo, bar and baz and not from qux and we want 4
partitions, one for each of foo, bar, baz, qux, then foobar is needed in the
first 3 partitions, and the IPA_REF_ADDRs recorded for foobar that right
after IPA the foobar call will be replaced with calls to foobar1, foobar2,
foobar3 or foobar (non-artificial) can of course stay in different
partitions if needed.

2020-05-13  Jakub Jelinek  

* Makefile.in (GTFILES): Add omp-general.c.
* cgraph.h (struct cgraph_node): Add declare_variant_alt and
calls_declare_variant_alt members and initialize them in the
ctor.
* ipa.c (symbol_table::remove_unreachable_nodes): Handle direct
calls to declare_variant_alt nodes.
* lto-cgraph.c (lto_output_node): Write declare_variant_alt
and calls_declare_variant_alt.
(input_overwrite_node): Read them back.
* omp-simd-clone.c (simd_clone_create): Copy calls_declare_variant_alt
bit.
* tree-inline.c (expand_call_inline): Or in calls_declare_variant_alt
bit.
(tree_function_versioning): Copy calls_declare_variant_alt bit.
* omp-offload.c (execute_omp_device_lower): Call
omp_resolve_declare_variant on direct function calls.
(pass_omp_device_lower::gate): Also enable for
calls_declare_variant_alt functions.
* omp-general.c (omp_maybe_offloaded): Return false after inlining.
(omp_context_selector_matches): Handle the case when
cfun->curr_properties has PROP_gimple_any bit set.
(struct omp_declare_variant_entry): New type.
(struct omp_declare_variant_base_entry): New type.
(struct omp_declare_variant_hasher): New type. 
(omp_declare_variant_hasher::hash, omp_declare_variant_hasher::equal):
New methods.
(omp_declare_variants): New variable.
(struct omp_declare_variant_alt_hasher): New type.
(omp_declare_variant_alt_hasher::hash,
omp_declare_variant_alt_hasher::equal): New methods.
(omp_declare_variant_alt): New variables.
(omp_resolve_late_declare_variant): New function.
(omp_resolve_declare_variant): Call omp_resolve_late_declare_variant
when called late.  Create a magic declare_variant_alt fndecl and
cgraph node and return that if decision needs to be deferred until
after gimplification.
* cgraph.c (symbol_table::create_edge): Or in calls_declare_variant_alt
bit.

* c-c++-common/gomp/declare-variant-14.c: New test.

--- gcc/Makefile.in.jj  2020-05-12 21:20:52.701547377 +0200
+++ gcc/Makefile.in 2020-05-13 11:34:54.869947514 +0200
@@ -2616,6 +2616,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h
   $(srcdir)/omp-offload.h \
   $(srcdir)/omp-offload.c \
   $(srcdir)/omp-expand.c \
+  $(srcdir)/omp-general.c \
   $(srcdir)/omp-low.c \
   $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \
   $(srcdir)/cgraphclones.c \
--- gcc/cgraph.h.jj 2020-05-12 21:20:47.433626426 +0200
+++ gcc/cgraph.h2020-05-13 11:34:54.870947499 +0200
@@ -937,7 +937,8 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cg
   split_part (false), indirect_call_target (false), local (false),
   versionable (false), can_change_signature (false),
   redefined_extern_inline (false), tm_may_enter_irr (false),
-  ipcp_clone (false), m_uid (uid), m_summary_id (-1)
+  ipcp_clone (false), declare_variant_alt (false),
+  calls_declare_variant_alt (false), m_uid (uid), m_summary_id (-1)
   {}
 
   /* Remove the node from cgraph and all inline clones inlined into it.
@@ -1539,6 +1540,11 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cg
   unsigned tm_may_enter_irr : 1;
   /* True if this was a clone created by ipa-cp.  */
   unsigned ipcp_clone : 1;
+  /* True if this is the deferred declare variant resolution artificial
+ function.  */
+  unsi

[PATCH] Remove SLP_INSTANCE_GROUP_SIZE

2020-05-13 Thread Richard Biener


This removes the SLP_INSTANCE_GROUP_SIZE member since the number of
lanes throughout a SLP subgraph is not necessarily constant.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2020-05-13  Richard Biener  

* tree-vectorizer.h (SLP_INSTANCE_GROUP_SIZE): Remove.
(_slp_instance::group_size): Likewise.
* tree-vect-loop.c (vectorizable_reduction): The group size
is the number of lanes in the node.
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Likewise.
(vect_analyze_slp_instance): Do not set SLP_INSTANCE_GROUP_SIZE,
verify it matches the instance trees number of lanes.
(vect_slp_analyze_node_operations_1): Use the numer of lanes
in the node as group size.
(vect_bb_vectorization_profitable_p): Use the instance root
number of lanes for the size of life.
(vect_schedule_slp_instance): Use the number of lanes as
group_size.
* tree-vect-stmts.c (vectorizable_load): Remove SLP instance
parameter.  Use the number of lanes of the load for the group
size in the gap adjustment code.
(vect_analyze_stmt): Adjust.
(vect_transform_stmt): Likewise.
---
 gcc/tree-vect-loop.c  |  2 +-
 gcc/tree-vect-slp.c   | 28 +---
 gcc/tree-vect-stmts.c | 13 ++---
 gcc/tree-vectorizer.h |  8 
 4 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 180790abf42..a1f52dcc2ad 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6574,7 +6574,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
 which each SLP statement has its own initial value and in which
 that value needs to be repeated for every instance of the
 statement within the initial vector.  */
-  unsigned int group_size = SLP_INSTANCE_GROUP_SIZE (slp_node_instance);
+  unsigned int group_size = SLP_TREE_SCALAR_STMTS (slp_node).length ();
   if (!neutral_op
  && !can_duplicate_and_interleave_p (loop_vinfo, group_size,
  TREE_TYPE (vectype_out)))
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index f9ad0821fa0..6f623955ce5 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1810,7 +1810,6 @@ vect_slp_rearrange_stmts (slp_tree node, unsigned int 
group_size,
 static bool
 vect_attempt_slp_rearrange_stmts (slp_instance slp_instn)
 {
-  unsigned int group_size = SLP_INSTANCE_GROUP_SIZE (slp_instn);
   unsigned int i, j;
   unsigned int lidx;
   slp_tree node, load;
@@ -1821,14 +1820,16 @@ vect_attempt_slp_rearrange_stmts (slp_instance 
slp_instn)
   /* Compare all the permutation sequences to the first one.  We know
  that at least one load is permuted.  */
   node = SLP_INSTANCE_LOADS (slp_instn)[0];
-  if (!node->load_permutation.exists ())
+  if (!SLP_TREE_LOAD_PERMUTATION (node).exists ())
 return false;
+  unsigned int group_size = SLP_TREE_LOAD_PERMUTATION (node).length ();
   for (i = 1; SLP_INSTANCE_LOADS (slp_instn).iterate (i, &load); ++i)
 {
-  if (!load->load_permutation.exists ())
+  if (!SLP_TREE_LOAD_PERMUTATION (load).exists ()
+ || SLP_TREE_LOAD_PERMUTATION (load).length () != group_size)
return false;
-  FOR_EACH_VEC_ELT (load->load_permutation, j, lidx)
-   if (lidx != node->load_permutation[j])
+  FOR_EACH_VEC_ELT (SLP_TREE_LOAD_PERMUTATION (load), j, lidx)
+   if (lidx != SLP_TREE_LOAD_PERMUTATION (node)[j])
  return false;
 }
 
@@ -2151,7 +2152,6 @@ vect_analyze_slp_instance (vec_info *vinfo,
  /* Create a new SLP instance.  */
  new_instance = XNEW (class _slp_instance);
  SLP_INSTANCE_TREE (new_instance) = node;
- SLP_INSTANCE_GROUP_SIZE (new_instance) = group_size;
  SLP_INSTANCE_UNROLLING_FACTOR (new_instance) = unrolling_factor;
  SLP_INSTANCE_LOADS (new_instance) = vNULL;
  SLP_INSTANCE_ROOT_STMT (new_instance) = constructor ? stmt_info : 
NULL;
@@ -2240,6 +2240,12 @@ vect_analyze_slp_instance (vec_info *vinfo,
 
  vinfo->slp_instances.safe_push (new_instance);
 
+ /* ???  We've replaced the old SLP_INSTANCE_GROUP_SIZE with
+the number of scalar stmts in the root in a few places.
+Verify that assumption holds.  */
+ gcc_assert (SLP_TREE_SCALAR_STMTS (SLP_INSTANCE_TREE (new_instance))
+   .length () == group_size);
+
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
@@ -2670,7 +2676,7 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, 
slp_tree node,
vf = loop_vinfo->vectorization_factor;
   else
vf = 1;
-  unsigned int group_size = SLP_INSTANCE_GROUP_SIZE (node_instance);
+  unsigned int group_size = SLP_TREE_SCALAR_STMTS (node).length ();
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
 

RE: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).

2020-05-13 Thread Srinath Parvathaneni
Hi,

> -Original Message-
> From: Christophe Lyon 
> Sent: 13 May 2020 11:20
> To: Srinath Parvathaneni 
> Cc: gcc Patches ; Richard Earnshaw
> 
> Subject: Re: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE
> vector load/store intrinsics (PR94959).
> 
> Hi,
> 
> 
> On Wed, 13 May 2020 at 11:47, Srinath Parvathaneni
>  wrote:
> >
> > Hello,
> >
> > Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler
> instructions generated by current compiler are wrong.
> > eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`.
> > But as per Arm-arm second argument in above instructions must also be a
> low register (<= r7).
> 
> How did you catch this? Does gas complain?

This was caught by running CMSIS-DSP testsuite and yes gas was complaining.

> 
> If so, then do the existing tests pass (I mean before your patch)?

All the existing tests are passing because the instructions generated in this 
tests always
has the low registers ( <=R7) for second argument and that is because these 
tests are
simple intrinsics calls.
But modifying the testcase as mentioned in PR94959, surfaced the problem fixed 
by this patch.

> 
> > This patch fixes this issue by creating a new predicate
> "mve_memory_operand" and constraint "Ux" which allows low registers
> > as arguments to the generated instructions depending on the mode of the
> argument.
> > A new constraint "Ul" is created to handle loading to PC-relative addressing
> modes for vector store/load intrinsiscs.
> >
> > All the corresponding MVE intrinsic generating wrong code-gen as
> vldrbq_s32 are modified in this patch.
> >
> >
> > Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8-
> M Architecture Reference Manual [2] for more details.
> > [1] https://developer.arm.com/architectures/instruction-sets/simd-
> isas/helium/mve-intrinsics
> > [2] https://developer.arm.com/docs/ddi0553/latest
> >
> > Regression tested on arm-none-eabi and found no regressions.
> >
> > Ok for trunk? If ok, can I also back-port this patch to GCC-10 branch?
> >
> > Thanks,
> > Srinath.
> >
> > gcc/ChangeLog:
> >
> > 2020-05-13  Srinath Parvathaneni  
> > Andre Vieira  
> >
> > PR target/94959
> > * config/arm/arm-protos.h (arm_mode_base_reg_class): Function
> > declaration.
> > (mve_vector_mem_operand): Likewise.
> > * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target
> check
> > the load from memory to a core register is legitimate for give mode.
> > (mve_vector_mem_operand): Define function.
> > (arm_print_operand): Modify comment.
> > (arm_mode_base_reg_class): Define.
> > * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check
> for
> > TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on
> TRUE.
> > * config/arm/constraints.md (Ux): Likewise.
> > (Ul): Likewise.
> > * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and
> also
> > add support for missing Vector Store Register and Vector Load 
> > Register.
> > Add a new alternative to support load from memory to PC (or label) 
> > in
> > vector store/load.
> > (mve_vstrbq_): Modify constraint Us to Ux.
> > (mve_vldrbq_): Modify constriant Us to Ux, predicate to
> > mve_memory_operand and also modify the MVE instructions to emit.
> > (mve_vldrbq_z_): Modify constraint Us to Ux.
> > (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
> > mve_memory_operand and also modify the MVE instructions to emit.
> > (mve_vldrhq_): Modify constriant Us to Ux, predicate to
> > mve_memory_operand and also modify the MVE instructions to emit.
> > (mve_vldrhq_z_fv8hf): Likewise.
> > (mve_vldrhq_z_): Likewise.
> > (mve_vldrwq_fv4sf): Likewise.
> > (mve_vldrwq_v4si): Likewise.
> > (mve_vldrwq_z_fv4sf): Likewise.
> > (mve_vldrwq_z_v4si): Likewise.
> > (mve_vld1q_f): Modify constriant Us to Ux.
> > (mve_vld1q_): Likewise.
> > (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
> > mve_memory_operand.
> > (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
> > mve_memory_operand and also modify the MVE instructions to emit.
> > (mve_vstrhq_p_): Likewise.
> > (mve_vstrhq_): Modify constriant Us to Ux, predicate to
> > mve_memory_operand.
> > (mve_vstrwq_fv4sf): Modify constriant Us to Ux.
> > (mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the
> MVE
> > instructions to emit.
> > (mve_vstrwq_p_v4si): Likewise.
> > (mve_vstrwq_v4si): Likewise.Modify constriant Us to Ux.
> > * config/arm/predicates.md (mve_memory_operand): Define.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2020-05-13  Srinath Parvathaneni  
> >
> > PR target/94959
> > *

Re: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).

2020-05-13 Thread Christophe Lyon via Gcc-patches
On Wed, 13 May 2020 at 13:45, Srinath Parvathaneni
 wrote:
>
> Hi,
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: 13 May 2020 11:20
> > To: Srinath Parvathaneni 
> > Cc: gcc Patches ; Richard Earnshaw
> > 
> > Subject: Re: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE
> > vector load/store intrinsics (PR94959).
> >
> > Hi,
> >
> >
> > On Wed, 13 May 2020 at 11:47, Srinath Parvathaneni
> >  wrote:
> > >
> > > Hello,
> > >
> > > Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler
> > instructions generated by current compiler are wrong.
> > > eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`.
> > > But as per Arm-arm second argument in above instructions must also be a
> > low register (<= r7).
> >
> > How did you catch this? Does gas complain?
>
> This was caught by running CMSIS-DSP testsuite and yes gas was complaining.
>
> >
> > If so, then do the existing tests pass (I mean before your patch)?
>
> All the existing tests are passing because the instructions generated in this 
> tests always
> has the low registers ( <=R7) for second argument and that is because these 
> tests are
> simple intrinsics calls.
> But modifying the testcase as mentioned in PR94959, surfaced the problem 
> fixed by this patch.

Thanks for the clarification.

But after a quick look it seems your patch does not modify the code in
the tests, only the scan-assembler directives.
Don't you need to modify the actual test code, such that they fail
without your fix?


>
> >
> > > This patch fixes this issue by creating a new predicate
> > "mve_memory_operand" and constraint "Ux" which allows low registers
> > > as arguments to the generated instructions depending on the mode of the
> > argument.
> > > A new constraint "Ul" is created to handle loading to PC-relative 
> > > addressing
> > modes for vector store/load intrinsiscs.
> > >
> > > All the corresponding MVE intrinsic generating wrong code-gen as
> > vldrbq_s32 are modified in this patch.
> > >
> > >
> > > Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8-
> > M Architecture Reference Manual [2] for more details.
> > > [1] https://developer.arm.com/architectures/instruction-sets/simd-
> > isas/helium/mve-intrinsics
> > > [2] https://developer.arm.com/docs/ddi0553/latest
> > >
> > > Regression tested on arm-none-eabi and found no regressions.
> > >
> > > Ok for trunk? If ok, can I also back-port this patch to GCC-10 branch?
> > >
> > > Thanks,
> > > Srinath.
> > >
> > > gcc/ChangeLog:
> > >
> > > 2020-05-13  Srinath Parvathaneni  
> > > Andre Vieira  
> > >
> > > PR target/94959
> > > * config/arm/arm-protos.h (arm_mode_base_reg_class): Function
> > > declaration.
> > > (mve_vector_mem_operand): Likewise.
> > > * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target
> > check
> > > the load from memory to a core register is legitimate for give 
> > > mode.
> > > (mve_vector_mem_operand): Define function.
> > > (arm_print_operand): Modify comment.
> > > (arm_mode_base_reg_class): Define.
> > > * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check
> > for
> > > TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on
> > TRUE.
> > > * config/arm/constraints.md (Ux): Likewise.
> > > (Ul): Likewise.
> > > * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and
> > also
> > > add support for missing Vector Store Register and Vector Load 
> > > Register.
> > > Add a new alternative to support load from memory to PC (or 
> > > label) in
> > > vector store/load.
> > > (mve_vstrbq_): Modify constraint Us to Ux.
> > > (mve_vldrbq_): Modify constriant Us to Ux, predicate 
> > > to
> > > mve_memory_operand and also modify the MVE instructions to emit.
> > > (mve_vldrbq_z_): Modify constraint Us to Ux.
> > > (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
> > > mve_memory_operand and also modify the MVE instructions to emit.
> > > (mve_vldrhq_): Modify constriant Us to Ux, predicate 
> > > to
> > > mve_memory_operand and also modify the MVE instructions to emit.
> > > (mve_vldrhq_z_fv8hf): Likewise.
> > > (mve_vldrhq_z_): Likewise.
> > > (mve_vldrwq_fv4sf): Likewise.
> > > (mve_vldrwq_v4si): Likewise.
> > > (mve_vldrwq_z_fv4sf): Likewise.
> > > (mve_vldrwq_z_v4si): Likewise.
> > > (mve_vld1q_f): Modify constriant Us to Ux.
> > > (mve_vld1q_): Likewise.
> > > (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
> > > mve_memory_operand.
> > > (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
> > > mve_memory_operand and also modify the MVE instructions to emit.
> > > (mve_vstrhq_p_): Likewise.
> > > (mve_vstrhq_): Modify con

Re: [PATCH PR94969]Add unit distant vector to DDR in case of invariant access functions

2020-05-13 Thread Christophe Lyon via Gcc-patches
Hi Bin,


On Mon, 11 May 2020 at 14:54, Richard Biener via Gcc-patches
 wrote:
>
> On Mon, May 11, 2020 at 7:52 AM bin.cheng via Gcc-patches
>  wrote:
> >
> > Hi,
> > As analyzed in PR94969, data dependence analysis now misses dependence 
> > vector for specific case in which DRs in DDR have the same invariant access 
> > functions.  This simple patch fixes the issue by also covering invariant 
> > cases.  Bootstrap and test on x86_64, is it OK?
>
> OK.
>
> Thanks,
> Richard.
>
> > Thanks,
> > bin
> >
> > 2020-05-11  Bin Cheng  
> >
> > PR tree-optimization/94969
> > * tree-data-dependence.c (constant_access_functions): Rename to...
> > (invariant_access_functions): ...this.  Add parameter.  Check for
> > invariant access function, rather than constant.
> > (build_classic_dist_vector): Call above function.
> > * tree-loop-distribution.c (pg_add_dependence_edges): Add comment.
> >
> > gcc/testsuite
> > 2020-05-11  Bin Cheng  
> >
> > PR tree-optimization/94969
> > * gcc.dg/tree-ssa/pr94969.c: New test.

The new test fails on arm and aarch64 and probably everywhere:
gcc.dg/tree-ssa/pr94969.c: dump file does not exist
UNRESOLVED: gcc.dg/tree-ssa/pr94969.c scan-tree-dump-not Loop 1
distributed: split to 3 loops "ldist"

Can you fix this?

Thanks


Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 1:05 PM Uros Bizjak  wrote:
>
> On Tue, May 12, 2020 at 10:07 PM H.J. Lu  wrote:
> >
> > Update STV pass to properly count cost of XMM register push.  In 32-bit
> > mode, to convert XMM register push in DImode, we do an XMM store in
> > DImode, followed by 2 memory pushes in SImode, instead of 2 integer
> > register pushes in SImode.  To convert XM register push in SImode, we
> > do an XMM register to integer register move in SImode, followed an
> > integer register push in SImode, instead of an integer register push in
> > SImode.  In 64-bit mode, we do an XMM register to integer register move
> > in SImode or DImode, followed an integer register push in SImode or
> > DImode, instead of an integer register push SImode or DImode.
> >
> > Tested on Linux/x86 and Linux/x86-64.
>
> I think it is better to implement XMM register pushes, and split them
> after reload to a sequence of:
>
> (set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
> (set (match_dup 0) (match_dup 1))
>
> This is definitely better than trips through memory to stack.

Attached (untested patch) allows fake pushes from XMM registers, so
STV pass can allow pushes.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 722eb9b5ec8..9f741ce7602 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1049,6 +1049,9 @@
 ;; SWI and DWI together.
 (define_mode_iterator SWIDWI [QI HI SI DI (TI "TARGET_64BIT")])
 
+;; SWI48 and DWI together.
+(define_mode_iterator SWI48DWI [SI DI (TI "TARGET_64BIT")])
+
 ;; GET_MODE_SIZE for selected modes.  As GET_MODE_SIZE is not
 ;; compile time constant, it is faster to use  than
 ;; GET_MODE_SIZE (mode).  For XFmode which depends on
@@ -1670,8 +1673,8 @@
 ;; Push/pop instructions.
 
 (define_insn "*push2"
-  [(set (match_operand:DWI 0 "push_operand" "=<")
-   (match_operand:DWI 1 "general_no_elim_operand" "riF*o"))]
+  [(set (match_operand:DWI 0 "push_operand" "=<,<")
+   (match_operand:DWI 1 "general_no_elim_operand" "riF*o,v"))]
   ""
   "#"
   [(set_attr "type" "multi")
@@ -1685,13 +1688,14 @@
   "ix86_split_long_move (operands); DONE;")
 
 (define_insn "*pushdi2_rex64"
-  [(set (match_operand:DI 0 "push_operand" "=<,!<")
-   (match_operand:DI 1 "general_no_elim_operand" "re*m,n"))]
+  [(set (match_operand:DI 0 "push_operand" "=<,<,!<")
+   (match_operand:DI 1 "general_no_elim_operand" "re*m,v,n"))]
   "TARGET_64BIT"
   "@
push{q}\t%1
+   #
#"
-  [(set_attr "type" "push,multi")
+  [(set_attr "type" "push,multi,multi")
(set_attr "mode" "DI")])
 
 ;; Convert impossible pushes of immediate to existing instructions.
@@ -1726,11 +1730,13 @@
 })
 
 (define_insn "*pushsi2"
-  [(set (match_operand:SI 0 "push_operand" "=<")
-   (match_operand:SI 1 "general_no_elim_operand" "ri*m"))]
+  [(set (match_operand:SI 0 "push_operand" "=<,<")
+   (match_operand:SI 1 "general_no_elim_operand" "ri*m,v"))]
   "!TARGET_64BIT"
-  "push{l}\t%1"
-  [(set_attr "type" "push")
+  "@
+   push{l}\t%1
+   #"
+  [(set_attr "type" "push,multi")
(set_attr "mode" "SI")])
 
 ;; emit_push_insn when it calls move_by_pieces requires an insn to
@@ -1739,11 +1745,13 @@
 
 ;; For TARGET_64BIT we always round up to 8 bytes.
 (define_insn "*push2_rex64"
-  [(set (match_operand:SWI124 0 "push_operand" "=X")
-   (match_operand:SWI124 1 "nonmemory_no_elim_operand" "r"))]
+  [(set (match_operand:SWI124 0 "push_operand" "=X,X")
+   (match_operand:SWI124 1 "nonmemory_no_elim_operand" "r,v"))]
   "TARGET_64BIT"
-  "push{q}\t%q1"
-  [(set_attr "type" "push")
+  "@
+   push{q}\t%q1
+   #"
+  [(set_attr "type" "push,multi")
(set_attr "mode" "DI")])
 
 (define_insn "*push2"
@@ -1754,6 +1762,18 @@
   [(set_attr "type" "push")
(set_attr "mode" "SI")])
 
+(define_split
+  [(set (match_operand:SWI48DWI 0 "push_operand")
+   (match_operand:SWI48DWI 1 "sse_reg_operand"))]
+  "TARGET_SSE && reload_completed"
+  [(set (reg:P SP_REG) (plus:P (reg:P SP_REG) (match_dup 2)))
+(set (match_dup 0) (match_dup 1))]
+{
+  operands[2] = GEN_INT (-PUSH_ROUNDING (GET_MODE_SIZE (mode)));
+  /* Preserve memory attributes. */
+  operands[0] = replace_equiv_address (operands[0], stack_pointer_rtx);
+})
+
 (define_insn "*push2_prologue"
   [(set (match_operand:W 0 "push_operand" "=<")
(match_operand:W 1 "general_no_elim_operand" "r*m"))


ChangeLog files - server and client scripts (git cherry-pick)

2020-05-13 Thread Martin Liška

On 5/13/20 1:05 PM, Martin Liška wrote:

I suggest to use native 'git revert XYZ' and 'git cherry-pick -x XYZ'.


I've prepared a working version of Revert format:
https://github.com/marxin/gcc-changelog/tree/cherry-pick

So using git cherry-pick -x HASH one gets something like:

$ cat patches-artificial/0001-Test-tree.h.patch
From a71eeba28ffa2427d24d5b2654e93b261980b9e3 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 13 May 2020 13:19:22 +0200
Subject: [PATCH] Test tree.h.

gcc/ChangeLog:

2020-01-03  Martin Liska  

PR ipa/12345
* tree.h: Just test it.

(cherry picked from commit a2bdf56b15b51c3a7bd988943bdbc42aa156f133)
---
 gcc/tree.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree.h b/gcc/tree.h
index 9ca9ab58ec0..99a9e1a73d9 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1,6 +1,8 @@
 /* Definitions for the ubiquitous 'tree' type for GNU compilers.
Copyright (C) 1989-2020 Free Software Foundation, Inc.
 
+

+
 This file is part of GCC.
 
 GCC is free software; you can redistribute it and/or modify it under

--
2.26.2

and the script generates:

$ ./git_email.py patches-artificial/0001-Test-tree.h.patch
OK
@@CL gcc
2020-05-13  Martin Liska  

Backport from master:
2020-01-03  Martin Liska  

PR ipa/12345
* tree.h: Just test it.
@@CL

So the datestamp and the author is taken from commit and original authors
are added after 'Backport from master' line. The script scans for the
'(cherry picked from commit' line in the message.

Benefit of the approach is that one can adjust the commit message (which 
influences
ChangeLog output).

Martin


Re: [PR 95013] EOF location is at end of file

2020-05-13 Thread Nathan Sidwell

On 5/13/20 2:44 AM, Christophe Lyon wrote:

On Wed, 13 May 2020 at 02:24, H.J. Lu via Gcc-patches



 [PR 95013] Fix gcc.dg/unclosed-init.c

 2020-05-13  Christophe Lyon  

 PR preprocessor/95013
 * gcc.dg/unclosed-init.c: Add missing comment in dg-error.


Thanks for beating me to it.

nathan

--
Nathan Sidwell


Re: ChangeLog files - server and client scripts

2020-05-13 Thread Martin Liška

The scripts were just installed to master except the git alias.
I'm sending that in a separate patch. Now the alias can be used
from any subfolder in a gcc git repository.

Martin
>From eb47191e8d8cbbda285c4df7eb2d1e98091edab9 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 13 May 2020 14:32:50 +0200
Subject: [PATCH] Add gcc-verify alias.

contrib/ChangeLog:

2020-05-13  Martin Liska  

	* gcc-git-customization.sh: Add gcc-verify alias
	that uses contrib/gcc-changelog/git_check_commit.py.
---
 contrib/gcc-git-customization.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index a932bf8c06a..ce293d1fe42 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -25,6 +25,8 @@ git config alias.svn-rev '!f() { rev=$1; shift; git log --all --grep="^From-SVN:
 git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then c=\${2:-master}; r=\$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); expr match \${r:-no} '^r[0-9]\\+\$' >/dev/null && r=\${r}-0-g\$(git rev-parse \${2:-master}); else c=\${1:-master}; r=\$(git describe --all --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p'); fi; if test -n \$r; then o=\$(git config --get gcc-config.upstream); rr=\$(echo \$r | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\\(-g[0-9a-f]\\+\\)\\?\$,\\1,p'); if git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$rr >/dev/null; then m=releases/gcc-\$rr; else m=master; fi; git merge-base --is-ancestor \$c \${o:-origin}/\$m && \echo \${r}; fi; }; f"
 git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream); r=\$(echo \$1 | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\$,\\1,p'); n=\$(echo \$1 | sed -n 's,^r[0-9]\\+-\\([0-9]\\+\\)\$,\\1,p'); test -z \$r && echo Invalid id \$1 && exit 1; h=\$(git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$r); test -z \$h && h=\$(git rev-parse --verify --quiet \${o:-origin}/master); p=\$(git describe --all --match 'basepoints/gcc-'\$r \$h | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+-\\([0-9]\\+\\)-g[0-9a-f]*\$,\\2,p;s,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+\$,0,p'); git rev-parse --verify \$h~\$(expr \$p - \$n); }; f"
 
+git config alias.gcc-verify '!f() { "`git rev-parse --show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'
+
 # Make diff on MD files use "(define" as a function marker.
 # Use this in conjunction with a .gitattributes file containing
 # *.mddiff=md
-- 
2.26.2



Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread H.J. Lu via Gcc-patches
On Wed, May 13, 2020 at 5:04 AM Uros Bizjak  wrote:
>
> On Wed, May 13, 2020 at 1:05 PM Uros Bizjak  wrote:
> >
> > On Tue, May 12, 2020 at 10:07 PM H.J. Lu  wrote:
> > >
> > > Update STV pass to properly count cost of XMM register push.  In 32-bit
> > > mode, to convert XMM register push in DImode, we do an XMM store in
> > > DImode, followed by 2 memory pushes in SImode, instead of 2 integer
> > > register pushes in SImode.  To convert XM register push in SImode, we
> > > do an XMM register to integer register move in SImode, followed an
> > > integer register push in SImode, instead of an integer register push in
> > > SImode.  In 64-bit mode, we do an XMM register to integer register move
> > > in SImode or DImode, followed an integer register push in SImode or
> > > DImode, instead of an integer register push SImode or DImode.
> > >
> > > Tested on Linux/x86 and Linux/x86-64.
> >
> > I think it is better to implement XMM register pushes, and split them
> > after reload to a sequence of:
> >
> > (set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
> > (set (match_dup 0) (match_dup 1))
> >
> > This is definitely better than trips through memory to stack.
>
> Attached (untested patch) allows fake pushes from XMM registers, so
> STV pass can allow pushes.

The problem isn't STV pass.  The IRA pass won't assign hard register for

(insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
(reg/v:DI 85 [ target ])) "x.i":19:5 40 {*pushdi2}
 (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil

and the reload pass turns into


(insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
(mem/c:DI (plus:SI (reg/f:SI 7 sp)
(const_int 16 [0x10])) [8 %sfp+-8 S8 A64])) "x.i":19:5
40 {*pushdi2}
 (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil)))

-- 
H.J.


[PATCH] Add default revisions argument for git_check_commit.py.

2020-05-13 Thread Martin Liška

A small tweak to the script that I'm going to install.

Martin

contrib/ChangeLog:

2020-05-13  Martin Liska  

* gcc-changelog/git_check_commit.py: Add default argument HEAD
for revisions and improve error message output.
---
 contrib/gcc-changelog/git_check_commit.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


diff --git a/contrib/gcc-changelog/git_check_commit.py b/contrib/gcc-changelog/git_check_commit.py
index b2d1d08a242..8553c90a96f 100755
--- a/contrib/gcc-changelog/git_check_commit.py
+++ b/contrib/gcc-changelog/git_check_commit.py
@@ -22,7 +22,7 @@ from git_repository import parse_git_revisions
 
 parser = argparse.ArgumentParser(description='Check git ChangeLog format '
  'of a commit')
-parser.add_argument('revisions',
+parser.add_argument('revisions', default='HEAD', nargs='?',
 help='Git revisions (e.g. hash~5..hash or just hash)')
 parser.add_argument('-g', '--git-path', default='.',
 help='Path to git repository')
@@ -36,9 +36,9 @@ args = parser.parse_args()
 retval = 0
 for git_commit in parse_git_revisions(args.git_path, args.revisions,
   not args.allow_non_strict_mode):
-print('Checking %s' % git_commit.hexsha)
+res = 'OK' if git_commit.success else 'FAILED'
+print('Checking %s: %s' % (git_commit.hexsha, res))
 if git_commit.success:
-print('OK')
 if args.print_changelog:
 git_commit.print_output()
 else:



[PATCH] add vectype parameter to add_stmt_cost hook

2020-05-13 Thread Richard Biener


This adds a vectype parameter to add_stmt_cost which avoids the need
to pass down a (wrong) stmt_info just to carry this information.
Useful for invariants which do not have a stmt_info associated.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2020-05-13  Richard Biener  

* target.def (add_stmt_cost): Add new vectype parameter.
* targhooks.c (default_add_stmt_cost): Adjust.
* targhooks.h (default_add_stmt_cost): Likewise.
* config/aarch64/aarch64.c (aarch64_add_stmt_cost): Take new
vectype parameter.
* config/arm/arm.c (arm_add_stmt_cost): Likewise.
* config/i386/i386.c (ix86_add_stmt_cost): Likewise.
* config/rs6000/rs6000.c (rs6000_add_stmt_cost): Likewise.

* tree-vectorizer.h (stmt_info_for_cost::vectype): Add.
(dump_stmt_cost): Add new vectype parameter.
(add_stmt_cost): Likewise.
(record_stmt_cost): Likewise.
(record_stmt_cost): Add overload with old signature.
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost):
Adjust.
(vect_get_known_peeling_cost): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vectorizer.c (dump_stmt_cost): Add new vectype parameter.
* tree-vect-stmts.c (record_stmt_cost): Likewise.
(vect_prologue_cost_for_slp_op): Remove stmt_vec_info parameter
and pass down correct vectype and NULL stmt_info.
(vect_model_simple_cost): Adjust.
(vect_model_store_cost): Likewise.
---
 gcc/config/aarch64/aarch64.c |  5 ++---
 gcc/config/arm/arm.c |  7 +++
 gcc/config/i386/i386.c   |  5 ++---
 gcc/config/rs6000/rs6000.c   |  5 ++---
 gcc/target.def   |  2 +-
 gcc/targhooks.c  |  5 ++---
 gcc/targhooks.h  |  2 +-
 gcc/tree-vect-loop.c | 46 +++-
 gcc/tree-vect-stmts.c| 17 ++--
 gcc/tree-vectorizer.c|  2 +-
 gcc/tree-vectorizer.h| 27 --
 11 files changed, 65 insertions(+), 58 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 434e095cb66..70aa2f752b5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -13753,15 +13753,14 @@ aarch64_sve_adjust_stmt_cost (class vec_info *vinfo, 
vect_cost_for_stmt kind,
 static unsigned
 aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count,
   enum vect_cost_for_stmt kind,
-  struct _stmt_vec_info *stmt_info, int misalign,
-  enum vect_cost_model_location where)
+  struct _stmt_vec_info *stmt_info, tree vectype,
+  int misalign, enum vect_cost_model_location where)
 {
   unsigned *cost = (unsigned *) data;
   unsigned retval = 0;
 
   if (flag_vect_cost_model)
 {
-  tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
   int stmt_cost =
aarch64_builtin_vectorization_cost (kind, vectype, misalign);
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d50781953c0..56d6be02996 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -309,7 +309,7 @@ static int arm_builtin_vectorization_cost (enum 
vect_cost_for_stmt type_of_cost,
 static unsigned arm_add_stmt_cost (vec_info *vinfo, void *data, int count,
   enum vect_cost_for_stmt kind,
   struct _stmt_vec_info *stmt_info,
-  int misalign,
+  tree vectype, int misalign,
   enum vect_cost_model_location where);
 
 static void arm_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
@@ -12133,15 +12133,14 @@ arm_builtin_vectorization_cost (enum 
vect_cost_for_stmt type_of_cost,
 static unsigned
 arm_add_stmt_cost (vec_info *vinfo, void *data, int count,
   enum vect_cost_for_stmt kind,
-  struct _stmt_vec_info *stmt_info, int misalign,
-  enum vect_cost_model_location where)
+  struct _stmt_vec_info *stmt_info, tree vectype,
+  int misalign, enum vect_cost_model_location where)
 {
   unsigned *cost = (unsigned *) data;
   unsigned retval = 0;
 
   if (flag_vect_cost_model)
 {
-  tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
   int stmt_cost = arm_builtin_vectorization_cost (kind, vectype, misalign);
 
   /* Statements in an inner loop relative to the loop being
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f7a4bae49bb..060e2df62ea 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21886,15 +21886,14 @@ ix86_init_cost (class loop *)
 static unsigned
 ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count,
enum vect_cost_for_stmt kind,
-   cla

[PATCH] [V2] rs6000: Add vec_extracth and vec_extractl

2020-05-13 Thread Bill Schmidt via Gcc-patches
From: Kelvin Nilsen 

Add new insns vextdu[bhw]vlx, vextddvlx, vextdu[bhw]vhx, and
vextddvhx, along with built-in access and overloaded built-in
access to these insns.

Changes from previous patch:
 * Removed the int iterators
 * Created separate expansions and insns
vextractl
vextractl_internal
vextractr
vextractr_internal
 * Adjusted rs6000-builtin.def entries to match the new expansion
   names

I didn't understand the comment about moving the decision making
part to the built-in handling code.  All the built-in handling
does is a table-driven call to the expansions; this logic *is*
the built-in handling code.  I don't see any way to simplify that.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions, using a Power9 configuration.  Is this okay for
master?

Thanks,
Bill


[gcc]

2020-05-12  Kelvin Nilsen  

* config/rs6000/altivec.h (vec_extractl): New #define.
(vec_extracth): Likewise.
* config/rs6000/altivec.md (UNSPEC_EXTRACTL): New constant.
(UNSPEC_EXTRACTR): Likewise.
(vextractl): New expansion.
(vextractl_internal): New insn.
(vextractr): New expansion.
(vextractr_internal): New insn.
* config/rs6000/rs6000-builtin.def (__builtin_altivec_vextdubvlx):
New built-in function.
(__builtin_altivec_vextduhvlx): Likewise.
(__builtin_altivec_vextduwvlx): Likewise.
(__builtin_altivec_vextddvlx): Likewise.
(__builtin_altivec_vextdubvhx): Likewise.
(__builtin_altivec_vextduhvhx): Likewise.
(__builtin_altivec_vextduwvhx): Likewise.
(__builtin_altivec_vextddvhx): Likewise.
(__builtin_vec_extractl): New overloaded built-in function.
(__builtin_vec_extracth): Likewise.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Define overloaded forms of __builtin_vec_extractl and
__builtin_vec_extracth.
(builtin_function_type): Add cases to mark arguments of new
built-in functions as unsigned.
(rs6000_common_init_builtins): Add
opaque_ftype_opaque_opaque_opaque_opaque.
* config/rs6000/rs6000.md (du_or_d): New mode attribute.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Add description of vec_extractl and
vec_extractr built-in functions.

[gcc/testsuite]

2020-05-10  Kelvin Nilsen  

* gcc.target/powerpc/vec-extracth-0.c: New.
* gcc.target/powerpc/vec-extracth-1.c: New.
* gcc.target/powerpc/vec-extracth-2.c: New.
* gcc.target/powerpc/vec-extracth-3.c: New.
* gcc.target/powerpc/vec-extracth-4.c: New.
* gcc.target/powerpc/vec-extracth-5.c: New.
* gcc.target/powerpc/vec-extracth-6.c: New.
* gcc.target/powerpc/vec-extracth-7.c: New.
* gcc.target/powerpc/vec-extracth-be-0.c: New.
* gcc.target/powerpc/vec-extracth-be-1.c: New.
* gcc.target/powerpc/vec-extracth-be-2.c: New.
* gcc.target/powerpc/vec-extracth-be-3.c: New.
* gcc.target/powerpc/vec-extractl-0.c: New.
* gcc.target/powerpc/vec-extractl-1.c: New.
* gcc.target/powerpc/vec-extractl-2.c: New.
* gcc.target/powerpc/vec-extractl-3.c: New.
* gcc.target/powerpc/vec-extractl-4.c: New.
* gcc.target/powerpc/vec-extractl-5.c: New.
* gcc.target/powerpc/vec-extractl-6.c: New.
* gcc.target/powerpc/vec-extractl-7.c: New.
* gcc.target/powerpc/vec-extractl-be-0.c: New.
* gcc.target/powerpc/vec-extractl-be-1.c: New.
* gcc.target/powerpc/vec-extractl-be-2.c: New.
* gcc.target/powerpc/vec-extractl-be-3.c: New.
---
 gcc/config/rs6000/altivec.h   |  3 +
 gcc/config/rs6000/altivec.md  | 62 +++
 gcc/config/rs6000/rs6000-builtin.def  | 13 
 gcc/config/rs6000/rs6000-call.c   | 39 +++-
 gcc/config/rs6000/rs6000.md   | 10 +++
 gcc/doc/extend.texi   | 56 +
 .../gcc.target/powerpc/vec-extracth-0.c   | 33 ++
 .../gcc.target/powerpc/vec-extracth-1.c   | 32 ++
 .../gcc.target/powerpc/vec-extracth-2.c   | 31 ++
 .../gcc.target/powerpc/vec-extracth-3.c   | 30 +
 .../gcc.target/powerpc/vec-extracth-4.c   | 31 ++
 .../gcc.target/powerpc/vec-extracth-5.c   | 29 +
 .../gcc.target/powerpc/vec-extracth-6.c   | 31 ++
 .../gcc.target/powerpc/vec-extracth-7.c   | 30 +
 .../gcc.target/powerpc/vec-extracth-be-0.c| 32 ++
 .../gcc.target/powerpc/vec-extracth-be-1.c| 30 +
 .../gcc.target/powerpc/vec-extracth-be-2.c| 30 +
 .../gcc.target/powerpc/vec-extracth-be-3.c| 30 +
 .../gcc.target/powerpc/vec-extractl-0.c   | 33 ++
 .../gcc.target/powerpc/vec-extractl-1.c   | 32 ++
 .../gcc.target/powerpc/vec-e

Re: [PATCH] contrib/vimrc: Reduce textwidth for commit messages

2020-05-13 Thread Martin Liška

On 5/4/20 8:18 PM, Martin Liška wrote:

I support the patch,


And as there's no feedback I also installed the patch.

Martin


Ping: [PATCH] wwwdocs: Add D front-end section for GCC 10 changes

2020-05-13 Thread Iain Buclaw via Gcc-patches
Ping.

On 07/05/2020 16:04, Iain Buclaw via Gcc-patches wrote:
> Hi,
> 
> Updated the patch to include missed changes, and slighted reworded some 
> entries
> to make them clearer/read easier.
> 
> OK to commit?
> 
> Iain.
> 
> ---
>  htdocs/gcc-10/changes.html | 35 +++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
> index 41c2dc0d..f10cfd56 100644
> --- a/htdocs/gcc-10/changes.html
> +++ b/htdocs/gcc-10/changes.html
> @@ -432,6 +432,41 @@ a work-in-progress.
>
>  
>  
> +D
> +
> +  Support for static foreach has been implemented.
> +  Aliases can now be created directly from any __trait that
> +  return symbols or tuples.  Previously, an AliasSeq was
> +  necessary in order to alias their return.
> +  
> +  It is now possible to detect the language ABI specified for a struct,
> +  class, or interface using __traits(getLinkage, ...)
> +  Support for core.math.toPrec intrinsics have been added.
> +  These intrinsics guarantee the rounding to specific floating-point
> +  precisions at required points in the code.
> +  
> +  Support for pragma(inline) has been implemented.  
> Previously
> +  the pragma was recognized, but had no effect on the compilation.
> +  
> +  Optional parentheses in asm operands are now deprecated 
> and
> +  will be removed in a future release.
> +  
> +  All content imported files are now included in the make dependency list
> +  when compiling with -M.
> +  
> +  Compiler recognized attributes provided by the 
> gcc.attribute
> +  module will now take effect when applied to function prototypes as well
> +  as when applied to full function declarations.
> +  
> +  Added --enable-libphobos-checking configure option to
> +  control whether run-time checks are compiled into the D runtime 
> library.
> +  
> +  Added --with-libphobos-druntime-only configure option to
> +  allow specifying whether to build only the core D runtime library, or
> +  both the core and standard libraries into libphobos.
> +  
> +
> +
>  Fortran
>  
>use_device_addr of version 5.0 of the
> 


Re: [PATCH] coroutines: Implicitly movable objects should use move CTORs for co_return.

2020-05-13 Thread Nathan Sidwell

On 5/13/20 6:59 AM, Iain Sandoe wrote:

.. and now to the right list…

I came across a build failure in a folly experimental test case where,
at first, it appeared that GCC was DTRT … however, further
investigation concluded that this was a case of differing interpretations
between implementations.

It’s kinda unhelpful that the discussion appears in class elision
but it motivates the selection of the correct overload even in cases
where there’s no elision.

The (conflicting info) issue is being taken to WG21 / Core.


Yeah, let's have an xrefing note at the co_return point.  not an 
intercal come-from at the eliding point!



@@ -1045,25 +1044,50 @@ finish_co_return_stmt (location_t kw, tree expr)
if (!crv_meth || crv_meth == error_mark_node)
return error_mark_node;
  
-  vec *args = make_tree_vector_single (expr);

-  co_ret_call = build_new_method_call (
-   get_coroutine_promise_proxy (current_function_decl), crv_meth, &args,
-   NULL_TREE, LOOKUP_NORMAL, NULL, tf_warning_or_error);
+  /* [class.copy.elision] / 3.
+An implicitly movable entity is a variable of automatic storage
+duration that is either a non-volatile object or an rvalue reference
+to a non-volatile object type.  For such objects in the context of
+the co_return, the overload resolution should be carried out first
+treating the object as an rvalue, if that fails, then we fall back
+to regular overload resolution.  */
+  tree obj = STRIP_NOPS (expr);
+  if (TREE_TYPE (obj)
+ && TYPE_REF_P (TREE_TYPE (obj))
+ && TYPE_REF_IS_RVALUE (TREE_TYPE (obj)))
+   obj = TREE_OPERAND (obj, 0);
+
+  if (TREE_CODE (obj) == PARM_DECL
+ || (VAR_P (obj)
+ && decl_storage_duration (obj) == dk_auto
+ && !TYPE_VOLATILE (TREE_TYPE (obj


this is wrong and insufficient. As I explained on IRC -- 'expressions of 
reference type are not a thing'.  We always bash them into an 
indirect-ref of the referency thing.


Look at check_return_value and how it handles NRVO.  There's a 
treat_lvalue_as_rvalue_p call that I'll bet you can use[*], along with a 
move wrapper, which maybe what you want rather than the below rvalue (expr).


a return_expr's operand is an INIT_EXPR or MODIFY_EXPR whose LHS is 
DECL_RESULT(fn).  So that's almost what you want here.  I think the 
general form is

 if (treat_lval_as_rval ())
   { silently try callexpr with moved operand }

if no joy either way
   { noisily try callexpr with unmoved operand }

[*] That is insufficient, because elision also applies to objects you've 
placed into the coro frame.  by the std those are still automatic 
storage objects, but inside the compiler they are of course 
COMPONENT_REFs (or do you cons up VAR_DECLS of reference type that refer 
to them?).   Further, the PARM_DECL case can;t occur here -- the actor 
fn's only parm is the coro frame pointer.
 Either way you'll need to teach TLAR a new trick, or clone it locally 
(probably the better choice).


in place of TLAR's (parm_ok && ...) I think you want
 (code == COMPONENT_REF && OPERAND (0) == coro_frame)
and of course move it outside the (irrelavant) context check

does that help?

nathan
--
Nathan Sidwell


Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 2:37 PM H.J. Lu  wrote:
>
> On Wed, May 13, 2020 at 5:04 AM Uros Bizjak  wrote:
> >
> > On Wed, May 13, 2020 at 1:05 PM Uros Bizjak  wrote:
> > >
> > > On Tue, May 12, 2020 at 10:07 PM H.J. Lu  wrote:
> > > >
> > > > Update STV pass to properly count cost of XMM register push.  In 32-bit
> > > > mode, to convert XMM register push in DImode, we do an XMM store in
> > > > DImode, followed by 2 memory pushes in SImode, instead of 2 integer
> > > > register pushes in SImode.  To convert XM register push in SImode, we
> > > > do an XMM register to integer register move in SImode, followed an
> > > > integer register push in SImode, instead of an integer register push in
> > > > SImode.  In 64-bit mode, we do an XMM register to integer register move
> > > > in SImode or DImode, followed an integer register push in SImode or
> > > > DImode, instead of an integer register push SImode or DImode.
> > > >
> > > > Tested on Linux/x86 and Linux/x86-64.
> > >
> > > I think it is better to implement XMM register pushes, and split them
> > > after reload to a sequence of:
> > >
> > > (set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
> > > (set (match_dup 0) (match_dup 1))
> > >
> > > This is definitely better than trips through memory to stack.
> >
> > Attached (untested patch) allows fake pushes from XMM registers, so
> > STV pass can allow pushes.
>
> The problem isn't STV pass.  The IRA pass won't assign hard register for

Please see the sequence before STV pass:

(insn 24 23 25 3 (set (reg/v:DI 85 [ target ])
(mem:DI (reg/f:SI 86 [ target_p ]) [2 *target_p.0_1+0 S8
A32])) "pr95021.c":17:7 64 {*movdi_internal}
 (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
(nil)))

(insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 A32])
(reg/v:DI 85 [ target ])) "pr95021.c":18:15 64 {*movdi_internal}
 (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
(nil)))

(insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
(reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
 (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil

So, (reg 85) gets moved to a memory in (insn 26) and pushed to a stack
in (insn 28).

STV pass does this:

(insn 24 41 37 3 (set (subreg:V2DI (reg:DI 89) 0)
(subreg:V2DI (reg:DI 91) 0)) "pr95021.c":17:7 1342 {movv2di_internal}
 (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
(nil)))

(insn 37 24 38 3 (set (reg:V2DI 90)
(subreg:V2DI (reg:DI 89) 0)) "pr95021.c":17:7 -1
 (nil))
(insn 38 37 39 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 0)
(subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
 (nil))
(insn 39 38 40 3 (set (reg:V2DI 90)
(lshiftrt:V2DI (reg:V2DI 90)
(const_int 32 [0x20]))) "pr95021.c":17:7 -1
 (nil))
(insn 40 39 25 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 4)
(subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
 (nil))

(insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 A32])
(reg:DI 89)) "pr95021.c":18:15 64 {*movdi_internal}
 (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
(nil)))

(insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
(reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
 (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil

For some reason (reg 89) moves to (reg 85) via sequence (insn 37) to
(insn 40), splitting to SImode and reassembling back to (reg 85).

>
> (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> (reg/v:DI 85 [ target ])) "x.i":19:5 40 {*pushdi2}
>  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> (nil
>
> and the reload pass turns into
>
> 
> (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> (mem/c:DI (plus:SI (reg/f:SI 7 sp)
> (const_int 16 [0x10])) [8 %sfp+-8 S8 A64])) "x.i":19:5
> 40 {*pushdi2}
>  (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> (nil)))

This is an optimization of the reload pass, it figures out where the
values live and tries its best to assign hard regs and stack slots to
the above convoluted sequence. Note, that DImode push from integer
regs splits to SImode pushes after reload. This has nothing with STV
pass.

The question is, why STV pass creates its funny sequence? The original
sequence should be easily solved by storing DImode from XMM register
and (with patched gcc) pushing DImode value from the same XMM
register.

Uros.


Re: ChangeLog files - server and client scripts

2020-05-13 Thread Richard Earnshaw
On 13/05/2020 12:05, Martin Liška wrote:
> Hi.
> 
> I'm sending the gcc-changelog relates scripts which should be added to
> contrib
> folder. The patch contains:
> - git_check_commit.py - checking script that verifies git message format
> - git_update_version.py - a replacement of
> maintainer-scripts/update_version_git which
> bumps DATESTAMP and generates ChangeLog entries (for now into
> ChangeLog.test files)
> - git_commit.py, git_email.py and git_repository.py - helper classes
> 
> I also added a new git.config alias: 'gcc-verify' which can be used in
> the following
> way:
> 
> $ git gcc-verify HEAD~2..HEAD -p -n
> Checking 0e4009e9d523270e26856d2441c1be3d8119a477
> OK
> @@CL contrib
> 2020-05-13  Martin Liska  
> 
> * gcc-changelog/git_check_commit.py: New file.
> * gcc-changelog/git_commit.py: New file.
> * gcc-changelog/git_email.py: New file.
> * gcc-changelog/git_repository.py: New file.
> * gcc-changelog/git_update_version.py: New file.
> * gcc-git-customization.sh: Add gcc-verify alias.
> @@CL
> Checking 18edc195442291525e04f0fa4d5ef972155117da
> OK
> @@CL gcc
> 2020-05-13  Jakub Jelinek  
> 
> PR debug/95080
> * cfgrtl.c (purge_dead_edges): Skip over debug and note insns even
> if the last insn is a note.
> @@CL gcc/testsuite
> 2020-05-13  Jakub Jelinek  
> 
> PR debug/95080
> * g++.dg/opt/pr95080.C: New test.
> @@CL
> 
> Note the -n option which disables _strict mode_ (modification of both
> ChangeLog
> and another files).
> 
> The second part is git hook that will reject all commits for release and
> master branches.
> that violate ChangeLog format. Right now, strict mode is disabled in the
> hooks.
> 
> What's still missing to be done is format of Revert and Backport commits.
> I suggest to use native 'git revert XYZ' and 'git cherry-pick -x XYZ'.
> Doing that the commit messages will provide link to original commit and
> the script
> can later append corresponding 'Backported ..' or 'Reverted' line.
> 
> Thoughts?
> Martin

I've just realized this doesn't give us an easy way to mark changes for
the root-level ChangeLog file, unless, perhaps "@@ CL ." works?

R.


Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread H.J. Lu via Gcc-patches
On Wed, May 13, 2020 at 6:17 AM Uros Bizjak  wrote:
>
> On Wed, May 13, 2020 at 2:37 PM H.J. Lu  wrote:
> >
> > On Wed, May 13, 2020 at 5:04 AM Uros Bizjak  wrote:
> > >
> > > On Wed, May 13, 2020 at 1:05 PM Uros Bizjak  wrote:
> > > >
> > > > On Tue, May 12, 2020 at 10:07 PM H.J. Lu  wrote:
> > > > >
> > > > > Update STV pass to properly count cost of XMM register push.  In 
> > > > > 32-bit
> > > > > mode, to convert XMM register push in DImode, we do an XMM store in
> > > > > DImode, followed by 2 memory pushes in SImode, instead of 2 integer
> > > > > register pushes in SImode.  To convert XM register push in SImode, we
> > > > > do an XMM register to integer register move in SImode, followed an
> > > > > integer register push in SImode, instead of an integer register push 
> > > > > in
> > > > > SImode.  In 64-bit mode, we do an XMM register to integer register 
> > > > > move
> > > > > in SImode or DImode, followed an integer register push in SImode or
> > > > > DImode, instead of an integer register push SImode or DImode.
> > > > >
> > > > > Tested on Linux/x86 and Linux/x86-64.
> > > >
> > > > I think it is better to implement XMM register pushes, and split them
> > > > after reload to a sequence of:
> > > >
> > > > (set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
> > > > (set (match_dup 0) (match_dup 1))
> > > >
> > > > This is definitely better than trips through memory to stack.
> > >
> > > Attached (untested patch) allows fake pushes from XMM registers, so
> > > STV pass can allow pushes.
> >
> > The problem isn't STV pass.  The IRA pass won't assign hard register for
>
> Please see the sequence before STV pass:
>
> (insn 24 23 25 3 (set (reg/v:DI 85 [ target ])
> (mem:DI (reg/f:SI 86 [ target_p ]) [2 *target_p.0_1+0 S8
> A32])) "pr95021.c":17:7 64 {*movdi_internal}
>  (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
> (nil)))
>
> (insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 A32])
> (reg/v:DI 85 [ target ])) "pr95021.c":18:15 64 {*movdi_internal}
>  (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
> (nil)))
>
> (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> (reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
>  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> (nil
>
> So, (reg 85) gets moved to a memory in (insn 26) and pushed to a stack
> in (insn 28).
>
> STV pass does this:
>
> (insn 24 41 37 3 (set (subreg:V2DI (reg:DI 89) 0)
> (subreg:V2DI (reg:DI 91) 0)) "pr95021.c":17:7 1342 {movv2di_internal}
>  (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
> (nil)))
>
> (insn 37 24 38 3 (set (reg:V2DI 90)
> (subreg:V2DI (reg:DI 89) 0)) "pr95021.c":17:7 -1
>  (nil))
> (insn 38 37 39 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 0)
> (subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
>  (nil))
> (insn 39 38 40 3 (set (reg:V2DI 90)
> (lshiftrt:V2DI (reg:V2DI 90)
> (const_int 32 [0x20]))) "pr95021.c":17:7 -1
>  (nil))
> (insn 40 39 25 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 4)
> (subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
>  (nil))
>
> (insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 A32])
> (reg:DI 89)) "pr95021.c":18:15 64 {*movdi_internal}
>  (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
> (nil)))
>
> (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> (reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
>  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> (nil
>
> For some reason (reg 89) moves to (reg 85) via sequence (insn 37) to
> (insn 40), splitting to SImode and reassembling back to (reg 85).
>
> >
> > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > (reg/v:DI 85 [ target ])) "x.i":19:5 40 {*pushdi2}
> >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > (nil
> >
> > and the reload pass turns into
> >
> > 
> > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > (mem/c:DI (plus:SI (reg/f:SI 7 sp)
> > (const_int 16 [0x10])) [8 %sfp+-8 S8 A64])) "x.i":19:5
> > 40 {*pushdi2}
> >  (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > (nil)))
>
> This is an optimization of the reload pass, it figures out where the
> values live and tries its best to assign hard regs and stack slots to
> the above convoluted sequence. Note, that DImode push from integer
> regs splits to SImode pushes after reload. This has nothing with STV
> pass.
>
> The question is, why STV pass creates its funny sequence? The original
> sequence should be easily solved by storing DImode from XMM register
> and (with patched gcc) pushing DImode value from the same XMM

Re: [PATCH] coroutines: Implicitly movable objects should use move CTORs for co_return.

2020-05-13 Thread Iain Sandoe

Nathan Sidwell  wrote:


On 5/13/20 6:59 AM, Iain Sandoe wrote:



@@ -1045,25 +1044,50 @@ finish_co_return_stmt (location_t kw, tree expr)
   if (!crv_meth || crv_meth == error_mark_node)
return error_mark_node;
 -  vec *args = make_tree_vector_single (expr);
-  co_ret_call = build_new_method_call (
-   get_coroutine_promise_proxy (current_function_decl), crv_meth, &args,
-   NULL_TREE, LOOKUP_NORMAL, NULL, tf_warning_or_error);
+  /* [class.copy.elision] / 3.
+An implicitly movable entity is a variable of automatic storage
+duration that is either a non-volatile object or an rvalue reference
+to a non-volatile object type.  For such objects in the context of
+the co_return, the overload resolution should be carried out first
+treating the object as an rvalue, if that fails, then we fall back
+to regular overload resolution.  */
+  tree obj = STRIP_NOPS (expr);
+  if (TREE_TYPE (obj)
+ && TYPE_REF_P (TREE_TYPE (obj))
+ && TYPE_REF_IS_RVALUE (TREE_TYPE (obj)))
+   obj = TREE_OPERAND (obj, 0);
+
+  if (TREE_CODE (obj) == PARM_DECL
+ || (VAR_P (obj)
+ && decl_storage_duration (obj) == dk_auto
+ && !TYPE_VOLATILE (TREE_TYPE (obj


this is wrong and insufficient. As I explained on IRC -- 'expressions of  
reference type are not a thing'.  We always bash them into an  
indirect-ref of the referency thing.


Look at check_return_value and how it handles NRVO.  There's a  
treat_lvalue_as_rvalue_p call that I'll bet you can use[*], along with a  
move wrapper, which maybe what you want rather than the below rvalue  
(expr).


a return_expr's operand is an INIT_EXPR or MODIFY_EXPR whose LHS is  
DECL_RESULT(fn).  So that's almost what you want here.  I think the  
general form is

if (treat_lval_as_rval ())
  { silently try callexpr with moved operand }

if no joy either way
  { noisily try callexpr with unmoved operand }

[*] That is insufficient, because elision also applies to objects you've  
placed into the coro frame.  by the std those are still automatic storage  
objects, but inside the compiler they are of course COMPONENT_REFs (or do  
you cons up VAR_DECLS of reference type that refer to them?).   Further,  
the PARM_DECL case can;t occur here -- the actor fn's only parm is the  
coro frame pointer.


This is the equivalent of finish_return_stmt () in the parser, it knows  
nothing of the eventual morphing of local vars (or parms) into frame  
references.


So I only need to handle what can be returned by "expr =  
cp_parser_expression (parser);”
dependent expressions are dealt with above, with an early return with  
“type_unknown_node”.


So, it might be I can just reuse the code you point to?

Iain





Re: ChangeLog files - server and client scripts

2020-05-13 Thread Martin Liška

On 5/13/20 3:24 PM, Richard Earnshaw wrote:

I've just realized this doesn't give us an easy way to mark changes for
the root-level ChangeLog file, unless, perhaps "@@ CL ." works?


This works fine:
'ChangeLog:'

as seen for instance here:

commit 9ad3c1d81c129fc76594b9df5b798c380cbf03ee
Author: Stefan Schulze Frielinghaus 
Date:   Wed Apr 22 09:20:08 2020 +0200

MAINTAINERS: add myself for write after approval

ChangeLog:

2020-04-22  Stefan Schulze Frielinghaus  

* MAINTAINERS (Write After Approval): add myself


Martin


Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 3:25 PM H.J. Lu  wrote:
>
> On Wed, May 13, 2020 at 6:17 AM Uros Bizjak  wrote:
> >
> > On Wed, May 13, 2020 at 2:37 PM H.J. Lu  wrote:
> > >
> > > On Wed, May 13, 2020 at 5:04 AM Uros Bizjak  wrote:
> > > >
> > > > On Wed, May 13, 2020 at 1:05 PM Uros Bizjak  wrote:
> > > > >
> > > > > On Tue, May 12, 2020 at 10:07 PM H.J. Lu  wrote:
> > > > > >
> > > > > > Update STV pass to properly count cost of XMM register push.  In 
> > > > > > 32-bit
> > > > > > mode, to convert XMM register push in DImode, we do an XMM store in
> > > > > > DImode, followed by 2 memory pushes in SImode, instead of 2 integer
> > > > > > register pushes in SImode.  To convert XM register push in SImode, 
> > > > > > we
> > > > > > do an XMM register to integer register move in SImode, followed an
> > > > > > integer register push in SImode, instead of an integer register 
> > > > > > push in
> > > > > > SImode.  In 64-bit mode, we do an XMM register to integer register 
> > > > > > move
> > > > > > in SImode or DImode, followed an integer register push in SImode or
> > > > > > DImode, instead of an integer register push SImode or DImode.
> > > > > >
> > > > > > Tested on Linux/x86 and Linux/x86-64.
> > > > >
> > > > > I think it is better to implement XMM register pushes, and split them
> > > > > after reload to a sequence of:
> > > > >
> > > > > (set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
> > > > > (set (match_dup 0) (match_dup 1))
> > > > >
> > > > > This is definitely better than trips through memory to stack.
> > > >
> > > > Attached (untested patch) allows fake pushes from XMM registers, so
> > > > STV pass can allow pushes.
> > >
> > > The problem isn't STV pass.  The IRA pass won't assign hard register for
> >
> > Please see the sequence before STV pass:
> >
> > (insn 24 23 25 3 (set (reg/v:DI 85 [ target ])
> > (mem:DI (reg/f:SI 86 [ target_p ]) [2 *target_p.0_1+0 S8
> > A32])) "pr95021.c":17:7 64 {*movdi_internal}
> >  (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
> > (nil)))
> >
> > (insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 
> > A32])
> > (reg/v:DI 85 [ target ])) "pr95021.c":18:15 64 {*movdi_internal}
> >  (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
> > (nil)))
> >
> > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > (reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
> >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > (nil
> >
> > So, (reg 85) gets moved to a memory in (insn 26) and pushed to a stack
> > in (insn 28).
> >
> > STV pass does this:
> >
> > (insn 24 41 37 3 (set (subreg:V2DI (reg:DI 89) 0)
> > (subreg:V2DI (reg:DI 91) 0)) "pr95021.c":17:7 1342 
> > {movv2di_internal}
> >  (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
> > (nil)))
> >
> > (insn 37 24 38 3 (set (reg:V2DI 90)
> > (subreg:V2DI (reg:DI 89) 0)) "pr95021.c":17:7 -1
> >  (nil))
> > (insn 38 37 39 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 0)
> > (subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
> >  (nil))
> > (insn 39 38 40 3 (set (reg:V2DI 90)
> > (lshiftrt:V2DI (reg:V2DI 90)
> > (const_int 32 [0x20]))) "pr95021.c":17:7 -1
> >  (nil))
> > (insn 40 39 25 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 4)
> > (subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
> >  (nil))
> >
> > (insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 
> > A32])
> > (reg:DI 89)) "pr95021.c":18:15 64 {*movdi_internal}
> >  (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
> > (nil)))
> >
> > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > (reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
> >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > (nil
> >
> > For some reason (reg 89) moves to (reg 85) via sequence (insn 37) to
> > (insn 40), splitting to SImode and reassembling back to (reg 85).
> >
> > >
> > > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > > (reg/v:DI 85 [ target ])) "x.i":19:5 40 {*pushdi2}
> > >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > > (nil
> > >
> > > and the reload pass turns into
> > >
> > > 
> > > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > > (mem/c:DI (plus:SI (reg/f:SI 7 sp)
> > > (const_int 16 [0x10])) [8 %sfp+-8 S8 A64])) "x.i":19:5
> > > 40 {*pushdi2}
> > >  (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > > (nil)))
> >
> > This is an optimization of the reload pass, it figures out where the
> > values live and tries its best to assign hard regs and stack slots to
> > the above convoluted sequence. Note, that DImod

[committed] c++: Add testcase for already-fixed PR [PR70642]

2020-05-13 Thread Patrick Palka via Gcc-patches
We correctly reject the testcase in this PR ever since commit r9-7046.

gcc/testsuite/ChangeLog:

PR c++/70642
* g++.dg/cpp0x/alias-decl-70.C: New test.
---
 gcc/testsuite/ChangeLog|  5 +
 gcc/testsuite/g++.dg/cpp0x/alias-decl-70.C | 23 ++
 2 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-70.C

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 22adacb264c..4f602ed6f31 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2020-05-13  Patrick Palka  
+
+   PR c++/70642
+   * g++.dg/cpp0x/alias-decl-70.C: New test.
+
 2020-05-13  Jakub Jelinek  
 
PR debug/95080
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-70.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-70.C
new file mode 100644
index 000..28d9279e8a4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-70.C
@@ -0,0 +1,23 @@
+// PR c++/70642
+// { dg-do compile { target c++11 } }
+
+template struct enable_if {};
+template struct enable_if { using type = T; };
+
+template 
+struct foo
+{
+ template 
+ using meow = typename enable_if::type; // { dg-error 
"no type named .type." }
+
+ template// 1
+ meow bar () = delete;
+
+ int bar ()
+ {
+meow i;  // 2
+return 0; // 3
+ }
+};
+
+int j = foo().bar();
-- 
2.26.2.561.g07d8ea56f2



Re: [PATCH] tree-optimization/33315 - common stores during sinking

2020-05-13 Thread Martin Sebor via Gcc-patches

On 5/13/20 2:20 AM, Richard Biener wrote:


This implements commoning of stores to a common successor in
a simple ad-hoc way.  I've decided to put it into the code sinking
pass since, well, it sinks stores.  It's still separate since
it does not really sink code into less executed places.

It's ad-hoc since it does not perform any dataflow or alias analysis
but simply only considers trailing stores in a block, iteratively
though.  If the stores are from different values a PHI node is
inserted to merge them.  gcc.dg/tree-ssa/split-path-7.c shows
that path splitting will eventually undo this very transform,
I've decided to not bother with it and simply disable sinking for
the particular testcase.

Doing this transform is good for code size when the stores are
from constants, once we have to insert PHIs the situation becomes
less clear but it's a transform we do elsewhere as well
(cselim for one), and reversing the transform should be easy.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Any comments?


Rather than adding the new code to sink_code_in_bb I would suggest
to consider adding either all of it or just the while loop (to reduce
block nesting) into a few function and call it from sink_code_in_bb.

I'd expect it to improve readability (sink_code_in_bb is a fairly
small by GCC standards, only 100 lines long, but with the new code
it would grow to over 250).  It might also make it easier to move
the code later as suggested in the comments.

Other than that, should the new code update the pass stats counter?

Martin



2020-05-13  Richard Biener  

PR tree-optimization/33315
* tree-ssa-sink.c: Include tree-eh.h.
(sink_code_in_bb): Return TODO_cleanup_cfg if we commonized
and sunk stores.  Implement store commoning by sinking to
the successor.

* gcc.dg/tree-ssa/ssa-sink-13.c: New testcase.
* gcc.dg/tree-ssa/ssa-sink-14.c: Likewise.
* gcc.dg/tree-ssa/split-path-7.c: Disable sinking.
---
  gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c |   2 +-
  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c  |  25 
  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-14.c  |  17 +++
  gcc/tree-ssa-sink.c  | 168 ++-
  4 files changed, 207 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-14.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c
index 3d6186b34d9..a5df75c9b72 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-7.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fdump-tree-split-paths-details 
-w" } */
+/* { dg-options "-O2 -fsplit-paths -fno-tree-cselim -fno-tree-sink 
-fdump-tree-split-paths-details -w" } */
  
  
  struct _reent

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c
new file mode 100644
index 000..a65ba35d4ba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c
@@ -0,0 +1,25 @@
+/* PR33315 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sink" } */
+
+int num;
+int a[20];
+
+void test ()
+{
+  int i;
+  int *ptr;
+  ptr = & a[0];
+  i = num;
+  if ( i == 1) *(ptr+0) = 0;
+  if ( i != 1) *(ptr+0) = 0;
+  if ( i == 2) *(ptr+1) = 0;
+  if ( i != 2) *(ptr+1) = 0;
+  if ( i == 3) *(ptr+2) = 0;
+  if ( i != 3) *(ptr+2) = 0;
+}
+
+/* We should sink/merge all stores and end up with a single BB.  */
+
+/* { dg-final { scan-tree-dump-times "MEM\[^\n\r\]* = 0;" 3 "sink" } } */
+/* { dg-final { scan-tree-dump-times "  
  /* TODO:

 1. Sinking store only using scalar promotion (IE without moving the RHS):
@@ -469,7 +470,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
  
  /* Perform code sinking on BB */
  
-static void

+static unsigned
  sink_code_in_bb (basic_block bb)
  {
basic_block son;
@@ -477,6 +478,163 @@ sink_code_in_bb (basic_block bb)
edge_iterator ei;
edge e;
bool last = true;
+  unsigned todo = 0;
+
+  /* Very simplistic code to sink common stores from the predecessor through
+ our virtual PHI.  We do this before sinking stmts from BB as it might
+ expose sinking opportunities of the merged stores.
+ Once we have partial dead code elimination through sth like SSU-PRE this
+ should be moved there.  */
+  gphi *phi;
+  if (EDGE_COUNT (bb->preds) > 1
+  && (phi = get_virtual_phi (bb)))
+{
+  /* Repeat until no more common stores are found.  */
+  while (1)
+   {
+ gimple *first_store = NULL;
+ auto_vec  vdefs;
+
+ /* Search for common stores defined by all virtual PHI args.
+???  Common stores not present in all predecessors could
+be handled by inserting a forwarder to sink to.  Generally
+this involves deciding wh

Re: [PATCH] coroutines: Implicitly movable objects should use move CTORs for co_return.

2020-05-13 Thread Nathan Sidwell

On 5/13/20 9:26 AM, Iain Sandoe wrote:

Nathan Sidwell  wrote:


On 5/13/20 6:59 AM, Iain Sandoe wrote:



@@ -1045,25 +1044,50 @@ finish_co_return_stmt (location_t kw, tree expr)
   if (!crv_meth || crv_meth == error_mark_node)
 return error_mark_node;
 -  vec *args = make_tree_vector_single (expr);
-  co_ret_call = build_new_method_call (
-    get_coroutine_promise_proxy (current_function_decl), crv_meth, 
&args,

-    NULL_TREE, LOOKUP_NORMAL, NULL, tf_warning_or_error);
+  /* [class.copy.elision] / 3.
+ An implicitly movable entity is a variable of automatic storage
+ duration that is either a non-volatile object or an rvalue 
reference

+ to a non-volatile object type.  For such objects in the context of
+ the co_return, the overload resolution should be carried out first
+ treating the object as an rvalue, if that fails, then we fall back
+ to regular overload resolution.  */
+  tree obj = STRIP_NOPS (expr);
+  if (TREE_TYPE (obj)
+  && TYPE_REF_P (TREE_TYPE (obj))
+  && TYPE_REF_IS_RVALUE (TREE_TYPE (obj)))
+    obj = TREE_OPERAND (obj, 0);
+
+  if (TREE_CODE (obj) == PARM_DECL
+  || (VAR_P (obj)
+  && decl_storage_duration (obj) == dk_auto
+  && !TYPE_VOLATILE (TREE_TYPE (obj


this is wrong and insufficient. As I explained on IRC -- 'expressions 
of reference type are not a thing'.  We always bash them into an 
indirect-ref of the referency thing.


Look at check_return_value and how it handles NRVO.  There's a 
treat_lvalue_as_rvalue_p call that I'll bet you can use[*], along with 
a move wrapper, which maybe what you want rather than the below rvalue 
(expr).


a return_expr's operand is an INIT_EXPR or MODIFY_EXPR whose LHS is 
DECL_RESULT(fn).  So that's almost what you want here.  I think the 
general form is

if (treat_lval_as_rval ())
  { silently try callexpr with moved operand }

if no joy either way
  { noisily try callexpr with unmoved operand }

[*] That is insufficient, because elision also applies to objects 
you've placed into the coro frame.  by the std those are still 
automatic storage objects, but inside the compiler they are of course 
COMPONENT_REFs (or do you cons up VAR_DECLS of reference type that 
refer to them?).   Further, the PARM_DECL case can;t occur here -- the 
actor fn's only parm is the coro frame pointer.


This is the equivalent of finish_return_stmt () in the parser, it knows 
nothing of the eventual morphing of local vars (or parms) into frame 
references.


So I only need to handle what can be returned by "expr = 
cp_parser_expression (parser);”
dependent expressions are dealt with above, with an early return with 
“type_unknown_node”.


So, it might be I can just reuse the code you point to?


Yeah I think so.  I realized it was pre-xform so the component-ref stuff 
was irrelevant when out shopping.  In other news, flour's back in stock :)


nathan

--
Nathan Sidwell


[PATCH] Don't make -gsplit-dwarf imply -g

2020-05-13 Thread Fangrui Song via Gcc-patches

On 2020-05-13, Eric Botcazou wrote:

Did I mention I dislike -fsplit-dwarf? ;)


Seconded, this will be confusing for almost all users.  Since the option only
affects debug info generation, it should be prefixed with 'g' in any case.


Updating the semantics of -gsplit-dwarf is actually my favorite as
well:)

-gsplit-dwarf is not common. Many uses have separate -g. Let's change it.

Attached the patch.


(I also wish -gdwarf-5 did not imply -g but the ship may have shipped.)
>From d389afcaf66ae9f0549ec91437a7bcb1e3b0d7d7 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Wed, 13 May 2020 08:27:29 -0700
Subject: [PATCH] Don't make -gsplit-dwarf imply -g

-gsplit-dwarf introduces order dependency: it overrides previous -g0 and -g1.

Don't imply -g so that it can be plugged into a build without worrying
that unnecessary debugging information may be generated.

2020-05-13  Fangrui Song  

	PR debug/95096
	* common.opt: Don't make -gsplit-dwarf imply -g.
	* doc/invoke.texi: Update documentation.
---
 gcc/doc/invoke.texi | 10 +-
 gcc/opts.c  |  5 -
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 850aeac033d..12b65f0f604 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8796,11 +8796,11 @@ except when selective scheduling is enabled.
 
 @item -gsplit-dwarf
 @opindex gsplit-dwarf
-Separate as much DWARF debugging information as possible into a
-separate output file with the extension @file{.dwo}.  This option allows
-the build system to avoid linking files with debug information.  To
-be useful, this option requires a debugger capable of reading @file{.dwo}
-files.
+If DWARF debugging information is enabled, separate as much debugging
+information as possible into a separate output file with the extension
+@file{.dwo}.  This option allows the build system to avoid linking files with
+debug information.  To be useful, this option requires a debugger capable of
+reading @file{.dwo} files.
 
 @item -gdescribe-dies
 @opindex gdescribe-dies
diff --git a/gcc/opts.c b/gcc/opts.c
index ec3ca0720f9..b45f22d9634 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2723,11 +2723,6 @@ common_handle_option (struct gcc_options *opts,
   set_debug_level (DWARF2_DEBUG, false, "", opts, opts_set, loc);
   break;
 
-case OPT_gsplit_dwarf:
-  set_debug_level (NO_DEBUG, DEFAULT_GDB_EXTENSIONS, "", opts, opts_set,
-		   loc);
-  break;
-
 case OPT_ggdb:
   set_debug_level (NO_DEBUG, 2, arg, opts, opts_set, loc);
   break;
-- 
2.26.2.645.ge9eca65c58-goog



[PATCH] x86: Allow vector register pushes

2020-05-13 Thread H.J. Lu via Gcc-patches
On Wed, May 13, 2020 at 6:35 AM Uros Bizjak  wrote:
>
> On Wed, May 13, 2020 at 3:25 PM H.J. Lu  wrote:
> >
> > On Wed, May 13, 2020 at 6:17 AM Uros Bizjak  wrote:
> > >
> > > On Wed, May 13, 2020 at 2:37 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, May 13, 2020 at 5:04 AM Uros Bizjak  wrote:
> > > > >
> > > > > On Wed, May 13, 2020 at 1:05 PM Uros Bizjak  wrote:
> > > > > >
> > > > > > On Tue, May 12, 2020 at 10:07 PM H.J. Lu  
> > > > > > wrote:
> > > > > > >
> > > > > > > Update STV pass to properly count cost of XMM register push.  In 
> > > > > > > 32-bit
> > > > > > > mode, to convert XMM register push in DImode, we do an XMM store 
> > > > > > > in
> > > > > > > DImode, followed by 2 memory pushes in SImode, instead of 2 
> > > > > > > integer
> > > > > > > register pushes in SImode.  To convert XM register push in 
> > > > > > > SImode, we
> > > > > > > do an XMM register to integer register move in SImode, followed an
> > > > > > > integer register push in SImode, instead of an integer register 
> > > > > > > push in
> > > > > > > SImode.  In 64-bit mode, we do an XMM register to integer 
> > > > > > > register move
> > > > > > > in SImode or DImode, followed an integer register push in SImode 
> > > > > > > or
> > > > > > > DImode, instead of an integer register push SImode or DImode.
> > > > > > >
> > > > > > > Tested on Linux/x86 and Linux/x86-64.
> > > > > >
> > > > > > I think it is better to implement XMM register pushes, and split 
> > > > > > them
> > > > > > after reload to a sequence of:
> > > > > >
> > > > > > (set (reg:P SP_REG) (plus:P SP_REG) (const_int -8)))
> > > > > > (set (match_dup 0) (match_dup 1))
> > > > > >
> > > > > > This is definitely better than trips through memory to stack.
> > > > >
> > > > > Attached (untested patch) allows fake pushes from XMM registers, so
> > > > > STV pass can allow pushes.
> > > >
> > > > The problem isn't STV pass.  The IRA pass won't assign hard register for
> > >
> > > Please see the sequence before STV pass:
> > >
> > > (insn 24 23 25 3 (set (reg/v:DI 85 [ target ])
> > > (mem:DI (reg/f:SI 86 [ target_p ]) [2 *target_p.0_1+0 S8
> > > A32])) "pr95021.c":17:7 64 {*movdi_internal}
> > >  (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
> > > (nil)))
> > >
> > > (insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 
> > > A32])
> > > (reg/v:DI 85 [ target ])) "pr95021.c":18:15 64 {*movdi_internal}
> > >  (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
> > > (nil)))
> > >
> > > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > > (reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
> > >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > > (nil
> > >
> > > So, (reg 85) gets moved to a memory in (insn 26) and pushed to a stack
> > > in (insn 28).
> > >
> > > STV pass does this:
> > >
> > > (insn 24 41 37 3 (set (subreg:V2DI (reg:DI 89) 0)
> > > (subreg:V2DI (reg:DI 91) 0)) "pr95021.c":17:7 1342 
> > > {movv2di_internal}
> > >  (expr_list:REG_DEAD (reg/f:SI 86 [ target_p ])
> > > (nil)))
> > >
> > > (insn 37 24 38 3 (set (reg:V2DI 90)
> > > (subreg:V2DI (reg:DI 89) 0)) "pr95021.c":17:7 -1
> > >  (nil))
> > > (insn 38 37 39 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 0)
> > > (subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
> > >  (nil))
> > > (insn 39 38 40 3 (set (reg:V2DI 90)
> > > (lshiftrt:V2DI (reg:V2DI 90)
> > > (const_int 32 [0x20]))) "pr95021.c":17:7 -1
> > >  (nil))
> > > (insn 40 39 25 3 (set (subreg:SI (reg/v:DI 85 [ target ]) 4)
> > > (subreg:SI (reg:V2DI 90) 0)) "pr95021.c":17:7 -1
> > >  (nil))
> > >
> > > (insn 26 25 27 3 (set (mem:DI (reg/f:SI 87 [ c ]) [2 c.1_2->target+0 S8 
> > > A32])
> > > (reg:DI 89)) "pr95021.c":18:15 64 {*movdi_internal}
> > >  (expr_list:REG_DEAD (reg/f:SI 87 [ c ])
> > > (nil)))
> > >
> > > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > > (reg/v:DI 85 [ target ])) "pr95021.c":19:5 40 {*pushdi2}
> > >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > > (nil
> > >
> > > For some reason (reg 89) moves to (reg 85) via sequence (insn 37) to
> > > (insn 40), splitting to SImode and reassembling back to (reg 85).
> > >
> > > >
> > > > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > > > (reg/v:DI 85 [ target ])) "x.i":19:5 40 {*pushdi2}
> > > >  (expr_list:REG_DEAD (reg/v:DI 85 [ target ])
> > > > (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> > > > (nil
> > > >
> > > > and the reload pass turns into
> > > >
> > > > 
> > > > (insn 28 27 29 3 (set (mem:DI (pre_dec:SI (reg/f:SI 7 sp)) [2  S8 A64])
> > > > (mem/c:DI (plus:SI (reg/f:SI 7 sp)
> > > > (con

[committed] c++: Fix g++.dg/parse/attr4.C test.

2020-05-13 Thread Marek Polacek via Gcc-patches
I noticed this test failing in C++11 mode.

Tested x86_64-pc-linux-gnu, applying to trunk.

* g++.dg/parse/attr4.C: Use c++11 in a target selector.
---
 gcc/testsuite/g++.dg/parse/attr4.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/parse/attr4.C 
b/gcc/testsuite/g++.dg/parse/attr4.C
index 160c33e2625..36ebdbd0bae 100644
--- a/gcc/testsuite/g++.dg/parse/attr4.C
+++ b/gcc/testsuite/g++.dg/parse/attr4.C
@@ -1,4 +1,4 @@
 // PR c++/93684 - ICE-on-invalid with broken attribute.
 
 [[a:: // { dg-error "expected" }
-  // { dg-error "-:expected" "" { target c++14 } .+1 }
+  // { dg-error "-:expected" "" { target c++11 } .+1 }

base-commit: 78db0e093e69f360ac1ef871ca08895a4d2bec06
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[PATCH] c++: Implement DR 2289, Uniqueness of structured binding names [PR94553]

2020-05-13 Thread Marek Polacek via Gcc-patches
DR 2289 clarified that since structured bindings have no C compatibility
implications, they should be unique in their declarative region, see
[basic.scope.declarative]/4.2.

The duplicate_decls hunk is the gist of the patch, but that alone would
not be enough to detect the 'A' case: cp_parser_decomposition_declaration
uses

13968   tree decl2 = start_decl (declarator, &decl_specs, SD_INITIALIZED,
13969NULL_TREE, NULL_TREE, &elt_pushed_scope);

to create the 'A' VAR_DECL but in this start_decl's grokdeclarator we
don't do fit_decomposition_lang_decl because the declarator kind is not
cdk_decomp, so then when start_decl calls maybe_push_decl, the decl 'A'
isn't DECL_DECOMPOSITION_P and we don't detect this case.  So I needed a
way to signal to start_decl that it should fit_decomposition_lang_decl.
A simple flag seems the easiest solution.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

DR 2289
PR c++/94553
* cp-tree.h (groktypename): Fix formatting.
(start_decl): Add new bool parameter with a default argument.
* decl.c (duplicate_decls): Make sure a structured binding is unique
in its declarative region.
(start_decl): Add new bool parameter.  If it's true, call
fit_decomposition_lang_decl.
* parser.c (cp_parser_decomposition_declaration): Pass true to
start_decl.

* g++.dg/cpp1z/decomp52.C: New test.
---
 gcc/cp/cp-tree.h  |  7 +--
 gcc/cp/decl.c | 13 +++--
 gcc/cp/parser.c   |  3 ++-
 gcc/testsuite/g++.dg/cpp1z/decomp52.C | 14 ++
 4 files changed, 32 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/decomp52.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f7c11bcf838..1f1f5b69313 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6525,8 +6525,11 @@ extern void warn_misplaced_attr_for_class_type  
(location_t location,
 tree class_type);
 extern tree check_tag_decl (cp_decl_specifier_seq *, bool);
 extern tree shadow_tag (cp_decl_specifier_seq *);
-extern tree groktypename   (cp_decl_specifier_seq *, const 
cp_declarator *, bool);
-extern tree start_decl (const cp_declarator *, 
cp_decl_specifier_seq *, int, tree, tree, tree *);
+extern tree groktypename   (cp_decl_specifier_seq *,
+const cp_declarator *, bool);
+extern tree start_decl (const cp_declarator *,
+cp_decl_specifier_seq *, int,
+tree, tree, tree *, bool = 
false);
 extern void start_decl_1   (tree, bool);
 extern bool check_array_initializer(tree, tree, tree);
 extern void omp_declare_variant_finalize   (tree, tree);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1b6a5672334..6464dbbfc99 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -1705,6 +1705,9 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
newdecl_is_friend)
  inform (olddecl_loc, "previous declaration %q#D", olddecl);
  return error_mark_node;
}
+  else if ((VAR_P (olddecl) && DECL_DECOMPOSITION_P (olddecl))
+  || (VAR_P (newdecl) && DECL_DECOMPOSITION_P (newdecl)))
+   /* A structured binding must be unique in its declarative region.  */;
   else if (DECL_IMPLICIT_TYPEDEF_P (olddecl)
   || DECL_IMPLICIT_TYPEDEF_P (newdecl))
/* One is an implicit typedef, that's ok.  */
@@ -5199,7 +5202,8 @@ groktypename (cp_decl_specifier_seq *type_specifiers,
The scope represented by the context of the returned DECL is pushed
(if it is not the global namespace) and is assigned to
*PUSHED_SCOPE_P.  The caller is then responsible for calling
-   pop_scope on *PUSHED_SCOPE_P if it is set.  */
+   pop_scope on *PUSHED_SCOPE_P if it is set.  If FORCE_LANG_DECL_P, create
+   a lang decl node for the new decl.  */
 
 tree
 start_decl (const cp_declarator *declarator,
@@ -5207,7 +5211,8 @@ start_decl (const cp_declarator *declarator,
int initialized,
tree attributes,
tree prefix_attributes,
-   tree *pushed_scope_p)
+   tree *pushed_scope_p,
+   bool force_lang_decl_p/*=false*/)
 {
   tree decl;
   tree context;
@@ -5387,6 +5392,10 @@ start_decl (const cp_declarator *declarator,
   decl);
 }
 
+  /* Create a DECL_LANG_SPECIFIC so that DECL_DECOMPOSITION_P works.  */
+  if (force_lang_decl_p)
+fit_decomposition_lang_decl (decl, NULL_TREE);
+
   was_public = TREE_PUBLIC (decl);
 
   /* Enter this declaration into the symbol table.  Don't push the plain
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index f1ddef220fe..06d6a2

Re: Ping: [RFA] Require powerpc_vsx_ok in gcc.target/powerpc/pr71763.c

2020-05-13 Thread Joel Brobecker
Hello,

Would someone mind reviewing this patch, please?

The test explicitly uses -mvsx in the compilation options, so it seems
reasonable to require powerpc_vsx_ok...

Thank you!

> Just a friendly ping on the following patch, hopefully sufficiently
> straightforward and tested to be allowed onto branch master.
> 
> 
> On Fri, Apr 17, 2020 at 04:49:47PM -0700, Joel Brobecker wrote:
> > From: Douglas Rupp 
> > 
> > Hello,
> > 
> > (submitting this on behalf of Doug Rupp, one of my colleagues)
> > 
> > We're getting an error when running this test on PowerPC VxWorks 7,
> > due to an unexpected warning:
> > 
> > | Excess errors:
> > | cc1: warning: '-mvsx' and '-mno-altivec' are incompatible
> > 
> > The warning comes from a combination of factors:
> >   - The test itself uses -mvsx explicitly via the following directive:
> >// { dg-options "-O1 -mvsx" }
> >   - Our toolchain was configured so as to make -mno-altivec
> > the default;
> >   - These two options are mutually exclusive.
> > 
> > This commit adds a powerpc_vsx_ok dg-require-effective-target directive
> > to that test, and thus making it UNSUPPORTED instead.
> > 
> > Tested on PowerPC VxWorks 7. Also tested on PowerPC ELF as well,
> > a platform where we do not make -mno-altivec the default, to verify
> > that the test continues to run as usual in that case.
> > 
> > gcc/testsuite/
> > 
> > * gcc.target/powerpc/pr71763.c: Require powerpc_vsx_ok.
> > 
> > OK for master?
> > 
> > Thanks!
> > -- 
> > Joel
> > 
> > ---
> >  gcc/testsuite/gcc.target/powerpc/pr71763.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr71763.c 
> > b/gcc/testsuite/gcc.target/powerpc/pr71763.c
> > index b36ddfa26b0..b394393 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/pr71763.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr71763.c
> > @@ -1,5 +1,6 @@
> >  // PR target/71763
> >  // { dg-do compile }
> > +// { dg-require-effective-target powerpc_vsx_ok }
> >  // { dg-options "-O1 -mvsx" }
> >  
> >  int a, b;
> > -- 
> > 2.17.1
> 
> -- 
> Joel

-- 
Joel


[PATCH resend] rs6000, pr 94833: fix vec_first_match_index for nulls

2020-05-13 Thread Carl Love via Gcc-patches
GCC maintainers:

This is a resend of "[PATCH]rs6000, fix vec_first_match_index for
nulls".

Per the received comments the pr number was added to the subject line. 
I also tweaked the message to make it clear that the patch fixed issues
with vectors whose elements contain zeros rather then a zero length
vector.

-
The following patch fixes PR94833, vec_first_match_index does not
function as described in its description.

The builtin does not handle vector elements which are zero correctly. 
The following patch fixes the issue and adds additional test cases to
verify the vec_first_match_index builtin and related builtins work
correctly with elements that are zero.

The patch has been compiled and tested on

  powerpc64le-unknown-linux-gnu (Power 9 LE)

with no regression errors.

Please let me know if the patch is acceptable for mainline and for
backporting as needed.

Thanks.

   Carl Love

--

gcc/ChangeLog

2020-04-30  Carl Love  

PR target/94833
* config/rs6000/vsx.md (define_expand): Fix instruction generation for
first_match_index_.
* testsuite/gcc.target/powerpc/builtins-8-p9-runnable.c (main): Add
additional test cases with zero vector elements.
---
 gcc/config/rs6000/vsx.md  |   4 +-
 .../powerpc/builtins-8-p9-runnable.c  | 118 ++
 2 files changed, 120 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 1fcc1b03096..12a0d5e668c 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4803,8 +4803,8 @@
   rtx cmp_result = gen_reg_rtx (mode);
   rtx not_result = gen_reg_rtx (mode);
 
-  emit_insn (gen_vcmpnez (cmp_result, operands[1],
-operands[2]));
+  emit_insn (gen_vcmpne (cmp_result, operands[1],
+   operands[2]));
   emit_insn (gen_one_cmpl2 (not_result, cmp_result));
 
   sh = GET_MODE_SIZE (GET_MODE_INNER (mode)) / 2;
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-8-p9-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-8-p9-runnable.c
index b2f7dc855e8..19457eebfc4 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-8-p9-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-8-p9-runnable.c
@@ -103,6 +103,31 @@ int main() {
  The element index in natural element order is returned for the
  first match or the number of elements if there is no match.  */
   /* char */
+  char_src1 = (vector signed char) { 0x40, 0, 0x40, 0x40,
+0x40, 0x40, 0x40, 0x40,
+0x40, 0x40, 0x40, 0x40,
+0x40, 0x40, 0x40, 0x40 };
+   
+  char_src2 = (vector signed char) {0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0};
+  expected_result = 1;
+
+  result = vec_first_match_index (char_src1, char_src2);
+
+#ifdef DEBUG2
+  print_signed_char("src1", char_src1);
+  print_signed_char("src2", char_src2);
+  printf(" vec_first_match_index = %d\n\n", result);
+#endif
+
+  if (result != expected_result)
+#ifdef DEBUG
+printf("Error: char first match result (%d) does not match expected result 
(%d)\n",
+  result, expected_result);
+#else
+abort();
+#endif
+
   char_src1 = (vector signed char) {-1, 2, 3, 4, -5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16};
   char_src2 = (vector signed char) {-1, 2, 3, 20, -5, 6, 7, 8,
@@ -367,6 +392,50 @@ int main() {
  The element index in BE order is returned for the first mismatch
  or the number of elements if there is no match.   */
   /* char */
+  char_src1 = (vector signed char) {1, 2, 0, 4, -5, 6, 7, 8,
+   9, 10, 11, 12, 13, 14, 15, 16};
+  char_src2 = (vector signed char) {1, 2, 0, 20, -5, 6, 7, 8,
+   9, 10, 11, 12, 13, 14, 15, 16};
+  expected_result = 3;
+
+  result = vec_first_mismatch_index (char_src1, char_src2);
+
+#ifdef DEBUG2
+  print_signed_char("src1", char_src1);
+  print_signed_char("src2", char_src2);
+  printf("vec_first_mismatch_index = %d\n\n", result);
+#endif
+
+  if (result != expected_result)
+#ifdef DEBUG
+printf("Error: char first mismatch result (%d) does not match expected 
result (%d)\n",
+  result, expected_result);
+#else
+abort();
+#endif
+
+  char_src1 = (vector signed char) {0, 2, 3, 4, -5, 6, 7, 8,
+   9, 10, 11, 12, 13, 14, 15, 16};
+  char_src2 = (vector signed char) {0, 2, 3, 20, -5, 6, 7, 8,
+   9, 10, 11, 12, 13, 14, 15, 16};
+  expected_result = 3;
+
+  result = vec_first_mismatch_index (char_src1, char_src2);
+
+#ifdef DEBUG2
+  print_signed_char("src1", char_src1);
+  print_signed_char("src2", char_src2);
+  print

Re: [PATCH PR94969]Add unit distant vector to DDR in case of invariant access functions

2020-05-13 Thread Jakub Jelinek via Gcc-patches
On Wed, May 13, 2020 at 02:00:11PM +0200, Christophe Lyon via Gcc-patches wrote:
> > > 2020-05-11  Bin Cheng  
> > >
> > > PR tree-optimization/94969
> > > * gcc.dg/tree-ssa/pr94969.c: New test.
> 
> The new test fails on arm and aarch64 and probably everywhere:
> gcc.dg/tree-ssa/pr94969.c: dump file does not exist
> UNRESOLVED: gcc.dg/tree-ssa/pr94969.c scan-tree-dump-not Loop 1
> distributed: split to 3 loops "ldist"
> 
> Can you fix this?

Seems a mere swapping of the scan-tree-dump-not args, I've verified the
test passes after this change and fails with older trunk, committed to trunk
as obvious.

2020-05-13  Jakub Jelinek  

PR testsuite/95110
* gcc.dg/tree-ssa/pr94969.c: Swap scan-tree-dump-not arguments.

--- gcc/testsuite/gcc.dg/tree-ssa/pr94969.c.jj  2020-05-13 09:24:36.959012780 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr94969.c 2020-05-13 19:13:53.664499322 
+0200
@@ -25,4 +25,4 @@ int main()
 __builtin_abort ();
 }
 
-/* { dg-final { scan-tree-dump-not "ldist" "Loop 1 distributed: split to 3 
loops"} } */
+/* { dg-final { scan-tree-dump-not "Loop 1 distributed: split to 3 loops" 
"ldist" } } */

Jakub



[PATCH] PR94397 the compiler consider "type is( real(kind(1.)) )" as a syntax error

2020-05-13 Thread Mark Eggleston

Please find attached a patch for PR94397.

Commit message:

Fortran  : "type is( real(kind(1.)) )" spurious syntax error PR94397

Based on a patch in the comments of the PR. That patch fixed this problem
but caused the test cases for PR93484 to fail. Changed to reduce
initialisation expressions if the expression is not EXPR_VARIABLE and not
EXPR_CONSTANT.

2020-05-13  Steven G. Kargl  
            Mark Eggleston 

gcc/fortran/

    PR fortran/94397
    * match.c (gfc_match_type_spec): New variable ok initialised
    to true. Set ok with the return value of gfc_reduce_init_expr
    called only if the expression is not EXPR_CONSTANT and is not
    EXPR_VARIABLE. Add !ok to the check for type not being integer
    or the rank being greater than zero.

2020-05-13  Mark Eggleston 

gcc/testsuite/

    PR fortran/94397
    * gfortran.dg/pr94397.F90: New test.

The formatting with tabs and date will be corrected prior to commit.

Tested on x86_64 for master, releases/gcc-9, releases/gcc-10 branches. 
OK to commit and backport?


--
https://www.codethink.co.uk/privacy.html

>From 425d05f2e735cf5fd30de2d0edc9d8a0e99b823c Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Wed, 1 Apr 2020 09:52:41 +0100
Subject: [PATCH] Fortran  : "type is( real(kind(1.)) )" spurious syntax error
 PR94397

Based on a patch in the comments of the PR. That patch fixed this problem
but caused the test cases for PR93484 to fail. Changed to reduce
initialisation expressions if the expression is not EXPR_VARIABLE and not
EXPR_CONSTANT.

2020-05-13  Steven G. Kargl  
	Mark Eggleston  

gcc/fortran/

	PR fortran/94397
	* match.c (gfc_match_type_spec): New variable ok initialised
	to true. Set ok with the return value of gfc_reduce_init_expr
	called only if the expression is not EXPR_CONSTANT and is not
	EXPR_VARIABLE. Add !ok to the check for type not being integer
	or the rank being greater than zero.

2020-05-13  Mark Eggleston  

gcc/testsuite/

	PR fortran/94397
	* gfortran.dg/pr94397.F90: New test.
---
 gcc/fortran/match.c   |  5 -
 gcc/testsuite/gfortran.dg/pr94397.F90 | 26 ++
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr94397.F90

diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index 8ae34a94a95..82d2b5087e5 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -2265,7 +2265,10 @@ found:
 	 a scalar integer initialization-expr and valid kind parameter. */
   if (c == ')')
 	{
-	  if (e->ts.type != BT_INTEGER || e->rank > 0)
+	  bool ok = true;
+	  if (e->expr_type != EXPR_CONSTANT && e->expr_type != EXPR_VARIABLE)
+	ok = gfc_reduce_init_expr (e);
+	  if (!ok || e->ts.type != BT_INTEGER || e->rank > 0)
 	{
 	  gfc_free_expr (e);
 	  return MATCH_NO;
diff --git a/gcc/testsuite/gfortran.dg/pr94397.F90 b/gcc/testsuite/gfortran.dg/pr94397.F90
new file mode 100644
index 000..fda10c1a88b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr94397.F90
@@ -0,0 +1,26 @@
+! { dg-do run }
+!
+
+module m
+  implicit none
+contains
+  function is_real8(a)
+class(*) :: a
+logical :: is_real8
+is_real8 = .false.
+select type(a)
+  type is(real(kind(1.0_8)))
+is_real8 = .true. 
+end select
+  end function is_real8
+end module m
+
+program test
+  use m
+
+  if (is_real8(1.0_4)) stop 1
+  if (.not. is_real8(1.0_8)) stop 2
+#ifdef __GFC_REAL_16__
+  if (is_real8(1.0_16)) stop 3
+#endif
+end program
-- 
2.11.0



libbacktrace patch committed: Treat EACCES like ENOENT

2020-05-13 Thread Ian Lance Taylor via Gcc-patches
This patch to libbacktrace treats an EACCES error when opening a file
like an ENOENT error.  This case happens when running the libgo
syscall tests as root, when testing various ways of restricting a
child process.  Bootstrapped and ran libbacktrace and Go tests on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2020-05-13  Ian Lance Taylor  

PR go/95061
* posix.c (backtrace_open): Treat EACCESS like ENOENT.
diff --git a/libbacktrace/posix.c b/libbacktrace/posix.c
index 356e72b4a3b..a2c88dd8e4a 100644
--- a/libbacktrace/posix.c
+++ b/libbacktrace/posix.c
@@ -67,7 +67,11 @@ backtrace_open (const char *filename, 
backtrace_error_callback error_callback,
   descriptor = open (filename, (int) (O_RDONLY | O_BINARY | O_CLOEXEC));
   if (descriptor < 0)
 {
-  if (does_not_exist != NULL && errno == ENOENT)
+  /* If DOES_NOT_EXIST is not NULL, then don't call ERROR_CALLBACK
+if the file does not exist.  We treat lacking permission to
+open the file as the file not existing; this case arises when
+running the libgo syscall package tests as root.  */
+  if (does_not_exist != NULL && (errno == ENOENT || errno == EACCES))
*does_not_exist = 1;
   else
error_callback (data, filename, errno);


Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2020-05-13 Thread Craig Blackmore
On 12/05/2020 23:33, Jim Wilson wrote:
> On Mon, Apr 27, 2020 at 10:08 AM Craig Blackmore
>  wrote:
>> Thanks for the review. I have updated the following patch with those changes.
> This looks good, and the tree is open for development work again, so I
> committed both parts 1 and 2 and pushed it.
>
> One weird thing is that while the patch program accepted the patch
> fine, git am would not, and kept giving me an error that didn't make
> any sense, pointing at a line that didn't have any visible problem.
> So I had to commit it by hand and then use git commit --amend to fix
> the authorship before pushing it.  The end result looks OK to me
> though.
>
> Jim

Hi Jim,

Many thanks for your help in reviewing this patch.

I am not sure what happened with git am, thanks for working around it,
the end result looks good to me.

Craig



Re: ChangeLog files - server and client scripts

2020-05-13 Thread Joseph Myers
On Wed, 13 May 2020, Martin Liška wrote:

> I'm sending the gcc-changelog relates scripts which should be added to contrib
> folder. The patch contains:
> - git_check_commit.py - checking script that verifies git message format

We need a documentation patch to contribute.html or gitwrite.html that 
describes the exact commit message format being used.

> - git_update_version.py - a replacement of
> maintainer-scripts/update_version_git which
> bumps DATESTAMP and generates ChangeLog entries (for now into ChangeLog.test
> files)

Where does this check things out?  (The existing ~gccadmin/gcc-checkout 
isn't suitable for that, it needs to stay on master to have the correct 
version of maintainer-scripts rather than being switched to other 
branches, though I suppose a second long-lived checkout that gets updated 
automatically could be used.  If you check things out somewhere else 
temporarily, it's important to be sure the checkout gets deleted 
afterwards rather than having multiple checkouts accumulating.  That's 
especially the case if you use a checkout in /tmp as a single GCC 
repository clone / checkout uses a significant proportion of the free 
space on the root filesystem; /sourceware/snapshot-tmp/gcc has more free 
space for large temporary directories.)

> The second part is git hook that will reject all commits for release and
> master branches.
> that violate ChangeLog format. Right now, strict mode is disabled in the
> hooks.

Note that the present state of having GCC-specific patches to the git 
hooks is supposed to be a temporary one; we want to move to all relevant 
GCC-specific configuration being in refs/meta/config rather than custom 
code, so GCC and sourceware can share a single instance of the hooks which 
in turn can use the same code as in the upstream AdaCore repository, so 
that future updates of the hooks from upstream are easier.  See the issues 
I filed at https://github.com/AdaCore/git-hooks/issues for the existing 
custom GCC changes and the pull request 
https://github.com/AdaCore/git-hooks/pull/12 to bring in implementations 
of many of those features (not sure if it covers everything or not).  So 
it's important to consider how these checks could be implemented without 
needing GCC-specific code directly in these hooks (for example, using the 
new hooks.update-hook mechanism added by one of the commits in that pull 
request, or getting extra features added to the upstream hooks in a 
generic form if necessary).

-- 
Joseph S. Myers
jos...@codesourcery.com


libbacktrace patch committed: Mark state unused in ztest.c test_large

2020-05-13 Thread Ian Lance Taylor via Gcc-patches
This libbacktrace patch marks the state parameter of test_large in
ztest.c as ATTRIBUTE_UNUSED.  The parameter is not used if HAVE_ZLIB
is not defined.  Bootstrapped and ran libbacktrace tests on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2020-05-13  Ian Lance Taylor  

* ztest.c (test_large): Mark state ATTRIBUTE_UNUSED.
diff --git a/libbacktrace/ztest.c b/libbacktrace/ztest.c
index 2663c90061a..39476704972 100644
--- a/libbacktrace/ztest.c
+++ b/libbacktrace/ztest.c
@@ -294,7 +294,7 @@ average_time (const size_t *times, size_t trials)
 /* Test a larger text, if available.  */
 
 static void
-test_large (struct backtrace_state *state)
+test_large (struct backtrace_state *state ATTRIBUTE_UNUSED)
 {
 #ifdef HAVE_ZLIB
   unsigned char *orig_buf;


libgo patch committed: Build syscall test with -static

2020-05-13 Thread Ian Lance Taylor via Gcc-patches
This libgo patch builds the syscall test with -static.  This avoids
problems finding libgo.so when running the test as root, which invokes
the test as a child process in various limited environments.  This
fixes GCC PR 95061.  Bootstrapped and ran Go tests on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
0d5d880994660e231f82b7cb1dcfab744158f7e0
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 02f6746cf6b..b63bb3bd547 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-876bdf3df3bb33dbf1414237d84be5da32a48082
+93b3d88515db85e203d54f382200b84b56b0ae4c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index dea09de592b..eff0e00e787 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -967,6 +967,10 @@ endif
 # Also use -fno-inline to get better results from the memory profiler.
 runtime_pprof_check_GOCFLAGS = -static-libgo -fno-inline
 
+# Use -static for the syscall tests, because otherwise when
+# running as root the re-execs ignore LD_LIBRARY_PATH.
+syscall_check_GOCFLAGS = -static
+
 extra_go_files_runtime_internal_sys = version.go
 runtime/internal/sys.lo.dep: $(extra_go_files_runtime_internal_sys)
 


[pushed] testsuite: Support { target c++20 } in tests.

2020-05-13 Thread Jason Merrill via Gcc-patches
I'm not sure why I didn't check this in along with adding -std=c++20, since
I wrote this patch at the same time.  The testsuite should support both
{ target c++2a } and { target c++20 }.

Tested x86_64-pc-linux-gnu, applying to trunk and 10.

gcc/testsuite/ChangeLog
2020-05-13  Jason Merrill  

* lib/target-supports.exp (check_effective_target_c++20_only)
(check_effective_target_c++20): New.
---
 gcc/testsuite/lib/target-supports.exp | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 3a6ab8740c3..88f4a9cd812 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -9134,6 +9134,14 @@ proc check_effective_target_c++2a { } {
 return [check_effective_target_c++2a_only]
 }
 
+proc check_effective_target_c++20_only { } {
+return [check_effective_target_c++2a_only]
+}
+
+proc check_effective_target_c++20 { } {
+return [check_effective_target_c++2a]
+}
+
 # Check for C++ Concepts support, i.e. -fconcepts flag.
 proc check_effective_target_concepts { } {
 if [check_effective_target_c++2a] {

base-commit: 18edc195442291525e04f0fa4d5ef972155117da
-- 
2.18.1



Re: [PATCH] c++: SFINAE for invalid delete-expression [PR79706]

2020-05-13 Thread Jason Merrill via Gcc-patches

On 5/12/20 11:36 PM, Patrick Palka wrote:

This fixes SFINAE when substitution yields an invalid delete-expression
due to the pertinent deallocation function being marked deleted or
otherwise inaccessible.

We need to check for an erroneous result from build_op_delete_call and
exit early in that case, so that we don't build a COND_EXPR around the
erroneous call which finish_decltype_type would then quietly accept.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK to
commit?


OK.


gcc/cp/ChangeLog:

PR c++/79706
* init.c (build_vec_delete_1): Return error_mark_node if
deallocate_expr is error_mark_node.
(build_delete): Return error_mark_node if do_delete is
error_mark_node.

gcc/testsuite/ChangeLog:

PR c++/79706
* g++.dg/template/sfinae30.C: New test.
---
  gcc/cp/init.c|  8 ++--
  gcc/testsuite/g++.dg/template/sfinae30.C | 21 +
  2 files changed, 27 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/sfinae30.C

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index e2e547afd96..c1047dbb1cc 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -4076,7 +4076,9 @@ build_vec_delete_1 (location_t loc, tree base, tree 
maxindex, tree type,
  }
  
body = loop;

-  if (!deallocate_expr)
+  if (deallocate_expr == error_mark_node)
+return error_mark_node;
+  else if (!deallocate_expr)
  ;
else if (!body)
  body = deallocate_expr;
@@ -4993,7 +4995,9 @@ build_delete (location_t loc, tree otype, tree addr,
return expr;
  }
  
-  if (do_delete)

+  if (do_delete == error_mark_node)
+return error_mark_node;
+  else if (do_delete)
  {
tree do_delete_call_expr = extract_call_expr (do_delete);
if (TREE_CODE (do_delete_call_expr) == CALL_EXPR)
diff --git a/gcc/testsuite/g++.dg/template/sfinae30.C 
b/gcc/testsuite/g++.dg/template/sfinae30.C
new file mode 100644
index 000..82f31aaa625
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/sfinae30.C
@@ -0,0 +1,21 @@
+// PR c++/79706
+// { dg-do compile { target c++11 } }
+
+struct A {
+  void operator delete(void*) = delete;
+private:
+  void operator delete[](void*);
+};
+
+extern A *p;
+
+template
+auto foo(T *t) -> decltype(delete t); // { dg-error "use of deleted function" }
+
+template
+auto bar(T *t) -> decltype(delete[] t); // { dg-error "private within this 
context" }
+
+void baz() {
+  foo(p); // { dg-error "no match" }
+  bar(p); // { dg-error "no match" }
+}





Re: [PATCH] c++: explicit(bool) malfunction with dependent expression [PR95066]

2020-05-13 Thread Jason Merrill via Gcc-patches

On 5/11/20 7:06 PM, Marek Polacek wrote:

I forgot to set DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P when merging two
function declarations and as a sad consequence, we never tsubsted
the dependent explicit-specifier in tsubst_function_decl, leading to
disregarding the explicit-specifier altogether, and wrongly accepting
this test.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10/9?


OK.


PR c++/95066
* decl.c (duplicate_decls): Set DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P.

* g++.dg/cpp2a/explicit16.C: New test.
---
  gcc/cp/decl.c   |  2 ++
  gcc/testsuite/g++.dg/cpp2a/explicit16.C | 21 +
  2 files changed, 23 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit16.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1b6a5672334..604ecf42e95 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2035,6 +2035,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
newdecl_is_friend)
DECL_FINAL_P (newdecl) |= DECL_FINAL_P (olddecl);
DECL_OVERRIDE_P (newdecl) |= DECL_OVERRIDE_P (olddecl);
DECL_THIS_STATIC (newdecl) |= DECL_THIS_STATIC (olddecl);
+  DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (newdecl)
+   |= DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (olddecl);
if (DECL_OVERLOADED_OPERATOR_P (olddecl))
DECL_OVERLOADED_OPERATOR_CODE_RAW (newdecl)
  = DECL_OVERLOADED_OPERATOR_CODE_RAW (olddecl);
diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit16.C 
b/gcc/testsuite/g++.dg/cpp2a/explicit16.C
new file mode 100644
index 000..9d95b0d669e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/explicit16.C
@@ -0,0 +1,21 @@
+// PR c++/95066 - explicit malfunction with dependent expression.
+// { dg-do compile { target c++2a } }
+
+template 
+struct Foo {
+  template 
+  explicit(static_cast(true)) operator Foo();
+};
+
+template 
+template 
+Foo::operator Foo() {
+  return {};
+}
+
+int
+main ()
+{
+  Foo a;
+  Foo b = a; // { dg-error "conversion" }
+}

base-commit: 840ac85ced0695fefecee433327e4298b4adb20a





Re: [PATCH] c++: premature requires-expression folding [PR95020]

2020-05-13 Thread Jason Merrill via Gcc-patches

On 5/11/20 6:43 PM, Patrick Palka wrote:

In the testcase below we're prematurely folding away the
requires-expression to 'true' after substituting in the function's
template arguments, but before substituting in the lambda's deduced
template arguments.

This happens because during the first tsubst_requires_expr,
processing_template_decl is 1 but 'args' is just {void} and therefore
non-dependent, so we end up folding away the requires-expression to
boolean_true_node before we could substitute in the lambda's template
arguments and determine that '*v' is ill-formed.

This patch removes the uses_template_parms check when deciding in
tsubst_requires_expr whether to keep around a new requires-expression.
Regardless of whether the template arguments are dependent, there still
might be more template parameters to later substitute in -- as in the
testcase below -- and even if not, tsubst_expr doesn't perform full
semantic processing unless !processing_template_decl, so it seems we
should wait until then to fold away the requires-expression.

Passes 'make check-c++', does this look OK to commit after a full
bootstrap/regtest?


OK.


gcc/cp/ChangeLog:

PR c++/95020
* constraint.c (tsubst_requires_expr): Produce a new
requires-expression when processing_template_decl, even if
template arguments are not dependent.

gcc/testsuite/ChangeLog:

PR c++/95020
* g++/cpp2a/concepts-lambda7.C: New test.
---
  gcc/cp/constraint.cc  |  4 +---
  gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C | 14 ++
  2 files changed, 15 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 4ad17f3b7d8..8ee347cae60 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2173,9 +2173,7 @@ tsubst_requires_expr (tree t, tree args,
if (reqs == error_mark_node)
  return boolean_false_node;
  
-  /* In certain cases, produce a new requires-expression.

- Otherwise the value of the expression is true.  */
-  if (processing_template_decl && uses_template_parms (args))
+  if (processing_template_decl)
  return finish_requires_expr (cp_expr_location (t), parms, reqs);
  
return boolean_true_node;

diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
new file mode 100644
index 000..50746b777a3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda7.C
@@ -0,0 +1,14 @@
+// PR c++/95020
+// { dg-do compile { target c++2a } }
+
+template
+void foo() {
+  auto t = [](auto v) {
+static_assert(requires { *v; }); // { dg-error "static assertion failed" }
+  };
+  t(0);
+}
+
+void bar() {
+  foo();
+}





[C++] some cleanup patches

2020-05-13 Thread Nathan Sidwell

I've committed this set of minor cleanups from the modules branch.

nathan
--
Nathan Sidwell
2020-05-13  Nathan Sidwell  

	Formatting fixups & some simplifications.
	* pt.c (spec_hash_table): New typedef.
	(decl_specializations, type_specializations): Use it.
	(retrieve_specialization): Likewise.
	(register_specialization): Remove unnecessary casts.
	(push_template_decl_real): Reformat.
	(instantiate_class_template_1): Use more RAII.
	(make_argument_pack): Simplify.
	(instantiate_template_1): Use gcc_checking_assert for expensive
	asserts.
	(instantiate_decl): Likewise.
	(resolve_typename_type): Reformat comment.
	* semantics.c (struct deferred_access): Remove unnecessary GTY on
	member.
	(begin_class_definition): Fix formatting.

diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 7911293571e..ff15a926db9 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -48,9 +48,9 @@ along with GCC; see the file COPYING3.  If not see
returning an int.  */
 typedef int (*tree_fn_t) (tree, void*);
 
-/* The PENDING_TEMPLATES is a TREE_LIST of templates whose
-   instantiations have been deferred, either because their definitions
-   were not yet available, or because we were putting off doing the work.  */
+/* The PENDING_TEMPLATES is a list of templates whose instantiations
+   have been deferred, either because their definitions were not yet
+   available, or because we were putting off doing the work.  */
 struct GTY ((chain_next ("%h.next"))) pending_template
 {
   struct pending_template *next;
@@ -116,9 +116,10 @@ struct spec_hasher : ggc_ptr_hash
   static bool equal (spec_entry *, spec_entry *);
 };
 
-static GTY (()) hash_table *decl_specializations;
-
-static GTY (()) hash_table *type_specializations;
+/* The general template is not in these tables.  */
+typedef hash_table spec_hash_table;
+static GTY (()) spec_hash_table *decl_specializations;
+static GTY (()) spec_hash_table *type_specializations;
 
 /* Contains canonical template parameter types. The vector is indexed by
the TEMPLATE_TYPE_IDX of the template parameter. Each element is a
@@ -1278,7 +1279,7 @@ retrieve_specialization (tree tmpl, tree args, hashval_t hash)
 {
   spec_entry *found;
   spec_entry elt;
-  hash_table *specializations;
+  spec_hash_table *specializations;
 
   elt.tmpl = tmpl;
   elt.args = args;
@@ -1573,10 +1574,9 @@ register_specialization (tree spec, tree tmpl, tree args, bool is_friend,
   if (hash == 0)
 	hash = spec_hasher::hash (&elt);
 
-  slot =
-	decl_specializations->find_slot_with_hash (&elt, hash, INSERT);
+  slot = decl_specializations->find_slot_with_hash (&elt, hash, INSERT);
   if (*slot)
-	fn = ((spec_entry *) *slot)->spec;
+	fn = (*slot)->spec;
   else
 	fn = NULL_TREE;
 }
@@ -4474,7 +4466,7 @@ reduce_template_parm_level (tree index, tree type, int levels, tree args,
   TEMPLATE_PARM_PARAMETER_PACK (tpi)
 	= TEMPLATE_PARM_PARAMETER_PACK (index);
 
-	/* Template template parameters need this.  */
+  /* Template template parameters need this.  */
   tree inner = decl;
   if (TREE_CODE (decl) == TEMPLATE_DECL)
 	{
@@ -5705,8 +5697,7 @@ push_template_decl_real (tree decl, bool is_friend)
  template  friend void A::f();
is not primary.  */
 is_primary = false;
-  else if (TREE_CODE (decl) == TYPE_DECL
-	   && LAMBDA_TYPE_P (TREE_TYPE (decl)))
+  else if (TREE_CODE (decl) == TYPE_DECL && LAMBDA_TYPE_P (TREE_TYPE (decl)))
 is_primary = false;
   else
 is_primary = template_parm_scope_p ();
@@ -5845,8 +5836,7 @@ push_template_decl_real (tree decl, bool is_friend)
   if (!ctx
   || TREE_CODE (ctx) == FUNCTION_DECL
   || (CLASS_TYPE_P (ctx) && TYPE_BEING_DEFINED (ctx))
-  || (TREE_CODE (decl) == TYPE_DECL
-	  && LAMBDA_TYPE_P (TREE_TYPE (decl)))
+  || (TREE_CODE (decl) == TYPE_DECL && LAMBDA_TYPE_P (TREE_TYPE (decl)))
   || (is_friend && !DECL_TEMPLATE_INFO (decl)))
 {
   if (DECL_LANG_SPECIFIC (decl)
@@ -11752,13 +11743,10 @@ instantiate_class_template_1 (tree type)
 		continue;
 
 	  /* Build new CLASSTYPE_NESTED_UTDS.  */
+	  bool class_template_p = (TREE_CODE (t) != ENUMERAL_TYPE
+   && TYPE_LANG_SPECIFIC (t)
+   && CLASSTYPE_IS_TEMPLATE (t));
 
-	  tree newtag;
-	  bool class_template_p;
-
-	  class_template_p = (TREE_CODE (t) != ENUMERAL_TYPE
-  && TYPE_LANG_SPECIFIC (t)
-  && CLASSTYPE_IS_TEMPLATE (t));
 	  /* If the member is a class template, then -- even after
 		 substitution -- there may be dependent types in the
 		 template argument list for the class.  We increment
@@ -11767,7 +11755,7 @@ instantiate_class_template_1 (tree type)
 		 when outside of a template.  */
 	  if (class_template_p)
 		++processing_template_decl;
-	  newtag = tsubst (t, args, tf_error, NULL_TREE);
+	  tree newtag = tsubst (t, args, tf_error, NULL_TREE);
 	  if (class_template_p)
 		--processing_template_decl;
 	  if (newtag == error_mark_node)
@@ -

[C++] canonical_type_parameter

2020-05-13 Thread Nathan Sidwell
Canonical_type_parameter shows C-like thinking.  This modernizes it, 
which I found simpler to understand.


pushed to master

nathan
--
Nathan Sidwell
2020-05-13  Nathan Sidwell  

	* pt.c (canonical_type_parameter): Simplify.

diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index a732ced2d8d..a36f603761c 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -4417,29 +4417,21 @@ build_template_parm_index (int index,
 static tree
 canonical_type_parameter (tree type)
 {
-  tree list;
   int idx = TEMPLATE_TYPE_IDX (type);
 
   gcc_assert (TREE_CODE (type) != TEMPLATE_TEMPLATE_PARM);
 
-  if (!canonical_template_parms)
-vec_alloc (canonical_template_parms, idx + 1);
-
-  if (canonical_template_parms->length () <= (unsigned) idx)
+  if (vec_safe_length (canonical_template_parms) <= (unsigned) idx)
 vec_safe_grow_cleared (canonical_template_parms, idx + 1);
 
-  list = (*canonical_template_parms)[idx];
-  while (list && !comptypes (type, TREE_VALUE (list), COMPARE_STRUCTURAL))
-list = TREE_CHAIN (list);
+  for (tree list = (*canonical_template_parms)[idx];
+   list; list = TREE_CHAIN (list))
+if (comptypes (type, TREE_VALUE (list), COMPARE_STRUCTURAL))
+  return TREE_VALUE (list);
 
-  if (list)
-return TREE_VALUE (list);
-  else
-{
-  (*canonical_template_parms)[idx]
-	= tree_cons (NULL_TREE, type, (*canonical_template_parms)[idx]);
-  return type;
-}
+  (*canonical_template_parms)[idx]
+= tree_cons (NULL_TREE, type, (*canonical_template_parms)[idx]);
+  return type;
 }
 
 /* Return a TEMPLATE_PARM_INDEX, similar to INDEX, but whose


[C++] simplify typedef access checking

2020-05-13 Thread Nathan Sidwell
I discovered template typedef access checking was more expensive than it 
need be.  The call of get_types_needed_access_check in the 
FOR_EACH_VEC_SAFE_ELT is the moral equivalent of


  for (size_t pos = 0; pos != strlen (string); pos++)'

Let's not do that.

nathan

--
Nathan Sidwell
2020-05-13  Nathan Sidwell  

	* pt.c (perform_typedefs_access_check): Cache expensively
	calculated object references.
	(check_auto_in_tmpl_args): Just assert we do not get unexpected
	nodes, rather than silently do nothing.
	(append_type_to_template_for_access): Likewise, cache expensie
	object reference.

diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index a732ced2d8d..a36f603761c 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -11521,26 +11512,28 @@ perform_typedefs_access_check (tree tmpl, tree targs)
 	  && TREE_CODE (tmpl) != FUNCTION_DECL))
 return;
 
-  FOR_EACH_VEC_SAFE_ELT (get_types_needing_access_check (tmpl), i, iter)
-{
-  tree type_decl = iter->typedef_decl;
-  tree type_scope = iter->context;
-
-  if (!type_decl || !type_scope || !CLASS_TYPE_P (type_scope))
-	continue;
+  if (vec *tdefs
+  = get_types_needing_access_check (tmpl))
+FOR_EACH_VEC_ELT (*tdefs, i, iter)
+  {
+	tree type_decl = iter->typedef_decl;
+	tree type_scope = iter->context;
 
-  if (uses_template_parms (type_decl))
-	type_decl = tsubst (type_decl, targs, tf_error, NULL_TREE);
-  if (uses_template_parms (type_scope))
-	type_scope = tsubst (type_scope, targs, tf_error, NULL_TREE);
+	if (!type_decl || !type_scope || !CLASS_TYPE_P (type_scope))
+	  continue;
 
-  /* Make access check error messages point to the location
- of the use of the typedef.  */
-  iloc_sentinel ils (iter->locus);
-  perform_or_defer_access_check (TYPE_BINFO (type_scope),
- type_decl, type_decl,
- tf_warning_or_error);
-}
+	if (uses_template_parms (type_decl))
+	  type_decl = tsubst (type_decl, targs, tf_error, NULL_TREE);
+	if (uses_template_parms (type_scope))
+	  type_scope = tsubst (type_scope, targs, tf_error, NULL_TREE);
+
+	/* Make access check error messages point to the location
+	   of the use of the typedef.  */
+	iloc_sentinel ils (iter->locus);
+	perform_or_defer_access_check (TYPE_BINFO (type_scope),
+   type_decl, type_decl,
+   tf_warning_or_error);
+  }
 }
 
 static tree
@@ -29225,25 +29218,13 @@ check_auto_in_tmpl_args (tree tmpl, tree args)
 vec *
 get_types_needing_access_check (tree t)
 {
-  tree ti;
-  vec *result = NULL;
-
-  if (!t || t == error_mark_node)
-return NULL;
-
-  if (!(ti = get_template_info (t)))
-return NULL;
+  gcc_checking_assert ((CLASS_TYPE_P (t) || TREE_CODE (t) == FUNCTION_DECL));
+  
+  if (tree ti = get_template_info (t))
+if (TI_TEMPLATE (ti))
+  return TI_TYPEDEFS_NEEDING_ACCESS_CHECKING (ti);
 
-  if (CLASS_TYPE_P (t)
-  || TREE_CODE (t) == FUNCTION_DECL)
-{
-  if (!TI_TEMPLATE (ti))
-	return NULL;
-
-  result = TI_TYPEDEFS_NEEDING_ACCESS_CHECKING (ti);
-}
-
-  return result;
+  return NULL;
 }
 
 /* Append the typedef TYPE_DECL used in template T to a list of typedefs
@@ -29328,9 +29309,11 @@ append_type_to_template_for_access_check (tree templ,
   gcc_assert (type_decl && (TREE_CODE (type_decl) == TYPE_DECL));
 
   /* Make sure we don't append the type to the template twice.  */
-  FOR_EACH_VEC_SAFE_ELT (get_types_needing_access_check (templ), i, iter)
-if (iter->typedef_decl == type_decl && scope == iter->context)
-  return;
+  if (vec *tdefs
+  = get_types_needing_access_check (templ))
+FOR_EACH_VEC_ELT (*tdefs, i, iter)
+  if (iter->typedef_decl == type_decl && scope == iter->context)
+	return;
 
   append_type_to_template_for_access_check_1 (templ, type_decl,
 	  scope, location);


[C++] template arg comparison

2020-05-13 Thread Nathan Sidwell
When fixing up the template specialization hasher I was confused by the 
control flow through template_args_equal.  This reorders the category 
checking, so it is clearer as to what kind of node can reach which point.


nathan

--
Nathan Sidwell
2020-05-13  Nathan Sidwell  

	* pt.c (template_args_equal): Reorder category checking for
	clarity.

diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index a732ced2d8d..a36f603761c 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -9092,22 +9084,22 @@ template_args_equal (tree ot, tree nt, bool partial_order /* = false */)
   if (class_nttp_const_wrapper_p (ot))
 ot = TREE_OPERAND (ot, 0);
 
-  if (TREE_CODE (nt) == TREE_VEC)
+  if (TREE_CODE (nt) == TREE_VEC || TREE_CODE (nt) == TREE_VEC)
 /* For member templates */
-return TREE_CODE (ot) == TREE_VEC && comp_template_args (ot, nt);
-  else if (PACK_EXPANSION_P (ot))
-return (PACK_EXPANSION_P (nt)
+return TREE_CODE (ot) == TREE_CODE (nt) && comp_template_args (ot, nt);
+  else if (PACK_EXPANSION_P (ot) || PACK_EXPANSION_P (nt))
+return (PACK_EXPANSION_P (ot) && PACK_EXPANSION_P (nt)
 	&& template_args_equal (PACK_EXPANSION_PATTERN (ot),
 PACK_EXPANSION_PATTERN (nt))
 	&& template_args_equal (PACK_EXPANSION_EXTRA_ARGS (ot),
 PACK_EXPANSION_EXTRA_ARGS (nt)));
   else if (ARGUMENT_PACK_P (ot) || ARGUMENT_PACK_P (nt))
 return cp_tree_equal (ot, nt);
-  else if (ot && TREE_CODE (ot) == ARGUMENT_PACK_SELECT)
+  else if (TREE_CODE (ot) == ARGUMENT_PACK_SELECT)
 gcc_unreachable ();
-  else if (TYPE_P (nt))
+  else if (TYPE_P (nt) || TYPE_P (nt))
 {
-  if (!TYPE_P (ot))
+  if (!(TYPE_P (nt) && TYPE_P (ot)))
 	return false;
   /* Don't treat an alias template specialization with dependent
 	 arguments as equivalent to its underlying type when used as a
@@ -9125,8 +9117,6 @@ template_args_equal (tree ot, tree nt, bool partial_order /* = false */)
   else
 	return same_type_p (ot, nt);
 }
-  else if (TREE_CODE (ot) == TREE_VEC || TYPE_P (ot))
-return 0;
   else
 {
   /* Try to treat a template non-type argument that has been converted
@@ -9136,6 +9126,7 @@ template_args_equal (tree ot, tree nt, bool partial_order /* = false */)
 	 || code1 == NON_LVALUE_EXPR;
 	   code1 = TREE_CODE (ot))
 	ot = TREE_OPERAND (ot, 0);
+
   for (enum tree_code code2 = TREE_CODE (nt);
 	   CONVERT_EXPR_CODE_P (code2)
 	 || code2 == NON_LVALUE_EXPR;


Re: [PATCH] [V2] rs6000: Add vec_extracth and vec_extractl

2020-05-13 Thread Segher Boessenkool
Hi!

On Wed, May 13, 2020 at 07:50:50AM -0500, Bill Schmidt wrote:
> From: Kelvin Nilsen 
> 
> Add new insns vextdu[bhw]vlx, vextddvlx, vextdu[bhw]vhx, and
> vextddvhx, along with built-in access and overloaded built-in
> access to these insns.
> 
> Changes from previous patch:
>  * Removed the int iterators
>  * Created separate expansions and insns
> vextractl
> vextractl_internal
> vextractr
> vextractr_internal
>  * Adjusted rs6000-builtin.def entries to match the new expansion
>names
> 
> I didn't understand the comment about moving the decision making
> part to the built-in handling code.  All the built-in handling
> does is a table-driven call to the expansions; this logic *is*
> the built-in handling code.  I don't see any way to simplify that.

I'll try it myself :-)

> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions, using a Power9 configuration.  Is this okay for
> master?

This is okay for trunk.  Thanks!


Segher


> 2020-05-12  Kelvin Nilsen  
> 
>   * config/rs6000/altivec.h (vec_extractl): New #define.
>   (vec_extracth): Likewise.
>   * config/rs6000/altivec.md (UNSPEC_EXTRACTL): New constant.
>   (UNSPEC_EXTRACTR): Likewise.
>   (vextractl): New expansion.
>   (vextractl_internal): New insn.
>   (vextractr): New expansion.
>   (vextractr_internal): New insn.
>   * config/rs6000/rs6000-builtin.def (__builtin_altivec_vextdubvlx):
>   New built-in function.
>   (__builtin_altivec_vextduhvlx): Likewise.
>   (__builtin_altivec_vextduwvlx): Likewise.
>   (__builtin_altivec_vextddvlx): Likewise.
>   (__builtin_altivec_vextdubvhx): Likewise.
>   (__builtin_altivec_vextduhvhx): Likewise.
>   (__builtin_altivec_vextduwvhx): Likewise.
>   (__builtin_altivec_vextddvhx): Likewise.
>   (__builtin_vec_extractl): New overloaded built-in function.
>   (__builtin_vec_extracth): Likewise.
>   * config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
>   Define overloaded forms of __builtin_vec_extractl and
>   __builtin_vec_extracth.
>   (builtin_function_type): Add cases to mark arguments of new
>   built-in functions as unsigned.
>   (rs6000_common_init_builtins): Add
>   opaque_ftype_opaque_opaque_opaque_opaque.
>   * config/rs6000/rs6000.md (du_or_d): New mode attribute.
>   * doc/extend.texi (PowerPC AltiVec Built-in Functions Available
>   for a Future Architecture): Add description of vec_extractl and
>   vec_extractr built-in functions.
> 
> [gcc/testsuite]
> 
> 2020-05-10  Kelvin Nilsen  
> 
>   * gcc.target/powerpc/vec-extracth-0.c: New.
>   * gcc.target/powerpc/vec-extracth-1.c: New.
>   * gcc.target/powerpc/vec-extracth-2.c: New.
>   * gcc.target/powerpc/vec-extracth-3.c: New.
>   * gcc.target/powerpc/vec-extracth-4.c: New.
>   * gcc.target/powerpc/vec-extracth-5.c: New.
>   * gcc.target/powerpc/vec-extracth-6.c: New.
>   * gcc.target/powerpc/vec-extracth-7.c: New.
>   * gcc.target/powerpc/vec-extracth-be-0.c: New.
>   * gcc.target/powerpc/vec-extracth-be-1.c: New.
>   * gcc.target/powerpc/vec-extracth-be-2.c: New.
>   * gcc.target/powerpc/vec-extracth-be-3.c: New.
>   * gcc.target/powerpc/vec-extractl-0.c: New.
>   * gcc.target/powerpc/vec-extractl-1.c: New.
>   * gcc.target/powerpc/vec-extractl-2.c: New.
>   * gcc.target/powerpc/vec-extractl-3.c: New.
>   * gcc.target/powerpc/vec-extractl-4.c: New.
>   * gcc.target/powerpc/vec-extractl-5.c: New.
>   * gcc.target/powerpc/vec-extractl-6.c: New.
>   * gcc.target/powerpc/vec-extractl-7.c: New.
>   * gcc.target/powerpc/vec-extractl-be-0.c: New.
>   * gcc.target/powerpc/vec-extractl-be-1.c: New.
>   * gcc.target/powerpc/vec-extractl-be-2.c: New.
>   * gcc.target/powerpc/vec-extractl-be-3.c: New.


Re: [PATCH] RISC-V: Make unique SECCAT_SRODATA names start with .srodata

2020-05-13 Thread Palmer Dabbelt

On Tue, 12 May 2020 16:53:14 PDT (-0700), Jim Wilson wrote:

This fixes a bug reported to the RISC-V sw-dev mailing list late last year.
https://groups.google.com/a/groups.riscv.org/forum/#!topic/sw-dev/JV5Jdh4UjVw

Keith Packard wote the obvious patch to fix it.  I tested it with cross builds
for riscv32-newlib and riscv64-linux.  There were no regressions.  Checking
toolchain libraries with objdump shows that there are no longer sdata2 sections
in the libraries.

Committed.


Thanks!



Jim

2020-05-12  Keith Packard  
* config/riscv/riscv.c (riscv_unique_section): New.
(TARGET_ASM_UNIQUE_SECTION): New.

default_unique_section uses ".sdata2" as a prefix for SECCAT_SRODATA
unique sections, but RISC-V uses ".srodata" instead. Override the
TARGET_ASM_UNIQUE_SECTION function to catch this case, allowing the
default to be used for all other sections.

Signed-off-by: Keith Packard 
---
 gcc/config/riscv/riscv.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index e4c08d780db..1ad9799fce4 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3492,6 +3492,43 @@ riscv_select_section (tree decl, int reloc,
 }
 }

+/* Switch to the appropriate section for output of DECL.  */
+
+static void
+riscv_unique_section (tree decl, int reloc)
+{
+  const char *prefix = NULL;
+  bool one_only = DECL_ONE_ONLY (decl) && !HAVE_COMDAT_GROUP;
+
+  switch (categorize_decl_for_section (decl, reloc))
+{
+case SECCAT_SRODATA:
+  prefix = one_only ? ".sr" : ".srodata";
+  break;
+
+default:
+  break;
+}
+  if (prefix)
+{
+  const char *name, *linkonce;
+  char *string;
+
+  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
+  name = targetm.strip_name_encoding (name);
+
+  /* If we're using one_only, then there needs to be a .gnu.linkonce
+prefix to the section name.  */
+  linkonce = one_only ? ".gnu.linkonce" : "";
+
+  string = ACONCAT ((linkonce, prefix, ".", name, NULL));
+
+  set_decl_section_name (decl, string);
+  return;
+}
+  default_unique_section (decl, reloc);
+}
+
 /* Return a section for X, handling small data. */

 static section *
@@ -5254,6 +5291,9 @@ riscv_new_address_profitable_p (rtx memref, rtx_insn 
*insn, rtx new_addr)
 #undef TARGET_ASM_SELECT_SECTION
 #define TARGET_ASM_SELECT_SECTION riscv_select_section

+#undef TARGET_ASM_UNIQUE_SECTION
+#define TARGET_ASM_UNIQUE_SECTION riscv_unique_section
+
 #undef TARGET_ASM_SELECT_RTX_SECTION
 #define TARGET_ASM_SELECT_RTX_SECTION  riscv_elf_select_rtx_section


Re: [PATCH PR94969]Add unit distant vector to DDR in case of invariant access functions

2020-05-13 Thread Bin.Cheng via Gcc-patches
On Thu, May 14, 2020 at 1:46 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Wed, May 13, 2020 at 02:00:11PM +0200, Christophe Lyon via Gcc-patches 
> wrote:
> > > > 2020-05-11  Bin Cheng  
> > > >
> > > > PR tree-optimization/94969
> > > > * gcc.dg/tree-ssa/pr94969.c: New test.
> >
> > The new test fails on arm and aarch64 and probably everywhere:
> > gcc.dg/tree-ssa/pr94969.c: dump file does not exist
> > UNRESOLVED: gcc.dg/tree-ssa/pr94969.c scan-tree-dump-not Loop 1
> > distributed: split to 3 loops "ldist"
> >
> > Can you fix this?
>
> Seems a mere swapping of the scan-tree-dump-not args, I've verified the
> test passes after this change and fails with older trunk, committed to trunk
> as obvious.
Oh, sorry for the breakage, and thanks for fixing this.

Thanks,
bin
>
> 2020-05-13  Jakub Jelinek  
>
> PR testsuite/95110
> * gcc.dg/tree-ssa/pr94969.c: Swap scan-tree-dump-not arguments.
>
> --- gcc/testsuite/gcc.dg/tree-ssa/pr94969.c.jj  2020-05-13 09:24:36.959012780 
> +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr94969.c 2020-05-13 19:13:53.664499322 
> +0200
> @@ -25,4 +25,4 @@ int main()
>  __builtin_abort ();
>  }
>
> -/* { dg-final { scan-tree-dump-not "ldist" "Loop 1 distributed: split to 3 
> loops"} } */
> +/* { dg-final { scan-tree-dump-not "Loop 1 distributed: split to 3 loops" 
> "ldist" } } */
>
> Jakub
>


Re: [PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

2020-05-13 Thread luoxhu via Gcc-patches
On 2020/5/13 02:24, Richard Sandiford wrote:
> luoxhu  writes:
>> +  /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e:
>> +
>> + 73: r145:SI=r123:DI#0-0x1
>> + 74: r144:DI=zero_extend (r145:SI)
>> + 75: r143:DI=r144:DI+0x1
>> + ...
>> + 31: r135:CC=cmp (r123:DI,0)
>> + 72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;clobber
>> + scratch;clobber scratch;}
> 
> Minor, but it might be worth stubbing out the clobbers, since they're
> not really necessary to understand the comment:
> 
>72: {pc={(r143:DI!=0x1)?L70:pc};r143:DI=r143:DI-0x1;...}
> 
>> +
>> + r123:DI#0-0x1 is param count derived from loop->niter_expr equal to the
>> + loop iterations, if loop iterations expression doesn't overflow, then
>> + (zero_extend (r123:DI#0-1))+1 could be simplified to zero_extend only.
>> +   */
>> +  bool simplify_zext = false;
> 
> I think it'd be easier to follow if this was split out into
> a subroutine, rather than having the simplify_zext variable.
> 
>> +  rtx extop0 = XEXP (count, 0);
>> +  if (GET_CODE (count) == ZERO_EXTEND && GET_CODE (extop0) == PLUS)
> 
> This isn't valid: we can only do XEXP (count, 0) *after* checking
> for a ZERO_EXTEND.  (It'd be good to test the patch with
> --enable-checking=yes,extra,rtl , which hopefully would have
> caught this.)
> 
>> +{
>> +  rtx addop0 = XEXP (extop0, 0);
>> +  rtx addop1 = XEXP (extop0, 1);
>> +
>> +  int nonoverflow = 0;
>> +  unsigned int_mode
>> += GET_MODE_PRECISION (as_a GET_MODE (addop0));
> 
> Heh.  I wondered at first how on earth this compiled.  It looked like
> there was a missing "(...)" around the GET_MODE.  But of course,
> GET_MODE adds its own parentheses, so it all works out. :-)
> 
> Please add the "(...)" anyway though.  We shouldn't rely on that.
> 
> "int_mode" seems a bit of a confusing name, since it's actually a precision
> in bits rather than a mode.
> 
>> +  unsigned HOST_WIDE_INT int_mode_max
>> += (HOST_WIDE_INT_1U << (int_mode - 1) << 1) - 1;
>> +  if (get_max_loop_iterations (loop, &iterations)
>> +  && wi::ltu_p (iterations, int_mode_max))
> 
> You could use GET_MODE_MASK instead of int_mode_max here.
> 
> For extra safety, it would be good to add a HWI_COMPUTABLE_P test,
> to make sure that using HWIs is valid.
> 
>> +nonoverflow = 1;
>> +
>> +  if (nonoverflow
> 
> Having the nonoverflow variable doesn't seem necessary.  We could
> just fuse the two "if" conditions together.
> 
>> +  && CONST_SCALAR_INT_P (addop1)
>> +  && GET_MODE_PRECISION (mode) == int_mode * 2
> 
> This GET_MODE_PRECISION condition also shouldn't be necessary.
> If we can prove that the subtraction doesn't wrap, we can extend
> to any wider mode, not just to double the width.
> 
>> +  && addop1 == GEN_INT (-1))
> 
> This can just be:
> 
> addop1 == constm1_rtx
> 
> There's then no need for the CONST_SCALAR_INT_P check.
> 
> Thanks,
> Richard
> 

Thanks for all your great comments, addressed them all with below update,
"--enable-checking=yes,extra,rtl" did catch the ICE with performance penalty.


This "subtract/extend/add" existed for a long time and still annoying us
(PR37451, part of PR61837) when converting from 32bits to 64bits, as the ctr
register is used as 64bits on powerpc64, Andraw Pinski had a patch but
caused some issue and reverted by Joseph S. Myers(PR37451, PR37782).

Andraw:
http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01070.html
http://gcc.gnu.org/ml/gcc-patches/2008-10/msg01321.html
Joseph:
https://gcc.gnu.org/legacy-ml/gcc-patches/2011-11/msg02405.html

We still can do the simplification from "subtract/zero_ext/add" to "zero_ext"
when loop iterations is known to be LT than MODE_MAX (only do simplify
when counter+0x1 NOT overflow).

Bootstrap and regression tested pass on Power8-LE.

gcc/ChangeLog

2020-05-14  Xiong Hu Luo  

PR rtl-optimization/37451, part of PR target/61837
* loop-doloop.c (doloop_simplify_count): New function.  Simplify
(add -1; zero_ext; add +1) to zero_ext when not wrapping.
(doloop_modify): Call doloop_simplify_count.

gcc/testsuite/ChangeLog

2020-05-14  Xiong Hu Luo  

PR rtl-optimization/37451, part of PR target/61837
* gcc.target/powerpc/doloop-2.c: New test.
---
 gcc/loop-doloop.c   | 38 -
 gcc/testsuite/gcc.target/powerpc/doloop-2.c | 29 
 2 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/doloop-2.c

diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
index db6a014e43d..02282d45bd5 100644
--- a/gcc/loop-doloop.c
+++ b/gcc/loop-doloop.c
@@ -397,6 +397,42 @@ add_test (rtx cond, edge *e, basic_block dest)
   return true;
 }
 
+/* Fold (add -1; zero_ext; add +1) operations to zero_ext if not wrapping. i.e:
+
+   73: r145:SI=r123:DI#0-0x1
+   74: r144:DI=zero_extend (r145:SI)

[Committed] IBM Z: stack clash prot: add missing updates of last_probe_offset

2020-05-13 Thread Andreas Krebbel via Gcc-patches
After emitting probes in a loop last_probe_offset needs to be updated.
Not doing this usually assumes a too low distance to the last access
when emitting the remainder leading to stack probes being omitted.

Bootstrapped and regression tested on s390x

Committed to mainline

gcc/ChangeLog:

2020-05-14  Andreas Krebbel  

* config/s390/s390.c (allocate_stack_space): Add missing updates
of last_probe_offset.

gcc/testsuite/ChangeLog:

2020-05-14  Andreas Krebbel  

* gcc.target/s390/stack-clash-1.c: New test.
---
 gcc/ChangeLog |  5 +
 gcc/config/s390/s390.c|  3 +++
 gcc/testsuite/ChangeLog   |  4 
 gcc/testsuite/gcc.target/s390/stack-clash-1.c | 17 +
 4 files changed, 29 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/stack-clash-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0b326ee09e8..51d3e425ad5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2020-05-14  Andreas Krebbel  
+
+   * config/s390/s390.c (allocate_stack_space): Add missing updates
+   of last_probe_offset.
+
 2020-05-14  Andreas Krebbel  
 
* config/s390/s390.md ("allocate_stack"): Call
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 18332271ed7..b4897256af5 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10996,6 +10996,8 @@ allocate_stack_space (rtx size, HOST_WIDE_INT 
last_probe_offset,
   stack_pointer_rtx,
   offset));
}
+ if (num_probes > 0)
+   last_probe_offset = INTVAL (offset);
  dump_stack_clash_frame_info (PROBE_INLINE, residual != 0);
}
  else
@@ -11029,6 +11031,7 @@ allocate_stack_space (rtx size, HOST_WIDE_INT 
last_probe_offset,
  s390_prologue_plus_offset (stack_pointer_rtx, temp_reg,
 const0_rtx, true);
  temp_reg_clobbered_p = true;
+ last_probe_offset = INTVAL (offset);
  dump_stack_clash_frame_info (PROBE_LOOP, residual != 0);
}
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index bb3e4c86adc..8ff0bbcc85b 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-05-14  Andreas Krebbel  
+
+   * gcc.target/s390/stack-clash-1.c: New test.
+
 2020-05-14  Andreas Krebbel  
 
* gcc.target/s390/stack-clash-3.c: New test.
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-1.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-1.c
new file mode 100644
index 000..3d29cab9446
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-1.c
@@ -0,0 +1,17 @@
+/* Make sure a stack probe is emitted also for the remaining bytes
+   after the loop probing the large chunk.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=z9-ec -fstack-clash-protection" } */
+
+void large_stack() {
+  volatile int stack[8000];
+  int i;
+  for (i = 0; i < sizeof(stack) / sizeof(int); ++i)
+stack[i] = i;
+}
+
+/* We use a compare for the stack probe.  There needs to be one inside
+   a loop and another for the remaining bytes.  */
+/* { dg-final { scan-assembler-times "cg\t" 2 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "c\t" 2 { target { ! lp64 } } } } */
-- 
2.17.1



[Committed] IBM Z: Define probe_stack expander

2020-05-13 Thread Andreas Krebbel via Gcc-patches
Probes emitted by the common code routines still use a store.  Define
the "probe_stack" pattern to use a compare instead.

Bootstrapped and regression tested on s390x

Committed to mainline

gcc/ChangeLog:

2020-05-14  Andreas Krebbel  

* config/s390/s390.c (s390_emit_stack_probe): Call the probe_stack
expander.
* config/s390/s390.md ("@probe_stack2", "probe_stack"): New
expanders.

gcc/testsuite/ChangeLog:

2020-05-14  Andreas Krebbel  

* gcc.target/s390/stack-clash-2.c: New test.
---
 gcc/ChangeLog |  7 +++
 gcc/config/s390/s390.c|  7 +++
 gcc/config/s390/s390.md   | 16 +++-
 gcc/testsuite/ChangeLog   |  4 
 gcc/testsuite/gcc.target/s390/stack-clash-2.c | 17 +
 5 files changed, 46 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/stack-clash-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 51d3e425ad5..5c2366e3671 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2020-05-14  Andreas Krebbel  
+
+   * config/s390/s390.c (s390_emit_stack_probe): Call the probe_stack
+   expander.
+   * config/s390/s390.md ("@probe_stack2", "probe_stack"): New
+   expanders.
+
 2020-05-14  Andreas Krebbel  
 
* config/s390/s390.c (allocate_stack_space): Add missing updates
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b4897256af5..4de3129f88e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10946,10 +10946,9 @@ s390_prologue_plus_offset (rtx target, rtx reg, rtx 
offset, bool frame_related_p
 static void
 s390_emit_stack_probe (rtx addr)
 {
-  rtx tmp = gen_rtx_MEM (Pmode, addr);
-  MEM_VOLATILE_P (tmp) = 1;
-  s390_emit_compare (EQ, gen_rtx_REG (Pmode, 0), tmp);
-  emit_insn (gen_blockage ());
+  rtx mem = gen_rtx_MEM (Pmode, addr);
+  MEM_VOLATILE_P (mem) = 1;
+  emit_insn (gen_probe_stack (mem));
 }
 
 /* Use a runtime loop if we have to emit more probes than this.  */
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 908de587e17..cd1c0634b71 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -11017,8 +11017,22 @@
 
   emit_move_insn (operands[0], virtual_stack_dynamic_rtx);
   DONE;
-})
+  })
+
+(define_expand "@probe_stack2"
+  [(set (reg:CCZ CC_REGNUM)
+   (compare:CCZ (reg:P 0)
+(match_operand 0 "memory_operand")))
+   (unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  "")
 
+(define_expand "probe_stack"
+  [(match_operand 0 "memory_operand")]
+  ""
+{
+  emit_insn (gen_probe_stack2 (Pmode, operands[0]));
+  DONE;
+})
 
 ;
 ; setjmp instruction pattern.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 8ff0bbcc85b..498ebb7f678 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-05-14  Andreas Krebbel  
+
+   * gcc.target/s390/stack-clash-2.c: New test.
+
 2020-05-14  Andreas Krebbel  
 
* gcc.target/s390/stack-clash-1.c: New test.
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-2.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-2.c
new file mode 100644
index 000..e554ad5ed0d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=z900 -fstack-clash-protection" } */
+
+extern void bar (char *);
+
+void
+foo ()
+{
+  char * mem = __builtin_alloca (2);
+  bar (mem);
+}
+
+/* For alloca a common code routine emits the probes.  Make sure the
+   "probe_stack" expander is used in that case. We want to use mem
+   compares instead of stores.  */
+/* { dg-final { scan-assembler-times "cg\t" 5 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "c\t" 5 { target { ! lp64 } } } } */
-- 
2.17.1



[PATCH] middle-end/95118 - fix printing of "denormal" zero

2020-05-13 Thread Richard Biener


This fixes printing a REAL_CST generated from value-numbering
punning some bits to a real which turns out as zero with big
negative exponent.  This causes the loop in real_to_decimal_for_mode to
never terminate.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

2020-05-14  Richard Biener  

PR middle-end/95118
* real.c (real_to_decimal_for_mode): Make sure we handle
a zero with nonzero exponent.

* gcc.dg/pr95118.c: New testcase.
---
 gcc/real.c |  4 ++--
 gcc/testsuite/gcc.dg/pr95118.c | 11 +++
 2 files changed, 13 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr95118.c

diff --git a/gcc/real.c b/gcc/real.c
index 00b23ceb41e..09ec5c08c38 100644
--- a/gcc/real.c
+++ b/gcc/real.c
@@ -1714,8 +1714,8 @@ real_to_decimal_for_mode (char *str, const 
REAL_VALUE_TYPE *r_orig,
 
  do_multiply (&u, &v, ten);
 
- /* Stop if we're now >= 1.  */
- if (REAL_EXP (&u) > 0)
+ /* Stop if we're now >= 1 or zero.  */
+ if (REAL_EXP (&u) > 0 || u.cl == rvc_zero)
break;
 
  v = u;
diff --git a/gcc/testsuite/gcc.dg/pr95118.c b/gcc/testsuite/gcc.dg/pr95118.c
new file mode 100644
index 000..69bc47fd7aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr95118.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre" } */
+
+void a();
+void b() {
+union {
+   int c[4];
+   long double d;
+} e = {{0, 0, 4}};
+a(e.d);
+}
-- 
2.16.4