[PATCH] Generalized value pass-through for self-recursive function (ipa/pr93203)

2020-01-25 Thread Feng Xue OS
Besides simple pass-through (aggregate) jump function, arithmetic (aggregate)
jump function could also bring same (aggregate) value as parameter passed-in
for self-feeding recursive call.  For example,

  f1 (int i)/*  normal jump function */
 {
f1 (i & 1);
 }

Suppose i is 0, recursive propagation via (i & 1) also gets 0, which
can be seen as a simple pass-through of i.

  f2 (int *p)  /* aggregate jump function */
 {
int t = *p & 1;
f2 (&t);
 }
Likewise, if *p is 0, (*p & 1) is also 0, and &t is an aggregate simple
pass-through of p.

This patch is to support these two kinds of value pass-through.
Bootstrapped/regtested on x86_64-linux and aarch64-linux.

Feng

---
2020-01-25  Feng Xue  

PR ipa/93203
* ipa-cp.c (ipcp_lattice::add_value): Add source with same call
edge but different source value.
(adjust_callers_for_value_intersection): New function.
(gather_edges_for_value): Adjust order of callers to let a
non-self-recursive caller be the first element.
(self_recursive_pass_through_p): Add a new parameter simple, and
check generalized self-recursive pass-through jump function. 
(self_recursive_agg_pass_through_p): Likewise.
(find_more_scalar_values_for_callers_subset): Compute value from
pass-through jump function for self-recursive.
(intersect_with_plats): Remove code of itersection with unknown
place holder value.
(intersect_with_agg_replacements): Likewise.
(intersect_aggregates_with_edge): Deduce with from pass-through
jump function for self-recursive.
(decide_whether_version_node): Remove dead callers and adjust
order to let a non-self-recursive caller be the first element.

From 406c23711077c0df18d5e77270d0a82be098224b Mon Sep 17 00:00:00 2001
From: Feng Xue 
Date: Tue, 21 Jan 2020 20:53:38 +0800
Subject: [PATCH] Generalized value pass-through for self-recusive function

---
 gcc/ipa-cp.c   | 196 ++---
 gcc/testsuite/g++.dg/ipa/pr93203.C |  95 ++
 gcc/testsuite/gcc.dg/ipa/ipcp-1.c  |   2 +-
 3 files changed, 217 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr93203.C

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 17da1d8e8a7..533a429ba3b 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1850,7 +1850,7 @@ ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
 	  {
 	ipcp_value_source *s;
 	for (s = val->sources; s; s = s->next)
-	  if (s->cs == cs)
+	  if (s->cs == cs && s->val == src_val)
 		break;
 	if (s)
 	  return false;
@@ -4207,6 +4207,33 @@ get_info_about_necessary_edges (ipcp_value *val, cgraph_node *dest,
   return hot;
 }
 
+/* Given a NODE, and a set of its CALLERS, try to adjust order of the callers
+   to let a non-self-recursive caller be the first element.  Thus, we can
+   simplify intersecting operations on values that arrive from all of these
+   callers, especially when there exists self-recursive call.  Return true if
+   this kind of adjustment is possible.  */
+
+static bool
+adjust_callers_for_value_intersection (vec callers,
+   cgraph_node *node)
+{
+  for (unsigned i = 0; i < callers.length (); i++)
+{
+  cgraph_edge *cs = callers[i];
+
+  if (cs->caller != node)
+	{
+	  if (i > 0)
+	{
+	  callers[i] = callers[0];
+	  callers[0] = cs;
+	}
+	  return true;
+	}
+}
+  return false;
+}
+
 /* Return a vector of incoming edges that do bring value VAL to node DEST.  It
is assumed their number is known and equal to CALLER_COUNT.  */
 
@@ -4230,6 +4257,9 @@ gather_edges_for_value (ipcp_value *val, cgraph_node *dest,
 	}
 }
 
+  if (caller_count > 1)
+adjust_callers_for_value_intersection (ret, dest);
+
   return ret;
 }
 
@@ -4241,7 +4271,6 @@ get_replacement_map (class ipa_node_params *info, tree value, int parm_num)
 {
   struct ipa_replace_map *replace_map;
 
-
   replace_map = ggc_alloc ();
   if (dump_file)
 {
@@ -4592,36 +4621,40 @@ create_specialized_node (struct cgraph_node *node,
 }
 
 /* Return true, if JFUNC, which describes a i-th parameter of call CS, is a
-   simple no-operation pass-through function to itself.  */
+   pass-through function to itself.  When SIMPLE is true, further check if
+   JFUNC is a simple no-operation pass-through.  */
 
 static bool
-self_recursive_pass_through_p (cgraph_edge *cs, ipa_jump_func *jfunc, int i)
+self_recursive_pass_through_p (cgraph_edge *cs, ipa_jump_func *jfunc, int i,
+			   bool simple = true)
 {
   enum availability availability;
   if (cs->caller == cs->callee->function_symbol (&availability)
   && availability > AVAIL_INTERPOSABLE
   && jfunc->type == IPA_JF_PASS_THROUGH
-  && ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR
+  && (!simple || ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
   && ipa

[PATCH] Fix gcc.target/aarch64/vec_zeroextend.c for big-endian

2020-01-25 Thread apinski
From: Andrew Pinski 

vec_zeroextend.c fails on big-endian as it assumes
0 index is the lower part but it is not for
big-endian case.  This fixes the problem by
using the correct index for the lower part
for big-endian.

Committed as obvious after a test on aarch64_be-linux-gnu.

Thanks,
Andrew Pinski

ChangeLog:
* gcc.target/aarch64/vec_zeroextend.c: Fix for big-endian.
---
 gcc/testsuite/gcc.target/aarch64/vec_zeroextend.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/vec_zeroextend.c 
b/gcc/testsuite/gcc.target/aarch64/vec_zeroextend.c
index 9c3971f036a..5a74cbc5aba 100644
--- a/gcc/testsuite/gcc.target/aarch64/vec_zeroextend.c
+++ b/gcc/testsuite/gcc.target/aarch64/vec_zeroextend.c
@@ -3,17 +3,21 @@
 
 #define vector __attribute__((vector_size(16) ))
 
+#define lowull (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ ? 1 : 0)
+#define lowui (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ ? 3 : 0)
+
+
 vector unsigned long long
 f1(vector unsigned long long b, vector unsigned int a)
 {
-  b[0] = a[0];
+  b[lowull] = a[lowui];
   return b;
 }
 
 unsigned long long
 f2(vector unsigned int a)
 {
-  return a[0];
+  return a[lowui];
 }
 
 /* { dg-final { scan-assembler-times {fmov} 2 } } */
-- 
2.17.1



[PATCH V2] Generalized value pass-through for self-recursive function (ipa/pr93203)

2020-01-25 Thread Feng Xue OS
Made some changes.

Feng


From: Feng Xue OS
Sent: Saturday, January 25, 2020 5:54 PM
To: mjam...@suse.cz; Jan Hubicka; gcc-patches@gcc.gnu.org
Subject: [PATCH] Generalized value pass-through for self-recursive function 
(ipa/pr93203)

Besides simple pass-through (aggregate) jump function, arithmetic (aggregate)
jump function could also bring same (aggregate) value as parameter passed-in
for self-feeding recursive call.  For example,

  f1 (int i)/*  normal jump function */
 {
f1 (i & 1);
 }

Suppose i is 0, recursive propagation via (i & 1) also gets 0, which
can be seen as a simple pass-through of i.

  f2 (int *p)  /* aggregate jump function */
 {
int t = *p & 1;
f2 (&t);
 }
Likewise, if *p is 0, (*p & 1) is also 0, and &t is an aggregate simple
pass-through of p.

This patch is to support these two kinds of value pass-through.
Bootstrapped/regtested on x86_64-linux and aarch64-linux.

Feng

---
2020-01-25  Feng Xue  

PR ipa/93203
* ipa-cp.c (ipcp_lattice::add_value): Add source with same call
edge but different source value.
(adjust_callers_for_value_intersection): New function.
(gather_edges_for_value): Adjust order of callers to let a
non-self-recursive caller be the first element.
(self_recursive_pass_through_p): Add a new parameter simple, and
check generalized self-recursive pass-through jump function.
(self_recursive_agg_pass_through_p): Likewise.
(find_more_scalar_values_for_callers_subset): Compute value from
pass-through jump function for self-recursive.
(intersect_with_plats): Remove code of itersection with unknown
place holder value.
(intersect_with_agg_replacements): Likewise.
(intersect_aggregates_with_edge): Deduce with from pass-through
jump function for self-recursive.
(decide_whether_version_node): Remove dead callers and adjust
order to let a non-self-recursive caller be the first element.

From 74aef0cd2f40ff828a4b2abcbbdbbf4b1aea1fcf Mon Sep 17 00:00:00 2001
From: Feng Xue 
Date: Tue, 21 Jan 2020 20:53:38 +0800
Subject: [PATCH] Generalized value pass-through for self-recusive function

---
 gcc/ipa-cp.c   | 195 ++---
 gcc/testsuite/g++.dg/ipa/pr93203.C |  95 ++
 gcc/testsuite/gcc.dg/ipa/ipcp-1.c  |   2 +-
 3 files changed, 216 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr93203.C

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 17da1d8e8a7..64d23a34292 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1850,7 +1850,7 @@ ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
 	  {
 	ipcp_value_source *s;
 	for (s = val->sources; s; s = s->next)
-	  if (s->cs == cs)
+	  if (s->cs == cs && s->val == src_val)
 		break;
 	if (s)
 	  return false;
@@ -4207,6 +4207,33 @@ get_info_about_necessary_edges (ipcp_value *val, cgraph_node *dest,
   return hot;
 }
 
+/* Given a NODE, and a set of its CALLERS, try to adjust order of the callers
+   to let a non-self-recursive caller be the first element.  Thus, we can
+   simplify intersecting operations on values that arrive from all of these
+   callers, especially when there exists self-recursive call.  Return true if
+   this kind of adjustment is possible.  */
+
+static bool
+adjust_callers_for_value_intersection (vec callers,
+   cgraph_node *node)
+{
+  for (unsigned i = 0; i < callers.length (); i++)
+{
+  cgraph_edge *cs = callers[i];
+
+  if (cs->caller != node)
+	{
+	  if (i > 0)
+	{
+	  callers[i] = callers[0];
+	  callers[0] = cs;
+	}
+	  return true;
+	}
+}
+  return false;
+}
+
 /* Return a vector of incoming edges that do bring value VAL to node DEST.  It
is assumed their number is known and equal to CALLER_COUNT.  */
 
@@ -4230,6 +4257,9 @@ gather_edges_for_value (ipcp_value *val, cgraph_node *dest,
 	}
 }
 
+  if (caller_count > 1)
+adjust_callers_for_value_intersection (ret, dest);
+
   return ret;
 }
 
@@ -4241,7 +4271,6 @@ get_replacement_map (class ipa_node_params *info, tree value, int parm_num)
 {
   struct ipa_replace_map *replace_map;
 
-
   replace_map = ggc_alloc ();
   if (dump_file)
 {
@@ -4592,36 +4621,40 @@ create_specialized_node (struct cgraph_node *node,
 }
 
 /* Return true, if JFUNC, which describes a i-th parameter of call CS, is a
-   simple no-operation pass-through function to itself.  */
+   pass-through function to itself.  When SIMPLE is true, further check if
+   JFUNC is a simple no-operation pass-through.  */
 
 static bool
-self_recursive_pass_through_p (cgraph_edge *cs, ipa_jump_func *jfunc, int i)
+self_recursive_pass_through_p (cgraph_edge *cs, ipa_jump_func *jfunc, int i,
+			   bool simple = true)
 {
   enum availability availability;
   if (cs->caller == cs->callee->

Re: [patch, fortran] Fix PR 85781, ICE on valid

2020-01-25 Thread Thomas König

Hi Tobias,


I hope my patch covers all issues. – OK for the trunk?


Yep.

Thanks a lot for the patch!

Regards

Thomas


[PATCH] dbr: Filter-out TARGET_FLAGS_REGNUM from end_of_function_needs.

2020-01-25 Thread Hans-Peter Nilsson
Compared to the cc0 version, I noticed a regression in
delay-slot-filling for CRIS for several functions in libgcc with
a similar layout, one being lshrdi3, where with cc0 all
delay-slots were filled, as exposed by the test-case.  I ended
up including the thankfully-small lshrdi3 as-is, for simplicity
of testing, after failing to cook up an artificial test-case.

There's one slot that fails to be filled for the decc0rated CRIS
port.  A gdb session shows it is because of the automatic
inclusion of TARGET_FLAGS_REGNUM in "registers needed at the end
of the function" because there are insns in the epilogue that
clobber the condition-code register.  I'm not trying to tell a
clobber from a set, as parallels with set instead of clobber
seems likely to happen too, for targets with TARGET_FLAGS_REGNUM
set.

Other targets with delay-slots and one dedicated often-clobbered
condition-code-register should consider defining
TARGET_FLAGS_REGNUM.  I noticed it improved delay-slot-filling
also in other situations than this.  (Author of introduction of
TARGET_FLAGS_REGNUM use in dbr is CC:ed.)

Tested cris-elf.

Ok to commit or perhaps wait to gcc11?

(The test-case goes in either way, as it passes with cc0-CRIS.)

gcc:
* resource.c (init_resource_info): Filter-out TARGET_FLAGS_REGNUM
from end_of_function_needs.

gcc/testsuite:
* gcc.target/cris/pr93372-1.c: New.

---
 gcc/resource.c|  6 +++
 gcc/testsuite/gcc.target/cris/pr93372-1.c | 62 +++
 2 files changed, 68 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/cris/pr93372-1.c

diff --git a/gcc/resource.c b/gcc/resource.c
index d26217c..62a69c0 100644
--- a/gcc/resource.c
+++ b/gcc/resource.c
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
+#include "target.h"
 #include "rtl.h"
 #include "df.h"
 #include "memmodel.h"
@@ -1214,6 +1215,11 @@ init_resource_info (rtx_insn *epilogue_insn)
   if (return_insn_p (epilogue_insn))
break;
 }
+  
+  /* Filter-out the flags register from those additionally required
+ registers. */
+  if (targetm.flags_regnum != INVALID_REGNUM)
+CLEAR_HARD_REG_BIT (end_of_function_needs.regs, targetm.flags_regnum);
 
   /* Allocate and initialize the tables used by mark_target_live_regs.  */
   target_hash_table = XCNEWVEC (struct target_info *, TARGET_HASH_PRIME);
diff --git a/gcc/testsuite/gcc.target/cris/pr93372-1.c 
b/gcc/testsuite/gcc.target/cris/pr93372-1.c
new file mode 100644
index 000..b625eda
--- /dev/null
+++ b/gcc/testsuite/gcc.target/cris/pr93372-1.c
@@ -0,0 +1,62 @@
+/* Check that all more-or-less trivially fillable delayed-branch-slots
+   are filled. */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "\tnop" } } */
+
+void *f(void **p)
+{
+  /* Supposedly the memory read finds its way into the "ret"
+ delay-slot. */
+  return *p;
+}
+
+int g(int *x, int *y, char *v, int n)
+{
+  int z = *x;
+  int w = *v + 31;
+
+  /* Two branch and two return slots, all filled. */
+  if (z != 23 && z != n+1)
+return *x+*y+24+w;
+  return *y+24+w;
+}
+
+/* No problem with the two examples above, but with a more involved
+   example, the epilogue contents matter (the condition-code register
+   clobber was mistaken for a register that needed to be alive). */
+
+struct DWstruct {int low, high;};
+typedef unsigned long long DItype;
+typedef unsigned int USItype;
+
+typedef union
+{
+  struct DWstruct s;
+  DItype ll;
+} DWunion;
+
+unsigned long long
+xlshrdi3 (DItype u, unsigned int b)
+{
+  if (b == 0)
+return u;
+
+  const DWunion uu = {.ll = u};
+  const int bm = (4 * 8) - b;
+  DWunion w;
+
+  if (bm <= 0)
+{
+  w.s.high = 0;
+  w.s.low = (USItype) uu.s.high >> -bm;
+}
+  else
+{
+  const USItype carries = (USItype) uu.s.high << bm;
+  w.s.high = (USItype) uu.s.high >> b;
+  w.s.low = ((USItype) uu.s.low >> b) | carries;
+}
+
+  return w.ll;
+}
-- 
2.11.0

brgds, H-P


Re: [PATCH] dbr: Filter-out TARGET_FLAGS_REGNUM from end_of_function_needs.

2020-01-25 Thread Jeff Law
On Sat, 2020-01-25 at 15:14 +0100, Hans-Peter Nilsson wrote:
> Compared to the cc0 version, I noticed a regression in
> delay-slot-filling for CRIS for several functions in libgcc with
> a similar layout, one being lshrdi3, where with cc0 all
> delay-slots were filled, as exposed by the test-case.  I ended
> up including the thankfully-small lshrdi3 as-is, for simplicity
> of testing, after failing to cook up an artificial test-case.
> 
> There's one slot that fails to be filled for the decc0rated CRIS
> port.  A gdb session shows it is because of the automatic
> inclusion of TARGET_FLAGS_REGNUM in "registers needed at the end
> of the function" because there are insns in the epilogue that
> clobber the condition-code register.  I'm not trying to tell a
> clobber from a set, as parallels with set instead of clobber
> seems likely to happen too, for targets with TARGET_FLAGS_REGNUM
> set.
> 
> Other targets with delay-slots and one dedicated often-clobbered
> condition-code-register should consider defining
> TARGET_FLAGS_REGNUM.  I noticed it improved delay-slot-filling
> also in other situations than this.  (Author of introduction of
> TARGET_FLAGS_REGNUM use in dbr is CC:ed.)
> 
> Tested cris-elf.
> 
> Ok to commit or perhaps wait to gcc11?
> 
> (The test-case goes in either way, as it passes with cc0-CRIS.)
> 
> gcc:
> * resource.c (init_resource_info): Filter-out TARGET_FLAGS_REGNUM
> from end_of_function_needs.
> 
> gcc/testsuite:
> * gcc.target/cris/pr93372-1.c: New.
Looks reasonable to me.  I'd probably wait for gcc-11 though out of an
abundance of caution.

jeff
> 



[committed] Add include hack to fix missing SCNuMAX defines in inttypes.h on hpux11.[01]*

2020-01-25 Thread John David Anglin
In porting git trunk to hppa2.0w-hp-hpux11.11, I found that we lack defines for 
SCNuMAX:
https://public-inbox.org/git/c9aa5047-7438-8f2f-985c-1c8771354...@bell.net/T/#u

This patch adds the missing defines.

Tested on hppa2.0w-hp-hpux11.11.  Committed to trunk.

Dave

2020-01-25  John David Anglin  

* inclhack.def (hpux_c99_inttypes4): New, add missing SCNuMAX defines.
* fixincl.x: Regenerate.
* tests/base/inttypes.h: Update for above fix.

 fixincludes/inclhack.def  | 15 +++
 fixincludes/tests/base/inttypes.h |  9 +
 2 files changed, 24 insertions(+)

diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def
index bf136fdaa20..f58e7771e1c 100644
--- a/fixincludes/inclhack.def
+++ b/fixincludes/inclhack.def
@@ -2613,6 +2613,21 @@ fix = {
"#define UINTPTR_MAX\n";
 };

+/*
+ * Fix missing SCNuMAX defines in inttypes.h
+ */
+fix = {
+hackname  = hpux_c99_inttypes4;
+mach  = "hppa*-hp-hpux11.[01]*";
+files = inttypes.h;
+sed   = "/^[ \t]*#[ \t]*define[ \t]*SCNxMAX[ \t]*SCNx64/a\\\n"
+   "#define SCNuMAX \t SCNu64\n";
+sed   = "/^[ \t]*#[ \t]*define[ \t]*SCNxMAX[ \t]*SCNx32/a\\\n"
+   "#define SCNuMAX \t SCNu32\n";
+test_text = "#define SCNxMAX SCNx64\n"
+   "#define SCNxMAX SCNx32\n";
+};
+
 /*
  *  Fix hpux broken ctype macros
  */
diff --git a/fixincludes/tests/base/inttypes.h 
b/fixincludes/tests/base/inttypes.h
index e2216832666..144ea6596e8 100644
--- a/fixincludes/tests/base/inttypes.h
+++ b/fixincludes/tests/base/inttypes.h
@@ -33,3 +33,12 @@
 #endif

 #endif  /* HPUX_C99_INTTYPES3_CHECK */
+
+
+#if defined( HPUX_C99_INTTYPES4_CHECK )
+#define SCNxMAX SCNx64
+#define SCNuMAX SCNu64
+#define SCNxMAX SCNx32
+#define SCNuMAX SCNu32
+
+#endif  /* HPUX_C99_INTTYPES4_CHECK */


Re: [C++ PATCH] c++: Poor diagnostic for dynamic_cast in constexpr context [PR93414]

2020-01-25 Thread Marek Polacek
On Fri, Jan 24, 2020 at 10:39:14PM -0500, Jason Merrill wrote:
> [C++ PATCH] c++: is unnecessarily redundant, you can just write [PATCH].

True enough, I've dropped --subject-prefix from my git patch alias.

> On 1/24/20 6:20 PM, Marek Polacek wrote:> I neglected to add a proper
> diagnostic for the reference dynamic_cast> case when the operand of a
> dynamic_cast doesn't refer to a public base> of Derived, resulting in
> suboptimal error message>
> > error: call to non-'constexpr' function 'void* __cxa_bad_cast()'
> > 
> > Tested x86_64-linux, ok for trunk?
> 
> OK.

Thanks.

Marek



Re: [PATCH] libsanitizer: Add missign file and regen Makefile.in

2020-01-25 Thread Andreas Tobler

On 23.01.20 21:09, Jeff Law wrote:

On Wed, 2020-01-22 at 22:23 +0100, Andreas Tobler wrote:

Hi all,

I'm digginig out old patches and I want to complete the libasan support
for FreeBSD x86_64. The below one was not that obvious when you have
been away for the past years.

In the last import the sanitizer_platform_limits_freebsd.cpp got
forgotten. Fix this.

Ok for trunk once it's open again?

Thanks,
Andreas

libsanitizer/sanitizer_common:

  * Makefile.am: Add sanitizer_platform_limits_freebsd.cpp.
  * makefile.in: Regenerate

I think all the patches in this space are fine for the trunk.  As
someone else mentioned, the sanitizer patches should probably go
through the upstream project as GCC is downstream.


Yup. Thanks for the feedback. I'm working on this one. It'll take some 
time since I learned it is not that easy to build llvm and run a simple 
test. But that is another story.


I'll commit w/o the sanitizer bit once trunk is open for new features.

Andreas


Re: [committed] [PR tree-optimization/92788] Check right edge flags when suppressing jump threading

2020-01-25 Thread Jakub Jelinek
On Fri, Jan 24, 2020 at 03:52:51PM -0700, Jeff Law wrote:
> When we thread through the successor of a joiner block we make a clone
> of the joiner block and redirect its outgoing edges.  Of course if
> there's cases where we can't redirect an edge, then bad things will
> happen.
> 
> The code already checked for EDGE_ABNORMAL to suppress threading in
> that case.  But it really should have been checking EDGE_COMPLEX which
> includes ABNORMAL_CALL, EH and PRESERVE.
> 
> This patch fixes that oversight and resolves the BZ.  Bootstrapped and
> regression tested on x86_64.  Committed to the trunk.

The test FAILs on i686-linux, operator new's first parameter needs to be
size_t, which for ia32 is not unsigned long, but unsigned int.

Also, I think we shouldn't be adding tests to g++.dg/ directly, for
optimization test it might be better in g++.dg/opt/, but as it is x86
guarded, I've moved it to g++.dg/target/i386/ instead.

Tested on x86_64-linux -m32/-m64, committed to trunk as obvious.

2020-01-26  Jakub Jelinek  

PR tree-optimization/92788
* g++.dg/pr92788.C: Move to ...
* g++.target/i386/pr92788.C: ... here.  Remove target from dg-do line.
Change type of operator new's first parameter to __SIZE_TYPE__.

diff --git a/gcc/testsuite/g++.dg/pr92788.C 
b/gcc/testsuite/g++.target/i386/pr92788.C
similarity index 98%
rename from gcc/testsuite/g++.dg/pr92788.C
rename to gcc/testsuite/g++.target/i386/pr92788.C
index b92ae38f7aa..048bbd1b9b8 100644
--- a/gcc/testsuite/g++.dg/pr92788.C
+++ b/gcc/testsuite/g++.target/i386/pr92788.C
@@ -1,4 +1,4 @@
-/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-do compile } */
 /* { dg-require-effective-target c++11 } */
 /* { dg-options "-O3 -fnon-call-exceptions -ftracer -march=k8 
-Wno-return-type" } */
 
@@ -17,7 +17,7 @@ struct is_same : integral_constant {};
 
 template  using __enable_if_t = _Tp;
 
-void *operator new(unsigned long, void *__p) { return __p; }
+void *operator new(__SIZE_TYPE__, void *__p) { return __p; }
 
 template  class __normal_iterator {
 


Jakub



Re: [PATCH] Remove assertion in get_info_about_necessary_edges (PR ipa/93166)

2020-01-25 Thread Jakub Jelinek
On Sun, Jan 19, 2020 at 12:54:51PM +0100, Jan Hubicka wrote:
> > Bootstrapped/regtested on x86_64-linux and aarch64-linux.
> > 
> > Feng
> > ---
> > 2020-01-19  Feng Xue  
> > 
> > PR ipa/93166
> > * ipa-cp.c (get_info_about_necessary_edges): Remove value
> > check assertion.
> 
> OK.
> Please next time write short description on the problem in email so one
> does not need to look up the PR log.

The test FAILs with -std=c++98 (because it really requires C++11), but more
importantly, it doesn't actually test what it is supposed to test, i.e.
doesn't FAIL before the ipa-cp.c change and succeed with it.
That is because it is a dg-do compile test, it needs to be a proper LTO test
for this.

Fixed thusly, tested on x86_64-linux and i686-linux (with Thursday snapshot
where it FAILs and current where it PASSes), committed to trunk as obvious.

2020-01-26  Jakub Jelinek  

PR ipa/93166
* g++.dg/pr93166.C: Move to ...
* g++.dg/pr93166_0.C: ... here.  Turn it into a proper lto test.

diff --git a/gcc/testsuite/g++.dg/pr93166.C 
b/gcc/testsuite/g++.dg/lto/pr93166_0.C
similarity index 95%
rename from gcc/testsuite/g++.dg/pr93166.C
rename to gcc/testsuite/g++.dg/lto/pr93166_0.C
index e9234ce7a0c..52f7ddf4016 100644
--- a/gcc/testsuite/g++.dg/pr93166.C
+++ b/gcc/testsuite/g++.dg/lto/pr93166_0.C
@@ -1,5 +1,10 @@
-// { dg-do compile }
-// { dg-options "-shared -flto -O2 -fPIC -fvisibility=hidden" }
+// PR ipa/93166
+// { dg-lto-do link }
+// { dg-lto-options { { -fPIC -O2 -flto -fvisibility=hidden } } }
+// { dg-require-effective-target shared }
+// { dg-require-effective-target fpic }
+// { dg-extra-ld-options "-shared" }
+// { dg-require-visibility "" }
 
 namespace Qt {
 enum DropAction {};


Jakub



[PATCH] i386: Fix up *avx_vperm_broadcast_v4df [PR93430]

2020-01-25 Thread Jakub Jelinek
Hi!

Apparently my recent patch which moved the *avx_vperm_broadcast* and
*vpermil* patterns before vpermpd broke the following testcase, the
define_insn_and_split matched always but the splitter condition only split
it if not -mavx2 for V4DFmode, basically relying on the vpermpd pattern to
come first.

The following patch fixes it by moving that part of SPLIT-CONDITION into
CONDITION, so that when it is not met, we just don't match the pattern
and thus match the later vpermpd pattern in that case.
Except, for { 0, 0, 0, 0 } permutation, there is actually no reason to do
that, vbroadcastsd from memory seems to be slightly cheaper than vpermpd $0.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-01-26  Jakub Jelinek  

PR target/93430
* config/i386/sse.md (*avx_vperm_broadcast_): Disallow for
TARGET_AVX2 and V4DFmode not in the split condition, but in the
pattern condition, though allow { 0, 0, 0, 0 } broadcast always.

* gcc.dg/pr93430.c: New test.
* gcc.target/i386/avx2-pr93430.c: New test.

--- gcc/config/i386/sse.md.jj   2020-01-24 22:49:19.0 +0100
+++ gcc/config/i386/sse.md  2020-01-25 18:32:02.100439737 +0100
@@ -19912,9 +19912,10 @@ (define_insn_and_split "*avx_vperm_broad
  (match_operand:VF_256 1 "nonimmediate_operand" "m,o,?v")
  (match_parallel 2 "avx_vbroadcast_operand"
[(match_operand 3 "const_int_operand" "C,n,n")])))]
-  "TARGET_AVX"
+  "TARGET_AVX
+   && (mode != V4DFmode || !TARGET_AVX2 || operands[3] == const0_rtx)"
   "#"
-  "&& reload_completed && (mode != V4DFmode || !TARGET_AVX2)"
+  "&& reload_completed"
   [(set (match_dup 0) (vec_duplicate:VF_256 (match_dup 1)))]
 {
   rtx op0 = operands[0], op1 = operands[1];
--- gcc/testsuite/gcc.dg/pr93430.c.jj   2020-01-25 18:39:33.455584367 +0100
+++ gcc/testsuite/gcc.dg/pr93430.c  2020-01-25 18:38:03.725950223 +0100
@@ -0,0 +1,33 @@
+/* PR target/93430 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-mavx -mno-avx2" { target avx } } */
+
+typedef double V __attribute__((vector_size (4 * sizeof (double;
+typedef long long VI __attribute__((vector_size (4 * sizeof (long long;
+
+#if __SIZEOF_DOUBLE__ == __SIZEOF_LONG_LONG__
+void
+foo (V *x, V *y)
+{
+  y[0] = __builtin_shuffle (x[0], x[0], (VI) { 0, 0, 0, 0 });
+}
+
+void
+bar (V *x, V *y)
+{
+  y[0] = __builtin_shuffle (x[0], x[0], (VI) { 1, 1, 1, 1 });
+}
+
+void
+baz (V *x, V *y)
+{
+  y[0] = __builtin_shuffle (x[0], x[0], (VI) { 2, 2, 2, 2 });
+}
+
+void
+qux (V *x, V *y)
+{
+  y[0] = __builtin_shuffle (x[0], x[0], (VI) { 3, 3, 3, 3 });
+}
+#endif
--- gcc/testsuite/gcc.target/i386/avx2-pr93430.c.jj 2020-01-25 
18:39:55.282252126 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-pr93430.c2020-01-25 
18:40:35.080646319 +0100
@@ -0,0 +1,5 @@
+/* PR target/93430 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2" } */
+
+#include "../../gcc.dg/pr93430.c"

Jakub



[PATCH] i386: Fix up *{add,sub}v4_doubleword patterns (PR target/93412)

2020-01-25 Thread Jakub Jelinek
Hi!

In the *{add,sub}v4_doubleword patterns, we don't really want to see a
VOIDmode last operand, because it then means invalid RTL
(sign_extend:{TI,POI} (const_int ...)) or so, and therefore something we
don't really handle in the splitter either.  We have
*{add,sub}v4_doubleword_1 pattern for those and that is what combine
will match, the problem in this testcase is just that it was only RA that
propagated the constant into the instruction.

In the similar *{add,sub}v4 patterns, we make sure not to accept
VOIDmode operand and similarly to these have _1 suffixed variant that allows
constants.  Fixed thusly, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2020-01-26  Jakub Jelinek  

PR target/93412
* config/i386/i386.md (*addv4_doubleword, *subv4_doubleword):
Use nonimmediate_operand instead of x86_64_hilo_general_operand and
drop  from constraint of last operand.

* gcc.dg/pr93412.c: New test.

--- gcc/config/i386/i386.md.jj  2020-01-23 16:16:55.982005551 +0100
+++ gcc/config/i386/i386.md 2020-01-25 19:30:20.358205644 +0100
@@ -6135,7 +6135,7 @@ (define_insn_and_split "*addv4_doub
(sign_extend:
  (match_operand: 1 "nonimmediate_operand" "%0,0"))
(sign_extend:
- (match_operand: 2 "x86_64_hilo_general_operand" "r,o")))
+ (match_operand: 2 "nonimmediate_operand" "r,o")))
  (sign_extend:
(plus: (match_dup 1) (match_dup 2)
(set (match_operand: 0 "nonimmediate_operand" "=ro,r")
@@ -6644,7 +6644,7 @@ (define_insn_and_split "*subv4_doub
(sign_extend:
  (match_operand: 1 "nonimmediate_operand" "0,0"))
(sign_extend:
- (match_operand: 2 "x86_64_hilo_general_operand" "r,o")))
+ (match_operand: 2 "nonimmediate_operand" "r,o")))
  (sign_extend:
(minus: (match_dup 1) (match_dup 2)
(set (match_operand: 0 "nonimmediate_operand" "=ro,r")
--- gcc/testsuite/gcc.dg/pr93412.c.jj   2020-01-25 19:36:23.083680678 +0100
+++ gcc/testsuite/gcc.dg/pr93412.c  2020-01-25 19:36:09.771883437 +0100
@@ -0,0 +1,15 @@
+/* PR target/93412 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-Og" } */
+
+unsigned char a;
+int b;
+unsigned c;
+
+int
+foo (int e, int f, int g, int h, int k, int i, short j)
+{
+  b = __builtin_add_overflow (a, 0, &c);
+  b = __builtin_add_overflow_p (b, a, (unsigned __int128) 0) ? b : 0;
+  return e + f + g + a + h + k + i + j + c;
+}

Jakub



[PATCH] sanopt: Avoid crash on anonymous parameter [PR93436]

2020-01-25 Thread Marek Polacek
Here we crash when using -fsanitize=address -fdump-tree-sanopt because
the dumping code uses IDENTIFIER_POINTER on a null DECL_NAME.  Instead,
we can print "" in such a case.  Or we could avoid printing
that diagnostic altogether.

I don't think this warrants a testcase.

Tested x86_64-linux, ok for trunk and 9?

2020-01-25  Marek Polacek  

PR tree-optimization/93436
* sanopt.c (sanitize_rewrite_addressable_params): Avoid crash on
null DECL_NAME.
---
 gcc/sanopt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index 619aae45a15..63fd68d4ad1 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -1176,7 +1176,9 @@ sanitize_rewrite_addressable_params (function *fun)
  if (dump_file)
fprintf (dump_file,
 "Rewriting parameter whose address is taken: %s\n",
-IDENTIFIER_POINTER (DECL_NAME (arg)));
+(DECL_NAME (arg)
+ ? IDENTIFIER_POINTER (DECL_NAME (arg))
+ : ""));
 
  SET_DECL_PT_UID (var, DECL_PT_UID (arg));
 

base-commit: 05107d4e4ccd11ecc8712d6e271fcb4b5f17060f
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[PATCH] checking: avoid verify_type_variant crash on incomplete type.

2020-01-25 Thread Jason Merrill
Here, we end up calling gen_type_die_with_usage for a type that's in the
middle of finish_struct_1, after we set TYPE_NEEDS_CONSTRUCTING on it but
before we copy all the flags to the variants--and, significantly, before we
set its TYPE_SIZE.  It seems reasonable to only look at
TYPE_NEEDS_CONSTRUCTING on complete types, since we aren't going to try to
create an object of an incomplete type any other way.

Tested x86_64-pc-linux-gnu, OK for trunk/9?

PR c++/92601
* tree.c (verify_type_variant): Only verify TYPE_NEEDS_CONSTRUCTING
of complete types.
---
 gcc/testsuite/g++.dg/debug/verify1.C | 64 
 gcc/tree.c   |  2 +-
 2 files changed, 65 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/debug/verify1.C

diff --git a/gcc/testsuite/g++.dg/debug/verify1.C 
b/gcc/testsuite/g++.dg/debug/verify1.C
new file mode 100644
index 000..67e407251a1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/verify1.C
@@ -0,0 +1,64 @@
+// PR c++/92601
+// { dg-additional-options "-g -fchecking -std=c++17" }
+
+typedef int size_t;
+template  struct integral_constant {
+  static constexpr int value = __v;
+};
+template  struct A;
+template  using __remove_cv_t = typename A<_Tp>::type;
+template 
+struct B : integral_constant {};
+template  class tuple;
+template  struct A {
+  using type = tuple;
+};
+template  struct C { typedef __remove_cv_t __type; };
+template  class D {
+public:
+  typedef typename C<_Tp>::__type type;
+};
+template  struct enable_if;
+template  struct F {};
+template  class G {
+public:
+  int operator*();
+  void operator++();
+};
+template 
+bool operator!=(G<_Iterator, _Container>, G<_Iterator, _Container>);
+template  class H;
+template >> class vector {
+public:
+  typedef G iterator;
+  iterator begin();
+  iterator end();
+};
+template  struct pack_c { typedef pack_c type; };
+template  struct make_index_pack_join;
+template 
+struct make_index_pack_join, pack_c>
+: pack_c {};
+template 
+struct I
+: make_index_pack_join::type, typename I::type> 
{};
+template <> struct I<1> : pack_c {};
+template 
+struct are_tuples_compatible_not_same
+: F::type, int>::value> {};
+template  struct tuple_impl;
+template 
+struct tuple_impl, Ts...> {
+  template , UTuple>::value>::type>
+  tuple_impl(UTuple &&);
+};
+template  class tuple {
+  tuple_impl::type> _impl;
+  tuple(tuple &) = default;
+};
+vector message_handler_registrations;
+void fn1() {
+  for (auto t : message_handler_registrations)
+;
+}
diff --git a/gcc/tree.c b/gcc/tree.c
index 0ddf002e9eb..298499fe876 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13881,9 +13881,9 @@ verify_type_variant (const_tree t, tree tv)
  debug_tree (TYPE_SIZE_UNIT (t));
  return false;
}
+  verify_variant_match (TYPE_NEEDS_CONSTRUCTING);
 }
   verify_variant_match (TYPE_PRECISION);
-  verify_variant_match (TYPE_NEEDS_CONSTRUCTING);
   if (RECORD_OR_UNION_TYPE_P (t))
 verify_variant_match (TYPE_TRANSPARENT_AGGR);
   else if (TREE_CODE (t) == ARRAY_TYPE)

base-commit: 9c1179c339e050e2ce7c545f648b684d38dec69d
-- 
2.18.1



[COMMITTED] c++: avoid ICE with __builtin_memset (PR90997).

2020-01-25 Thread Jason Merrill
warn_for_memset calls fold_for_warn, which calls fold_non_dependent_expr, so
also calling instantiate_non_dependent_expr here is undesirable.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/90997
* semantics.c (finish_call_expr): Don't call
instantiate_non_dependent_expr before warn_for_memset.
---
 gcc/cp/semantics.c   | 1 -
 gcc/testsuite/g++.dg/ext/builtin14.C | 4 
 2 files changed, 4 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/builtin14.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 3b88f1520bc..a489e2cf399 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2664,7 +2664,6 @@ finish_call_expr (tree fn, vec **args, bool 
disallow_virtual,
  tree arg2 = (*orig_args)[2];
  int literal_mask = ((literal_integer_zerop (arg1) << 1)
  | (literal_integer_zerop (arg2) << 2));
- arg2 = instantiate_non_dependent_expr (arg2);
  warn_for_memset (input_location, arg0, arg2, literal_mask);
}
 
diff --git a/gcc/testsuite/g++.dg/ext/builtin14.C 
b/gcc/testsuite/g++.dg/ext/builtin14.C
new file mode 100644
index 000..38d5a39fd73
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/builtin14.C
@@ -0,0 +1,4 @@
+// PR c++/90997
+
+template void f ()
+{ __builtin_memset (0, 0, int(0.)); }

base-commit: d0683c187f1806b887ff8b7e476edbde992310ef
-- 
2.18.1